CPU Load Average (CPL1)
Description
This probe measures the average processor load over the last 1, 5 and 15 minutes. A value of 1 means that one CPU is fully loaded. Typically a value up to 0.7 is considered healthy. If the system has more than one core, the value may be multiplied by the number of cores. E.g. on a quad core system, a load average of 3 is perfectly healthy. If the load average is higher, it means that things are being slowed down due to a lack of CPU power. Since it is hard to tell at which load average, services start to actually get affected it is adviced not to set your CRITICAL threshold too low.
Release notes
Version 1.0 - General deployment
- Feature: Initial version
- Dependency:
Resource configuration interface
Resource parameters
- load warning level: the minimal load average to generate a WARNING alarm. Recommended value is 0.75 times the number of CPU cores. Some experience is typically required to find a good value that works for you. The default value is 10 which is definitely not a normal load.
- load critical level: the minimal load average to generate a CRITICAL alarm. It's recommended to not set this at all or to set it high enough to avoid false alarms. You're not sure that e.g. a load of even 10 is actually service affecting, it depends on various factors. Some experience is typically required to find a good value that works for you. The default value is 50 to make sure there are no false alarms.
This probe generates a graph with following performance metrics:
- load (1 minute average): a value of 1 means that 1 CPU is fully loaded on average.
Alarms
This probe can report following alarm states:
- WARNING: any of the 1, 5 or 15 minute load averages surpassed the load warning level parameter
- CRITICAL: any of the 1, 5 or 15 minute load averages surpassed the load critical level parameter
Possible causes
A high CPU load may have various causes.
Possible consequences
A high CPU load may cause the system to react slower than normal. On higher loads, voice quality may be impacted and various services may start to fail.
Possible actions
Following actions may be taken in response to a high load:
- Investigate which process is responsible for the high load.
- SOP Shell > Diagnostics > System > CPU/Memory usage