CPU Temperature (CPT1)

Description

This probe measures the temperature of the different CPU of the SOP. Do NOT confuse the number of cores and the number of CPU's. By example you can have a hardware that has 2 quadcored CPU's. It makes a total of 8 cores but you will monitor the 2 individual CPU's.

Release notes

Version 2.0.0 - Early deployment
  • Feature: Monitor component through SNMP (M12556)
  • Limitation: In case of temperature alert in another hardware component, this probe will generate an alert even if said component is not monitored. (M12556)
  • Dependency:
    • SNMP Agent module >= 3.8.0
    • Monitoring Service module >= 1.6.0 installed on the BMS

Version 1.1.2 - Deprecated
  • Improvement: Made line more visible in graph
  • Bugfix: Made the probe run on both active and standby sop (M9172)
  • Deprecated: Superseded by the HP Hardware probe. Please use that one where possible (M12556)
  • Dependency:
    • SNMP Agent v3.0.0 or higher
    • HP DL120 G7 or DL360 G7

Version 1.1.1 - Deprecated
  • Improvement: added the possibility to activate the probe on cluster level
  • Deprecated: Superseded by the HP Hardware probe. Please use that one where possible (M12556)
  • Dependency:
    • SNMP Agent v3.0.0 or higher
    • HP DL120 G7 or DL360 G7

Version 1.1.0 - Deprecated
  • Improvement: Probe will now not be executed on the server if it is not defined on the SMP (M6425)
  • Deprecated: Superseded by the HP Hardware probe. Please use that one where possible (M12556)
  • Dependency:
    • SNMP Agent v3.0.0 or higher
    • HP DL120 G7 or DL360 G7

Version 1.0.0 - Deprecated
  • Feature: Initial version (M0004796)
  • Deprecated: Superseded by the HP Hardware probe. Please use that one where possible (M12556)
  • Dependency:
    • SNMP Agent v2.7.6+
    • HP DL120 G7 or DL360 G7

Resource configuration interface

GUI unavailable.

Compatibility

This resource has only been developed and tested on the HP DL360 G7 and the HP DL120 G7

Resource parameters

  • None

Performance graphs

This probe generates a graph with following performance metrics:

  • temperature of CPU's (1 minute average): The temperature is in Celsius.
  • temperature threshold (1 minute average): The temperature is in Celsius and the threshold is fixed by the hardware manufacturer.

Alarms

This probe can report following alarm states:

  • WARNING: After 15 minutes during which the CPU temperature surpassed the hardware vendor defined threshold level.
  • UNKNOWN: the test could not be performed.

Possible causes

  • a broken CPU fan
  • failure to detect CPU temperature
  • air conditioning failure in the datacenter

Possible consequences

The computer might power off or hardware components could be physically damaged.

Possible actions

  • check the air conditioning
  • replace the SOP
Copyright © Escaux SA