GenResource124

Disk Space (DPS1)

This probe measures the available and used space on a specific file system (disk partition).

Version 1.2 - General deployment

Version 1.1 - General deployment

Feature: Offer the possibility to check on the MB threshold instead of percentage
Dependency:
- SNMP Agent module v2.0+

Version 1.0 - General deployment

GUI unavailable.

partition: the name of the file system to be monitored. Possible values are:
- /: the root (system) file system: a relatively small partition that contains the most critical parts of the system.
- /data: this larger partition contains all user data like voicemails etc.
warning level: generate a WARNING alarm if more than this percentage is used. The default value is 85 (%). The value should be chosen to give enough headroom and time to take action to avoid a CRITICAL alarm.
critical level: generate a CRITICAL alarm if more than this percentage is used. The default value is 95 (%). The value should be chosen such that it is (almost) certain that service is currently impacted and immediate action is required.
Check type: Indicate nothing if level value are expressed in percentage, indicate 'bu' if value are expressed in megabyte

To monitor the root partition with a warning at 80% and a critical alarm at 99%:

This probe generates a graph with following performance metrics:

This probe can report following alarm states:

WARNING: the used space on the given partition has surpassed the warning level threshold.
CRITICAL: the used space on the given partition has surpassed the critical level threshold.

No likely root-causes are documented yet.

As long as there is enough space for the processes that require it, no service is impacted and not even degraded.
But if there is not enough space, any process that requires it may fail in various ways and has very severe consequences. This includes:
- telephony
- netDesktop, netConsole
- (almost) anything

Look at the graph:
- If the usage is slowly increasing over time a long period of time, you can predict when the used space will reach 100% and as such determine the urgency.
- If the slope is rather steep, this is not normal. Something may be generating excessive logs for instance. Again you should determine the urgency by predicting when it will reach 100%.
- If the increase was instantaneous and the level is again constant, something must have happened at that time but the situation is stable now.
Contact Escaux