Pages

onsdag 17. april 2013

OpsMgr: How to monitor disk space

I always use this as a general guide:
  1. Always calculate both MB and Percentage threshold.
    The drive must exceed both thresholds for a state change to occur. For example, you need a warning when there is less than 5 GB free space on a disk. The warning threshold on a non-system disk is by default 10 % and 2 GB so a 40 GB disk have a 4 GB Percentage threshold (10 %). This disk will be in a warning state when it has less than 2 GB of free space. If you override the MB threshold to 5120 MB and keep the 10 % threshold (on a 40 GB disk), you will not get a warning at 5 GB, but at 4 GB, because 10 % will be the last threshold to be reached. In this example, you would also have to override the Percentage threshold to at least 12.5, or 25 % if this warning should apply to a 20 GB disk as well.
  2. Minimize the number of overrides needed in your environment.
Current Base OS management pack, 6.0.7026.0, do not show percent and MB free values in the Logical Disk free space alerts. To remedy consider to use the addendum management packs from Kevin Holman:  "Logical Disk free space alerts don’t show percent and MB free values in the alert description". If a future Base OS management pack fix this issue, the addendum management packs needs to be removed and the disk monitoring overrides needs to be recreated.

Another issue to consider is if you want alert on warning state. In general, alert on warning state for a three state monitor is something you should not do. This is because the monitor will change the alert when the state changes, instead of creating a new alert. So if you configure alert on warning state and notifications, the change to critical state will not trigger a new notification. In the latest base OS management pack there is an aggregate rollup monitor which can be used for critical alerting from the warning alerts child unit monitor. However if you use the addendum management packs mentioned earlier, this is not included, and can therefore not be used. To remedy this you could use two monitors, one for warning, and one for critical.

To calculate thresholds, use the Logical Disk Free Space Monitor Calculator by Jonathan Almquist.

Start by finding threshold values that best meets your environment. Use this to override all logical disks. If this is not sufficient, look at the disks that do not meet the general monitoring rule and group those that have the same threshold needs. This could be large file share disks or disks where data growth is faster than normal, etc. Then override based on those groups. Use the calculator to find the best threshold values for each group.

A good starting point would be to use the largest disk size, the smallest disk size and the average disk size for both System and Non-System drives in your environment.

You can list drives monitored by OpsMgr with this SQL query (import the list to Excel for sorting and filtering):

 /* [MTV_LogicalDisk] is a View in the OperationsManager database */  
 SELECT [PrincipalName] AS 'Computer'  
    ,[DisplayName_55270A70_AC47_C853_C617_236B0CFF9B4C] AS 'Drive'  
    ,[SizeNumeric_486ADDDB_2EB8_819A_FA24_8F6AB3E29543] AS 'MBSize'  
 FROM [MTV_LogicalDisk]  
 ORDER BY 'Computer', 'Drive'

/* Or (depending on your OpsMgr/Base OS MP version): */
 SELECT [PrincipalName] AS 'Computer'  
    ,[DisplayName] AS 'Drive'  
    ,[SizeNumeric_749AE6FF_BCF2_0852_7B6D_CE73CA77FD8F] AS 'MBSize'  
 FROM [MTV_Microsoft$Windows$Server$2008$LogicalDisk]
 ORDER BY 'Computer', 'Drive'