Pages

tirsdag 4. juni 2013

OpsMgr: Use Command Channel to reset monitor health

Lets assume we have created a two state script monitor (see my example) that generate critical alerts. The monitor turns from healthy to critical. Now it will not generate another alert unless it becomes healthy again and then turns critical a second time. So, what if you want the script to generate alert each time it runs?

Of course you could do this in many ways and one way would be to use a rule instead of a monitor. The problem with this is that Operations Manager Console do not include a wizard to do that. We can create Alert Generating Rules, but none of those will run a script. We can create a Timed Commands Rule, but it will not generate an alert. So if we want to do this with a rule, we must use a tool like Visual Studio Authoring Extensions or Visio MP Designer (or XML editor for those geeky enough to try).

However, if we keep the two state script monitor, we can send the alert to a Notification Command Channel and use PowerShell to reset the monitor health. This is how:

Before you create a command channel that will interact with Operations Manager, you need a user with proper permissions for this purpose. The user that will be used is the one assigned to the Run As profile called Notification Account. To set this open Operations Manager Console and then Administration > Run As Configuration > Profiles. Open Notification Account and add the domain user to Run As Accounts. The account you add here must also be added to Administration > Security > User Roles > Operations Manager Operators.

Create a new Command ChannelOperations Console > Administration > Notifications > Channels > New channel > Command...:

Give it a name, e.g: Reset monitor health Command, then configure settings as follows:



Full path of the command line:
c:\windows\system32\WindowsPowerShell\v1.0\powershell.exe
Command line parameters:
-noexit D:\Scripts\resetmonitorhealth.ps1 '$Data/Context/DataItem/AlertId$'
Startup folder for the command line:
c:\windows\system32\windowspowershell\v1.0\

Create the script resetmonitorhealth.ps1 with the following content and save it to D:\Scripts (or the same path used in the Command line parameters) on each Management Server that is member of the Notifications Resource Pool:
Param([string]$AlertID)

# Import PowerShell Modules
Import-Module OperationsManager

# Get the alert
$SCOMAlert = Get-SCOMAlert -Id ($AlertID) -ErrorAction SilentlyContinue

If ($SCOMAlert) {
  # Get the Monitoring Object (e.g. Windows Computer Object)
  $MonObj = Get-SCOMMonitoringobject -Id $SCOMAlert.MonitoringObjectId

  # If the monitor exist and is unhealthy
  If (($MonObj) -and ($MonObj.HealthState -ne "Success")) {
    # Get the monitor for the alert
    $Mon = Get-SCOMMonitor -Id $SCOMAlert.MonitoringRuleId
    If ($Mon) {
      $MonObj.ResetMonitoringState($Mon)
      # If your monitor do not auto-close the alert, uncomment this if you would like to do so
      #$SCOMAlert | Set-SCOMAlert -Comment "Command Channel: Reset Monitor Health" -ResolutionState 255
      #Write-EventLog -LogName Application -Source "PowerShellScript" -EntryType Information -EventID 101 -Message "Done resetting Monitor $($Mon.DisplayName)"
    }
    #Else {
    #  Write-EventLog -LogName Application -Source "PowerShellScript" -EntryType Warning -EventID 201 -Message "Unable to find monitor for Alert with ID $AlertID"
    #}
  }
  #ElseIf (!$MonObj) {
  #  Write-EventLog -LogName Application -Source "PowerShellScript" -EntryType Warning -EventID 202 -Message "Unable to find monitoring object for Alert with ID $AlertID"
  #}

}
# If you need to debug you can write to the event log like this:
# First, on each management server, open PowerShell and run:
#   New-EventLog -LogName Application -Source "PowerShellScript"
# This will create a new source that you can filter on in Event Viewer
# Then, in this script uncomment all the lines with Write-EventLog in this script, including the Else sections
#Else {
#  Write-EventLog -LogName Application -Source "PowerShellScript" -EntryType Warning -EventID 203 -Message "Unable to find Alert with ID $AlertID"
#}

Then create a Notification Subscription, specify Criteria with the options Created by specific rules or monitors, select the montor you have created, and also With a specific resolution state of New (0):

To send the alert to the Command Channel you must also create a Subscriber:
Subscriber Name: Reset monitor health Subscriber
Addresses: Create a new Address: Reset monitor health Subscriber Address, Choose the Channel Type Command and select the Command Channel you created earlier.

Also add any other subscribers you would like to send notification to, e.g. e-mail or sms.

In the Channels section add the Command Channel you created earlier.

When done with the Notification Subsribtion Wizard the Summary should look something like this:
Name
DC1 Backup Folder State

Description
Reset the monitor state for DC1 Backup Folder State.

Criteria
Notify on all alerts where
created by DC1 Backup Folder State rules or monitors (e.g., sources)
and with New (0) resolution state

Subscribers
Reset monitor health Subscriber

Channels
Reset monitor health Command


You may run in to a problem with Maximum Number of Asynchronous Responses (5) has Been Reached showing up in the event log. Read more in Jim Moldenhauer's Blog.

To resolve this try to modify registry by executing the following commands on each management server in the Notifications Resource Pool:
REG ADD "HKLM\Software\Microsoft\Microsoft Operations Manager\3.0\Modules\Global\Command Executer" /v AsyncProcessLimit /t REG_DWORD /d 20 /f
net stop healthservice
net start healthservice
Set the decimal value between 1 and 100. We start with 20. Move up from there if needed. Also we restart the Health Service to allow the new settings to take effect.