Excuting PowerShell scripts on Windows clients monitored by Icinga (Nagios etc)

For one of my customers I have set up a monitoring system based on Icinga2.
Icinga2 runs on a Ubuntu Linux host and monitors clients, servers and devices on the network from there. In order to monitor the Windows computers, the Icinga2 client has been installed on them, which can be downloaded here. The clients were then connected to the Icinga2 server and the usual parameters such as CPU, RAM, hard disk usage etc. were recorded.

My customer was very satisfied, but had another special task for me:”I would like to monitor an application running on one of our Windows servers for errors,” he said. “In the event of an error, the application writes a dump of it to a file. Can we monitor a specific directory on a Windows server so that we are notified when a new file with a specific file extension appears?”. Nothing easier than that, I thought. Using a PowerShell script should make it easy to implement.

Unfortunately, there is no predefined command in Icinga2 to start a PowerShell script on a Windows computer. I therefore created a new command via the Icinga Director. As command type I choose “Plugin Check Command”, as command name I use “windows-powershell”. The actual command is as follows:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe "& '$script_name$'"

The command uses the $script_name$ variable to pass the path to the PowerShell file, which is then executed on the Windows computer. The relative path to the powershell. exe file is the same on any Windows system. However, the drive letter could be thereotic other than “C:”. But I assume that the operating system is installed on 99.99999999% of all Windows systems with the drive letter “C:”. In such cases, the %SystemRoot% variable is usually used in Windows environments to make sure that the path always points to the “Windows” directory and the correct drive letter.

Unfortunately, this cannot be used in this case, since Icinga assumes that it is a path relative to the Icinga installation directory on the client (usually “c:\program Files\icinga2”) if a path is specified that does not begin with a drive letter.

For example, if I use the Path

"%SystemRoot%\WindowsPowerShell\V1.0\powershell.exe"

Icinga tries to find the command relative to the installation directory, which would result in the path “c:\program Files\icinga2\%SystemRoot%\WindowsPowerShell\v1.0\powershell. exe”. Therefore, I give the path absolutely.

So that the path of the PowerShell script file to be executed can be passed to our command, we define a new field called “script_name” on the tab “Fields”. Since our command doesn’t work without a script, we also specify that this is a mandatory field. The definition of our command is as follows:

object CheckCommand "windows-powershell"
{
   import "plugin-check-command"
   command = [
      "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
      ""&",
   "'$script_name$'""
   ]
}

Next we define a new command called “windows-powershell-check-error-files” of type “Plugin Check Command”. In this command we import our previously defined command “windows-powershell”:

Screenshot

As value for the variable “script_name” we give the absolute path to the PowerShell script file that will be used to check our folder. I plan to place the script file in the Icinga sbin directory later, so in my case the path is “C:\Program Files\ICINGA2\sbin\check-error-files.ps1”.

Since the script also needs to know which folder to check, we create a new argument with the name “-folderPath” and the value “$folder_path$” of the type “string”. In addition, we create the corresponding mandatory field “folder_path”. After saving the change, we can specify the script path on the first tab of the command. In my case, this is “C:\Program Files\ICINGA2\sbin\sbin\check-error-files.ps1”. The definition of the second command is then as follows:

object CheckCommand "windows-powershell-check-error-files"
{
   import "plugin-check-command"
   import "windows-powershell"
   arguments +=
   {
      "-folderPath" =
      {
         required = true
         value = "$folder_path$"
      }
   }
   vars.script_name = "C:\\Program Files\\ICINGA2\\sbin\\check-error-files.ps1"
}

Resolved (i. e. including the settings it inherits from the parent command “windows-powershell”), the command looks like this:

object CheckCommand "windows-powershell-check-folder"
{
   import "plugin-check-command"
   command = [
      "C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe",
      ""&",
      "'$script_name$'""
   ]
   arguments +=
   {
      "-folderPath" =
      {
         required = true
         value = "$folder_path$"
      }
   }
   vars.script_name = "C:\\Program Files\\ICINGA2\\sbin\\check-folder.ps1"
}

In the next step, we use this command in a new service template. I call the service template “PowerShell Check Error Files” and import the check command “windows-powershell-check-error-files”. In addition, I define that the service has to be run on an agent (i. e. not on the Icinga server itself, there is no PowerShell available). After the configuration, the file for the service looks like this:

template Service "PowerShell Check Error Files"
{
   check_command = "windows-powershell-check-error-files"
   enable_notifications = true
   enable_active_checks = true
   enable_passive_checks = true
   enable_event_handler = true
   enable_perfdata = true
   command_endpoint = host_name
}

In the last step of the Icinga server configuration, I assign the service to a Windows computer and run the check. In this case, I use the computer on which the folder is located. It would also be possible to run the check over the network and check if the folder is present on another computer. However, the Icinga service runs in the default settings on Windows computers under the Network Service account. This account can access the local computer, but cannot access resources on other Windows computers over the network. If this is desired, the Icinga service account on the Windows machine should be changed to a default domain account.

The service is assigned to the desired computer via the Icinga Director and the section “Hosts”, in my case to the computer “Guinan”.

So that the PowerShell script also knows which folder it should check, I enter the path to the folder as argument for “folderPath”.

The configuration file looks like this:

object Service "PowerShell Check Error Files"
{
   host_name = "Guinan.StarTrek.net"
   import PowerShell Check Error Files"
   vars.folder_path = "
C:\\inetpub\\temp"
}

For the Icinga 2 Agent to be able to run the PowerShell script, it must be located on the Windows computer in the Icinga 2 Agent directory under “C:\Program Files\ICINGA2\sbin”. There I copy the following script under the name “check-error-files. ps1”:

Param(
    [string]$folderPath
)


function Exit-WithExitCode($exitCode)
{
    $host.SetShouldExit($exitcode)
    exit
}

if (!$folderPath)
{
    Write-Host "UNKNOWN: Parameter 'folderPath' is not set. Script will be aborted."
    Exit-WithExitCode 3
}

$result = $null
$exitcode = $null

# Does folder exists?
if (Test-Path $folderPath )
{
    $items = Get-ChildItem -LiteralPath $folderPath -Filter "*.err"
   
    if ($items)
    {
        if ($items.Count -gt 1)
        {
            $result = ("CRITICAL: Folder '{0}' contains '{1}' error files" -f $folderPath, $items.Count)
            $exitCode = 2
        }
        else
        {
            $result = ("WARNING: Folder '{0}' contains '1' error files" -f $folderPath)
            $exitCode = 1
        }
    }
    else
    {
        $result = ("OK: Folder '{0}' contains '0' error files" -f $folderPath)
        $exitCode = 0
    }
}
else
{
    $result = ("Folder {0} does not exist" -f $folderPath)
    $exitCode = 3
}

Write-Host $result
Exit-WithExitCode $exitCode

The script accepts the parameter “folderPath” and searches for files with the extension “. err”in this directory. If the directory does not contain a file with the extension “. err”, the script returns the error code 0.
If there is a file with the extension “. err” in this directory, the script returns the error code 1 and signals a warning.
If there is more than 1 file with the extension in the directory, the script returns the error code 2 and signals a critical state.

The PowerShell script can be extended relatively easily by adding additional parameters to the PowerShell script, e. g. to transmit the file extension to be checked for or the threshold values for warning and critical.

Important for the script is the function “Exit-WithExitCode”. The function returns the error code so that it can also be received by the Icinga2 agent. If the error code is returned directly via the PowerShell command “Exit 1”, then the Icinga2 Agent cannot receive the real value but always receives the value 0.

To check on the client which command is executed by the Icinga2 agent, debugging can be enabled. To do this, the following command is executed on the command line on the Client:

icinga2 feature enable debuglog
net stop icinga2
net start icinga2

The log file is created in the following directory:

C:\ProgramData\icinga2\var\log\log\icinga2

If everything works as desired, debugging can be disabled with the following commands:

icinga2 feature disable debuglog
net stop icinga2
net start icinga2

In the debug log, the script call looks like this:

[2018-03-04 13:50:58 +0100] notice/JsonRpcConnection: Received 'event::ExecuteCommand' message from 'Icheb.startrek.net'
[2018-03-04 13:50:58 +0100] notice/Process: Running command 'C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe ""&" "'C:\Program Files\ICINGA2\sbin\check-error-files.ps1'"" -folderPath C:\inetpub\temp': PID 3556
[2018-03-04 13:50:59 +0100] notice/Process: PID 3556 ('
C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe ""&" "'C:\Program Files\ICINGA2\sbin\check-error-files.ps1'"" -folderPath C:\inetpub\temp') terminated with exit code 0
[2018-03-04 13:50:59 +0100] notice/ApiListener: Sending message '
event::CheckResult' to 'Icheb.startrek.net'

Robocopy error 1338 (0x0000053A) during file migration from NetApp to Windows Server 2016

I am working on a project to migrate CIFS shares on a NetApp to Windows Server 2016 file servers. To migrate the folder and file structure I use Robocopy with the following command line:

robocopy \\netapp\vol_home$ D:\vol_home /MIR /SEC /SECFIX /R:1 /W:1 /MT:32 /LOG:D:\Migration\Robocopy\vol_home_output.log /NFL /NDL /NP

The migration of the data has so far worked without problems. Today, however, Robocopy displays an error while trying to copy a directory:

2018/02/25 13:47:46 FEHLER 1338 (0x0000053A) NTFS-Sicherheit wird in Zielverzeichnis kopiert D:\vol_home\test\
Die Struktur der Sicherheitsbeschreibung ist unzulässig.

Or in english:

ERROR 1338 (0x0000053A) Copying NTFS Security to Destination Directory D:\vol_home\test\
The security descriptor structure is invalid.

I found the following information about this error online (KB2459083):

“The error is usually caused by the CIFS file server returning invalid security information for a file. For example, if the CIFS file server returns a NULL Security ID (SID) for a file’s Owner, or a file’s Primary Group, when Robocopy tries to copy this information to the destination file, Windows will return error 87 “The parameter is incorrect” or error 1338 “The security descriptor is invalid”. This is by design – file security information in Windows is expected to contain both Owner and Primary Group SIDs.”

The reason for the problem could be that there is no owner and/or primary group set on the folder’s security description. I can change the owner via the properties in Windows Explorer, so I first tried to set another owner here. The new owner was set, but this did not solve the problem. The same error message still appeared.

Unfortunately, only the owner can be displayed and set with the Windows Explorer, the Primary group is not displayed. Fortunately, Windows PowerShell is able to display the ACL of a folder including its owner and primary group. The Get-ACL cmdlet is used for this purpose:

Get-ACL \\netapp\vol_home$\test

As you can see here, the value for the group is missing:

For comparison, here is a screenshot with the group set correctly:

To set the primary group for the folder, I have written the following PowerShell script:

$folderPath = "\\netapp2a\vol_home$\test"
$primaryGroup = "VORDEFINIERT\Administratoren"
$folder = Get-Item $folderPath
Write-Host ("ACL for folder '{0}' before change:" -f $folderPath)
$folderACL = Get-Acl $folderPath
$folderACL | fl
$newPrimaryGroupACL = New-Object System.Security.AccessControl.DirectorySecurity
$primaryGroup = New-Object System.Security.Principal.NTAccount($primaryGroup)
# Sets the primary group for the security descriptor associated with this ObjectSecurity object
# https://msdn.microsoft.com/en-us/library/system.security.accesscontrol.directorysecurity(v=vs.110).aspx
$newPrimaryGroupACL.SetGroup($primaryGroup)
$folder.SetAccessControl($newPrimaryGroupACL)

Subsequently, a new call to the Get-Acl cmdlet returned the following result:

Robocopy then copied the directory without any problems.