Kaseya Community

Agent offline for 30 min send notification to after hours monitoring email address based on schedule

  • Hi All,
    Just wondering if someone is using similar setup and can advise how to configure the following.
    During working hours we send notifications to email groups which contain monitoring email address and client's IT email address.
    For some critical devices we had "30 min server Down" in our old monitoring  which was sending emails to after hours monitoring team based on schedule 17:30 - 8:30 weekdays and full days for Sat, Sun
    I am trying to replicate the same behaviour.
    I can see the way to set up policy in Kaseya to send email notifications to a specific email address when agent is offline for 30 min however I cannot see a way to restrict this policy to do this based on schedule (only after hours and weekends)
    I was thinking about procedures but:
    1. if server is offline you probably cannot run procedures there
    2. Monitoring schedule is based on Kaseya server time, not client's time. Also no built in time variables so you have to run procedures to get current time and date on the client.
    You can probably run procedures on some other server e.g. gateway server but it does not look like something which has high visibility and manageability as you will have to drill through policies and procedures applied to gateway to understand why some alerts have been generated.
    Any pointers?
    Kind regards,
    Vadim
  • To start with as you have said you can rule out a procedure as that can't be run on an offline machine.

    You really need to setup the Agent Status email notification and unfortunately have an external system define the schedule.

    In our case we go through a 3rd party call center who receive these alerts and then assign them over to the appropriate after-hours engineer, this may not be acceptable for you.

    See what you can do with email rules/settings your email security can provide. (can you setup a rule in O365 for example?)

    From kaseya polices and alerts you cannot define this to run only at certain hours of the day, only that they do.

  • Hello vGonzales,

    A possible idea (not the easiest of the options but as long as you have a SQL Server version that is not Express it should work).

    You can use SQL Server Job Schedule to achieve what you need.

    We start from a simple query as below:

    SELECT
         [displayName]
         ,[lastCheckinTime]
    , DATEDIFF(minute, [lastCheckinTime], GETDATE())
         ,[timeZone]
         ,[onlineState]    
    FROM [ksubscribers].[dbo].[vAgents_AgentStatus]
    where DATEDIFF(minute, [lastCheckinTime], GETDATE()) > 30

    That will give you a list of machines that have been offline more than 30 minutes.
    You can change the parameters and join more tables if you need more info.

    I should mention that you will need to make sure SQL Server Database Mail has been configured.

    Here is a good tutorial:

    www.snapdba.com/.../enabling-and-configuring-database-mail-in-sql-server-using-t-sql

    Then you create a Job with the following step:

    SELECT
         [displayName]
         ,[lastCheckinTime]
    , DATEDIFF(minute, [lastCheckinTime], GETDATE())
         ,[timeZone]
         ,[onlineState]    
    FROM [ksubscribers].[dbo].[vAgents_AgentStatus]
    where DATEDIFF(minute, [lastCheckinTime], GETDATE()) > 30;

    IF (@@ROWCOUNT > 0)
    begin
    EXEC msdb.dbo.sp_send_dbmail
       @profile_name = 'Your_Email_Profile',
       @recipients = 'Recipients Destination / distribution group',
       @query = 'SELECT  [displayName],[lastCheckinTime], DATEDIFF(minute, [lastCheckinTime], GETDATE()),[timeZone]
         ,[onlineState]  FROM [ksubscribers].[dbo].[vAgents_AgentStatus] where DATEDIFF(minute, [lastCheckinTime], GETDATE()) > 30' ,      
       @subject = 'Servers Down!!',
       @Body = 'Please find attached list of Servers Down' ,
       @attach_query_result_as_file = 1 ;
    End


    And you can attach 2 schedules as follows:


    As a result you will get an Email (only when rows are returned) with the list of servers offline attached.


    Best Regards

  • That's a really great idea.

    Only a few things with that, you need a way to filter out suspended agents or agents with suspended alarms.

    I know in our VSA there are often dozens at a time that are suspended for one reason or another.

    I guess its just a case of joining in the correct tables to do that.

  • Hello Rowan,

    Of course, the Query above was a simple one to get the idea, but you can change it to include suspended agents logic.

    In fact the query I have done before was also taking EVERY offline agent since 30 minutes

    (not just servers).

    There are so many ways to do the same thing that you may find you own way you prefer or need to do it.

    The below would exclude suspended agents (or agents with suspended monitoring) and take only servers into consideration.

    SELECT

    vAgentLabel.displayName as 'Agent',

    agentstate.online as 'Online',

    offlinetime as 'Last Online',

    DATEDIFF(minute, offlinetime, GETDATE()) as 'Offline Since Minutes',

    startSuspend as 'Suspend Monitors From',

    endSuspend as 'Suspend Monitors To',

    suspendagent as 'Suspend Agent',

    userIpInfo.osInfo as 'OS Info'

    FROM dbo.vAgentLabel

    left JOIN monitorSuspend ON [monitorSuspend].agentguid = vagentlabel.agentguid

    left JOIN users ON users.agentguid = vAgentLabel.agentGuid

    left JOIN agentstate ON agentstate.agentguid = vAgentLabel.agentGuid

    left JOIN userIpInfo ON useripinfo.agentguid = vAgentLabel.agentGuid

    LEFT JOIN dbo.vAgents_AgentStatus ON vagentlabel.agentguid = vAgents_AgentStatus.agentguid

    WHERE userIpInfo.osInfo like '%server%'

    and agentstate.online = 0

    and (suspendAgent is NULL or suspendagent = 0)

    and (startsuspend >= getdate() or startSuspend is NULL)

    and (endsuspend <= getdate() or endSuspend is Null)

    and DATEDIFF(minute, offlinetime, GETDATE()) > 30;

    Really up to you to define the conditions.

    Best Regards

  • Vadim,

    Our VSA is configured to precisely do what you are looking for. We use Service Desk to accomplish this, and this is exactly what Service Desk is designed for. Unfortunately, Kaseya had marketed this as a ticketing system and not a alert-automation system. That's about to change. :)

    We use 4 variables in the SD configuration that define the helpdesk start and end times for weekdays and for weekends/holidays. Our helpdesk does not operate on weekends/holidays, so the weekend start/end time are both set to 0:00.

    When a high-priority ticket arrives, such as a "server down", we process the alert through service desk, extracting the priority and other data from the alert ID. After processing is complete, we perform two tests - one to determine if the help desk is NOT operating and another to determine if the time is within customer coverage (that's a built-in SD procedure). If the help desk is not operating, the alert is enabled for notification. If the customer is within coverage time, a notification is sent immediately, otherwise it is queued to be sent early in the am (6am weekdays/8am weekends & holidays). An email is always sent immediately when a P1/P2 alert arrives - only the notification can be deferred.

    We use a notification system that places a voice call. Our call goes to an answering service that contacts the primary on-call engineer. If he doesn't answer or is unavailable to take the ticket, they notify the backup on-call engineer. The system could call the engineers directly, but we prefer the human control of escalation for the $100/month it costs. It also provides clients with a direct-call number for after-hours support.

    FYI - we never use Agent Offline to report system status. It's a pretty unreliable indication of the server state. The agent is designed to relinquish resources when the system is under heavy load, such as when backups or end of period closing tasks are running and the CPU is at 90% or more for extended periods. Our P2 alerts use Network Monitor and alert when the system doesn't respond to a specific query (never ping!) after 15 minutes (5 failures on 3-minute intervals). The Agent Offline alert is set for 1 hour and is P4.

    The part that makes this easy is our Multi-Tool for Service Desk. It has complex time calculation and string manipulation functions that simplify this logic.

    TIWE - Time-Is WeekEnd, returns true if the current day/time is a weekend.

    TITR - Time-In Time Range, returns true if a specific (defaults to current) time is within a specific time range. Nice thing about this is it is based on the current 24-hour period, so it understands that 2am is between 5pm YESTERDAY and 8am today!

    The string manipulation functions allow you to extract specific values from delimited strings, so if your alert contains ",priority 2," as one of the message parts, you could split on the comma, and then on the word "priority" to get the priority value.

    The math and comparison functions are double-precision for numeric values, not string based as is the Kaseya default.

    Our multi tool has over 60 math, string, comparison, and time functions and is just $299 - you can get more info here at www2.mspbuilder.com/products/multi-tool, and it should be up on the application exchange later this week. The complete manual is available for download and review, and we should have a demo version available for download by the end of today.

    Glenn



    Typo
    [edited by: gbarnas at 3:53 AM (GMT -7) on Jul 12, 2016]
  • Hi All,

    Thank you  all  for your responses. (Especially for those containing SQL queries as I  SQL is not one of my strongest skills)

    I was trying to to bring VSA and KNM notifications to "common denominator" and manage everything within one tool e.g. Kaseya.

    KNM and VSA notifications differ a bit (well, more than a bit)

    1.VSA - Policies can be created to send some alerts to hardcoded email addresses or use variables for sev1, sev2, sev3. No notification schedules. You can use  procedures to change email notification schedules based on client time  (or recalculate server time based on GMT offset) but ti is not built-in so you have to create a piece of code which will check  against the schedule. And that will not work for "client offline" events as offline clients  are down and cannot process any procedures.

    2. KNM - has built in notification schedules, notification groups  and “user on duty” so emails can be send to users based on schedule.

    In our efforts to manage both of these without developing case of extreme schizophrenia I ended up modifying Exchange transport rules.

    to refresh memory on our case.

    -all alerts are sent to our  alertXXX@YYY email address

    -all after hours  5:30 PM - 8:30 AM  “device is offline for 30 min” from VSA and KNM should be sent to third party who would raise a case and call on-call person.

    We send alert emails to email groups which include after hours email address BUT block emails to AfterHoursMonitoring during working hours via Exchange transport rule.

    Exchange transport rule will be enabled /disabled  via scripts running on schedule.

    see below example for the script enabling transport rule and sending notification email.

    this script will run at 8:30 AM enabling transport rule, another script with be running at 5:30 PM disabling the rule.

    Exchange rule is set up to redirect emails sent to specific email address to Emergency Response Groups so we will be able to see which critical alerts supposed to go to AfterHoursMonitoring during the business hours.

    I hope that VSA and KNM alerts would would be more interconnected  so you would not have to deal with some low visibility workarounds like using Exchange Transport rules or SQL scheduler

    Looks like the end of story for this particular case.

    Kind regards,

    Vadim

    #06.07.2016

    #Add powershell snapins and modules

    add-pssnapin Microsoft.Exchange.Management.PowerShell.E2010

    #setting up parameters

    $timestamp = get-date -format yyMMddhhmm

    $errorfile="C:\_scripts\MonitoringAlertsScheduleChange-errors-$timestamp.txt"

    $OutputFile="C:\_scripts\MonitoringAlertsScheduleChange-Output-$timestamp.csv"

    $EmailTransportRuleState=(Get-TransportRule -Identity "Kaseya disable alerts to AfterHoursMonitoring |select name, state, comments)

    # Send Email Variables

    $smtp = "ex01.xxx.com.au"

    $from = "Exchange  Administrator<support@xxx.com.au>"

    $to = "vg@xxx.com.au"

    $subject = "Monitoring Alerts Schedule change - Critical Kaseya Alerts will be sent to AfterHoursMonitroing”

    $sBody=@"

    Hi All,

    Please be advised Critical Kaseya Alerts will be redirected to Emergency Response group starting from now.

    Schedule will be changed at 5:30 PM for alerts to be redirected to AfterHoursMonitroing.

    You will receive additional email confirming schedule change.

    Kind regards,

    IT Support

    (email is generated by MonitoringAlertsSchedule-enable.ps1 script running on EX01 via Task Scheduler)

    Debug info:

    $EmailTransportRuleState

    "@

    Enable-TransportRule "Kaseya disable alerts to AfterHoursMonitoring"

    Send-MailMessage -From $from -To $to -Subject $subject -Body $sBody -SmtpServer $smtp

    $EmailTransportRuleState | Out-file $OutputFile -Append

    $error| Out-file $errorfile -Append