Kaseya Community

Offline Servers getting grouped together in one email - can this be stopped?

  • We use the traditional agent offline alerting for our servers.  We are working on some automation with offline servers and run into a little snag if more than 1 server goes offline inside a machine group. Kaseya groups the servers into one email and sends the email out instead of sending an email for each offline server. 

    Does anyone know of a way to stop this so each server gets its own email notification? 

  • I don't believe so, no. however if memory serves, the body of the email by default lists all the downed servers, so oyu could perhaps process that information?

    We have offline alerting enabled for servers also; here is my pet peeve:

    We re-arm the alert after 30 minutes so our techs are nagged regularly for down servers. Every 30 minutes the VSA sends a subsequent email if the server is still down (thats fine) but the VSA also opens a new Alert - so if the server is down for, say, 4 hours, we end up with 8 alerts open for the same incident. When the server comes back up, only the last open Alert is closed - leaving 7 alerts open for a resolved issue. This looks really bad in the customer reports.

    I don't believe Kaseya cares about fixing this issues as you're apparently "supposed" to use KNM instead these days instead.

    I quietly await the day when these sorts of legacy issues are actually revisited by Kaseya and addressed.

  • We don't use or recommend the Agent Offline method of server-down alerting. The agent, from what I'm told by engineering, will "go quiet" when the server is under high load, such as you might see during overnight backups or similar nightly processing. We used to get dozens of alerts each night when we used this method. Our Agent Offline setting for servers is 1 hour down, 1 hour to reset for all but two critical 24/7 clients, which are set to 20/60 as a backup to the primary monitor.

    We use Network Monitor to perform a very specific check that makes a request from the server. We call it an "Operational Availability" monitor. Unlike Ping, which can be ignored during high load conditions, the query we make requires a specific response from the server. It can be delayed but not ignored, so we ALWAYS get a response if the server is powered up AND the O/S is operating properly. We set the threshold on this to 15 minutes (5 consecutive failures on 3-minute check cycles), which is usually enough for a system to reboot without getting an offline alert. (We have a separate Smart Monitor that reports reboots during business hours, or booting into Safe Mode.)

    We've found that this simple monitor has eliminated nearly all false "Server Down" alert based on Agent Offline, yet provided accurate and timely reporting of when a server is actually unavailable.

    These KNM monitors are part of our Core Automation suite for VSA, and the Smart Monitors are part of the Enhanced Maintenance and Monitoring Kit.