Kaseya Community

Alarm Delay

  • Has anyone else seen delays in alarms triggering after an event? Or not triggering at ALL?

    I've seen all sorts of randomness over the years but finally got some hard facts.

    I rebooted a client's server. It was set to alarm on an offline condition of 2 minutes and send me an e-mail. I eventually got an e-mail saying that it had been offline for 5 minutes (and then the "back online" e-mail a few minutes later). So just to make sure that it wasn't a delay in e-mail, I checked the alarms. Sure enough, the ALARM ITSELF was stamped as of 5 minutes after the system first went offline - meaning that the alarm didn't trigger for 2-1/2 times longer than it was supposed to!

    Although it was no biggie in this particular case since it was something planned, this really makes me look like a schmo to my clients when they're paying me all this money to monitor their systems and I'm not able to do so reliably.

    Another recent case - I've got external checks of two web sites and one sub-page on one particular client's web server. They're set as 1-1-1... one minute check, trigger after 1 minute, and rearm after 1 minute (I found there's really no immediate rearm capability). Well, this server conked out and generated some other alarms, but only ONE of those web alerts ever triggered.

    Again, if the client had been relying on me, I'd have looked like an idiot.

    Legacy Forum Name: Alarm Delay,
    Legacy Posted By Username: warever
  • We are having the same problems, both with Monitor sets, Alarms and SNMP sets. Alarms are triggered at random intervals...

    Legacy Forum Name: Monitor/Event Sets/SNMP Sets,
    Legacy Posted By Username: HakanH
  • I just got a somewhat disturbing e-mail response from Kaseya support today on this subject.

    "Kaseya doesn't flag a machine as offline instantly, it will wait 2x the check-in time to decide that the machine is truly offline, and that it hasn't just missed a check-in due to a temporary factor. Once this is flagged as offline, the background task that runs every 2 minutes will process this alert and send the email. It is reasonable that an additional delay would be seen between the actual time of the machine going offline and the alert being raised, in addition to the time for the mail to be processed, sent, received and read."This is disturbing because that means that even if I set the offline figure at 2 minutes (which should ALREADY help prevent false alarms if one or two check-ins were randomly missed), I'm not going to get a notification for what amounts to DOUBLE that.

    Even more disturbing is the revelation that the alert process runs ever 2 minutes - which presumably means that EVERY alarm may be delayed up to 2 minutes!!!

    Now, for some things, it doesn't matter - a 2 minute delay in a drive hitting 50% free space is no biggie (heck, if it waited several HOURS, I wouldn't care). But for other things - like a critical service that stopped - it is just intolerable.

    My ISP guarantees "4-9's" uptime - 99.99% uptime. That's a grand total of 4.3 minutes per month. Now, if I'm trying to monitor that, I can't - because I'm not going to get notification within that time frame.

    Nice to finally know the truth - that I really can't reliably count on Kaseya to alert me in a timely fashion to any alarms.

    Legacy Forum Name: Monitor/Event Sets/SNMP Sets,
    Legacy Posted By Username: warever
  • Is it really that important? You are not going to be able to do anything in that additional two minutes anyway. I may be way off base, but I would think if you were getting that tight on alerts that you might risk getting a lot of false positives.

    David Wertz

    Legacy Forum Name: Monitor/Event Sets/SNMP Sets,
    Legacy Posted By Username: DaveW
  • I set servers for 2 minute offline detection precisely because it helps PREVENT false alarms. Normally, a check-in is every 30 seconds so that means it would have to miss FOUR check-ins to hit the 2 minute offline mark.

    If a server happens to be a bit busy, it may miss one. It may miss two. It won't miss four.

    A quick Internet drop may be a minute. Anything longer and I'm going to want to know about it anyway.

    Can I FIX a problem in 2 minutes? Nope. I'm not THAT good! (grin)

    But if a server is down for 5 minutes, a client has already figured out that their server (or Internet connection) is down and has probably already called me.

    If I'm able to answer the phone and tell them that I already know it is down and have already gotten into Kaseya to see if the rest of their machines are offline (which means a likely Internet issue) or if just that one machine is offline (in which case it is probably a crashed server), then I'm a hero to them.

    If I just heard about it from them, then WHY are they paying me all that money every month to monitor and manage their systems?

    Or maybe it is just a Exchange store service that went offline and not the entire server. The client is probably going to call me pretty quickly to let me know that there's an issue with their e-mail. If I have to sit around with my thumb up my rear trying to figure out what happened, I'm going to look like an idiot. If, on the other hand, when they call I can tell them that their Exchange store went down and have the error information already in my hand, I can tell them that I'm already working on it... and I'm a hero. No QUESTION about why they're paying me money every month. ANY schmo can start looking for problems after they call to say there are problems.

    Legacy Forum Name: Monitor/Event Sets/SNMP Sets,
    Legacy Posted By Username: warever