I am wondering what others do when they perform maintenance on their Kaseya Server, and need to reboot.
We have around 3500 agents - when we reboot I have to keep alarm generation turned off for about 25 mins after the server is back up otherwise we get slammed with offline alerts as the agents all fall over each other trying to check back in.
I read somewhere in the help that this is not supposed to happen, but it does. Do others experience this?
Before reboot: System - Configure - uncheck 'Enable Alarm Generation'.
Re-enable after reboot and agents are all back online.
Craig - that is exactly what I do at the moment. My question is that why do we have to do this when the documentation implies that it's not necessary. I am trying to locate that little nugget of info int he documentation, but can't find it yet. I think it said that the KServer knows to suppress offline alarms after it's been offline. Which is not true at all.
I think you are mistaken. Standard procedure is to turn off alarm generation as Craig described. I have several thousand agents as well.
Yes, I agree that standard procedure is to turn off alarm generation during maintenance. That is exactly what I do as I have already said. The point I am trying to get across is that once the KServer is back online, we have to wait for the agents to settle down with their connections before turning alarm generation back on.
Perhaps we can feature request for a suspend alarm generation window. I empathize with you in that a single action would be useful rather than two steps for disable followed by enable (or even worse forget to turn the damn thing back on).
Tonijo, I can restart my K server and not experience the issues you are having. In your case it sounds much more like your K server isn't fast enough to handle the workload of a lot of agents checking in and also the backlog of scheduled procedures running.
Have you reviewed the system performance under system-statistics to determine if the K server and /or database engine aren't keeping up?
Hi Craig, our KServer is spec'd for 10,000 agents, We have had performance issues in the past, but was sorted out by changing the standard Kaseya monitoring sets, as monitoring every service and checking the CPU too often bogged down the SQL Server. We have a separate SQL box to the KServer.
I have tried staggering the checkin times for servers and workstations to try and spread the load, but that hasn't helped.
So I just have to leave alarm generation turned off for around 25 mins after KServer is up and watch the online offline numbers yoyo before settling down. Then it's safe to turn alerts back on.
When I perform maintenance on our Kaseya server, I do it in three steps:
1) I go under Monitor -> Status -> Suspend Alarm and for all Workstations and servers I put in the amount of time I have booked for maintenance so we receive no tickets (this is usually 4 hours)
2) I then go under Agent -> Configure Agents -> Suspend and suspend the Kaseya agents on all workstations and servers
3) I then log into our firewall and turn off outside access to the Kaseya server.
I go ahead and apply patches, apply updates, and reboot. When I am finally done I turn the outside access on and remove the suspend on the agents. I then wait until the workstations/servers come online and then remove the alarm suspends from a few and test with before I remove from all.
Maybe this is not the best way to do it, but it prevents us from getting hammered with alerts.