Kaseya Community

Agents going offline randomly

This question is answered

We have our server agents set to alert us if the agent hasn't checked in for 10 minutes.  I have noticed for the last month or so that agents will go offline then back online an hour or two later.  It usually happens between 12:30am and 3am.  I have checked the logs and can't find any clues as to why this is happening.  Has anyone else seen this?

All Replies
  • Yes. And a matter of fact it just now happened again between 12:10-12:16PM MST with us. This is the fifth time in two weeks that this has happened. We have been in contact with Kaseya support multiple times regarding this and are getting absolutely no where. Even had a Kaseya technician blame it on our firewall/network infrastructure after we told him that we have geographically dispersed clients and that they all are reporting this. There is no way this is our fault.

    This can't continue to happen as we even have clients of ours that are subscribed to these alerts...it's very aggravating.

  • I've seen this happen when BUDR is running as well as some other offsite backups that we run.  It can be when the machine is very heavily loaded or network is overloaded.   We see it very consistently on certain machines when the backup runs through the night.

  • I'm not sure what kind of setup DanGross has, but ours is a hosted SAAS Kaseya instance. And we do not use BUDR on any of our clients currently.

  • This has happened to us at various times over the years. In our case is was never network related, but due to High CPU and / or disk usage on the VSA or SQL servers.

    Are you scheduling client audits, backups, AV scans or patch installs during this outage window perhaps?

    Also....if this is on Virtual Infrastructure....make sure the host is not the cause of the outages. Either the VM's coudl be getting backed up....or other VM's on the same host could be affecting yours.



    [edited by: joshua.niland at 2:42 AM (GMT -8) on Dec 17, 2012] Updated
  • We have on own Kaseya server, it happened again last night while I was working on another issue,  I checked CPU and Memory usage and it was normal.  Our schedule for audits, av and patch scans are spread over 8 hours and this usually happens in 2 hours.

  • I know this may be an odd, and possible stupid question, but I don't know how many agents you're referring to or where they are located.  If the agents are at a client location, have you been able to narrow it down to any certain locations?  The reason I ask is that it could always be ISP related at your client's site.  We've had instance in the past where a number of agents have gone offline repeatedly at certain times during the evening at numerous clients.  At first we assumed it was Kaseya related.  Then after further investigation we found that all of the clients were using the same ISP.  After discussing it extensively with the ISP, we worked with them to troubleshoot down an issue on their end that was causing outages at roughly the same time each night.

    Of course if all the agents are on premise with the KServer, then that pretty rules out that as a possibility.

  • So far there are no connections i can see between the server agents that are going offline, different ISPs, different geography, different clients.

  • We have seen this on several occasions.

    At one location, we seen in the agent log that the line was bouncing randomly. Most or all machines at a location would drop off for 10-30 minutes and then come back online. The ISP was notified and ATT fixed the problem.

    On several other occasions, agents would drop offline at various clients, various location, various times, and some would check back in while others would not. Finally discovered that now and then, when the VSA did its updates, some updates would disconnect and/or block agent connections. No pattern seen, which made it harder to drill down on. But, finally discovered that the VSA updates were causing the problems. Disabled automatic updates and have not had the problem since. We now manually apply the VSA updates and take more control of it to prevent the problem from happening again. Its so random, you spend more time trying to figure out what is going on than being productive.

  • We have had all agents drop connections to KServer during business hours- maybe once every 2 weeks and we've had it looked at multiple times, but it continues to happen. Server is not overloaded or anything, so it's very strange.

  • Last night we had 4 servers go offline at 12:53 at 3 different locations.  In one location 1 server went offline but the other 9 didn't so not ISP related.  Another location (hosted space in a datacenter) 2 of 5 went offline, one was off for 8 minutes and the other one was off for over 2 hours, since they came back online at different times, I don't think it is related to the Kaseya server being busy.  All of the other managed servers at 25 different clients, different states, different ISPs stayed online.  Very strange.

  • Another question.  Have you checked License Manager - License tab under System - Server Management to ensure you're not running out of Agent Licenses for your KServer or that you have any limits set for your client(s) that are being affected that may be exceeding limits.  If there are more agents deployed than are allowed, agents will randomly stop checking in to ensure no more the 100% of licensed/allowed agents are checking in.

  • I had a similar problem with servers dropping off-line but after ckeching the agnet logs I found that the agents were up and running, but couldn't connect to the Kserver. After some digging, I found that a setting in the NIC was the cause.

    under the LAN interface, click properties > Configure > Power Management

    uncheck the "Allow the computer to turn off this device to save power"

    I found that Windows puts the nic's in a sleep like mode listing but not transmitting which causes your agents to go offline at the Kserver.

    I hope this helps, if this isn't your solution then best of luck.

  • To echo danrche, I also had the same issue he referenced on desktops, however that "shouldn't?" be the case on servers.  That said, we elected to remedy the power management challenges using the Kaseya Desktop Policy & Migration plugin which made that easy to manage, again on desktops, along with many other features.

  • joshua.niland

    This has happened to us at various times over the years. In our case is was never network related, but due to High CPU and / or disk usage on the VSA or SQL servers.

    Are you scheduling client audits, backups, AV scans or patch installs during this outage window perhaps?

    Also....if this is on Virtual Infrastructure....make sure the host is not the cause of the outages. Either the VM's coudl be getting backed up....or other VM's on the same host could be affecting yours.

    Joshua,

    you may want to look for SQL bottle necks, you should be able to schedule all that throughout the day with no troubles. If you're tied to a SAN on the back end, make sure you've setup the multiple pathways driver in Windows, also look into your sql disk drive setup.... the more spindles the better your I/O, also devide up your drives; separate your c:\ , tempdb, and ksubscribers on separate partions, this should also help with I/O issues on your DB

  • We just had four servers report offline (9:16PM MST). All four servers are completely fine and are split between two geographically different locations. So this is the sixth time in less than a month we've had this happen.

    Most of the answers I'm seeing here are for onsite Kaseya installs. We have a hosted SAAS instance.