I've just gotten our Server 2008 and SBS 2011 Service monitoring lists/sets set up and have enabled it on our in house 2008 servers. We're using K2/6.2. I'm now getting bombarded with tickets on services "not running", even though when I log in shortly thereafter to check, the services are started and running fine. It's only happening on specific services, such as ShellHWDetection and BITS, and my theory is that these services may just be restarted on occasion, even on healthy servers for whatever reason. In particular, it appears the tickets are occurring right on the hour (i.e. 6:01am, 7:02am, 8:00am) so this appears to be just something the server is doing on its own on a timer. We have no procedures running during those time spans so it's not our management turning them off.
The problem is, even if the service is down for only very brief moments (like 5 seconds) we get a ticket, and the noise is aggravating. Is there a way to set up each monitor list item or the monitor set itself to only alert/ticket after "x" number of seconds, like 60 seconds or 120 seconds of a service being offline? This would eliminate the "noise" entirely and only really alert us if something "stays down".
Any assistance or suggestions would be appreciated.
Can you set some of those services to manual instead of automatic? I think shellhwdetection and bits can both function properly when set as manual.
We had the same problem (like everyone) so this is what we did.
Instead of getting your monitor set to create an alarm when a service stoppage is detected. We get it to run a script instead. that script then waits 1 minutes and then checks if the service is still stopped if it's still stopped then it creates an alarm otherwise do nothing, cos the service is now running.
Yes this takes some basic scripting but it's the only way to achieve what you want
@Michael - How do you do this across multiple services in one monitoring set? For example if I monitor *ALL (all automatic services) in one monitoring set and I want to run a script to see if services are actually down or if they've just been restarted quickly, then I imagine I'd need a pretty huge script that will check each service and then alert if one of the services is still stopped. And since each machine would have slightly different services, then you'd have to build a script for each machine.
Maybe I'm missing something though, so if you could provide more info on how you have your alerting set up, maybe that would clear things up for me.
In a perfect world you'd create individual monitor sets one per service. Then use policy management to deploy them to the server if the service is running and set to Automatically start. obviously the deployment of the relevant policies is automatic (how you achieve that is more magic).
You might say thats heaps of work but you just export the first one then change the name and service name and re import it so it's not as bad as using the interface to make them from start.
Thats one way to achieve what you want there is of coarse more than one to skin a cat
And then you have to create a matching script for each individual monitor set. That definitely sounds like a lot of work. And if you have a new client come on board then you have to check each server for new services that you haven't built monitoring sets for, create the monitor set and then create the script.
In my perfect world, Kaseya would have the ability to alert if the service is down for more than x number of checks.
We have one script that get called be everything. so no i don't have to create the matching script
If you did bring on a new client with new services you have to add them but that is no difference to now anyway
yes in a perfect world it would work out of the box .lol
By monitoring all automatic services, you don't have to add new monitoring sets for services when you onboard a new server. That being said, you do have to ensure that each newly onboarded server doesn't have any services set to start automatically that are often in a stopped state (shellhwdetection and tbs are two that I see quite a lot).
I'm wondering how you do that check with just one script for all the different monitoring sets? Are you able to pass the service name from the monitor set into the script somehow? If so, and you're willing to share your work, I'd love to take a look at it.
Well the script that get run does some remedetion ie it tries to restart it and then waits 6 minutes or till next business day before it checks if it's still stopped before an alarm is created. So that just about always deals with the backups at night time or services that get restarted all the time but it's not actually an issue. this also means engs on site can restart services without creating alarms back at base.
Yes it is possible to pass the name of the service to the script in a round about's way.
I'm quite interested in how you pass the service name to the script. Any desire to share with the class?
Well it gets pretty complicated for me. i dont administer kaseya we have someone else for that i get paid to extend kaseya's functionality to proved my company with a point of difference. So while i would love to tell you everything i've done in detail i would no longer have that point of difference . Also one paragraph on how to do something can be worth many months of pondering and my company need to see a return on that investment. To some degree i've already told you to much now you know it's possible to do you'll probably try to seek out how and achieve it. where as before you many not have thought it was possible and therefore never tried.
I would like to point out im NOT a guru of kaseya but have been asked by kaseya to help the people out on the forum for the good of us all and im glad to do so and frankly enjoy it :)
You can pass the service name with the variable #ln# (that is an L)
www.superchargeyourmsp.com and subscribe for a free trial.
Essentially we have the complete answer to the above post