Just wondering what is the best way for a new kaseya user to build up comprehensive event log monitoring. We currently have a number of critical issues logged, like eventog 6008 unexpected shutdown, etc. However we had a clients Exchange go down which was logged in eventlog but we were not notified. I am not happy with this, as I need to make sure what we advertise (proactive) is what we deliver.
Two-Three ways of doing this I guess...
These solutions have their downsides, but what do you recommend as a way forward? Can anyone advise on where I can grab a list of error id's for a given source? Even eventid.net doesn't have a list. I'd happily pay a few hundred for a list too!
Let me know what you guys thing and thank you for any comments in advance, it is very much appreciated. If anyone would like to assist I am willing to pay a consultancy fee for one-to-one remote assistance in getting our Kaseya working properly for us.
This is one horrible chestnut and I am sure you will get a multitude of different response to this.
What we have done is taken the approach that we just cant miss anything so we collect everything. We then have a global block list and we also have Priority 1 / 2 / 3 lists everything else goes to a different location.
globalblocklist is everything we don't want to see, I am currently working on a way to get this down to the exclusion xml that you can put on the machines so that kaseya does not even collect these.
The priority 1 / 2 / 3 are event sets that we use to define events that have different priorities and based on the priority we do different things so 1 goes directly to our monitoring board in connectwise and gets a P1 stamp Red. P2 goes to monitoring but gets a p2 stamp and 3 goes to a proactive board in CW and gets a normal stamp.
All the rest of the noise goes to another mailbox that is filtered through - we still have lots of work to do on it but we are getting there. We are going to implement email2db and then parse all the "other stuff" into a DB we can then report on it so we can see the noisy machines and then try to fix.
Yes it is a messy solution but we agreed we want to see it all and then fix it rather than see the blinkered view of what is going on and to say yes I have all the critical ID's I think you would be fooling yourself - I know for instance that in Win2k8 they changed the ID's for security auditing so again you have to stay on the game whereas with us we just see them and then decide.
Anyway thats us :o)
Exactly, thank you for the helpful response mmartin.
For reference how would I do the following in Kaseya:
Monitor all eventid's in Application and System for Error status, and ignore noise that isn't important?
Also, we rely on Kaseya alerts dashboard at the moment, as we currently trying to integrate email2db on a custom ticketing system. How could we flag important things like 6008 events as alerts, the rest as just e-mails (noise) that we can filter out and use discretion?
I don't like eventlog monitoring on workstations just yet, I will get round to it but with that it's like signing yourself up for spam.
To monitor all you use the event set All events for the selected Event log so App or Sys - then pick error / warning / information - you apply this to the machine you are now monitoring all events for that selected event log for that selected type.
To ignore you create an event set call it blocklist and you add the event ID's or source / description etc of the ones you want to ignore and you ticket the ignore button and then add. You apply this to the machine also and remember you apply it again on the particular event log and also whether it is error / warning.
Note - if you are applying an ignore event set you must apply it first then the all events (something quirky about that)
This means when kaseya see the event it looks in blocklist first if it finds it it moves on otherwise it goes to all events and sends on to you.
Note also - under log settings I think in Agent menu you must make sure you are collecting Error / Warning for both app and system otherwise kaseya will not collect the events to be able to alarm on them.
If you apply an event set to a machine and you don't hve the collection enabled I think it will highlight it in red so you know something is a miss.
Very helpful, confirmed what we have so far. Just wasn't sure about ignore lists.
Basically we currently have:
Windows Base Events (error)
Windows Backup Monitoring Events (error)
and so on... which all generate an alarm.
Below that you're saying we could add Global Noise Blacklist, ignore things as mentioned, then a monitor all set?
Can I just check with you on the: "Ignore additional alarms for xx"
In theory, if I set 1 Day, I wouldn't be notified about other alerts would I? Or is it per eventid?
ignore for 1 day is that event will not alarm for 1 day but all others will.
clean machine - you put ignore first then Windows Base Events then Windows Backup monitoring then all events
Thank you. Think I'm sorted on that, what I can do is any urgent ones is build a list and add them to the alarmed sets so we get a red spot and a ticket for those ones.
Many thanks mmartin. Can I ask if you do the same for service monitoring - automatic services? I.E. Report if any automatic services fail or stop? Or do you do this on a per service level? I kind of like the idea of getting everything, as it means nothing is ever missed.
However, if anyone else would like to chip in before I close this discussion it would be most welcome?
heading home now so I will let the others from across the pond do some talking.
For service monitoring we built our own application - its a small exe that runs on the servers and alerts via the event logs. We don't use kaseya's service monitoring as you can either monitor ALL with no exceptions or you have to manually select services add them to a monitor set and apply to servers. With our app we can add exclusions at a global level or at a machine level and it monitors all automatic services only.
Best of luck anyway.
Just to throw my thoughts in here.....
We do exactly what has been suggested already: Use an ignore list and monitor for all events. At one point we were monitoring for specific event IDs, but missing two RAID failures in a week caused us to change our monitoring (was an atypical RAID controller and the source/ID was not added to the watch list).
So now, we monitor for all Application/Directory Service/ DNS/File Replication/System critical/error/warning events (but only on servers, workstations are disk events and user profile events mainly). We have an extensive ignore list that is continually being added too as I document each event we see and process. I have a Sharepoint list that is far from being complete but has a decent amount of events, an explanation of what happened, any applicable links to documentation, and whether we want to alert or ignore and why. I also included resolution steps for the event if we are monitoring it.
As for service monitoring, we do not use the Auto Learn option or a *all option. Instead when we take on a client we check each server for Exchange/SQL and any other critical client apps that may run as a service. We then add the specific service monitor. I have universal monitors for Exchange/SQL (if installed), and basic critical Windows services.
This is what happened to us. An exchange failure was missed, which caused me to look into this in more detail. We are not monitoring everything and building up an ignore list.
We also have a custom database (not sharepoint) that does exactly the same. Allows us to keep track of jobs and resolutions to those.
Thank you for all the comments, it has been really helpful to find out what others do. We're new to Kaseya but everyday we're getting more from it so that is good.
Sorry I mean't we are NOW monitoring everything. :-)