Kaseya Community

More interesting SQL and custom reporting, Event ID's and Performance

  • Hi guys and girls, another post similar to my last,

    After i wrote that query to pull in events i wondered exactly what i could do with it - turns out it is quite extensible and quite powerful. 

    I thought to create a "Event ID percentage score" similar to the patch score given in the default Kaseya reports. 

    If you run the following Query in Microsoft Excel, and create a pivot table for it.... you get an accurate percentage of the EventID's that should consume the most of your time

    select b.displayname, a.eventid, a.source, b.groupName, count(a.agentguid) cnt
    from ntEventLog20120717 a, machnametab b
    where a.agentguid = b.agentguid
    and a.eventid not like '25'
    and a.eventid not like '26'
    and a.eventid not like '30'
    and a.eventid not like '31'
    and a.eventid not like '32'
    and a.eventid not like '34'
    and a.eventid not like '35'
    and a.eventid not like '36'
    and a.eventid not like '37'
    and a.eventid not like '45'
    and a.eventid not like '47'
    and a.eventid not like '50'
    and a.eventid not like '53'
    and a.eventid not like '54'
    and a.eventid not like '55'
    and a.eventid not like '59'
    and a.eventid not like '60'
    and a.eventid not like '61'
    and a.eventid not like '62'
    and a.eventid not like '63'
    and a.eventid not like '70'
    and a.eventid not like '77'
    and a.eventid not like '100'
    and a.eventid not like '101'
    and a.eventid not like '102'
    and a.eventid not like '103'
    and a.eventid not like '107'
    and a.eventid not like '108'
    and a.eventid not like '109'
    and a.eventid not like '114'
    and a.eventid not like '118'
    and a.eventid not like '120'
    and a.eventid not like '124'
    and a.eventid not like '126'
    and a.eventid not like '128'
    and a.eventid not like '130'
    and a.eventid not like '134'
    and a.eventid not like '135'
    and a.eventid not like '141'
    and a.eventid not like '142'
    and a.eventid not like '145'
    and a.eventid not like '148'
    and a.eventid not like '152''
    group by b.groupName, b.displayname, a.eventid, a.source
    order by cnt desc

    From there, you take the sum of the top 5 occuring event ID's filtered by the site you want to check, and divide it by the grand total. This site for example, i'd say was "90.00%" healthy.

    Of course, health is a made up figure, but i can justify it easy enough to my customer.

    Out of 20 servers this customer had 5 events occuring on 2 servers that made up 90% of their errors. Thats justification enough to me.

    Of course, with that query, there are 2 things. 
    1. Change the date each day
    2. The "a.EventID not like '##' - the actual query i use has 1000 or so event ID's to ignore - remember that the table being queried is events collected not alarms, so the ignore sets don't apply. 

    Next up, Performance Monitoring, this one i'm pretty proud of. I've yet to come across a good way of reporting performance. 

    The Query - This only shows Performance Monitor > Memory Counters. Audit your performance sets to get counter names. The Data Dump The Pivot Table
    SELECT     b.displayName, a.monitorCounterId, a.counterValue, b.groupName, COUNT(a.agentGuid) AS cnt, monitorCounter.name
    FROM monitorCounterLog20120717 AS a INNER JOIN
    machNameTab AS b ON a.agentGuid = b.agentGuid INNER JOIN
    monitorCounter ON a.monitorCounterId = monitorCounter.monitorCounterId
    WHERE monitorCounter.name like 'Available MBytes < 256'
    and counterValue < 256
    and counterValue not like '-999'
    and counterValue not like '-998'

    or monitorCounter.Name like 'Page Reads/sec'
    and counterValue > 90

    or monitorCounter.name like 'Page Writes/sec'
    and counterValue > 90

    or monitorCounter.name like 'Pages/sec'
    and counterValue > 150

    or monitorCounter.name like '% Committed Bytes In Use'
    and counterValue > 85

    GROUP BY b.groupName, b.displayName, a.monitorCounterId, a.counterValue, monitorCounter.name
    ORDER BY cnt DESC

    A few things to note about this

    1. The counter value is the greater or less than value it takes for our performance monitoring to exceed a threshold
    2. In the Datadump / work sheet in excel, you need to create a "Percentage" column. This is, the count of the times the thresholds were exceeded  divided by the number of times you collected that data.
    2A. We collect each performance metric 96 times a day, or even 15 minutes. If the threshold was exceeded 9.6 times, then i can safely say the server was busy 10% of the day

    3.  When you pivot from that table on the fields in the above table - you will see which servers were really busy for that period of time.

    I know it's a bit confusing to wrap your head around - it took me ages to figure all this out, but it was a worth while process. 

    Any queries or additions? Let me know if i have a flaw in my logic or i missed anything :)

    Cheerio All,

  • well that didn't work - for those interested - right click and save the pictures as you need to view them