Kaseya Community

Monitoring the Monitor

  • I have been searching for a good answer to this but have yet to find one. Has anyone come up with a good way to Monitor the "Monitor Sets" in Kaseya so you can be alerted when they are not working on the servers you are trying to monitor?

    Just for understanding we have a base 5 monitor sets we apply to all servers and then setup alerts if they go beyond the thresholds. We have been finding lately that these monitors on some servers are in a state of not responding and we have to troubleshoot the monitor set.

    The issue is we do not want to check the monitor set every day to make sure they are collecting data.

  • Bumping this one - we would love to know as well.  I am having trouble understanding why there is no error checking in monitor set deployment.  You apply the policy, it's compliant, but as far as I am concerned, it's NOT compliant if the monitor sets are not receiving data.

    I have had one this morning like this - the Agent had failed to deploy any perfmon counters.  Removed the agent, cleaned up files, deployed agent, perfmon counters now created.  But still no data coming through.  It's really hard to troubleshoot WHY.

  • Hello,

    At the current time there is no feature within Kaseya to alarm when a monitor set is not responding or not returning data in X amount of time.

    However, the following SQL query can be ran in SQL Management Studio against the KSubscribers database to find the top 800 monitor sets that are not responding or have not returned data in 24 hours.

    (This query can be changed and customized and is a pretty basic template)

    select top 800 b.machname, b.agentguid, a.monitorcounterid, a.agentguid, a.countervalue, a.eventdatetime

    from monitorcounterlogsummary a left join machnametab b on a.agentguid=b.agentguid

    where a.agentguid = b.agentguid

    and DATEDIFF(hour, a.eventDateTime, GETDATE()) > 24 or a.counterValue like '%999%'

    order by machname desc

    Please note - This SQL query will only point out the monitor sets that may have stale data or are in a not responding state.

    If there is a issue with the endpoint itself and not being able to start the Perfmon counters, the following articles should assist in troubleshooting:

    http://community.kaseya.com/kb/w/wiki/692.aspx

    http://community.kaseya.com/kb/w/wiki/1192.perfmon-counter-fails-to-start.aspx

    To add on to these articles - The best way to verify the status of counters is by running a "logman query" command via the command prompt or KLC > CMD on these endpoints.

    It will return a list of all counters on the machine running or stopped.

    IF there are any stopped, you can manually try and restart them via the following command:

    "Logman Start KCTR$XXXX"

    Should the counters not start when trying to perform this manually, it will return a error in the Event Logs on this machine.

    The Kaseya Agent will not be able to start counters if they are throwing a error, as something on the endpoint themselves is preventing this.

    ~

    Going forward based on the Kaseya Roadmap - KNM will be included in the core based monitoring which uses WMI to query values instead of Perfmon Counters.

    http://community.kaseya.com/p/roadmap.aspx

    Kind Regards,

    Nicolas

  • You can use the following query as an example to find non-working counter monitors

    DECLARE @Agentguid numeric

    set @Agentguid= 0123456789 -- set this to the agent to be queried

    ; WITH cte AS

    (

    SELECT    monitorcounterid,

               countervalue,

               Rn = ROW_NUMBER() OVER (PARTITION BY monitorcounterid

                                       ORDER BY MAX(eventdatetime) DESC)

     FROM      vmonitorcounterlog

     WHERE     eventdatetime >= DATEADD(day, -1, GETDATE())

     --and agentguid = @Agentguid

     GROUP BY  agentguid,monitorcounterid,countervalue

    )

    SELECT    

    mc.name

    ,mc.counterscript

    ,cte.countervalue

    FROM        cte inner join monitorCounter mc on cte.monitorCounterId=mc.monitorCounterId

    where Rn =1

    and cte.counterValue in (-998,-999)   -- comment this line out to return all values for the agentguid

    GROUP BY    mc.name,mc.counterScript,cte.countervalue ;

  • Would it be possible to get this into an SQLREAD and than use within a procedure to automate ?

  • Create a view with the following query and you can use that with an agent procedure sqlread() step

    WITH cte AS

    (

    SELECT    monitorcounterid,

              countervalue,

              Rn = ROW_NUMBER() OVER (PARTITION BY monitorcounterid

                                      ORDER BY MAX(eventdatetime) DESC)

                                      ,agentGuid

                                      ,eventDateTime

    FROM      vmonitorcounterlog

    WHERE     eventdatetime >= DATEADD(day, -1, GETDATE())

    GROUP BY  agentguid,monitorcounterid,countervalue, eventDateTime

    )

    SELECT    

    cte.agentGuid

    ,mc.name

    ,mc.counterscript

    ,cte.countervalue

    ,cte.eventDateTime

    FROM        cte inner join monitorCounter mc on cte.monitorCounterId=mc.monitorCounterId

    where Rn =1

    GROUP BY    cte.agentGuid,mc.name,mc.counterScript,cte.countervalue ,cte.eventDateTime

    having cte.counterValue = -998 or cte.counterValue = -999

  • were do you create the view ? Have youan example ?

  • i used attached script for to find monitor not responding machine .

    upload  collectlogs.sql into managed files and schedule the script into Kserver .

    Need to replace email id in script .

    CollectLogs.sql6215.Procedure MonitorNotResponding.xml

  • We took a different approach to this problem, and have an agent procedure which write a batch file which checks the dates of the %workingdirectory%\logs\klog*.csv and alerts us if none of them have been modified in a day.

  • I'm getting data back from this query but when I check the monitor sets for the machines listed, all the monitor sets are up to date.  Not sure what monitor sets this query is referencing.  Could this query be pulling up old data and/or monitor sets that are no longer active?

  • This is the query that has worked for us:

    select distinct

    b.displayname,

    c.name,

    a.countervalue,

    a.eventdatetime

    from monitorcounterlogsummary a

    left join monitorCounterAgent d

    on a.monitorCounterId = d.monitorCounterId

    left join machnametab b

    on a.agentguid=b.agentguid

    left join monitorCounter c

    on a.monitorcounterid=c.monitorcounterid

    where (DATEDIFF(day, a.eventDateTime, GETDATE()) > 7 AND DATEDIFF(day, a.eventDateTime, GETDATE()) < 30)

    and d.activeFlag = 1

    order by

    b.displayname,

    c.name,

    a.countervalue,

    a.eventdatetime

  • @rmeyer: that's a good idea! Only problem is that you won't be able to detect when kaseya stops correctly reporting on *services*, not just counters. We've got a periodic report that runs to check this, but it's quite annoying.

  • I would like to bump this as a suggested integrated function of Kaseya as well.

    I had a problem with monitors not working about 2 years ago that led to a client being quite upset that that we didn't know about their problem.  I worked with a few very helpful and knowledgeable folks at Kaseya (who are no longer with the company) to correct the problem at I was told about future improvements to proactively identify this issue, but none were ever implemented.

    Just this week I had another problem that we didn't know about because services monitoring was not working.  I have the monitors working again, but I am worried about the next time it breaks.

    I very much hope that this will be addressed.  I find it frustrating that end users must come up with workarounds to discover when monitoring software is not working.



    Correct wording.
    [edited by: eperson at 7:03 AM (GMT -8) on Jan 30, 2014]
  • Totally agree with eperson, it boggles my mind that we have to find workarounds to ensure alerting works properly, when the entire point of alerting is to enable automated notification of issues. So we can't depend on it for the core purpose of what it is supposed to provide.

    We've also been burned by alerting not working properly, and we've (tried) to invoice Kaseya for what it cost us.

  • I see a few work-arounds being offered by various members and a Kaseya employee  to supply a solution to gain traction on the ultimate goal.

    For this I thank you!  

    For those who are concerned - Submit your suggestions or issues to the new management team as hopefully they may address your concerns.