I'm using Service Desk for automated remediation of events. Each event has a "remediation time" assigned - use 15 minutes for this discussion.
When an event arrives, the DeDup procedure fires and any duplicate events within the 15-minute remediation time are treated as dups. The dup is logged in the ticket associated with the first event, and the dup is ignored.
In addition to this, I have some logic to detect "repeating" events. Each event writes a time-stamped entry to a SQL table that includes the monitor set ID, Agent Guid, and other values to uniquely identify the event. The first stage of the Service Desk process writes this entry, then requests a count of identical events from the last 4 hours, storing the result into a variable called RepeatCount. The following steps are then performed:
If RepeatCount is 1, we progress to another stage that invokes the identified remediation task.
If RepeatCount is Greater or Equal to 5, the ticket is marked as "Repeating", the remediation task is disabled, the priority set to High, and a ticket is generated in ConnectWise with "REPEATING" status. The remediation stage is bypassed.
If the remediation is enabled at this point, the RepeatCount must be between 2 and 4 - the repeat count is written to the ticket log, but remediation is still permitted and we escalate to the remediation stage.
Depending on the result of the remediation stage (if performed), a Closed (remediate success) or New (remediate failed) ticket is sent to ConnectWise.
Bottom line - first time we remediate with no repeat count reported, next 3 times we remediate with a repeat count reported, and 5 or more, we don't remediate and send a NEW/REPEATING ticket to ConnectWise.
Today I had a situation where a monitor set was applied to a non-domain computer, and the NetLogon service continuously reported failures. This exposed a problem in the above logic.
While the value of RepeateCount was between 5 and 9, the "repeating" logic if statement fired properly, no remediation was attempted, and a New/Repeating ticket was created. (This is de-dup'd in ConnectWise, BTW). When the RepeatCount hit 10 and above, it behaved as if it were between 2 and 4 - always reporting the repeat count in the SD ticket, attempting remediation, and creating a New ticket in ConnectWise instead of one with Repeating status.
There is a SQLQuery command that returns a Count of matching items to the RepeatCount variable. The next line only seems to work with values of 5-9 - If CheckVariable("#RepeatCount#") Is Greater Than Or Equal To "5"
I'm sure I could rewrite the logic to use "Less than 6" but it seems strange that this basic math comparison is failing with double-digit values.
I appreciate any insight that can be offered!
offhand, my guess would be that it's using "string" comparison rather than numeric comparison...
And in "string" comparison... 15 is less than 5... Because you start looking at the lefthand most character first, and 5 is greater than 1...
Of course the real test would be to generate a "repeating" event and make it repeat 50 times to see if "50 is greater than or equal to "5"..
I would wonder what happens (if it's even possible), if you leave out the quotes to make it
If CheckVariable(#RepeatCount#) Is Greater Than or Equal to 5
My other post has been moderated, so it won't show up until later :)... However in further testing I wasn't able to duplicate this... But I would still bet it has to do with it doing a string comparison rather than a numeric one. From the help file:
"Values in variables are stored as strings, so compared numbers must be of equal string length"
... Just for the heck of it could try try changing from the if Checkvar to IF Eval()?
That (string compare issue) was suggested by my peer here at the office. I bet that would work if the If Eval() was an option, but the available statements in Service Desk/Stage Entry & Exit don't include If Eval(). Only the CheckVariable(), inReopenTicket(), isWithinCoverage(), testIncidentCustomField(), and testIncidentProperty() statements are available.
I've changed the logic to (pseudocode):
If CF:CW_Status <> "None" ; If ConnectWise Status is New or Closed
If CF:Actionable = "Yes" ; If event is Actionable
RepeatCount = SQLQUERY:SDRC_GetCount ; get count of matching Repeating entries
SQLNonQUERY:SDRC_Create ; Add new Repeating entry
If #RepeatCount# < 5 ; not yet considered "REPEATING"
If RepeatCount > 0 ; Has occurred 1+ times in past 4 hours, so note in ticket
AddNote "...occurred [=RepeatCount=] times..."
Actionable = No
Priority = Repeating
AddNote "...occurred [=RepeatCount=] times and has REPEATING status..."
I have my "bad" monitor set re-applied and currently have 9 tickets (initial and 8 repeats) from that host... as of now everything after "4" is identified as "repeating" and no remediation is being performed - at the current rate, I should see the result in the next 30 minutes when the count hits 10.
If this doesn't work, I may add a new SQL statement that returns True or False if the count exceeds 4 and use that result in the "Is Repeating" test. I'll use both SQL calls - one to log the number of repeats in the ticket and the other to decide on the Repeating status.
Sure would be nice if the command availability was consistent between SD and Agent Procedures. :)
I'll post the result of the test in progress shortly..
Back to the drawing board, since the procedure thinks that 10 is less than 5, too. :(
Yeah I keep forgetting that the Service Desk module doesn't support all the same code, since we don't use that module much, I've tried to use it like you are as a "filter" for some things before they hit our connectwise service board, but too much of the time I run into stuff like this that just frustrates my efforts.. Between that and the fact that Connectwise and Kaseya decided to give up on the two way ticketing integration and leave it up to DevIO which in addition to being expensive, I haven't heard a whole lot of people having great success with, I find myself barely utilizing the Service Desk module at all.
Well, I can report success, now that I avoid Kaseya's If test with regard to math ops.
I still call the SQL query that returns a count, but now just use it for reporting and to test equality for zero. I wrote another SQL query that returns 1 if the count is 4 or less and zero otherwise. So now, the logic is:
If RepeatStatus=1 ; not repeating yet - <4 times in 4 hours
If RepeatCount=0 ; is first time
AddLog "Initial event" ; a note for debugging, mostly
AddLog "... [-RepeatCount=] times in 4 hours..."
Else ; is repeating
set repeating priority, clear actionable status, set CW_Status, write log with count, etc...
All of our alerts pass through ServiceDesk, where we check for dups, repeats, assign status/priority, determine if a remediation action is possible, perform and evaluate the remediation result when present, and finally decide if and what kind of ticket to forward to ConnectWise, which is our primary tech interface for ticket management. The techs aren't even aware of tickets in Kaseya.
Thanks for your ideas!