The initiative & opportunity
I'm considering our options for a big data / analytics initiative in our business. We have the benefit of being able to leverage AWS Datalakes and we are looking at front ends for analysing data. We are a ConnectWise shop so that part of our our business is catered for, i'm turning my attention to Kaseya. I know the Kaseya database pretty well so i know what we can get out of it. Patch / Audit data is a given, i'd just point our tooling at that data and away we go.
Event log, Syslog, KNM data. I'll start with event logs. If you collect event logs and for too long, your database size per agent is large and the first thing we get from support is to get that number down. I want every event log every on all systems we support going back to the dawn of time. True Big Data. True insights into all systems we manage. Imagine being able to look at the Event Logs an average retail device generates, and using that data in your next sales pitch for a similar customer. Spread that over 100 customers. First foot in the door, second foot to follow. Syslog data, and KNM data is still the same. We can't currently leverage KNM's database easily as it's not in the Kaseya DB. It's a seperate DB. Getting that data in a datalake should be a "simple enough job"
Should Kaseya consider an initiative as part of the platform to capture data without impacting production performance&Has any one here considered, or implemented an initiative to capture event log data beyond "7 days".
I'd love if Kaseya had an initiative or made an an acquisition of a small / startup BI company. My current thinking is to keep 3 days of data and have a script that targets my DB to copy the data to a seperate table not linked to the agent to keep the data away from the production DB / decrease performance
We've captured just event logs (primarily security) logs for a couple of hundred machines, and attempted to keep these for a year. Unfortunately, Kaseya's DB doesn't scale that well, and by 5TB, it was impacting our ability to use Kaseya for normal purposes. Event after we've optimised the underlying compute power, I suspect we won't get to 10TB without the performance being a significant problem for normal functions. This was only with about 3 months of data too - the idea of keeping a single year of all logs in Kaseya has become impractical.
The biggest hurdle for us is the Kaseya DB structure. Event logs have always been modular, in that there are insertion strings, but the message is static. So an event description is 99% redundant. In Vista and above, the XML based event logs are even better for storage, as they break up the fields for you, so there is even more information for less work. Unfortunately Kaseya stores the entire message in plain text, so it is massively redundant, and makes searching slow.
An example is that a complex SQL query (complex in the sense of being long, searching for string occurrences) took about 36 hours before I cancelled it. Less complex queries are taking between 1-3 hours, searching for less than three occurrences of a string in the message field. When we search for too many text strings at once, it takes 100% CPU on an 80core system!
At the moment we're focusing on writing a basic front-end for a SIEM component, and then will have to look at writing our own SIEM, or purchasing another product. It is really a shame, as the information is being collected by Kaseya, and the things we are looking for in the logs are all well documented, but we may have to spend money to duplicate the process with another commercial product, because of poor database design.
In the end, we are extracting data once per day into a separate DB, where we get down to <100M records is much more searchable than 1B+ records.
@eyeTSystem, can I ask what system specs you have with a number of Agents? We get some trouble too with collection large amount of event logging.
Mark Boyd (mark David Boyd), will comment see with a description how we did that. The only thing where I still have trouble with is that KNM data is worthless with SNMP data because that doesn't store the string. Just end with an engineer that point me to feature request but will take this first up with my sales rep. Because of the lack of functionality. (Traverse has the same with reports..)
At the moment is collection is just 40 MB Agent. (Collection event data of 62 days) but keep our eyes on the tables and filter many events.
The data from KNM is stored in Record Manager and we can't manage to get it into a database (SNMP) data ...
Will update with more information soon :)
We are running on an 80 core Xeon system, 140GB RAM, and FC connected SAN. The system itself is quite speedy, but the design means even this hardware can't deal with the volume of data being stored as strings in the database.
We are averaging 26.1GB/machine, which is keeping <90 days of logs, with lots of audit logging. While we could trim down the logs we are collecting, it would be an issue in terms of incident response to have only some data available.
The typical answer from the community will be "you don't need that much data", but let me give this example. We had the CEO of an organisation have a malware incident, due to him requiring (read desired without cause, and was the CEO, so insisted he got) admin rights. The malware was running on his system for some time, under the system security context and the CEO security context. A brief examination of the processes activities showed that it was reaching out to internet hosts on SMB ports. We were able (after running a week of queries against this slow DB) to prove with certainty that there was no data exfiltration occurring. Without these log files, we couldn't make that affirmation.
If the event logs were stored properly, using the insertion strings, I estimate a 90% space saving on 1B+ records, which would fix our problem entirely.
To me the SNMP monitoring in Kaseya is confusing, and as you say, isn't particularly useful, so we haven't persisted. We've recently been looking for alternatives - if you have suggestions, let me know!
26.1 GB/per Machine is an absurd amount of data for each individual machine in your Database and I can definitely see performance hits every time something tries to query that amount of data for each machine.
I would not currently recommend using Kaseya VSA as a log storing utility especially if every log from the beginning of time is trying to be stored, the database was not designed to be an infinitely scalable and searchable log stash.
If storing and maintaining enormous amounts logs for large amounts of time (big data), I would recommend pursuing open source stacks like Elastic LogStash with Kibana "ELK" (https://www.elastic.co/products/kibana).
Alternatively, there are paid log stash options that are truly designed for collecting, storing, and reviewing logs (Splunk/ArcSight).
KNM SNMP vs Classic Monitor SNMP are very different, however, both should have some sort of data in the database to query.
If you have questions on either, feel free to PM me as I don't want to deviate the thread.
I don't disagree with the large size of the database being a problem, however if it were stored properly, there wouldn't be much of a problem (2.6GB/machine is much better than 26GB, which would be the approximate savings if the DB were redesigned). SQL can easily deal with this data set, and would yield reasonable search times if we only had to search on four fields (event time, event id, event source, (e.g.) computer name). I agree that Kaseya isn't a SIEM, but it's already got log collection and aggregation sorted, so it would be great it we could use it for the analysis of large event sets. Especially when the logs we are collecting are currently less than 90 days worth.
In any event, we're also at the same point, where the Kaseya DB is a temporary measure until we can get a better solution in place.
eyeTSystems - That is a crazy amount of data! Lordy, i'm envious, i wish i had that much, but the problem would remain for us, and i agree with Nicolas (not in a mudslinging way) - that amount of data would kill most systems.
I agree with the comment in this thread that it's essentially a lost opportunity for KAseya here - they coudl dominate the Big Data market and make companies like LogStash or Secureworx look completely irrelevant, if they were smart about their big data capturing.
The solution to me would be new module called "Insights" or something similar, where every bit of data that is every pulled by the Kaseya database is stored in a detached DB and you can pick and choose what archiving you want. Event Logs, Syslogs, SNMP data, all of it. What you do with the data, and what analytics tools you put over it is entirely up to you, but the abiltiy to capture the data and not impact production systems would be amazing.