I am re-evaluating how much data i collect in my monitor sets, and my alarm thresholds and so on
I realised after carefully planning for a new KServer that we:
1. Don't collect enough accurate and useful Performance Monitor data2. Even if we wanted to collect accurate and useful data, the shear volume doesn't scale if i wanted to see a years worth of performance data.
The simple problem, is collecting on a sample interval of every 60 minutes, if i had a year worth of performance data, there'd by millions of rows extra in the database (more on that in another post to come.
Anyway, my question is this
If i have a sample interval on a monitor set of 60 minutes, and i set the collection threshold to -1, does it:
a) Send 1 piece of data every 60 minutes? b) send 3600 pieces of data every 60 minutes?
I am hoping for the former, i want it to collect the data no matter what it is, once, every 60 minutes.
If i can do this, the data over a yearly period of monitoring is useful.
For those interested, i am going to write an article titled "Collecting performance monitor data and analysing it" or something to that effect. When i have finishing writing it i'll post a link to my blog here.
Hoping for some good feedback, Cheers Mark.
I'm thinking that it will collect 1 piece of data every 60 minutes. You can test this for sure by checking the KLOG file that your monitor set produces in c:\ktemp\logs\. I believe that these log files are then sent back to the KServer for processing.
I think that you'd be able to setup two monitoring sets, one that checks every 5 minutes and another that checks every 60 minutes. Then you'd have one graph that shows a week or two with high resolution and one that shows a year with a lower resolution. My only concern is that the lower resolution that the yearly graph has is not accurate enough as it could miss lots of spikes and may not be representative of the average usage that your 5 minute checks are seeing.
IMHO a better solution (and one that I would love Kaseya to adopt) would be to use RRDTool as their data collection, storage and analysis tool.
Here's how RRDTool handles this kind of problem:
You may log data at a 1 minute interval, but you might also be interested to know the development of the data over the last year. You could do this by simply storing the data in 1 minute intervals for the whole year. While this would take considerable disk space it would also take a lot of time to analyze the data when you wanted to create a graph covering the whole year. RRDtool offers a solution to this problem through its data consolidation feature. When setting up an Round Robin Database (RRD), you can define at which interval this consolidation should occur, and what consolidation function (CF) (average, minimum, maximum, total, last) should be used to build the consolidated values (see rrdcreate). You can define any number of different consolidation setups within one RRD. They will all be maintained on the fly when new data is loaded into the RRD.
Round Robin Archives
Data values of the same consolidation setup are stored into Round Robin Archives (RRA). This is a very efficient manner to store data for a certain amount of time, while using a known and constant amount of storage space.
It works like this: If you want to store 1'000 values in 5 minute interval, RRDtool will allocate space for 1'000 data values and a header area. In the header it will store a pointer telling which slots (value) in the storage area was last written to. New values are written to the Round Robin Archive in, you guessed it, a round robin manner. This automatically limits the history to the last 1'000 values (in our example). Because you can define several RRAs within a single RRD, you can setup another one, for storing 750 data values at a 2 hour interval, for example, and thus keep a log for the last two months at a lower resolution.
The use of RRAs guarantees that the RRD does not grow over time and that old data is automatically eliminated. By using the consolidation feature, you can still keep data for a very long time, while gradually reducing the resolution of the data along the time axis.
Using different consolidation functions (CF) allows you to store exactly the type of information that actually interests you: the maximum one minute traffic on the LAN, the minimum temperature of your wine cellar, the total minutes of down time, etc.
Thank you SO much for such a large measured response, this has given me a lot to think about. I have since discovered the answer to my question. It will collect 1 piece of data every 60 minutes.
I created an excel spreadsheet that will show the database writes to the KServer daily / monthly / yearly given 1,2,5,10----100,500,1000 servers and they are MASSIVE
We at any one time have 20 performance counters running doing various things.
I would dearly love to graph this out over a yearly period but i just discovered there are some great complexities to this - Excel doesn't like (until i find a work around) graphing out 255 pieces of data on an X axis
I will look at RRDTool and see how i can make use of it. I think you might be right - if LUA can be used to script against the data sets (excuse me if my terminology is incorrect) then Kaseya can definitely make use of it, especially if it is open source.
I am going to rest on this overnight and give you a more measured response
In the mean time, have a look at http://www.scribd.com/doc/81414818/Database-Capacity-Planning to see the spreadsheet for exactly how much data would be collected over any given period of time.
At the moment we collect 1 piece of data every 60 minutes X 200 servers X 20 performance counters X 24 hours in a day X 365 days a year or there abouts - something like that - i forget.
- Note with that - i only keep 30 days of performance log data - i am going to write a Stored Procedure to get the data and pump it out to a different location - even just a table that gets yesterdays performance data written to it. I have read up on limitations querying a table with 50 Million rows in it and there aren't too many other than things should be indexed and expect the results to take a sh!t load of time.
Happy to continue fleshing ideas out here with the community if appropriate.
@Mark - I like the idea of writing a stored procedure that would move data from the production db off to another location. I'm wondering if somehow there's a way to pull performance data from MS SQL directly into RRDTool.
Alternatively, because RRDTool was built to keep historical data while staying within a set size limit, and is quite small itself, maybe it would be possible to set up rrdtool on managed machines and pull data from the Kaseya log files kept on each machine. The data could either be saved locally and/or onto an external storage device (usb/nas). That would also save a lot of processing by the DB.
Every night the files RRDTool creates could be pulled back to the KServer, or could be included in the client backups just in case.
Of course all of this is just wishful thinking at this point, but the local option sounds like it could be done. RRDTool has a windows binary (oss.oetiker.ch/.../rrdtool-1.2.30-win32-perl510.zip).