I'm wondering if anyone here has experience monitoring and making sense of the metrics the MS SQL servers return (via WMI). Specifically I have trouble with the 'Buffer Cache Hit Ratio' test.

The supplied test definition looks like this (in 95_wmi_ms_sqlsvr.xml):


    <probeConfig>
      <subType name="sqlsvr_stats_cachehit" enabled="true">
        <wmiObject>
          <prefix>Win32_PerfRawData_MSSQLSERVER_SQLServerBufferManager</prefix>
          <poll name="keyField" property="@"/>
          <poll name="ratioBase" property="Buffercachehitratio_Base"/>
          <poll name="input" property="Buffercachehitratio"/>
        </wmiObject>
        <filter name="objectName"
          pattern="^_Total$"
          action="reject"/>
        <filter name="ratioBase"
          pattern="(^$|^0$)"
          action="reject"/>
        <postProcess
          testName="SQL Server: Buffer Cache Hit Ratio"
          displayUnit="%"
          resultMultiplier="[[100/$ratioBase]]"
          directive="none"/>
      </subType>
    </probeConfig>


So what this is doing essentialy is dividing the Buffercachehitratio property by the Buffercachehitratio_base property and multiplying it by 100 thus getting a percentage.

The issue I observed (using a WMI browser) is that the value of both of these properties change in every second, so putting the Buffercachehitratio_base in the static 'resultmultiplier' results in bogus numbers (I routinely observe values like 300%).

I figured I could just create a composite test and do this calculation in real time, but even then the rapidly changing nature of these properties would not give me meaningful results. These values are not true counters, rather results from the past 1 second I assume. (so assuming 5 minute test intervals, I'd only get data for the past 1 second?)

Let me know if someone could explain to me how this should be properly calculated and tracked.

Regards,

Krisztian



.
[edited by: kmizser at 2:02 PM (GMT -7) on Sep 22, 2014]