Bind 9.5+ provides a very rich set of metrics that a server admin can use to troubleshoot problems, and monitor overall health and usage of an environment, but these details are only available locally, through the statistics file. Rather than write an agent that runs on a server, we opted to use the UCD-SNMP mib, and leverage the EXTERNAL test framework, to provide a mechanism so that arbitrary metrics from your BIND9 installation can be retrieved for monitoring within Traverse, without putting undue load on your server.

Design goals were simple : provide metrics with a simply configured, and low impact monitor.

To this end, we chose the external script configuration with a standard net-snmp snmpd. Sample config lines for this are :

Code:

exec bind-nxrrset /etc/snmp/bind96-getstats.pl --file=/var/named/data/named_stats.txt --prefix=name_server_statistics --item=queries_resulted_in_nxrrset
exec bind-servfail /etc/snmp/bind96-getstats.pl --file=/var/named/data/named_stats.txt --prefix=name_server_statistics --item=queries_resulted_in_servfail
exec bind-nxdomain /etc/snmp/bind96-getstats.pl --file=/var/named/data/named_stats.txt --prefix=name_server_statistics --item=queries_resulted_in_nxdomain
exec bind-recursion /etc/snmp/bind96-getstats.pl --file=/var/named/data/named_stats.txt --prefix=name_server_statistics --item=queries_caused_recursion
exec bind-duplicates /etc/snmp/bind96-getstats.pl --file=/var/named/data/named_stats.txt --prefix=name_server_statistics --item=duplicate_queries_received

These entries tell the snmpd to run the bind96-getstats.pl file (with the provided arguments) when the specific OID's that it assigns to them are called. These OID's are generated at run time, and are based upon the number of 'exec' configuration lines that are in the file. In this case we only have the 5 referenced lines, which produce the following snmpwalk output :

Code:

snmpwalk -v2c -c public localhost .1.3.6.1.4.1.2021.8.1 
UCD-SNMP-MIB::extIndex.1 = INTEGER: 1
UCD-SNMP-MIB::extIndex.2 = INTEGER: 2
UCD-SNMP-MIB::extIndex.3 = INTEGER: 3
UCD-SNMP-MIB::extIndex.4 = INTEGER: 4
UCD-SNMP-MIB::extIndex.5 = INTEGER: 5
UCD-SNMP-MIB::extNames.1 = STRING: bind-nxrrset
UCD-SNMP-MIB::extNames.2 = STRING: bind-servfail
UCD-SNMP-MIB::extNames.3 = STRING: bind-nxdomain
UCD-SNMP-MIB::extNames.4 = STRING: bind-recursion
UCD-SNMP-MIB::extNames.5 = STRING: bind-duplicates
UCD-SNMP-MIB::extCommand.1 = STRING: /etc/snmp/bind96-getstats.pl
UCD-SNMP-MIB::extCommand.2 = STRING: /etc/snmp/bind96-getstats.pl
UCD-SNMP-MIB::extCommand.3 = STRING: /etc/snmp/bind96-getstats.pl
UCD-SNMP-MIB::extCommand.4 = STRING: /etc/snmp/bind96-getstats.pl
UCD-SNMP-MIB::extCommand.5 = STRING: /etc/snmp/bind96-getstats.pl
UCD-SNMP-MIB::extResult.1 = INTEGER: 0
UCD-SNMP-MIB::extResult.2 = INTEGER: 0
UCD-SNMP-MIB::extResult.3 = INTEGER: 0
UCD-SNMP-MIB::extResult.4 = INTEGER: 0
UCD-SNMP-MIB::extResult.5 = INTEGER: 0
UCD-SNMP-MIB::extOutput.1 = STRING: 109351
UCD-SNMP-MIB::extOutput.2 = STRING: 8428
UCD-SNMP-MIB::extOutput.3 = STRING: 667201
UCD-SNMP-MIB::extOutput.4 = STRING: 778181
UCD-SNMP-MIB::extOutput.5 = STRING: 13407
UCD-SNMP-MIB::extErrFix.1 = INTEGER: noError(0)
UCD-SNMP-MIB::extErrFix.2 = INTEGER: noError(0)
UCD-SNMP-MIB::extErrFix.3 = INTEGER: noError(0)
UCD-SNMP-MIB::extErrFix.4 = INTEGER: noError(0)
UCD-SNMP-MIB::extErrFix.5 = INTEGER: noError(0)
UCD-SNMP-MIB::extErrFixCmd.1 = STRING: 
UCD-SNMP-MIB::extErrFixCmd.2 = STRING: 
UCD-SNMP-MIB::extErrFixCmd.3 = STRING: 
UCD-SNMP-MIB::extErrFixCmd.4 = STRING: 
UCD-SNMP-MIB::extErrFixCmd.5 = STRING:

The script itself is the meat of this process; it will parse the output of your statistics file, and provide back the individual counter specified on the command line. To begin using the script, save it to your snmpd configuration directory (typically /etc/snmp/ on a linux installation), and edit the file to provide the location of your rndc binary if it differs from the default : $RNDC = "/usr/sbin/rndc";

The script will call 'rndc stats' for you, to create and update the statistics file when encessary. The default interval is 300 seconds, and the script will parse the stats file for the most current data set; if a set is not found within the interval, then rndc will be called to update the stats file, and the script will try once more. Any error will produce an output of "-1" - you should run the script by hand with the --debug flag if this condition occurs. NOTE: The script will NOT manage the statistics file for you, which will continue to grow. You will need to craft a cron entry to manage it's size separately.

Once this has been done, you should be able to run the file on a command line to validate that it is both getting statistics back, and to see a full list of the statistics your server provides by providing the --debug flag twice on the command line :

Code:

./bind96getstats.pl --interval 300 --file /var/named/data/named_stats/txt --prefix test --item test --debug --debug


This should produce output similar to the following (list truncated for brevity) :

Code:

name_server_statistics:responses_sent=4876383
name_server_statistics:truncated_responses_sent=1
name_server_statistics:queries_resulted_in_successful_answer=4090859
name_server_statistics:queries_resulted_in_authoritative_answer=3724617
name_server_statistics:queries_resulted_in_non_authoritative_answer=1143292
name_server_statistics:queries_resulted_in_nxrrset=109418
name_server_statistics:queries_resulted_in_servfail=8428
name_server_statistics:queries_resulted_in_nxdomain=667632
name_server_statistics:queries_caused_recursion=779391
name_server_statistics:duplicate_queries_received=13411
name_server_statistics:queries_dropped=3
name_server_statistics:other_query_failures=46
zone_maintenance_statistics:ipv4_soa_queries_sent=3079
zone_maintenance_statistics:ipv4_ixfr_requested=55
zone_maintenance_statistics:transfer_requests_succeeded=54
zone_maintenance_statistics:transfer_requests_failed=1

You can see the 5 statistics we're tracking in the list above, with the prefix (1st column) and value (2nd column) separated by ":".

To track these values in Traverse, you will need to utilize the UCD-HOST-EXTERNAL signature, which you can find in this post : http://community.zyrion.com/showthre...&p=531#post531