I recently saw this on my twitter and realised that I never shared how I monitor my caching server
— Marcus Ransom (@marcusransom) March 29, 2014
Jedda pinged back my 2012 post about the caching server and that is how I came to know about his Nagios ability. If you use Nagios and Macs be sure to check out his blog here
In any case, I do not use Nagios and never got around moving to logstash (+ elastic search + kibana) for keeping track of the server logs. This is mainly to the lack of time but seeing the latest Kibana I might be able to find some time :P
I use the free version of Splunk and it has been configured for quite a while now without any issue.
Back to the topic, for the caching server I created a dashboard and made a query to the logs that looks like this:
source="/Library/Server/Caching/Logs/Debug.log" "bytes served," | rex "(?i)(?P<BYTESSERVED>[^ ]+) bytes served, [0-9]+ from cache, (?P<DOWNLOADEDBYTES>[^ ]+) downloaded" | eval GBSERVED=round(BYTESSERVED/1024/1024/1024,2) | eval DOWNLOADEDINGB=round(DOWNLOADEDBYTES/1024/1024/1024, 2) | timechart sum(GBSERVED) AS Served sum(DOWNLOADEDINGB) AS Downloaded span=1d
This worked fine until 10.9 came around and changed it a bit. Then I changed the query to this:
source="/Library/Server/Caching/Logs/Debug.log" " served," | rex "(?i)(?P[^ ]+) MB served," | rex "(?P[^ ]+) MB downloaded from origin," | eval GBSERVED=round(MBYTESSERVED/1024,2) | eval DOWNLOADEDINGB=round(DOWNLOADEDMBYTES/1024, 2) | timechart sum(GBSERVED) AS Served sum(DOWNLOADEDINGB) AS Downloaded span=1d
I know it only cares about megabytes, but that cover most (all?) of the downloads I’ve seen anyway, and I don’t want to spend too much time writing a better query for something I consider non-critcal.
Here the end result:
Easy to monitor, right?