In this blog post we’re going to be looking at what some people might call “big” data. No that doesn’t mean big in the conventional sense, it means big in the sense that the single file dataset is 10 Gb in size, and I wanted to make a “big data” pun.
The data in question is a record of NYC’s 311 complaints since 2010, the 6th most popular dataset on the opendata website. “311” is a complaints hotline in NYC, for those interested in following along or investigating the data themselves, it is freely available from the open data website.
Today we’re going to cover
Creating a data source and importing the data
First look at the data to determine interesting fields
In my ever onward quest to show to the world how easy it is to get up and started with Logscape, today I’m going to use a Logscape docker container in order to build visualisations based off some publicly available CSV files in no time at all. If you’ve never used the Logscape docker image, then check out my previous blog.
Today we’re going to be analysing data made available via the gov.uk website, which offers statistics for crashes in the UK for the year of 2015. The specific dataset is available for download here.
Here at Logscape it should go without saying that monitoring is sort of a big deal. Some would even go as far as to say it’s even our “thing”. To go with that we’ve collated a collection, of what we think might be the best 10 monitoring talks people should watch. Regardless of whether you’re looking to implement a logging tool, build your own or are just a developer, these talks are worth the time.
Introduction Logscape Analytics’ are incredibly powerful, however, are you using them to their full potential? In this blog post we’re going to go over some of the less used analytics, show you how to use them, and hopefully inspire you to use your Logscape instance in new and exciting ways. So, without further ado let’s get into some searches. Continue reading →
So you have written an app or log – it’s brilliant, it grabs all the data you need and runs like greased lightning. All you need to do now is ensure your output file has a nice clean format – preferably one that means Logscape does all the work for you! So here are some of my top tips.
1) Add a full time stamp to every line. You wouldn’t believe how much trouble can be caused by people using just times or dates. At the best, you have to struggle to get your data properly organised. At worst, you end up with a mess and data appears in the wrong place on the graph. Do it right, set the date and time!
2) Add a time zone to that stamp. My computer will never move time-zone, surely it’ll be fine? Don’t count on it. British Summer Time changing the system time on half your servers, servers being reset to US time, data centres moving locations… All these things can and will happen. Adding the time zone to the stamp gives you a cast iron assurance that the data will always be correct. That peace of mind is worth a few bytes.
To chart the data in Logscape you need a passing familiarity with the how to search using a data type.
To execute a search I would need to know which Collectd plugin I am interested in and what metrics it outputs. The table of all collectd plugins can be found here. Here’s an example which charts the load of a host svr0001
Logging and monitoring system health is a hot topic where operational engineers manage large server estates. There are many solutions out there that solve a piece of the puzzle of how the metrics are generated, where the metric data is stored and how it is then visualized.
In this blog post we are going to take a look at Collectd and how to integrate this with Logscape. Collectd is an excellent monitoring backend for collecting operating system metrics. Collectd has around 90+ plugins including hardware sensors such as temperature and power usage. Metric data by itself is of little use unless you can visualize it in some way or fire alerts based on trends in the systems under supervision.
This table shows some of the available sensors being collected. There are abount 32 different sensors from 8 different hosts being imported in this environemnt. Here is dashboard of system health KPIs.