Visualizing UK accident data with Logscape

In my ever onward quest to show to the world how easy it is to get up and started with Logscape, today I’m going to use a Logscape docker container in order to build visualisations based off some publicly available CSV files in no time at all. If you’ve never used the Logscape docker image, then check out my previous blog.

Today we’re going to be analysing data made available via the gov.uk website, which offers statistics for crashes in the UK for the year of 2015. The specific dataset is available for download here.

Continue reading

30 Seconds of reading, hours of watching – 10 monitoring talks everyone should see

Here at Logscape it should go without saying that monitoring is sort of a big deal. Some would even go as far as to say it’s even our “thing”. To go with that we’ve collated a collection, of what we think might be the best 10 monitoring talks people should watch. Regardless of whether you’re looking to implement a logging tool, build your own or are just a developer, these talks are worth the time.

Monitoring at Spotify – When things go ping in the night. – by Martin Parm

This non-tech talk covers how logging at Spotify has adapted over the years. From the days where monitoring what just a gleam in the developer’s eyes, to the months where operations slept restlessly fearing the inevitable phone call, to the current day monitoring system which act as caretakers to the infrastructure responsible for streaming hundreds of thousands of tracks a day.

Metrics, Metrics everywhere – by Coda Hale

Digging back into the archives, we have Coda Hale’s talk from 2011. While you may think it’s dated, and irrelevant by todays standards. Coda covers some topics which will simply never get old.

Monitoring is Dead. Long Live Monitoring. – by Greg Poirier

Greg Poirer thinks it’s time we stop viewing its metrics in isolation and declare things alive or dead. In this laid back high-energy talk, he covers his opinion on the definition of monitoring, with only a few jabs at DevOps. A must watch.

Better living through statistics. Monitoring doesn’t have to suck. – by Jamie Wilkinson

Jamie Wilkinson goes over what he believes to be the problems in how we currently monitor, and how we can get rid of these problems so that everyone can benefit from logging. A great talk that discusses why our current logs simply aren’t precise enough.

 

The art of performance monitoring. – by Brian Smith

Brian Smith covers the mistakes that he’s made, and the mistakes he keeps seeing developers making. For a 25 minute talk the sheer quantity of technical ideas conveyed in this talk is impressive, but not for the faint of heart.

What should I monitor, and how should I do it? – by Baron Schwartz

In this talk, Baron criticises our approach to monitoring, which is to just stare at a graph and attempt to determine what’s gone wrong. Baron highlights the importance of not just collecting data, but collecting actional data.

Creating a Culture of Observability. – by Cory Watson

Taking a leaf out of the Spotify playbook, this talk covers the culture around logging, rather than logging itself. It’s Cory’s own story of how after joining Strike he managed to instil a culture of observability and monitoring. You’ll hear about his journey towards that goal, with the good, the bad and the downright sneaky.

How monitoring works at scale. – by Ran Leibman

Facebook manage to claim another spot on this top list, this time discussing the challenge which is monitoring the huge amount of infrastructure that make up Facebook around the world. Most companies dont have to monitor on this scale, which leaves an obvious questions, “How exactly do you?”.

The evolution of monitoring systems at google. – by Tom Rippy

Following in the steps of Facebook, Google returns for its second spot in the list. Much like the Spotify talk above, Tony Rippy aims to walk us through the progression of monitoring at Google, and includes some facts that you just wouldn’t believe about the now Tech-Giant.Tony wasn’t present for the whole of this history, but it’s portrayed in a fun and interesting way, which demonstrate whilst your current monitoring solution may not be the best, it doesn’t mean you can’t progress.

Allison McKnights talk demonstrates how monitoring doesn’t have to be expensive. Hailing from Etsy which is known for doing a lot with not a lot. Allison walks us through her experiences of using open source projects to build a monitoring system capable of monitoring the entirety of Etsy’s back end.

 

Hopefully you’ve enjoyed the videos, if you have any that you feel should be added to the list, feel free to drop them below, or tweet @logscape with why.

Logscape and Docker – Get monitoring in 60 seconds

Monitoring Magic

It’s finally that day, Logscape is now on docker hub. As such I’m going to be walking you through the process of getting Logscape running, and once you’ve got the hang of it, you’ll be able to download, run and start using Logscape all within 60 seconds. Monitoring in a heart beat. Continue reading

Concatenation or Parameters? Both? What’s the top method of Java logging.

Concatenation or Parameters? Which should we use.

Now it’s undeniable, we techies love to argue about anything we can. Emacs or Vi? Tabs or spaces? Dark theme or Light theme? Brackets on the method line, or the next? to name but a few. We can even see examples of these arguments if you follow discussions on Twitter.

However, whilst you sit in your corner and argue Emacs or Vi (The winner is Vi for the record) we decided to take action by looking at the top Java repositories on GitHub and settling once and for all, which is the more used method of logging.

Take a guess, we dare you.

Again, looking at Twitter (Do we spend too much time staring at that scrolling feed?) polls in the past have shown parameterized logging to have a distinct lead over its String concatenation cousin. But regardless let’s take a look at the actual data.

The proof is in the pu-… repository.

We ingested the top Java repositories on GitHub, pruned out those using less than 200 log statements in the entire file, glared at the inconsiderate repo’s that were ruining our stats with their outliers, and then broke the logging statements down by the most common methods.

We took the results of our search, dropped them into Logscape, generated a pie chart and got….

Use of Logging method across all repo’s (Click to enlarge)

It’s fair to say we were as surprised as you. Of all the log lines we ingested and checked, a whopping 52% have no parameterized elements at all. They’re just plain old strings. Coming in second, we have parameterized statements, with about 33%, concatenation at 14%, and trying it’s best to avoid notice, less than 1% of log statements use both.

The fact so many messages contain no variables at all is interesting, as it means the application has no way of telling you what state it was in, only that it executed that particular log line. However good logs vs bad logs is a discussion for another time.

So we now know that no parameters seem to be vastly more common than any other type of message, but what’s the spread of logging styles? Do people who use static logging always log that way, or do they mix it up. That brings us to our second graph, the average breakdown of logging style on a per-repo basis, and the results are…

Logging type by repo (Click to enlarge)

…What I would say is a better look at the breakdown of statements. Parameterized takes the lead, but only by the smallest slice of pie, coming in at 42% compared to the 36% of statements where both are used. This shows us that whilst the most commonly used format is parameterized, a similar number of devs either can’t decide, or just don’t care, and opt for whichever format suits them best. This begs the question, what is the actual difference.

Concatenation or Parameters, the who, what and where.

So, from an end users point of view, when they open the log file, regardless of which logging format was used, they’re going to see the same thing. A (hopefully) nicely formatted log, full of data they’re interested in. So where’s the difference? the difference lies in the code.

  • String concatenation is combining strings, i.e
     LOGGER.info(failure + " Just blew up")
  • Parameterized uses a formatting anchor i.e
    i.e LOGGER.info(`{} just blew up`, failure)

From a visual standpoint, they’re really not that different, but what if we look a level deeper?

The Deep Dark

So we know that Parameters are the most popular logging method, and we know that from a code perspective, they both look reasonably similar. So what is the actual difference between them? Well, it mainly comes down to how the JVM treats each statement.

For the case of concatenation, if we take a line such as –

LOGGER.debug("This " + item + " went wrong, with state " + state);

Regardless of the log level, the variables in this message will be converted to a string, meaning if the log level is actually currently INFO, we’ve just converted those variables into Strings, and then we’re not going to use them.

This can admittedly be avoided, but it makes your code even more verbose,

if(LOGGER.isDebugEnabled()) LOGGER.debug("This " + item + " went wrong, with state " + state);

Even using this as an inlined if statement, that’s plenty of visual clutter.

Looking instead at parameterization that same log message is going to look something like this,

LOGGER.debug("This {} went wrong, with state {}", item, state);

If the Logger is set to INFO, this object will never be converted into a String.

There are arguments for and against the visual style of the two, but the biggest fact is a fairly simple one. Using parameterization your objects will only be converted to Strings when they’re needed, saving you time and memory.

Thoughts

For us, the most surprising thing to come out of our research was discovering the sheer number of static log messages that we saw in the first graph. The second showed us that whilst parameterization has a lead, it’s not much of one. This probably reflects the fact that in the long run, there really isn’t that much difference between the two methods. However, once you enter the realms of large-scale logging, it’s clear that parameterization can be simpler and more performant. The fact that you’re not performing additional calls to toString, and thus spending time, and resources for nought seems small now, but scale that up to a system that is potentially making that call hundreds, if not thousands of times per minute, and you see why people prefer parametrization. The major drawback of concatenation can be avoided, but it will cost your visual clarity, and so developer time.

Hopefully, this has helped to shine some light on the matter, and persuaded you to use paramaterization over concatentation. Your applications will thank me.

 

 

Native JSON Support

json_bumper.sh-600x600Working with JSON in Logscape 3.2

Logscape 3.2 introduced native JSON support, meaning that when working with JSON data there’s no need for datatypes, instead Logscape automatically pulls the keys from your structure.

This removes the sometimes daunting configuration step, and instead lets you get straight down to business with visualising your data. With that in mind, today we’re going to be embracing our inner geek, and get to work visualising some JSON from the game EvE Online™.


 

Continue reading

Logscape 3.2 Touches Down

ssksLVBLogscape version 3.2 is now available for public download, you can get it now from the Logscape Website.

A brief rundown of Logscape 3.2 brings with it, and what we’re going to cover today…

  • File Explorer
  • JSON Support (Including JSON Arrays)
  • Failover Overhaul
  • Performance and Stability Changes

 


 

Continue reading

Advanced data analytics and use-cases in Logscape

Introduction
self_descriptionLogscape Analytics’ are incredibly powerful, however, are you using them to their full potential? In this blog post we’re going to go over some of the less used analytics, show you how to use them, and hopefully inspire you to use your Logscape instance in new and exciting ways. So, without further ado let’s get into some searches. Continue reading

Logscape Tutorials – Logscape in 10 minutes

Recently we’ve been working on creating new learning materials for the release of Logscape 3.0.Materials appropriate for both the Logscape expert and an individual just picking Logscape up for the first time. The first person to be addressed by this was of cof course the beginner, as such here’s a 10 minute introduction to the basics of Logscape 3.0.

 

 

Hopefully this help some of our newer users, and keep an eye out for more advanced tutorials!

CSV Discovery in Logscape

New in Logscape 3.0

cloud_43-595x553Logscape 3.0 introduces a new feature that makes working with CSV data easier, and faster. Logscape will now automatically generate a datatype from imported CSV data, you’ll be free to immediately build a workspace around your data rather than having to worry about setting up your datatype. Continue reading

Using Logscape with HPC, 3 of 3

Today marks the last in the series of three blogs around Microsoft HPC by guest writer Ben Newton, we hope the articles have helped to demonstrate the time and thought that goes into the development of a Logscape App, for the final section Ben covers the development of the actual app that will run inside the logscape environment. You can find the past articles below. You can find more of Ben’s work on his Github page, or his LinkedIn.

Part one Part Two Continue reading