Welcoming a new NGINX and Apache app.

For a long time, we’ve had the web app available on our app repository – This covers Nginx, Apache and a whole host of other formats, and whilst functional, it hasn’t been touched in a long time, so it looks a little bit less than pretty.

Given the popularity of Apache and NGINX, and the fact they both use the same out of the box log format, we’re going to give them a dedicated app, and a brand new look.

We’re hoping this works out for everyone. If you’re currently using the existing web app you can continue to do so. But if you’re specifically running NGINX or Apache, and want a change of pace, then read on.

Continue reading

30 Seconds of reading, hours of watching – 10 monitoring talks everyone should see

Here at Logscape it should go without saying that monitoring is sort of a big deal. Some would even go as far as to say it’s even our “thing”. To go with that we’ve collated a collection, of what we think might be the best 10 monitoring talks people should watch. Regardless of whether you’re looking to implement a logging tool, build your own or are just a developer, these talks are worth the time.

Monitoring at Spotify – When things go ping in the night. – by Martin Parm

This non-tech talk covers how logging at Spotify has adapted over the years. From the days where monitoring what just a gleam in the developer’s eyes, to the months where operations slept restlessly fearing the inevitable phone call, to the current day monitoring system which act as caretakers to the infrastructure responsible for streaming hundreds of thousands of tracks a day.

Metrics, Metrics everywhere – by Coda Hale

Digging back into the archives, we have Coda Hale’s talk from 2011. While you may think it’s dated, and irrelevant by todays standards. Coda covers some topics which will simply never get old.

Monitoring is Dead. Long Live Monitoring. – by Greg Poirier

Greg Poirer thinks it’s time we stop viewing its metrics in isolation and declare things alive or dead. In this laid back high-energy talk, he covers his opinion on the definition of monitoring, with only a few jabs at DevOps. A must watch.

Better living through statistics. Monitoring doesn’t have to suck. – by Jamie Wilkinson

Jamie Wilkinson goes over what he believes to be the problems in how we currently monitor, and how we can get rid of these problems so that everyone can benefit from logging. A great talk that discusses why our current logs simply aren’t precise enough.

 

The art of performance monitoring. – by Brian Smith

Brian Smith covers the mistakes that he’s made, and the mistakes he keeps seeing developers making. For a 25 minute talk the sheer quantity of technical ideas conveyed in this talk is impressive, but not for the faint of heart.

What should I monitor, and how should I do it? – by Baron Schwartz

In this talk, Baron criticises our approach to monitoring, which is to just stare at a graph and attempt to determine what’s gone wrong. Baron highlights the importance of not just collecting data, but collecting actional data.

Creating a Culture of Observability. – by Cory Watson

Taking a leaf out of the Spotify playbook, this talk covers the culture around logging, rather than logging itself. It’s Cory’s own story of how after joining Strike he managed to instil a culture of observability and monitoring. You’ll hear about his journey towards that goal, with the good, the bad and the downright sneaky.

How monitoring works at scale. – by Ran Leibman

Facebook manage to claim another spot on this top list, this time discussing the challenge which is monitoring the huge amount of infrastructure that make up Facebook around the world. Most companies dont have to monitor on this scale, which leaves an obvious questions, “How exactly do you?”.

The evolution of monitoring systems at google. – by Tom Rippy

Following in the steps of Facebook, Google returns for its second spot in the list. Much like the Spotify talk above, Tony Rippy aims to walk us through the progression of monitoring at Google, and includes some facts that you just wouldn’t believe about the now Tech-Giant.Tony wasn’t present for the whole of this history, but it’s portrayed in a fun and interesting way, which demonstrate whilst your current monitoring solution may not be the best, it doesn’t mean you can’t progress.

Allison McKnights talk demonstrates how monitoring doesn’t have to be expensive. Hailing from Etsy which is known for doing a lot with not a lot. Allison walks us through her experiences of using open source projects to build a monitoring system capable of monitoring the entirety of Etsy’s back end.

 

Hopefully you’ve enjoyed the videos, if you have any that you feel should be added to the list, feel free to drop them below, or tweet @logscape with why.

Logscape and Docker – Get monitoring in 60 seconds

Monitoring Magic

It’s finally that day, Logscape is now on docker hub. As such I’m going to be walking you through the process of getting Logscape running, and once you’ve got the hang of it, you’ll be able to download, run and start using Logscape all within 60 seconds. Monitoring in a heart beat. Continue reading

Logscape 3.22 Arrives

It’s Alive! Download Logscape 3.22researcher_translation

Today Logscape 3.22 becomes available to the public, we’re really excited and hope everyone’s going to love the improvements that come with it. We’ve packed in numerous performance tweaks; but we’ve also started to focus heavily on UI/UX to make the Logscape experience better for you, our users. In case you’ve missed it, you can grab the newest release from the website. Without further ado, let’s get onto some of the highlights of the 3.22 release.  Continue reading

3.03 is here (and now)

Performance :
For this release we carried out more work around execution performance.
Single threaded benchmarking takes 2 profiles. Search page and Workspace oriented execution. When a Search is executed from the Search page it builds the facets stats to support adhoc analysis; it also streams a large set of events to the JettyWebServer. All the extra work yields about a 40% overhead, and we were seeing about 80k events per second for a single thread (30-40 discovered fields). The Workspace execution plan yields 120k per second, per thread.
The execution plan follows these steps:
1. Identify log files in the selected time period and meet the system field criteria (i.e. _agent, _type, _tag etc)
2. Select the time-series buckets associated with each resource
3. Scan the time-series buckets and build data-type patterns, synthetics and discovered fields for each event. (Using indexed fields is much faster that synthetics – 3.03 enhancement)
4. Aggregate and pump data using map-reduce execution of the functions(avg, count etc) (3.03)
5. Jetty Aggregate the incoming streams and drive the interface using websockets
6. Websocket events then send status messages, notification of replay-events (3.03), facets and updated histogram data.
Note I: 3.03 – marks where performance improvements were made.
Note II: A single thread processing 100,000 events is sustainable, 16 threads should process an equivalent of 1,600,000 events per second (in theory). Scalability depends upon I/O subsystem performance relating to disk-io, os-buffers and network.
Note III: Logfile processing is carried out with 1-thread per request.
Important: Before upgrading remember to: 1) backup your config, 2) backup the downloads and space folders (in case of reversion). 3) make sure all agents are online!
Release notes:
1. Fix summation problem where only the first event was evaluated
2. Further performance improvements on search performance and UI interaction
3. Ability to index any  field; discovered or synth (yields faster performance and requires reindexing)
4. Improved data types page for debugging and benchmarking
5. Datasources now use natural keys instead of UUIDs. This should combat DS duplication when importing.exporting. Note: ids only generated on new DS’s being saved
6. You can set the java.tmp.io.dir in boot.properties (boot.properties sets it to work/tmp by default. The directory is cleared on rebooting logscape. When upgrading you will need to perform this operation manually.
7.  Networking now uses faster lz4 compression. This will make offline agents break if not updated!
8. Geo-maps now use a chloropleth palette
9. Workspace linking now forces correct filtering when driven via URL clicks
10.  Fixed random hs_err crashing caused by ChronicleQ fixed
11. Search page chain.button now saves state and auto-runs search when auto-run is enabled
12. Rickshaw charts now format numbers with ‘,’ on mouse-over
13. Syslog no longer prints to stdout

Correlated Alerts in Logscape

computer_problemsIn my experience Correlated Alerts are something the average user doesn’t touch on, either thinking that they don’t need them, or believing (falsely) that setting up a correlated alert is much more difficult than it is. While correlated events can be used on almost any form of data, my personal opinion is that they’re at their best when dealing with data such as audit or webserver logs, but truly they will excel in any environment that makes use of error messages or codes. Today, I’m going to walk you through setting up a correlated alert, just to show how easy it is.

Continue reading

Lifecycle tracking with TXN in Logscape!

The Problem of Ticket Tracking

online_package_trackingSystems tend to have workflows – where an object or ticket are passed around different systems. As someone responsible for monitoring such a system, I need to be able to keep track of the events to ensure that customers get served and orders processed. My main aim in this example is to track how long it takes to process tickets.

This kind of assumes you already have a Logscape environment running. If you don’t, download it now and get started!

Continue reading

Post-Aggregates in Logscape 3

wisdom_of_the_ancientsVersion 3 introduces a new search analytic to Logscape’s arsenal, the Post-Aggregate function. This feature is designed to give the user more control over each individual search, allowing them to perform multiple functions on a value by aliasing the result of one function, into another value which can be given as an argument for another function. Allowing you to chain a value through several analytics to obtain your desired effect.

Today I’m going to walk through an example of how to use Post-Aggregate functions inside Logscape, hopefully it will be both insightful, and show you how Post-Aggregates can be used to improve your own dashboards. We’re already using them inside the Univa Grid Engine app, and some of our own monitoring workspaces. Continue reading