Logscape 2.1 is out!

Log analysis just got more powerful!

Since December we have been working hard on this release. Originally it was targeted as the 2.0.5, then as we realised the extent of cool things being added, a 2.1 nudge became warranted.

Links: Release Notes and Update Instructions

So whats new:

  • Better documentation: support.logscape.com has been given a much needed overhaul
  • Better chart navigation: Mouse wheel oriented chart navigation (zoom in-out using the mouse-wheel, click-drag to pan) It all feels very natural
  • Improved Intelligence: Field Discovery has a second cousin ‘GrokIt’ (yes – similar to logstash-grok) – but this is really just a bunch of regexp patterns (checkout this blog post for more detail)
  • Faster: We re-architected the FieldDiscovery DB to use Persisit-It, introduced Dictionary maps to reduce size and improve performance
  • Usable: The logviewer can now nudge back and forth using the mouse wheel or cursor-keys
  • More SAASy: DataSource tags are indispensible when you are identifying sources and trying to make sense of the big-picture. Now you can add multiple tags (attributes) which make the cross-cutting functionality even more powerful!
  • More Configurable: Field Discovery can now be configured on the DataSource/Advanced page, you can turn if off, or choose between auto-key-value discovery (system metrics) or grokit pattern extraction (email, url-paths etc).

In general, we have made Logscape faster, prettier and fixed a quite a few bugs, bear witness to the Release Notes:

Best Regards,
Neil.

 

Intelligent Log Analysis – Field Discovery

Field discovery..

.. is cool because it does most of the hard-work for you. It finds system metrics, emails, ipAddress and all sorts of things that you never really realised were filling up your logs. Log analysis has never been so powerful J. Its nice that you can add data, click on Search and see stuff. Log analysis tools keep getting smarter and smarter.

Logscape 2.1 builds on the already popular auto-field discovery by providing users with the ability to add their own, ‘auto-patterns’. The system is called grokIt. Im going to discuss the two approaches and how they work within Logscape.

Implementations:

  • Auto-Field discovery (Key-Value pairs)
  • GrokIt Pattern based discovery (Well known patterns)

Automatic Log Analysis of Key-Value pairs

With 2.0 we launched Key-Value pattern extraction. The idea is simple, whenever a recognised Key-Value pattern is found we index the pair and make them searchable terms.

For example:    CPU:99 hostname:travisio 

OR      { “user”:”john barness”,”ip:”128.10.8.150″,”action”:”login” }

Pattern based extraction (GrokIt):

With this release we have included the ability to extract known patterns such as, email-addresses, hostnames, log-levels, paths, etc. So every time john@jj-pennies.com is seen, then the data is extracted and indexed against the key (_email). The standard config file is logscape/downloads/grokit.properties

#field-name, substring match(leave blank if unavailable), and regular expression matchers that extract a single group for the value
_email::.*?([_A-Za-z0-9-\.]+@[A-Za-z0-9-]+\.[A-Za-z]{2,}).*?
_ipAddress::.*?([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*?
_exception::.*?([_A-Za-z0-9-\.]+Exception).*?
_url::.*?([A-Za-z]{4,4}://[A-Za-z.0-9]+[:0-9]{0,6}[A-Za-z/]+).*?
_level::.*?(INFO|ERROR|WARN|DEBUG|FATAL|TRACE|SEVERE|DEBUG).*?
_hour::.*?[,.\s-]([0-9]{2,2}):[0-9]{2,2}:[0-9]{2,2}[,.\s-].*?
_minute::.*?[,.\s-][0-9]{2,2}:([0-9]{2,2}):[0-9]{2,2}[,.\s-].*?
_gpath::.*?(\/[A-Za-z0-9]+\/[\/A-Za-z0-9]+).*?

Each of these patterns were considered to be the most practical in terms of a) – seeing useful information or b) – slicing your data by time (hour of day).

Each entry contains the FieldName (lhs) : Expression (rhs).
The regular expression must return a group that contains the value (see the orange brackets above). At the bottom we reference some of the awesome regular expression tools we used for these.

How do I configure it?

To make changes you can add or remove entries. Open your favourite text editor (vim?) – make the changes and save it (make sure you test it) . Once saved, then upload the file via the deployments page where the file is replicated to all agents on the network.

Any new files being monitored will pick up the configuration change (note: it wont happen mid-point through a file). To have the change applied retrospectively you will need to re-index the Datasource.

When is it applied?

As with anything, we have tried to make both discovery systems as fast as possible. Key-Value extraction can perform at a rate of 17-20MB/s per pattern, unfortunately the supported 8 different rules cumulatively slow things down. GrokIt – or regular expression parsing is about 14MB/s per compiled pattern. Again this is too slow; as you will see from above, there are 8 of them.

IndexTime: The easiest way to remove the performance penalty is to do the work once, and not when the user is waiting. In our case, when either of the discovery systems are enabled, a Field Database is used to store the data in its most efficient form (dictionary oriented maps). This decouples the processing and provides reasonable search performance on attributes that are unlikely to change.

SearchTime: At search time the executor will pull in any discovered fields and make them available for that event. This provides decent performance and better system scalability.

Configurable by the DataSource

To allow better performance, we have exposed FieldDiscovery flags on the DataSource/Advanced tab. Standard logscape sources have discovery disabled.

data-sources-discovery

Some great regular expression tools:

https://www.debuggex.com

http://regex101.com/

Regards Neil