Today marks the last in the series of three blogs around Microsoft HPC by guest writer Ben Newton, we hope the articles have helped to demonstrate the time and thought that goes into the development of a Logscape App, for the final section Ben covers the development of the actual app that will run inside the logscape environment. You can find the past articles below. You can find more of Ben’s work on his Github page, or his LinkedIn.
Building an App for Logscape makes it sound much harder work than it is. It’s just a collection of scripts and Logscape config xml, zipped up. The important thing is that it should contain everything a user needs (that’s you by the way) to get their monitoring up and running instantly.
In the previous sections, we looked at what data we wanted and the scripts used to extract that data. Now we’ve got the data out of HPC, we need to get it into Logscape.
Bundle it! Dealing with scripts
The key to any Logscape App is the Bundle file. It determines which hosts run which services. Open up the bundle file and you’ll see a large amount of XML. The main element I want to focus on is the Service, which might look something like this…
<Service> <name>ClusterOverview</name> <resourceSelection>Zone contains headnode</resourceSelection> <fork>false</fork> <background>true</background> <instanceCount>-1</instanceCount> <pauseSeconds>0</pauseSeconds> <script>powerShellRunner.groovy clusterOverview.ps1</script> </Service>
Now, if you’ve got HPC Server already installed, you could deploy this app right now. Go on, I’ll wait. Did it work? Probably not. That’s because the most important part of the Service is resourceSelection – which resource (i.e. which Logscape agent) is running that service. You’ll notice here that it says Zone contains headnode.
The default zone for a forwarder upon installation is dev.Forwarder – notice the word headnode does not appear – therefore that service will not run on that host. You therefore have three options.
- Change the Zone of the Forwarder to dev.headnode.Forwarder
- Change the resourceSelection in the bundle file
- Use an override.properties file to change the resourceSelection.
Once the resourceSelection includes the Head Node, that service will run! That’s the only bit of the Bundle file you’ll want to edit – changing the others will probably break it! And what is powerShellRunner.groovy? Just a wrapper that allows Logscape to start PowerShell scripts as services.
You’ll also need to set up the SQL connection for the SQL Services – see the Quick Start guide in the App to cover that. The LogParser and AzureNodeBalancer services are not used by default – they need a little more configuration – please check out the documentation for those.
New World – New Workspaces!
The HPC App has been redesigned to take into account the features that Logscape Version 3.0 can offer, so you may need to upgrade to take advantage of them. These include Workspace linking, dynamic filters, new charting libraries and better CSS – all of which should make the user experience much more pleasant.
From our original user stories, our main concern was ensuring that the Support team could have a complete Grid overview from a single screen – which we made the Home Page of the App. The Task Monitor and Core Distribution graphs give a visual indication of the current load upon the Clusters. The Overview Tables allow us to filter down to a cluster level as well as giving us the headline figures. The Node Overview panel gives us a health check on each node as well their current status. Finally the scheduler events table let’s us know if there are any events to be aware of. This provides our monitoring overview, with the Node, Job and Broker Monitoring Workspaces drilling down to the detail. As for monitoring, the Job History and Cluster Usage Workspaces can provide detailed reports on any metric available.
Assuming you’ve got this far, you should have data flowing into Logscape. That you can view it is thanks to the .config files – open one up! They are simply XML files which contain the searches, workspaces and data types. Don’t worry, you don’t have to create them in XML form. To extract XML from a Logscape environment, use the backup tool. Of course, that means you build them first, then extract them into XML form. Why would you bother putting them in an App if they’re already saved to your environment?
- Portability: The App can be ported to another Logscape environment (perhaps from Development to Production) in a single step, rather than recreating them manually.
- Durability: If a search is created in the GUI and then deleted (perhaps by accident), it’s gone for good. if it resides in the app, it can be restored instantly. In a more extreme example, should the entire environment be rebuilt, the deployment of the App would instantly return it to the correct state.
- Consistency: As and when the app is superseded, it can be removed with all components instantly, avoiding conflicts or latent Workspaces.
- Version Control: An App is a collection of XML and scripts – all of which can be version controlled.
Designing Workspaces – Lessons from Construction
Hopefully, you’ll find the way we presented the data useful and easy to use. This iteration of the App is the culmination of 6 months worth of testing redesigning, so here are some of the lessons we learnt along the way.
Include Hyperlink Menus on your Workspaces There is a Logscape style guide – it’s not compulsory but it definitely helps. Having a hyperlink menu on the left makes finding subsequent pages so much easier for users.
Use the Linked Workspaces The linked workspaces were added in version 3.0 and with a little work, you can make really effective workflows. It also allows you to apply filters to the same space. For example, you’ll notice the HPC App workspaces are broken down by cluster. That’s so you can analyse multiple clusters from the same environment, but use the workspace linking to quickly focus on the cluster that’s important to you.
Use the right charts The data you’re analysing won’t always suit the same chart type. So make sure you use the right chart types:
- Line-Connect/c3.spline: Suitable for continuous data changes like call queues or disk monitors.
- Stack/c3.area-spline.stack :Suitable for comparing levels of individual activity across multiple hosts e.g. count of errors
- Area/c3.area.stack: Used to compare Percentages (i.e. Idle pct and ActivePct)
- Table: Use for heat mapped data and hyperlinks.
- Scatter/c3.scatter: Use for the txn(elapsed) function and events that are isolated incidents.
- Pie/d3.pie: Use for aggregated data.
Hopefully, you have found this overview of the new HPC useful: whether you are interested in monitoring HPC or just following the general principles for designing your own app. If you have any feedback or further questions, please let us know!