Workload Heatmap With Tableau

by Dec 15, 2006

Hi, I thought I would post up an example of how Tableau can be used to generate some very effective workload characterization reports based off the data that up.time collects from each of your agent systems.

Before reading this post I would recommend that you review these KB articles to get a basic background on interfacing with your up.time DataStore via ODBC and setting up a connection to your DataStore within Tableau

Connecting to the DataStore via ODBC
Creating custom reports with Tableau

The graphic below is commonly referred to as a workload heat map. It gives a high level view of CPU utilization grouped by hour across many systems over a given time period. For each hour block the CPU total is averaged and as it gets closer to 100% utilized the square changes from green for OK to RED for highest utilization. In addition to the standard up.time reports and graphs this display gives a very valuable perspective on the workload characterization across your enterprise.

Lets take a quick look at the workload profile in the screenshot above.

happy – happy is a Solaris server running at maximum capacity almost 24 hours a day (as indicated by the red squares across the board). Generally this is not a good thing but you would really need to reference the process queue to get a better idea if processes were backlogging due to CPU utilization or if the high CPU utilization is not having a major impact on the systems ability to process new jobs.

hpinteg – hpinteg is a HP Integrity rx2620 running an Oracle database. We can see that it only has a few hotspots throughout the week. Particularly the system is utilized more between 8-10 Monday through Friday. Knowing the specifics on this system we know that during those hours some heavy reports are run, adding workload to the system for those peak periods.

prod1 – prod1 is a system that shows typical workload pattern for system backups. We can see that the system is not heavily utilized at any time outside of 11-12pm daily with an extended period Tuesday morning. This system runs a complete backup Monday night flowing into Tuesday mornings, we can see the CPU utilization skyrockets during that period. Every other night of the week a much less intensive incremental backup job runs that only generates workload for a few hours. Taking a higher level view on this profile I can see that resources on the system are not an issue but that we need to look into why the full backup is taking so long to finish, it shouldn't be overflowing into work hours in Tuesday morning because that will start to impact end users for this system.

To help you generate your own examples I have attached the Tableau workbook that was used to generate this quick example. This can easily be adapted to group data over different time periods, different performance stats or whatever you like.

Here is another example screenshot where systems are grouped individually showing workload per day over an entire month.