The Performance Data Investigator has a wealth of capabilities, some of which I have written about in the past. This week, I want to review the Health Indicators.
Health Indicators are a set of graphs (perspectives) found within the Investigate Data task in the Navigator for i Web console. The health indicators are based upon Collection Services data and are included with the base operating system (you do not need to purchase or install any additional licensed programs to use the Health Indicators). Health Indicators give you an overall view of the performance of your partition and can be used to easily determine whether you should investigate the performance of your partition further.
As you can see from the screen capture above, there are various different types of Health Indicators. The System Resources Health Indicators provide you an overview of high-level performance metrics for the partition (CPU, memory, disk, etc.). The other types of health indicators give you a bit more detail. Below is an example of the System Resources Health Indicators:
It’s important that you understand how to interpret the graph that’s displayed. For each metric that’s displayed, the colors (green, yellow, and red) represent the percentage of intervals within the collection that had values that were within that threshold level.
It’s easiest to explain by referring to the example above. If we look at the CPU metric, we will see that around 70% of the intervals, CPU metrics were healthy (green). About 10% of the intervals, we have values that were in the warning level (yellow), and 20% of the intervals, we had values that were above the action level (red). This graph does not tell us when the warning and action thresholds were hit – only that they occurred in some percentage of the intervals within the collection.
We can drill into the CPU health indicators to see more detail:
From this chart we can now see that we have concerns about 5% of the time regarding the partition CPU Utilization, but the major issue is Jobs CPU Queuing, which was happening around 80% of the time during the collection. To better understand what’s happening, we can now drill into the CPU Utilization and Waits overview to look at the Collection Services data over time, to see when these situations were occurring:
In the resulting chart, you can now see when the CPU Queuing time occurred and when the partition CPU utilization exceeded the Action threshold (circled in red).
These threshold values have defaults supplied by IBM, but you can modify them by taking the Define Health Indicators from the action drop-down:
The work management System Status GUI provides links to the Health Indicators. The General tab will provide a link to the System Resources Health Indicators, as shown below.
If you select other tabs from System Status, you will find links to other types of health indicators: the Processors tab includes a link to the CPU Health Indicators, the Memory tab includes a link to the Memory Health Indicators, and the Disk Space tab includes a link to the Disk Health Indicators. These buttons allow you to go from the work management system status information to view the Health Indicators based upon the current Collection Services data.
As previously mentioned, the Health Indicators are based upon Collection Services data; you can view the Health Indicators for the most recent collection (near real time information – i.e., how is the system performing now), as well as in the past for any collections you still have available.
In summary, you can use the Health Indicators to get an overview of the performance of your IBM i partition; if you see areas of concern, you can then use the Performance Data Investigator to drill into the performance data details to determine what issues may be occurring on your partition.
This blog post was originally published on IBMSystemsMag.com and is reproduced here by permission of IBM Systems Media.