iCan Blog Archive

In today’s computing environment, disk response time is a critical factor to consider in understanding your system’s performance. Processor speed has improved significantly over the past many years, while disk I/O performance has not improved at that same pace. Solid state drives (SSDs) have the potential to change this, but the reality is spinning disks continue to be a major component of system performance.

It’s important to understand the role of disk performance, and one critical measure is whether there are slow disk operations occurring. Too many slow disk operations can have an overall negative impact on systems performance.

In the 6.1 release of IBM i, the capability was added to collect what we call “disk response time groups.”  Disk response times are measured by the low-level I/O disk driver component within the Licensed Internal Code (LIC); that is, the time between sending the disk I/O request and receiving the corresponding response is measured. Response time groups are various time ranges for the I/O operations. For each time range, a count is maintained of the number of I/O times in that range. Since the response times are measured within the LIC, these measured response times apply to all disk operations, whether internal disk or external storage. 

In the initial support in 6.1, the response time groups were defined as follows:

Range 1:   0      <  1ms;
Range 2:   1ms    <  16ms;
Range 3:  16ms    <  64ms;
Range 4:  64ms    <  256ms;
Range 5: 256ms    <  1,024ms;
Range 6:          >= 1,024ms;

In the 7.1 release, additional the disk response time groups were added to support more granularity. In particular, smaller response time groups were defined to better account for the faster response times for I/O operations to SSDs; these new response time groups are in microseconds rather than milliseconds. For these new response time groups the counts are now accumulated into separate groups for read and write operations.

The combination of the additional granularity and separate groups for read and write times results in a total of 22 response time group fields in the 7.1 release. The response time groups are fields in the QAPMDISKRB file. It should be noted that the response time groups that were added in the 6.1 release continue to be supported on 7.1 as well (i.e., 7.1 has two sets of response time groups). (This support remains unchanged in the releases after 7.1.)

The 7.1 response time groups are as follows:

Range 1:         0         <         15us;
Range 2:        15us       <        250us;
Range 3:       250us       <      1,000us;
Range 4:     1,000us       <      4,000us;
Range 5:     4,000us       <      8,000us;
Range 6:     8,000us       <     16,000us;
Range 7:    16,000us       <     64,000us;
Range 8:    64,000us       <    256,000us;
Range 9:   256,000us       <    500,000us;
Range 10:   500,000us       <  1,024,000us;
Range 11:                  >= 1,024,000us;

Beginning with the 7.1 release of the Performance Data Investigator, there are now IBM-supplied charts of the disk response time groups. The following screen capture shows the new charts and tables that are available in the Performance Data Investigator for disk response time analysis. You’ll note that there are two sets of charts – one in the “Detailed” folder, and the second set in the Disk Response Time folder. The “Detailed” charts are the charts based upon the more detailed 7.1 response time groups, while the ones in the Disk Response Time folder are use more general response time groups that were introduced in 6.1.

Ican 5.18.10 Fig. 1

Finally, I’ve included an example screen capture of the “Disk I/O Rates Overview – Detailed” chart below, which gives you a histogram of the disk response time groups. This screen capture is using the new PDI charting engine introduced early in 2020.

If you see long response times, you should perform more detailed analysis to understand and correct those long response times. Disk response times longer than 10 milliseconds should be investigated; disk response times longer than 10 milliseconds are considered bad. For more information on disk performance, refer to section 8.2.3 and Appendix A in the “End to End Performance Management on IBM i.

This blog post was originally published on IBMSystemsMag.com and is reproduced here by permission of IBM Systems Media.