One of the presentations I give often covers the various ways you can monitor your IBM i partition using IBM provided tools.
When I talk about watches, I always ask the audience how many people are familiar with them. And I’m always amazed at how few people know about watches.
Watches are wonderful! I first blogged about them years ago (January 2010) in Automate Monitoring With Watches. Everything written in that prior blog is still correct. If you’re not familiar with watches, start by reading that blog.
To help you understand when and why you may want to consider using watches, I’ll provide some examples.
- Let’s say you have an intermittent problem you need to resolve where the symptom of that problem is a message to a QZDASOINIT job log. But which one? There can be hundreds of them and looking for it manually is very painful.
Using watches, you can efficiently monitor for messages sent to a job log.
You can start a watch to watch every QZDASOINIT job on the system looking for the specific message you are interested in. When that message is logged in any of the QZDASOINIT jobs you are watching, your exit program will run and you can take whatever action you wish.
- You can watch for messages sent to the history log. IBM’s message monitors do not support the history log.
- Perhaps you are working with IBM on a PMR where you have some sort of problem and the key diagnostic data IBM is asking for is a Program Activity Log (PAL) entry or a Licensed Internal Code (LIC) log entry. It’s tedious to manually go into STRSST to look for these artifacts. You can use watches to monitor for PAL entries and LIC logs so you can be notified when one of these occurs instead of manually checking.
- Maybe you need a very simple solution for monitoring QSYSOPR during off hours. You could use watches to monitor the QSYSOPR message queue for high-severity messages and have your program send an email or a text message to the person on call for off-hours support.
Watches also have some very significant advantages for automation:
- Watches are extremely efficient. There is almost no overhead until your watch program is called, and then the overhead is under your control since it is your program that will run. Your watch program runs in a separate job so it does not interfere with the running job that sent the message.
- As the example above demonstrated, you can watch for messages sent to job logs.
- Watches have a command (and API) interface and you can easily set up your watches to start when the system is IPLed by using basic work management features, such as autostart jobs.
The biggest challenge with watches is you need to develop the exit program that is called when the watched condition occurs. There is an example program in the IBM i Knowledge Center.
There’s also an IBM support document, STRWCH – Watch Exit Programs Explained with CL Example, that you might find very useful to get started.
Next time you have a situation where you need automation to help you do your job better, consider expanding your toolset with watches.
This blog post was edited to fix broken links on April 13, 2020.
This blog post was originally published on IBMSystemsMag.com and is reproduced here by permission of IBM Systems Media.