Watches provide a way to automate tasks when certain events occur. An event can be a message, a Licensed Internal Code (LIC) log (also known as a VLOG), or a Problem Activity Log (PAL) entry. The primary motivation for adding watches to the operating system was to provide a way for improved diagnostics, but watches, particularly message watches, can be used for automated monitoring of system conditions. Watches provide a way to be notified programmatically when the event occurs so immediate action can be taken. Additionally, watches can be very useful at detecting situations that occur intermittently since the actions can be automated.
Watches have minimal system overhead when they’re defined, but not “hit.” When the watch condition occurs, the actions taken depend upon the how the watch is defined. When using watches independent of traces, the actions taken are what the watch for even exit program is coded to do.
To start a watch, use the Start Watch (STRWCH) command or QSCSWCH API. To end a watch, use the End Watch (ENDWCH) command or QSCEWCH API. You can also use a Work With Watches (WRKWCH) command to display, start or end watches. Up to 10,000 watches can be active at one time.
Watching for LIC logs or PAL entries is probably something you won’t need to do, but watching for messages can be an effective way to perform proactive and automated monitoring of system conditions. You can watch for messages sent to any message queue, the history log or job logs. When you define the watch condition, you can specify several different conditions you want to check for to limit the situations under which the watch will trigger. When the watch condition is matched, your exit program will get control and you can take whatever actions you deem appropriate for the condition.
Using watches to automate monitoring of messages is much more efficient than using the Navigator for i Message Monitors. Navigator Message Monitors use a polling technique to retrieve messages, which has more overhead and is less timely than watches.
However, it isn’t as easy to set up Message Watches since there’s no GUI for watches; only command and API interfaces. In addition, with watches you have to write your own watch exit program, whereas Navigator message monitors has a nice interface to define the actions taken when the monitored condition occurs. IBM has provided an example watch exit program in the Knowledge Center.
The now unsupported Management Central function had Job Monitors, where you could monitor for job log messages. However, monitoring job log messages with Management Central Job Monitors was very expensive in terms of the system resources used due to its polling nature. Using watches, you can watch for messages sent to jobs logs and this is a much improved solution since watches use significantly fewer system resources.
When watches are defined, additional jobs run on the system. These jobs are different on the various releases. On 5.4, there was one batch job per watch in the QUSRWRK subsystem. These jobs run until the watch is ended. The changed starting in 6.1, where watches were changed to use batch prestart jobs. There’s a single batch job, QSCWCHMS, in QUSRWRK for message watches; when a message watch condition is hit, the user exit program is called in a prestart batch job, QSCWCHPS. This change reduces the number of jobs in the QUSRWRK subsystem.
IBM i has a function called Service Monitor that uses watches as its notification mechanism. Service Monitor was added in the 5.4 release. Service Monitor is controlled by the QSFWERRLOG system value; when this system value is set to *LOG, service monitor will be active. Service Monitor uses many watches, and you’ll see evidence of this by the jobs running in QUSRWRK, as well as when you work with watches. I’ll talk more about Service Monitor in a future blog article.
For a little history, watches were initially added to the operating system in the V5R3 release as as a way to automate the ending of traces. In 5.4, watches were supported independent of the trace commands.
This blog post was updated for currency on January 29, 2020.
This blog post was originally published on IBMSystemsMag.com and is reproduced here by permission of IBM Systems Media.