Browse Prior Art Database

Design of General Message-Passing Watchdog Process

IP.com Disclosure Number: IPCOM000129517D
Original Publication Date: 2005-Oct-06
Included in the Prior Art Database: 2005-Oct-06
Document File: 3 page(s) / 34K

Publishing Venue

IBM

Abstract

As the client/server architecture of software systems proliferate, there is an increasing need for each component to know the state and availability of other components. There is currently no good way to do so, because generally, the conventional approach is that it is initiator driven--that is, whenever the need arises to find the state of another entity, the initiating entity needs to probe. For example, when the Resource Manager tries to migrate 10,000 documents (or batches of documents, whatever the unit is) to Tivoli Storage Manager (TSM), it probes TSM to see if it's available 10,000 times. As you can see, this is impractical. For example, if TSM does crash, worse case is that a timeout occurs during the file transfer. If this scenario happens, a timeout would likely occur sooner than when 10,000 probes are completed and also takes up less resources than the probing of TSM 10,000 times.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Design of General Message-Passing Watchdog Process

Our invention is an approach of the communication of "states" between entities. One of our objectives is to make it as generic as possible so it will apply to any other product. What we will utilize is a watchdog process, its implementation is trivial, using java, c++, or any other language. It is placed on the machine that needs to be monitored, takes as input a list of processes to monitor, as well as any specific information that the process needs to ensure that the process is accepting requests, such as port number, connection information, etc.

Then periodically (configurable by user), it will check to see if the process(es) to be monitored is up, either by checking to see if the process is running, or testing by making an actual connection to the process. In this process, it can even gather statistics, such as network throughput, etc. If and only if the state of this particular process changes, the watchdog sends a message to the remote machine that needs the process, such that the message can be retrieved whenever the remote process needs. And all previous messages received about the monitored process on the remote machine can be purged periodically to save space. The message could be in the form of an email, file, etc, such that the remote machine can retrieve it whenever it wants. This approach is good, because the remote machine only acts when it needs to, whereby freeing it from the burden of having to respond to all machines it is connected to whenever it is pinged (polling). So for example, suppose I have a content management system with one central library server, and 50 remote resource managers. I place this generic watchdog process on each of the remote resource managers. Every 5 minutes, each watchdog process checks for change in status of the Resource Manager. If the state changes, it sends an email (or whatever) to the library server machine notifying it of the RM's latest state/statistics. Then, only if the library server needs access to a particular RM, say RM 23, does the library server check the email from RM 23. As you can see, in this approach, the communication cost on each RM as well as on...