Automatic Restart of DCE through Tivoli Management Environment
Original Publication Date: 1999-Dec-01
Included in the Prior Art Database: 2003-Jun-19
One of the aspects of monitoring the status of DCE servers using a *Tivoli Management Environment (TME) monitor, is what to do when one or more of the servers have their status observed as "down". For DCE, it is possible to restart DCE remotely. It is also possible that the administrator would not want this function to apply at all times so a mechanism would be needed to disable the functionality.
The DCE support implemented using Tivoli management interfaces provides for a DCE Server Status monitor and Restart DCE task. The automatic handling of a "down" status from the monitor should result in the Restart DCE task being called after some user-defined time interval. The minimum interval should be the amount of time it takes to execute the Restart DCE task and get back an "up" status from the monitor since one does not want to flood a host with restart requests that didn't have time to complete.
The DCE rules were written to handle getting the "up/down" status from the monitor. The "up" processing differs depending on whether a one of the servers that was down was a core server (dced, secd, cdsd, or cdsadv) or not. For a core server down that is now back up, many of the events that came in from that host are closed automatically. This does not occur if only one or more non-core servers went down and back up. The "down" processing differs depending on whether a one of the servers that was down was a core server (dced, secd, cdsd, or cdsadv) or not. For a core server down, a DCECoreServerDown event is generated with the name of the core server that is down as part of the new event's msg slot as follows:
msg="The core server,