Intelligent selection of logs required during recovery processing
Original Publication Date: 2002-Sep-16
Included in the Prior Art Database: 2003-Jun-21
Disclosed is a process for improving the speed of restoring a shared database by only replaying the log from the systems which have updated the data base since the previous backup. The IBM* Websphere* MQ* product (MQ), data base managers and other programs are increasingly allowing concurrent shared access to recoverable resources from multiple instances of queue/database managers. This is done to provide increase capacity, reliability, availability. If there should be a failure of the underlying data storage mechanism then to recover the data it is necessary to restore a backup copy of the data and to replay the logs of each of the queue/database managers to restore the data to its state immediately prior to the failure. Performance studies have shown that the log replay volume is the most significant factor in achieving a fast recovery from failure and so reducing the amount of log data that needs to be replayed is desirable. All though in a data sharing every instance of a queue/database manager has the potential for making recoverable updates that need to be replayed from the log in practice some instances may have been inactive whilst others may only may only have involved read-only or non-recoverable access to the data or have made all of their updates to portions of the data that have not failed. What is needed is means to identify which queue/database manager instances are likely to have made updates to the failed portion of the data so that when a recovery is needed we only need to replay the appropriate subset of the logs. However it is important to realize that storage media failure is a rare event and therefore the overhead of collecting the information about which logs will be needed during a potential replay should not impose a significant burden on the normal operation of the queue/database manager.