Optimization Methods for Fast Loading of TOC's.
Original Publication Date: 2005-Sep-02
Included in the Prior Art Database: 2005-Sep-02
This disclosure pertains to management and optimization of tables of contents (TOC's) kept for backupsets. The assumptions are that for each backupset generated, a TOC will be created that is stored on the media with the backupset. Another copy will be retained in a fast access (disk) storage pool for performance purposes.
Optimization Methods for Fast Loading of TOC's .
This disclosure pertains to management and optimization of tables of contents (TOC's) kept for backupsets.
The assumptions are that for each backupset generated, a TOC will be created that is stored on the media with the backupset. Another copy will be retained in a fast access (disk) storage pool for performance purposes.
In the case of TDP's where the applications are not being changed to support restore from backupsets there are 2 access methods :
1. query and restore active data.
2. query and restore data from all previous backups (inactive + active).
The second access method requires that all TOCs for backupsets containing TDP data be loaded and merged with the active inventory for the TDP client. This all or nothing approach, which is needed for transparency, creates a performance issue. As time goes by, the number of TOCS will continue to grow and the time to load all of the TOC's associated with the node will continue to grow in an unbound manner. This is especially problematic in the realm of TDP 's where most data is kept for very long times.
The basic idea behind this solution is to keep a "running master TOC." A master TOC is created, and then updated in an incremental manner as new backupsets are created. Because the changes associated with a given backupset are small when compared to the total universe of all backupsets, the cost of adding to the master TOC is also small, and can be done in a constant amount of time regardless of the number of backupsets that have been previously created. More importantly, updating the master TOC is done as part of the back-end processing instead of in response to an interactive query, meaning that end-user response time increases dramatically.
An additional advantage of having a master TOC is that it will allow the user to issue repeated operations on the same set of data with a one time price for initial load. After that, the master TOC will always be available, and the user will perceive no significant delay in accessing the information in the TOC.
So using the existing mechanism, in a situation where a customer has a backupset generated daily, after 100 days any operation that accesses inactive data will load and insert 100 different TOC's. On the other hand, while using this mechanism, the master TOC will be updated daily with...