Counter-intuitive RAID Rebuild Algorithm for Virtual Tape Server
Original Publication Date: 2001-Oct-01
Included in the Prior Art Database: 2003-Jun-20
We desire to improve the rebuild of a RAID (Redundant Array of Independent Disks) in a VTS (Virtual Tape Server) environment. Our approach is counter-intuitive, as we propose to first rebuild the least recently used (LRU) or least frequently used (LFU) files in cache, so that they can be flushed from the cache. This approach then frees up cache for the more actively used files. VTS consists of a hard-disk front-end to an automated tape library. If the data in the tape library is organized in a RAID, which may also be called a RAIT for Redundant Array of Independent Tape, the prospect of a RAID/RAIT rebuild needs to be addressed. We prefer that each file in VTS is represented by a meta-data structure called an inode. Each inode contains the description of the file, to include file type, access rights, owners, time-stamps, size, pointers to data blocks, etc. Key to our invention is that the addresses of data blocks allocated to a file are stored in its inode. When a user requests an I/O operation on the file, the kernel code converts the current offset to a block number, uses this number as an index in the block addresses table and reads or writes the physical block. This inode structure would be used to identify the RAID/RAIT stripes of those data held in cache. Thus, the cache would be searched to see what data held in cache could be used to rebuild damaged data stripes, because the inodes pertaining to the addresses of the damaged portions of the damaged data stripes would be known . By rebuilding damaged data stripes from cache, the data would be rebuilt without time consuming exclusive-or (XOR) parity calculations. Because RAID/RAIT rebuilding via XOR calculations can take several hours and consume most of the controller's CPU (central processing unit), it is a major performance advantage to rebuild the missing data directly from cache. The repaired data stripe could still have its parity checked later, when the RAID/RAIT is quiescent, to verify the rebuild from cache. Knowledge of the contents of the cache is taught by U.S. patent 6,163,773. This patent taught a neural network embedded in a cache management engine. The controller keeps a catalog, a data-set access log, a training record, and a score list. The access log, training record, and score list would be used to identify the LRU or LFU files in cache.