Method for an Effectively Infinite Bad Block Table
Original Publication Date: 2009-Oct-22
Included in the Prior Art Database: 2009-Oct-22
Disclosed is a method for an effectively infinite bad block table.
In today's storage subsystems, there are numerous failures which the subsystem may encounter. Often in these failure scenarios, data can be recovered by using techniques involving RAID or mirrored caches. However, in the case that data cannot be recovered, it is the subsystem's responsibility to maintain data integrity. That is, although data may be lost, the subsystem must not return faulty or incorrect data. To do this, the subsystem must keep track of Bad Blocks, which when read will result in a failure (e.g., Data Check) being reported such that the data loss condition is known.
For example, assume a RAID-5 array has a failed drive (A) and a second drive (B) encounters a read error while rebuilding the initial failed drive (A). Once the array has finished rebuilding, the subsystem must maintain the integrity of the data by remembering that the block that encountered the read error, as well as the block being rebuilt, have been lost. This includes over an indefinite amount of time and across power cycles. This also includes any future activity and failures that occur. If the drive that had the read error (B) fails and is later rebuilt, the indication must not be lost.
This has been solved in the past with a number of techniques. These usually use one of two methods. First, specially formatted drives or special commands (e.g., Write Long) to the drives can be used. Drives with extra bytes in each block can be used to save a Bad Block indication in the extra bytes of each block. Similarly, special commands which corrupt or alter the drive's internal block ECC could help hold or at least identify the possible presence of a Bad Block. Second, when specially formatted drives are not available, Bad Block Tables are used. Bad Block Tables may be used even when specially formatted drives are available. When Bad Block Tables are used, it is the desire that the checking be done as quickly as possible as every op must be checked.
For performance, it is desired that the Bad Block Table be small and kept in memory where it can be easily checked. To cover a large number of bad blocks, the Bad Block Tables can become very large. Alternatively, by limiting the size of the Bad Block Table, only a limited number of bad blocks could be tracked and thus, at some point, the entire drive or array would have to be considered failed when the Bad Block Table is exhausted.
This invention provides a method to track an effectively infinite number of bad blocks in a Bad Block Table which uses only a small amount of memory (see Figures 1 and 2).
A key idea of the invention is that the Bad Block Table can track a reasonable number of individual bad blocks just l...