Browse Prior Art Database

Method and System to Provide Fault Tolerance Against a Single Point of Failure During a Logical Drive Migration

IP.com Disclosure Number: IPCOM000013805D
Original Publication Date: 2001-Aug-28
Included in the Prior Art Database: 2003-Jun-18
Document File: 1 page(s) / 38K

Publishing Venue

IBM

Abstract

One of the major objectives of ServeRAID is to protect the user from any single point of failure. To ensure that this is indeed the case, there is a need to recover from an adapter failure during a logical drive migration (LDM). Data is moved on a stripe basis during an LDM, and the current stripe being migrated is stored in a nonvolatile SRAM on the adapter. The problem is that there is no way to determine what data has already been migrated, or the LDM status, if the adapter goes bad or the Nvram gets corrupted. The advantage of this feature is that it allows the user to eventually return to a normal environment. Normal in the sense that all of his/her logical drives return to ONL. This invention solves the problem by maintaining a redundant copy of the LDM status in a removable cache controller. If the adapter fails during an LDM, it will be possible for the user to fully recover his data. The course of actions required would be for the user to remove the cache controller from the faulty adapter and insert it into a new one. Then import the configuration from the drives, at which time the LDM and SYS drives will be operational once again. After the adapter is reset, the LDM is restarted by internal firmware logic and the LDM will eventually complete, returning the logical drives back to the ONL state. 1

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 100% of the total text.

Page 1 of 1

  Method and System to Provide Fault Tolerance Against a Single Point of Failure During a Logical Drive Migration

  One of the major objectives of ServeRAID is to protect the user from any single point of failure. To ensure that this is indeed the case, there is a need to recover from an adapter failure during a logical drive migration (LDM). Data is moved on a stripe basis during an LDM, and the current stripe being migrated is stored in a nonvolatile SRAM on the adapter. The problem is that there is no way to determine what data has already been migrated, or the LDM status, if the adapter goes bad or the Nvram gets corrupted. The advantage of this feature is that it allows the user to eventually return to a normal environment. Normal in the sense that all of his/her logical drives return to ONL.

This invention solves the problem by maintaining a redundant copy of the LDM status in a removable cache controller. If the adapter fails during an LDM, it will be possible for the user to fully recover his data. The course of actions required would be for the user to remove the cache controller from the faulty adapter and insert it into a new one. Then import the configuration from the drives, at which time the LDM and SYS drives will be operational once again. After the adapter is reset, the LDM is restarted by internal firmware logic and the LDM will eventually complete, returning the logical drives back to the ONL state.

1