Browse Prior Art Database

A method to balance the life cycles of SSD members within RAID arrays

IP.com Disclosure Number: IPCOM000212659D
Publication Date: 2011-Nov-22
Document File: 3 page(s) / 47K

Publishing Venue

The IP.com Prior Art Database

Abstract

Systems and methods for balancing the life cycles of members within SSD RAID arrays. The S.M.A.R.T. attributes for each SSD member drives are collected periodly. Based on the S.M.A.R.T. attributes, the remaining life for each SSD member is calculated. Then an algorithm can detect the imbalance of life cycle within the members of SSD RAID arrays. This algorithm can also find the optimized swap plan to balance the life cycle of members. With life cycles being balanced among the members, the overall life cycle of the SSD arrays is extended.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 3

A method to balance the life cycles of SSD members within RAID arrays

The FLASH SSD (Solid State Disk) which is used by most enterprise disk storage subsystem has a limit write(erase) cycles for each block. It means that the life for an SSD shortened when write (erase) is issued against one SSD drive. Within an SSD RAID array, it is often that one SSD drive have much more write (erase) than other drives due to the imbalance of write (erase) operations. And then that drive comes to the end of life much sooner than other drives.

The basic idea is to balance the life span by swapping the data among different members in the SSD array.

The solution at least has such benefits:


a) Improve the safety of data on the SSD RAID arrays.

When one SSD drive comes to the end of life, in normal case, RAID controller will reject that member, take a spare drive and then rebuild the data.

During the rebuilding process, the RAID array is in a risky state (especially for RAID5).

With the solution, the life span is balanced, so it will delay the time when any member drive is rejected due to the exceed of the limit of write (erase) cycles.


b) Reduce the service cost.

If one member drive fails (or being rejected) during warranty period, then we need to replace a new drive for the customer for free.

With the solution, we will get a long time window without any SSD drives being failed due to the exceed of the limit of write(erase) cycles.


c) Protect and enhance the reputation of storage products.

Whenever any hardware or software fails, the customer will always have a bad feeling. Especially if something fails soon.

How does this solution solve the problem:

SSD drive needs to be able to provide performance data such as:

- The (average) maximum write (erase) cycles;

- The current write (erase) cycles.

For now, most FLASH SSDs are able to provide such performance data as S.M.A.R.T. attributes;

Details of the solution:


1. First, these measurement matrixes are defined by the user or system:

a) The time window for how long to trigger our system to detect and recover the imbalance:

global time window for how often the system needs to check the available life imbalance within an arrays.

The time window can be exposed for the user to define or auto set by the system.

The time window needs to be long enough to be able to use the write (erase) speed in the previous time window to estimate the speed in the next time

window.

And the time window should NOT be too long to be larger than an SSD drive to exceed to the write (erase) limit.

b) The threshold used to determine whether it is necessary to recover the imbalance;

The threshold can be defined such as: how much (20% or more) longer the shortest life span has been expanded after the recover.

1


Page 02 of 3


2. At the end of each time window:

Step 1: Calculate the write (erase) speed for each SSD drive in the previous time window (this speed will be used to estimate the speed for next time

window).

Calculate the remai...