Browse Prior Art Database

Efficiently Distributing Knowledge of Device State in a Clustered System

IP.com Disclosure Number: IPCOM000031562D
Original Publication Date: 2004-Sep-29
Included in the Prior Art Database: 2004-Sep-29
Document File: 3 page(s) / 41K

Publishing Venue

IBM

Abstract

Disclosed is an Enterprise Storage System (ESS) which is organized as follows: it has two or more clusters, a cluster [1] is composed of one or more ranks and a rank is composed of one or more logical volumes. Each cluster element needs to determine the state of volumes on both clusters. When a cluster queries the state of a remote volume, i.e., a volume that is owned by the other cluster, a remote call is executed. On a typical ESS such queries are frequent. We address the problem of minimizing the remote queries while not impacting normal system operations.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Page 1 of 3

Efficiently Distributing Knowledge of Device State in a Clustered System

An Enterprise Storage System (ESS) is organized as follows: it has two or more clusters, a cluster is composed of one or more ranks and a rank is composed of one or more logical volumes. Each cluster element needs to determine the state of volumes on both clusters. When a cluster queries the state of a remote volume, i.e., a volume that is owned by the other cluster, a remote call is executed. On a typical ESS such queries are frequent. We address the problem of minimizing the remote queries while not impacting normal system operations. There is no known solution to the problem. There are three core observations used in the solution.

* When a physical rank changes its state in ESS, the state of multiple logical volumes are changed simultaneously.
* The state of each ESS volume can be kept on each cluster (there is no space constrain).
* There is an order on volume states and we can answer queries with an incorrect volume states as long as the state is less than the current state in the given order (this is acceptable only for a short window of time)

The idea is to keep the state of each ESS volume on each cluster in the Volume State Table (VST). This can be viewed as a type of cache [2]. If the VST is up to date, when a cluster queries the state of a remote volume, the remote call is avoided. The solution minimizes the number of remote updates of the VST. When a physical rank changes, the states of its associated volumes are changed simultaneously. As a result, a burst of volume state changes occur when a rank changes. For each rank we estimate the length of this burst. When we observe a burst of volume state changes that belongs to a given rank, we accumulate these changes for the estimated length of the burst. We then make a single remote call that updates the VST on the other side. Thus, on the average, the number of remote calls for an update of a rank change is only one. The reason we can accumulate volume state changes for the estimated time of the burst is that for that small duration we return volume states that are ordered higher than the current volume state (In the redundant case when there is no order on volume states one embodiment of the invention is to return an unknown state by the VST for the burst small time duration).

Note:
* We describe the solution in the context of two clusters. The solution is easily generalized to multiple clusters using the remote broadcast service.
* The solution is not applicable only to ranks but to any group of volumes that change simultaneously.

We keep a map that associates a rank with its burst estimated length. In addition, a dual buffer scheme is used to accumulate volume changes on the VST when transferring volume states to the opposite cluster. A transfer daemon is responsible for transmitting accumulated volume changes to the other cluster. As a result of the dual buffer mechanism we have two VST for each clu...