Browse Prior Art Database

Striped Mirroring of Data for Redundancy in a Shared Nothing Cluster Architecture

IP.com Disclosure Number: IPCOM000237832D
Publication Date: 2014-Jul-16
Document File: 3 page(s) / 45K

Publishing Venue

The IP.com Prior Art Database

Abstract

In this article a method is described that expands error correction methods within a Redundant Array of Inexpensive Disks (RAID) on a single controller or storage system to a Shared Nothing (SN) architecture across multiple server nodes within a cluster where data is stored on local disks. Furthermore such data is to be striped, i.e. split into smaller chunks of data and distributed across multiple nodes within such cluster for balancing the data stored across such cluster. Such architecture does provide better scalability of the overall system compared to a centralized storage system which is especially important for Big Data applications deployed on cluster architectures.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 3

Striped Mirroring of Data for Redundancy in a Shared Nothing Cluster Architecture

Redundant Arrays of Inexpensive Disks (RAID) [1] are built using disks attached to a single controller or to controllers comprised in a single server system or storage subsystem. In a server cluster with data stored locally on the server nodes with such RAID protection, the system would be protected against disk fails, but not against server node fails. Usually, central storage is used with such clusters that implement a local RAID on a storage server. With Shared Nothing (SN) architectures [2] only local storage is used that needs to be protected against disk and server fails. Disclosed here is a method to store data dispersed across server nodes using the functionality of a cluster filesystem like IBM's General Parallel File System (GPFS) [3].

GPFS supports SN architectures with the File Placement Optimizer (FPO) [4] feature, where the local storage subsystems are comprised under one cluster filesystem. It does support data locality, i.e. data that is accessed local to a node is stored on local disks. It does also provide replication, i.e. within the filesystem data can be replicated. With such replication, it can be ensured that all replicated data is to be stored on distinct cluster nodes so that in case of a node failure, it can be ensured that a copy of the data is to be found on the remaining nodes. Furthermore, striping of such data is supported to balance the I/O load across such nodes and to better leverage distributed I/O bandwidths on such cluster.

1


Page 02 of 3

Figure: Cluster implementation using a Shared Nothing storage architecture with striped replication

In the Figure it is shown how the storage is distributed across several nodes, the data is stored locally within the primary replica and a secondary replica is provided in a striped manner in order to prevent the overall cluster system against data loss. The secondary replica is completely stored on distinct nodes. By doing that, in case of a server node drop out in the cluster, all data is maintained within the cluster, i.e. a full mirror of the data is being provided across the cluster. In order to balance the I/O workload, such replication is distributed thru striping. As one can imagine, storing the data of one node to a singular distinct node would certainly result in a high I/O load on both of the nodes. That would be a...