Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for a Resilient Data Storage

IP.com Disclosure Number: IPCOM000033684D
Original Publication Date: 2004-Dec-23
Included in the Prior Art Database: 2004-Dec-23
Document File: 4 page(s) / 103K

Publishing Venue

IBM

Abstract

This invention addresses the issue of data storage security beyond the traditional protections based on preventing intruders/hackers from accessing/damaging data using technologies such as firewalls, encryption, and authentication.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 27% of the total text.

Page 1 of 4

Method for a Resilient Data Storage

This invention addresses the issue of data storage security beyond the traditional protections based on preventing intruders/hackers from accessing/damaging data using technologies such as firewalls, encryption, and authentication. The invention proposes a novel system where the data is constantly fragmented, and the fragments are constantly duplicated and moved in a wordwide distributed network. Particularly, the fragments of data move away from a disaster location (ground zero) when and if it happens. This allows the recovery of data even if a big disaster afflicts a part of the system. This invention also introduces a novel way to allow the system to be resilient and self-generating by automatically replacing damaged part (data fragments) by undamaged/"healthy" duplicates.

The invented system relies on the continuous fragmentation and migration, of the stored data items, among the nodes of the distributed storage network. As data fragments migrate from one node to another, fragment replicas are also created probabilistically to achieve more resilience by redundancy. Migration patterns of data fragments are randomized in a way that minimizes the chances of losing all the replicas of a fragment by a single node failure. These migration patterns are non-deterministic so a particular data item cannot be traced, and selectively destroyed. When a node is infected or suffers a failure, it is automatically isolated from the rest of the storage system. Fragments residing on neighboring nodes are migrated away, temporarily, to protect them against the case where those neighbor nodes were also subject to threat from the same source of failure/infection/attack. Below we detail the data lifecycle since its submission for backup until recovery.

The underlying assumption on which the system is based is that a peer-to-peer overlay network is in place and the resilient data storage system runs on top of such overlay network. The overlay network topology may take any form (e.g., mesh, ring, etc.) The following is assumed about the network overlay:
(1) each node is aware of its immediate neighboring nodes, and (2) each node is connected to the rest of the network by 2 logical links, at least.

Initial data fragmentation and placement

In this phase, a new data item is presented to the system to be stored. The name of the data item is mapped into a unique system-wide identifier. Ensuring a unique system-wide identifier may be achieved in several ways. One of which is to generate an identifier that is a simple combination of the name or network address of the host at which the data item is introduced, the identity of the user introducing the data item, time at which the data item is introduced (may be expressed in seconds since a certain fixed date), and the short given name of the data item. Alternatively, the above combination may be hashed and the hash code represents the data id. Yet another alternative approac...