Browse Prior Art Database

Method to Recover Dedupe-Data by Writing Metadata behind the Data

IP.com Disclosure Number: IPCOM000246296D
Publication Date: 2016-May-25
Document File: 4 page(s) / 103K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method to recover damages within deduplicated storage environments. The core idea is to add an identifier for the starting metadata information at the end of each userdata record; therefore, in the event of metadata corruption, the system can use the identifiers to help restore the repository and save data.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 4

Method to Recover Dedupe-Data by Writing Metadata behind the Data

This article addresses securing data integrity in deduplicated block storage environments, especially adding the possibility to recover data.

The amount of digitally stored information is dramatically increasing. In order to mitigate the effects of this dramatic growth, organizations utilize data compression and data deduplication methods to eliminate duplicated data.

The present disclosure introduces a mechanism to recover damages within deduplicated storage environments.

Figure 1 outlines the principles of data deduplication. An algorithm identifies

identical blocks. If the algorithm identifies a new block as a duplicate of a previously stored block, then a system only stores a pointer to the physically stored block in the metadata.

Figure 1: Prior art principle of deduplication. The system identifies duplicated chunks; each chunk is stored one time.

If damage occurs on data related to the metadata, the impact is higher, resulting in the total loss of a complete repository.

Figure 2: Current deduplication data structure

• Metadata are separate from the userdata
• Metadata get used as pointers to userdata (chunks). Metadata corruption causes loss of the data repository.

1


Page 02 of 4

The novel idea is to add an identifier for the starting metadata information at the end of each userdata record ( writing metadata identifier behind the data) . Thus, in the event of metadata corruption, the system can restore the repository and not completely lose the data. If reconstruction becomes necessary, then the system does not need to perform a complete scan of the userdata structure in order to find any missing metadata. A restore of the metadata recovery values can repair the

damages or reconstru...