Browse Prior Art Database

Greener Method for Data Deduplication with Secure Data Remanence

IP.com Disclosure Number: IPCOM000198559D
Publication Date: 2010-Aug-09
Document File: 3 page(s) / 26K

Publishing Venue

The IP.com Prior Art Database

Abstract

Described is a method for increasing the energy efficiency of the data duplication process by coupling the data deduplication (dedup) process with data remanence to ensure secure deletion of classified data during deduplication while decreasing the levels of heat dissipation, using less cooling power, and consuming fewer carbon credits.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 46% of the total text.

Page 1 of 3

Greener Method for Data Deduplication with Secure Data Remanence

Data remanence is the residual representation of data that has been in some way nominally erased or removed. Depending on the effectiveness and intent, they are often classified as either clearing or purging/sanitizing. Specific methods include overwriting, degaussing, encryption, and physical destruction. Most popular is overwriting with given number of levels as specified and mandated by Department of Defense (DOD) specification 5220.22M [1].

Deduplicationis a process where duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained should that data ever be required [2].

Modern storage appliances and solutions are equipped with both the above technologies. Some storage applications in the industry already have offerings bundled with these technologies.

Hybrid storage is increasingly common where storages have a mix of SSD, HDD, Phase-Change disks, Racetrack disks or even mix of small factor disk and regular size disks, each having their own unique characteristic (pros and cons). Mapping Hybrid storage equipped with data ramanence and data dedup over it creates a potential problem which is not addressed by any of the existing offerings which directly influence greener storage technology.

Problem with Existing Technology:

The data dedup process eliminates the duplicate data across the storage farms and maintains a single copy which is then referenced all over. The most popular and deployed process of data dedup (namely, post-process deduplication ) involves deletion of duplicate data. When data being deleted is categorized as sensitive, systems activate data ramanence. As per Department of Defense (DOD) specification, data must be securely purged by a given number of overwrites depending upon its classification. As a result, there is a need to inherently associate data ramanence with the data dedup process in Storages.

Currently, this aspect is not considered in any of the documented arts over dedups; hence, can be a market differentiator for regulatory-governed industries. The art represents that secure delete (data ramanence) is a WRITE-INTENSIVE process and depending upon the level of secure delete required there are number of overwrites executed over the disks to ensure the secure scrubbing of the deleted data. Write intensive operations dissipate a lot of heat which, in turn, costs the business over Carbon Credits and Cooling Needs. In a Hybrid Storage system where data dedup and data ramanence are coupled, a strong data dedup scheme which intelligently selects the unique copy to retain and delete the duplicate copies in a storage farm/cloud reduces the heat dissipation during the process to minimum levels. The lower heat

1

Page 2 of 3

dissipation results in saved Carbon Credits and cooling needs, making the system greener.

The core idea:

1. Incorporates a data dedup process coupled with data ramane...