Browse Prior Art Database

A method to accelerate calurate/compare Hash value in NTFS De-dup enabled disk

IP.com Disclosure Number: IPCOM000232518D
Publication Date: 2013-Nov-15
Document File: 6 page(s) / 60K

Publishing Venue

The IP.com Prior Art Database

Abstract

This disclosure is aim to speed up the hash calculation under copy senario. Gernerally, the comman method to deal with duplication files on offline De-dup area is to divided each files into small chrunks and then calcuate the hask key for each chunks and then compare the hash kay and deleted the duplication files which surely have the same hash value.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 6

A method to accelerate calurate /

/compare Hash value in NTFS De

compare Hash value in NTFS De -

-dup enabled disk

dup enabled disk

Currently the host side de -duplication feature significantly improve the efficiency of storage capacity usage . It saves storage capacity by delete the duplicate contents in de -dup enabled disks. There files are no longer stored as independent streams of data , but are replaced with points to data stored within a common chunk store , as shown as chart 1-1. All (1)-(6) actions were performed at an appointed time.


(1) Divide file A into data chunks per variable chunk size ;


(2) Calculate hash value for file A ;


(3) The same action for file B ;


(4) Compare hash value for each data chunks between file A and B ;


(5) Save metadata and map the right point .


(6) Delete data chunks which has the consistent hash value .

This disclosure is worked as an enhanced algorithm which will do some improvement on (1)-(4) under aspecial scenario such as Copy command happened and the like. As shown as Chart 1-2. It will speed up the hash value calculation and compare by leverage (1)-(4) into general operating time rather than a special busy computing time with a more de -dup efficiency method when Copy happened .

Chart 1-1: general de-dup method :

1



Page 02 of 6

Chart 1-2: improvement on this disclosure . General de-dup vs in this disclosure .

2



Page 03 of 6

3



Page 04 of 6

As stated in background, this idea is benefit from os filesystem itself to help saving hash value calculating and compare time . Imagine if a 1000 copy happened on a de-dup enabled disk,...