Method and System for Managing Multi-tenency Over Deduplication Technology for a Storage Cloud
Publication Date: 2011-Sep-02
The IP.com Prior Art Database
A method and system for managing multi-tenency over de-duplication technology for a storage cloud is disclosed. Selective de-duplication in the storage cloud is performed by de-duplicating data of a file or a block based on a policy for de-duplication of each file. The policy for a file or a block comprises information regarding ownership of data present in the file or the block, business importance of the data, and a list of friendly owners for the data. Based on the information in the policy, potential redundant copies of a file or block are determined.
Page 01 of 3
Method and System for Managing Multi -tenency Over Deduplication Technology for a Storage Cloud
Disclosed is a method and system for managing multi-tenency over de-duplication technology for a storage cloud. A plurality of files or blocks may be de-duplicated in the storage cloud according to a policy for a file or a block of the plurality of files and blocks created by a user. The policy comprises information regarding ownership of data present in the file or the block, business importance of the data, and a list of friendly owners for the data. Deduplication is performed on the plurality of files in the storage cloud according to the defined policy for each file or the block. A file defined as a business important file having multiple copies from multiple friendly owners may be removed and referenced to a single master copy of the file after the deduplication. An architecture that implements the disclosed method is shown in fig. 1. The architecture includes one or more application servers, a de-duplication server and a storage system for managing the de-duplication for the storage cloud.
(This page contains 00 pictures or other non-text object)
A table shown below indicates a database schema of the policy associated with files and blocks that are to be de-duplicated. Each file or block will have a policy for de-duplication. The policy comprises of one or more of an owner, a dataclass, a datatype, and a list of friendly owners for the file. An owner is a user-id associated with a file to which a given data belongs. A dataclass is a classification of the data based on file characteristics such as, one or more of a semantic content, a metadata, and an attribute of the file. The datatype is an indication of a type of data. The datatype can be '0' or '1'. Datatype value of '0' indicates that a data belonging to this dataclass is private and business-important. Further, datatype value of '1' indicates that a data belonging to this dataclass is not important and can be de-duplicated with any other files in the storage cloud. The list of friendly owners indicates a file can be de-duplicated with the data of the listed owners provided they have the owner of the file as a part of their friendly list. In one embodiment, all the policies for deduplication are stored in a central
Page 02 of 3
Class of data based on its contents or...