Browse Prior Art Database

Enabling Application and Meta Data Informaton through Tape Deduplication Monitor

IP.com Disclosure Number: IPCOM000198576D
Publication Date: 2010-Aug-09
Document File: 2 page(s) / 23K

Publishing Venue

The IP.com Prior Art Database

Abstract

Described is a tape deduplication monitor that uses the unique application information along with specific qualifiers to identify the source and characteristics of the tape and build tables of identifying information. Deduplication applications access this table as they process writes to a tape to filter metadata and application specific characteristics. This enables enable more efficient deduplication processing and data reduction percentages.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 2

Enabling Application and Meta Data Informaton through Tape Deduplication Monitor

Customer data management storage trends are moving toward the combination of virtual tape and data deduplication to improve efficiency and reduce the costs of tape processing operations. There are serious obstacles that applications such as DB2, HSM, FDR, etc. have to overcome to attain the extraordinary efficiency of up to 25 TB of application data being stored in 1 TB of disk cache after deduplication.

One of the inhibitors that deduplication applications face is metadata that these applications intermix with the actual customer raw data as they write to tape. Some of this metadata includes items such as dates or timestamps which might be unique between otherwise identical version items, such as data set backups. This disrupts data patterns and impacts the ability of deduplication products to find patterns in the data stream; knowledge of the processing application is not readily apparent from the raw tape data.

Improvements are needed to provide deduplication processing information on the application writing data to a tape so they can tune their deduplication algorithms accordingly.

The solution presents a new tape deduplication monitor which uses the unique application information along with specific qualifiers in the mount and sms message text to identify the source and characteristics of the tape. For each tape currently mounted, the system constructs a unique table where each application provides information for use by deduplication processing. This documentation can be general for any component or tailored to unique or specific processing that component might perform. This provides tailored information on the processing application characteristics that allows deduplication processing to utilize parsers tailored to the data steam unique to that application.

An advantage to this solution is that it does not impact the format of current information written to tapes nor the performance of deduplication tape processing. This is unique over an existing approach which often requires application program modifications to identify the type of data on a tape.

Existing processes and procedures need not be changed to accommodate the new technology and little or no operating system modification is required. The deduplication monitor is compatible with any of the currently available or future deduplication products.

The disclosed process used by the tape duplication monitor follows:

1. Combine product usage identification data. The tape monitor interrogates every tape mount message issued, along with associated sms information, to compare combinations of the program, data set name,

jobname and stepname to a

table/fi...