Browse Prior Art Database

A method for using extra metadata space on block storage system to apply block level data analysis Disclosure Number: IPCOM000257175D
Publication Date: 2019-Jan-18
Document File: 5 page(s) / 161K

Publishing Venue

The Prior Art Database

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 32% of the total text.

A method for using extra metadata space on block storage system to apply block level data analysis

Background: a) The tradition block storage system stores data on the disk drive by writing data into fix sized blocks.

The storage system doesn't aware of the content of the data stored at all, and the data blocks are managed completely by higher level software.

b) The existing object storage system is the most similar storage system that stores metadata for real data in the same level. And the object storage has the potential to accelerate data analysis by providing its metadata to data analysis system. But the object storage system relies on block storage to store the real data, even the object storage could implement the analysis abilities inside it, it is done from a level higher than the block device.

c) There are also other methods that introduced similar metadata space for the data stored on block device, but those methods are all using their metadata to take care of data integrity and other basic block level operations that does not involve any data analysis.

d) This method adds an extra space along side of each data block as metadata block, allowing some of storage system which has embedded data analysis capabilities to store data analysis result for each data block inside this metadata block, as an assistant method for following overall data analysis.

By doing this, some storage systems which has embedded data analysis capabilities could take advantage of this metadata block, apply analysis behavior during the data is being read/write, to carry out the analysis in-flight. When a normal I/O request is finished, the analysis is also done in parallel. There will be no more time required specifically for analyzing the data.

e) Another problem that this metadata block can solve is, by adding relationship information like previous block pointer and next block pointer into metadata, block storage system which uses this metadata method can have the ability to manage data on the block level, achieve capabilities like keeping continuous data blocks together by moving around them asynchronously in the background, so that when the data is queried by read I/O requests, it improves the read delay time because all blocks requested are stored continuously in one place.

Summary: This method tries to improve data analysis efficiency by introducing an extra metadata space for

each data block on the block storage device, and providing an interface to allow storage system and higher level application to take advantage of this metadata space, store analysis result for each data block in corresponding metadata block, so that the analysis could be done in parallel with the data I/O requests. For some scenario, it might not be necessary to read all raw data from the storage device then analysis on each piece of data in order to get the result. It could be just extracting the analysis results from metadata block and put them together to form the final analysis...