Browse Prior Art Database

A Method and System for Allowing Compression above Cache Platforms to Maintain a Fully Persistent Index on Disk

IP.com Disclosure Number: IPCOM000241060D
Publication Date: 2015-Mar-23
Document File: 4 page(s) / 161K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for allowing compression above cache platforms to maintain a fully persistent index on disk without any performance degradation or extra IO.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 4

A Method and System for Allowing Compression above Cache Platforms to Maintain a Fully Persistent Index on Disk

Compressing data is an efficient way to save space on storage systems. The compressed data allows enhanced space efficiency by compressing user data to storage in real time and decompressing it on user read demand. The data is compressed before being written to a physical layer which saves disk space and reduces number of input/output (IO) issued to disk. In some cases, compression of data occurs above cache depending on integration platform. The compressed data is not really written to disk but to a lower cache. The integration platform where compression takes place is required to synchronize the data with a secondary cache in order to be persistent in case of any failure and flush data to disk when needed. The data is written to disk/cache in a log-structured (journal) format. The data is compressed in the order it is written by a user/application such as a time based compression or temporal locality. After user data is

compressed, it is written into physical fixed-size blocks where each compressed block might hold several user logs from different and virtual offsets may not be adjacent. Since data modifications may be very frequent, the index is flushed to the underlying storage periodically. In case of a failure, the index of data that is not indexed may be reconstructed by processing the part of the stream that was about to be indexed and therefore a fully persistent index may be inefficient to maintain. Writing each index record directly to disk before returning acknowledge to front end may have a big performance impact, increasing the amount of IOs issued to disk.

Disclosed is a method and system for allowing compression above cache platforms to maintain a fully persistent index on d...