The rank-based partial compression method depending on different access frequency
Publication Date: 2015-Nov-23
The IP.com Prior Art Database
This article describes a method that can de-compress/compress partial of the file according to accessing ratio. The compress level can be caculated by a fomular. And it can be used with any file system including distributed file systems.
Page 01 of 4
When storing files in file system or content management system, in order to save the space, compressing files is a common method.
But the tradeoff is when the compressed files are accessed, they need to be uncompressed firstly. The expense of computing resource and time might be very high.
There is a solution which de-compresses or compresses with low level for the hot files, while compressing the cold files with high compress level.
But in more common scenario, the hot file may have cold parts and the cold file may have hot parts. For example: a. In an article store system, the title, abstract, keywords and some key flowcharts/formula parts are hot. And all other parts may be cold.
b. In web environment, files with many formats support steam accessing mode. That means the beginning part of the file will be much hotter than the middle and the tail in most situation.
c. Resource file of computer games may be very big, but it is possible that some parts will be read much more than other parts. Hot parts may be starting animation, main character's resources. The cold parts may be ending animation, NPC characters that will be seldom used.
For above scenarios, the hot/cold file method cannot take much benefit to them.
Our claim point is the rank-based partial compression method. When some parts of the file are accessed frequently(hot), they will be de-compressed or compressed with low level. When some parts of the file will be seldom accessed(cold), they will be compressed with high level.
With our method, file can be stored in smaller storage space while hot content can be retrieved with lower expense and quick response.
Here is the implementation for de-compress/compress with higher level and de-compress/compress with lower level.
a. Set a partition size and split the file into multi parts in logical. Or manage the file with known/fixed format as different parts, like article.
b. At the beginning, all files are uncompressed.
c. Record the hit times and keep time when the file is accessed.
d. Based on the hit times and keep time record, compute the ratio and adjust parts compress level accordingly.
e. When the hot parts turn cold, they need to be compressed with higher level according to ratio. The compression action can be triggered by some lifecycle control events. The direct way is to do this after fixed duration as a sweep batch task. So this can be triggered by event or scheduling.
This picture shows when the file is accessed, the read parts are the 2ndand 3rdparts from the beginning.
--based partial compression method depending on...