Browse Prior Art Database

A sort method based on sorted compress dictionary for memory Column store

IP.com Disclosure Number: IPCOM000238510D
Publication Date: 2014-Sep-01
Document File: 4 page(s) / 76K

Publishing Venue

The IP.com Prior Art Database

Abstract

This method is used for memory column store. As we know data stored in memroy is encoded, when we do sort for these data normally we need to decode it first then do sort. Decoding would spend lots of cpu and memory resources. This method do sort on the compression dictionary first, then the data in memory can be ordered based on the sorted compression dictionary. It would do not need to decode the data in memory and save lots of cpu and memory resources.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 63% of the total text.

Page 01 of 4

A sort method based on sorted compress dictionary for memory Column store

BACKGROUND

Nowadays, the column table is more and more useful in the OLAP system. In column table, you can compress the data based on the times it appears. With respect to the Huffman encoding…, you can think of it this way… Something that appears many many times should be able to be compressed more than other things which do not appear as often. For instance the letter 'e' may appear many times so you can encode it with a single bit(1) thereby getting very good compression for this item which shows up many times. However for say the letter 'q' which may not appear as often you, maybe only once, you encode this with seven bits (7). The end result is those items appearing more often get higher level of compression. The letter e example was only an example of the concept.

But this would lead to one question, how to do sort on this compressed column table. Normally, you will say decompressed it first, then do sort. But that would need many space and many CPU's resource if the data is very large.

DETAILED DESCRIPTION

The disclosure gives us a method that we can do the sort directly on the dictionary of this table. Then the searching results can be ordered by the sorted compression dictionary directly, the database engine do not need to decompress the code of the results. Based on this method we can save lots of CPU and memory resources.

Let's see how it works . Take the following table as...