Abstract of LOB data
Publication Date: 2017-Mar-02
The IP.com Prior Art Database
A new solution to extract the abstract from the Large Object (LOB) column for different data types, by sampling the raw data, as well as the store method for database, to improve the performance of reading LOB under an accepctable lower quality requirement.
Abstract of LOB data
LOB data type is generated for various types of large objects to store in the data base, for example, huge quantity of text data, image data, map, audio and video. There are three basic types of LOB data for different usage in database as below:
Character Large Object (CLOB)/Double-Byte Character Large Object (DBCLOB) A character large object is a varying-length string. A CLOB or DBCLOB is designed to store large SBCS/DBCS data, such as lengthy documents. For example, you can store information such as an employee resume, the script of a play, or the text of novel.
Binary Large Object (BLOB) A binary large object is a varying-length string. A BLOB is designed to store non-traditional data such as pictures, voice, and mixed media.
Problem The main bottleneck of using LOB data is the pool performance of storing (insert) into and retrieving (read) from the database, and update is even more difficult. Normally database is performing a delete plus a reinsert for LOB to achieve an update operation.
However, LOB data type is not designed for the data which is needed to update frequently. The typical usage of LOB is to store some permanent data for reference, such as a photo, a piece of voice or video, etc. When this kind data is required, database will read the whole LOB object and return them all back to the caller, then the application can use it as will. But in some of the scenario, the user just need an abstract of the LOB data. The performance of retrieving the full LOB will be a visible weakness.
Now we propose a new solution to extract the abstract from the LOB column for different data types, as well as store the method for database.
Metadata for LOB data To recognize and process the different types of LOB data, we need the metadata to indicate the content type and format, as well as the precision of the data as below, then we can decide how to extract the abstract for this LOB. We can collect these information from reading the LOB object itself.
Content type: image File format: JPEG