Browse Prior Art Database

A Method and System for Performing Integrated Query Processing on Hybrid Data using a Materialized Query Table

IP.com Disclosure Number: IPCOM000235979D
Publication Date: 2014-Apr-01
Document File: 4 page(s) / 79K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for performing integrated query processing on hybrid data including one or more of structured data, semi-structured data, and unstructured data, using a Materialized Query Table.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 31% of the total text.

Page 01 of 4

A Method and System for Performing Integrated Query Processing on Hybrid Data using a Materialized Query Table

Disclosed is a method and system for integrated query processing on hybrid data including one or more of structured data, semi-structured data, and unstructured data. The method and system uses a Materialized Query Table (MQT) as a caching mechanism for storing query results on unstructured data , and semi-structured data. Further, the MQT is incorporated within a relational database management structure (RDBMS) used in processing queries directed towards structured data.

The method and system receives and stores hybrid data in one or more data sources . While semi-structured and unstructured data is stored in big data storage structures , structured data is stored in relational databases. In an instance, the method and system stores structured data such as, mission critical transactional data, using relational databases. Similarly, the method and system uses a big data storage structure such as Hadoop* for storing unstructured data such as instrumented data , and social media data.

Thereafter, the method and system obtains hybrid data from the one or more data sources. Subsequently, unstructured data and semi-structured data are subjected to data filtering and cleansing. In an instance, the method and system uses stream based approaches for performing data filtering and cleansing . The method and system also transforms a storage format for the unstructured data and semi -structured data from an original append optimized storage format into a read optimized format .

Further, the method and system performs temporal filtering on the unstructured data for extracting recent unstructured data. Thereafter, the recent unstructured data is converted by the method and system into a semi-structured format for subsequent storage into the big data storage structure. The method and system thereafter pushes queries requiring processing of unstructured data to the Big data storage structure using MapReduce jobs. Additionally, the method and system uses a Structured Query Language (SQL) as a front end query language for receiving queries.

The method and system thereafter stores the results of queries on unstructured data , and semi-structured data into the MQT. Results can be one or more of, an intermediate result and, a final result. The method and system also includes incrementally updating the MQT with the recent unstructured data from the big data storage structure . Further, aged unstructured data stored within the MQT is retired to the big data storage structure for occasional data serving. The method and system also allows the RDBMS to use the MQT for storing query results on structured data.

The method and system also includes using the MQT as a caching structure for future queries requiring processing of hybrid data. In an instance a query requires processing of both structured data, such as transactional data, stored in the RD...