Browse Prior Art Database

Modeling of Aggregate Awareness and Cache Management in Cubes

IP.com Disclosure Number: IPCOM000243490D
Publication Date: 2015-Sep-24
Document File: 3 page(s) / 209K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method of Modeling Aggregate Awareness and Cache Management in Cubes to increases overall performance of queries.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 46% of the total text.

Page 01 of 3

Modeling of Aggregate Awareness and Cache Management in Cubes

Existing Approaches

Most of the Business Intelligence (BI) product vendors today use the following types of Aggregate Awareness and Cache Management mechanisms:

In-database Aggregates with the related processes to make a database query aware of the fact that a database aggregate does exist and should be used if it satisfies the needs of the query. In-memory Aggregates with the related processes to generate In-memory aggregates based on database cardinality and pre-collected usage pattern.

Data Cache with the related processes based on warm-up query and online usage
Result Set Cache with the related processes based on online usage
Expression Cache with the related processes based on online usage

There exists a master process that selects the most appropriate mechanism out of any ONE listed above.

Problems with Existing Approaches

While there are matured algorithms for picking the right mechanism, there is NONE that uses the best MIX of different techniques and provide DYNAMISM to the static aggregates that are revised periodically and manually. Some of the problems that arise due to such an approach are listed below:

Poor utilization of In-database Aggregates: Since the aggregates are pre-defined, based on the pre-run workload information and database component design strategies, it is not current enough to handle the latest workload information. As a result, the users face performance issues.

Poor utilization of In-memory Aggregates: Part of the problem is similar to In-database aggregates. In addition, it refreshes every time the Cube is refreshed. Depending on the definition, the process can take good amount of time to build these aggregates resulting in poor performance during this process.

Data Cache limitations: Data caches are refreshed every time the Cubes is refreshed. Also, it builds on utilization either by direct runs (by end user) or warm-up queries. In any case, the process takes time. Also, when the allocated memory is full, the cache does not build anymore leaving the latest usage data out of its scope.

Result Set Cache limitations: Similar to Data cache as explained above.

Expression Cache limitations: Similar to Data cache as explained above.

Brief Summary

Summary

1. Divide the In-memory Aggregate and various Caches into Fixed and Dynamic storage areas. Target the Fixed storage In-memory Aggregate area for structure-based aggregations and the Dynamic storage In-memory Aggregate area for most recent query usage based aggregations with the highest hit rate getting highest priority.

2. Instead of entire In-memory and Cache refresh, do an incremental refresh only 3. Stitch the results when query can't be services by any single Aggregate or Cache instead of hitting the most detailed database table

1


Page 02 of 3

Advantages

1. It will eliminate the need of warm-up queries completely.

2. It will reduce the cache generation time 3. It will also increase the hit...