Browse Prior Art Database

Optimization for query with heavy Aggregate Functions and Expressions

IP.com Disclosure Number: IPCOM000241251D
Publication Date: 2015-Apr-09
Document File: 7 page(s) / 235K

Publishing Venue

The IP.com Prior Art Database

Abstract

There are many requirements to show top N grids or top N reports in database. It is a burden to calculate so much useless aggregate functions and expressions when there are too many functions in the select list. In current database implementation, during generating access plan, database will calculate all columns listed in “SELECT” clause. After "group by" and "order by", then return top N rows. We would like to introduce a new method that we could postpone the aggregate functions and expressions calculations to save CPU cost when optimizer assume the aggregate functions and expressions CPU cost are heavy. This method would access the sorted column instead of calculating all the aggregate functions and expressions in advance. After knowing record which should be accessed, database would access base table again with less record accessing.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 29% of the total text.

Page 01 of 7

Optimization for query with heavy Aggregate Functions and Expressions

There are many requirements to show top N grids or top N reports in database. Such as:

1. In database monitoring tool, tell me top slowest 10 SQLs including their corresponding metrics: elapsed time, rows read, number of executions, CPU time,I/O time, lock Wait time, sorts, etc.

2. In e-commerce website, tell me the most popular 50 merchants and their corresponding sales data: stocks, sold numbers, amount, like votes, unlike votes, category, etc.

The sample query behind the scenario likes below:
SELECT key, sum(c1) as sum_c1, max(c2), avg(c3), sum(c4)+sum(c5), ... FROM table1
GROUP BY key
ORDER BY sum_c1
FETCH FIRST 10 ROWS ONLY

It is a burden to calculate so much useless aggregate functions and expressions.

In current database implementation, during generating access plan, database will calculate all columns listed in "SELECT" clause. After "group by" and "order by", then return top N rows.

We would like to introduce a new method that we could postpone the aggregate functions and expressionscalculations to save CPU cost when optimizer assume the aggregate functions and expressions CPU cost are heavy. This method would access the sorted column instead of calculating all the aggregate functions and expressions in advance. After knowing record which should be accessed, database would access base table again with less record accessing.

Advantages of Our Invention By the invention, we gain:


Less CPU consumption


Cheaper access path

Better performance

Novelty & Claims

For query with heavy aggregate functions and expressions, new access path generated following two steps fetching.


Optimizer detects the typical pattern query with heavy aggregate functions and expressions.


Optimizer would decide whether to apply this new method depending on the estimated cost.

Apply this new method to save CPU cost during runtime.

This invention provides a method to optimize query with heavy Aggregate Functions and Expressions. The query could qualify the following condition, and optimizer would decide whether to apply this new method.

1. The query should have heavy Aggregate Functions and Expressions with lots of CPU cost

1



Page 02 of 7


2. The query would have ORDER BY and FETCH FIRST N ROWS.

Once the query is identified during receiving, a new optimized access path will be generated and compared with the original access path. Database engine can compare the cost then choose cheaper access path for execution.

Optimizer would consider the following access path as candidate:

1. two steps fetching
a. fetching just "order by" column for all rows to find top N keys
b. fetching all columns for just keys desired then return

2. one step fetching
a. Fetching all columns for all rows then return just top N rows

During Access Plan evaluation, optimizer choose the lower estimation cost access path for the process for Aggregate Functions and Expressions evaluation during runtime. Figure 1 is the complete...