Browse Prior Art Database

A method to automatically reuse Sparse Index

IP.com Disclosure Number: IPCOM000248875D
Publication Date: 2017-Jan-19
Document File: 7 page(s) / 182K

Publishing Venue

The IP.com Prior Art Database

Abstract

Our main idea is that database could automatically release the sparse index depending on the optimizer, when the cost is huge and the required resource is small. If there is other queries refer the same sparse index, database could reuse the sparse index instead of creating a new one. Optimizer would decide whether caching sparse index depending on the build sparse index cost, sparse set size and life cycle. It is not hard for optimizer to estimate the sparse index cost and result set size, the following example shows the access path and cost for the query, and it also contains the cost for sparse index building and sparse size information.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

1

A method to automatically reuse Sparse Index

Sparse index is widely used by database, especially when the inner has no efficient index on the join columns.

For example,

SELECT MIN ( Z01 . FK_EN_PR_DEPOTID_N )

FROM WP01.EN_DEPOT_UMSATZ Z01 ,

WP01.EN_UMSATZ_ABRECHNU H03

WHERE Z01 . FK_EN_PR_DEPOTID_N IS NOT NULL

AND H03 . FK_EN_DEPOT_UMSID = Z01.ID_DEPOT_UMSATZ

AND H03 . AT_STATUS_ABRECHNU = 1

For the query above, database may choose to create a sparse index on H03, store results of the WHERE selection in the sparse index, then efficiently access to the inner table

H03 with the sparse index. But if there is another query with H03 which is needed the same sparse index, the database would create a new sparse index for H03

again, even though the sparse index created for the first query can be reused.

Our main idea is that database could automatically release the sparse index depending on the optimizer, when the cost is huge and the required resource is small. If there is

other query refer the same sparse index, database could reuse the sparse index instead of creating a new one.

Optimizer would decide whether caching sparse index depending on the build sparse index cost, sparse set size and life cycle. It is not hard for optimizer to estimate the sparse

index cost and result set size, the following example shows the access path and cost for the query, and it also contains the cost for sparse index building and sparse size

information.

SELECT MIN ( Z01 . FK_EN_PR_DEPOTID_N )

FROM WP01.EN_DEPOT_UMSATZ Z01 ,

WP01.EN_UMSATZ_ABRECHNU H03

WHERE Z01 . FK_EN_PR_DEPOTID_N IS NOT NULL

AND H03 . FK_EN_DEPOT_UMSID = Z01.ID_DEPOT_UMSATZ

AND H03 . AT_STATUS_ABRECHNU = 1

There are two tables join, the inner table uses sparse index access.

2

The following is the cost which records the cost for sparse index.

From the cost table, it is easy to get the sparse cost is 20884.895+10626=31510.895, there is one column in the sparse index and the column length is 4. The result size of

sparse index is 1508307.52 (4* 377076.88) (Record Size * Record number). So that benefit ratio for each storage usage is R = sparse cost/record_size =

31510.895/1508307.52= 0.021. The higher the ratio is, the benefit is more.

numberrecordsizerecord

tscantsort

sizeindexsparse

tindexsparse ratiobenefit

__

cos_cos_

__

cos__ _

× +==

We also define the life cycle extinction function. As our design, with the time flying, there is less use sparse index value. We suppose that there is 10 life cycle queue and the

following is the extinction curve. Customer could define the life cycle interval and there are 10 life cycle queues by default. As the time flies, the sparse index in the last queue

become invalid and are released. If the sparse index is referred, database would move the sparse index to the newest queue. For each life cycle queue, we would assign a

3

factor.

If the storage for sparse index cache is big enough, the sparse index set would be cached immediately. The sparse index would compete...