Browse Prior Art Database

Provisioning Deployment of Materialized Views and Indices for Data Warehouse Operation Schedule

IP.com Disclosure Number: IPCOM000176163D
Original Publication Date: 2008-Nov-06
Included in the Prior Art Database: 2008-Nov-06

Publishing Venue

IBM

Abstract

A large enterprise has multiple business intelligence (BI) applications with workloads running at different times and frequency (e.g. some workloads run daily while some others run every weekend). Current physical design advisors for data warehouse operations consider a workload and recommend materialized views and indexes to improve the query process competition time of the workload. Such recommendation aims at local optimization (i.e. execution of the single workload) instead of global optimization (i.e. data warehouse operation performance in a month). In this paper, we propose a new approach to data warehouse physical design from a point of view of long term operation. Our approach looks at workloads and their execution schedule as a whole and recommends both MQTs/indexes as well as their provisioning schedule (when to activate, refresh, and drop). We have implemented our proposed scheduling algorithm and validated using TPC-H workload.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 9% of the total text.

Page 1 of 11

Provisioning Deployment of Materialized Views and Indices for Data Warehouse Operation Schedule

1. Background:


A large enterprise has multiple business intelligence (BI) applications with workloads running at different times and frequency (e.g. some workloads run daily while some others run every weekend). Current physical design advisors for data warehouse operations consider a workload and recommend materialized views and indexes to improve the query process competition time of the workload. Such recommendation aims at local optimization (i.e. execution of the single workload) instead of global optimization (i.e. data warehouse operation performance in a month).

2. Summary of Invention:


In this disclosure, we propose a new approach to data warehouse physical design from a point of view of long term operation. Our approach looks at workloads and their execution schedule as a whole and recommends both MQTs/indexes as well as their provisioning schedule (when to activate, refresh, and drop).

1

Page 2 of 11

Provisioning Deployment of Materialized Views and Indices for Data Warehouse Operation Schedule

Dengfeng Gao ¤ † Wen-Syan Li Daniel C. Zilio §

    IBM Silicon Valley Laboratories 555 Baily Road, San Jose, CA 95120 USA

    IBM Almaden Research Center
650 Harry Road, San Jose, CA 95120 USA

                § IBM Canada Ltd.
8200 Warden Ave Markham, Ontario, Canada L6G 1C7

Abstract

  A large enterprise has multiple business intelligence (BI) applications with workloads running at different times and frequency (e.g. some workloads run daily while some others run every weekend). Current physical design advisors for data warehouse operations consider a workload and recom- mend materialized views and indexes to improve the query process competition time of the workload. Such recom- mendation aims at local optimization (i.e. execution of the single workload) instead of global optimization (i.e. data warehouse operation performance in a month). In this pa- per, we propose a new approach to data warehouse physical design from a point of view of long term operation. Our ap- proach looks at workloads and their execution schedule as a whole and recommends both MQTs/indexes as well as their provisioning schedule (when to activate, refresh, and drop). We have implemented our proposed scheduling algorithm and validated using TPC-H workload.

status. In a typical enterprise data warehouse system, the transaction information is consolidated at the headquarter using ETL (Extract, Transform, and Load) tools in the early evening. After all the transactional data has been processed into a form, the transactional data can be loaded into the base tables of the data warehouse. The data in the staging area can be viewed as the delta of the base tables. The delta of the base tables in the staging tables is propagated to the base tables. The indexes of the base tables are updated ac- cordingly as well. Then the materialized view...