Browse Prior Art Database

Redistributing Data In Shared Nothing Partitioned Database Environments Via Restore Mechanism

IP.com Disclosure Number: IPCOM000132342D
Original Publication Date: 2005-Dec-08
Included in the Prior Art Database: 2005-Dec-08
Document File: 2 page(s) / 56K

Publishing Venue

IBM

Abstract

This article discloses how redistributing data in shared nothing partitioned database environments can be achieved using the existing backup image. Two different methods of utilizing a backup image to redistribute data are described, a row based method and a page based method.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

THIS COPY WAS MADE FROM AN INTERNAL IBM DOCUMENT AND NOT FROM THE PUBLISHED BOOK

CA820050259 Peter K Wang/Markham/IBM Miso Cilimdzic, Ronen Grosman

Redistributing Data In Shared Nothing Partitioned Database Environments Via Restore Mechanism

Disclosed are two methods of utilizing a database backup image to redistribute data in shared-nothing partitioned database environments. When the number of physical nodes changes it may be necessary to redistribute the existing data across this new configuration. Current solutions require either double the storage of base tables or extra log space allocated in order to perform the redistribution. The two methods described in this disclosure utilize the existing backup images for a given database, and simply by doing a restore like operation redistribute the data set across the new physical node layout. The advantage of this method over the current solutions is the time and space savings. The savings in time and space are realized by avoiding data offloading (exporting all the data from the tables), additional table creation or extra logging overhead, by leveraging the existing storage allocated for backups and the data contained in them.

Row based redistribution is referred to as method 1 and page based redistribution as method 2.

Both methods require up-to-date database backup image and not having any transaction executed against the database from the backup time until the redistribution is completed, as well as index recreation of any indices existing prior to the redistribution.

Method 1

Row based method extracts data at a row granularity from the backup images using the row hash key calculates the new placement, and stores it at that location which effectively redistributes the data across any physical node configuration.

It is assumed that the data distribution across new physical node configuration is done by using a hash function on different hash key values which allows placement into appropriate nodes. This method applies in either case of the partitioning key being unique or not unique.

The process is the following:

     - Retrieve a table object page from the backup image - For every row on the page apply the hashing algorithm to decide the correct location for a row and place it onto the correct physical node.

- Method 1 is complete at this point and all of the rows are redistributed to reflect the new physical layout.

Method 2

Page based method extracts data at a page granularity from the backup...