Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

DB2 Replication of Random Samples

IP.com Disclosure Number: IPCOM000021776D
Original Publication Date: 2004-Feb-06
Included in the Prior Art Database: 2004-Feb-06
Document File: 2 page(s) / 31K

Publishing Venue

IBM

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 55% of the total text.

Page 1 of 2

DB2 Replication of Random Samples

Disclosed is a DB2 query that extracts random samples for replication and analysis. The query specifies the approximate sample size. Potential customers for this query are executives, analysts and programmers.

Beyond data entry and retrieval, analysts often pose encompassing queries to understand and describe a population represented by a given database. If the database rows describe trees, the entire database describes a forest. Interactive exploration of a large forest proceeds best on a PC or workstation provided with a replicated random sample of the forest. Average height, average age, geographical center, percentages by kind, mortality by cause, and like metrics lend themselves to very accurate estimation by manageable random samples.

Developing applications to run against a large database, the application programmer may wish to debug and tune his code rapidly and thoroughly outside the production environment. A random sample replicated to PC or workstation suffices to exercise and debug the major code paths of the application.

The present DB2 Replication product lacks explicit support for replication of random samples. However the support is implicit in replication of views, as documented in "IBM DB2 Universal Database Replication Guide and Reference."

The following query -- valid on DB2 S/390, UDB, and AS/400 -- extracts a random sample of size 10,000 (approx.) from a large table of arbitrary size (exceeding 10,000 of cou...