Browse Prior Art Database

Page level sampling when total number of pages is unknown Disclosure Number: IPCOM000015623D
Original Publication Date: 2002-Mar-05
Included in the Prior Art Database: 2003-Jun-20

Publishing Venue



An algorithm is disclosed that allows page level sampling when the total number of pages is unknown in a table. It is difficult to sample pages in a table when the total number of pages is unknown beforehand. Thus, it is difficult to determine which pages are to be part of the sample and which are not, when you do not have prior knowledge of the number of pages contained in the table. This can lead to poor sampling rates where the actual sampled percent does not reflect the desired sampled percent. When the total number of pages is unknown, it is possible to have the actual percent of pages sampled to be close (if not equal) to the desired sampling percent rate. This method of page level sampling relies on two counters being maintained. One for the number of pages included in the sample (let's call this one sampledPages), and one for the total number of pages encountered thus far (let's call this one totalPages). The first pages is always part of the set of pages making up the sample. At the second page (and every subsequent page after that) the following formula is used to determine if the page is to be included in the sample or not: if sampledPages totalPages) desired sampling rate then include this page in the sample