Browse Prior Art Database

Extendible Hashing for Line-Oriented Paging Stores

IP.com Disclosure Number: IPCOM000042302D
Original Publication Date: 1984-May-01
Included in the Prior Art Database: 2005-Feb-03
Document File: 2 page(s) / 15K

Publishing Venue

IBM

Related People

Bryant, RM: AUTHOR

Abstract

The extendible hashing algorithm for hash tables stored in page-formatted main memory allows most hash-table probes to be resolved using only two memory references, each memory reference bringing a multibyte line into the processor cache. The following assumptions are made. Pages have a size of 4096 bytes and are divided into 128-byte lines. Hash table keys are 8-bytes long and 5 bytes of data are associated with each key. Each hash table entry occupies 16 bytes. The 8-byte key is hashed down to a 3-byte (24-bit) hash table address (HTA). The bottom byte of the HTA gives the entry-in-page (EIP). The top n bits of the HTA give the directory index (DI). Each directory entry occupies 8-bytes so that an initial value of n=9 corresponds to one page of directory.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 2

Extendible Hashing for Line-Oriented Paging Stores

The extendible hashing algorithm for hash tables stored in page-formatted main memory allows most hash-table probes to be resolved using only two memory references, each memory reference bringing a multibyte line into the processor cache. The following assumptions are made. Pages have a size of 4096 bytes and are divided into 128-byte lines. Hash table keys are 8-bytes long and 5 bytes of data are associated with each key. Each hash table entry occupies 16 bytes. The 8-byte key is hashed down to a 3-byte (24-bit) hash table address (HTA). The bottom byte of the HTA gives the entry-in-page (EIP). The top n bits of the HTA give the directory index (DI). Each directory entry occupies 8-bytes so that an initial value of n=9 corresponds to one page of directory. Each directory entry has the format: 1 byte - count 3 bits - L2Depth 21 bits - page number (PN) 4 bytes - free space map The count field contains the number of hash table entries in the page, the L2Depth is the logarithm base-2 of the number of index entries that point at the particular page ("depth" of the page), the PN is the address of the page, and the free space map is one allocation bit per line of the page. A 0-bit indicates that there is space available in that line, a 1-bit indicates that the corresponding line is full. Adjacent directory entries that are buddy system "buddies" can point at the same hash table page. The count and free space fields of the directory entry are used only in the first directory entry that points at a particular page. The first entry can always be found from the address of the present entry and the L2Depth value stored there. The maximum allowed L2Depth is 4 (depth=16). Then all directory entries that point at a particular page always reside in the same line of the directory, and only one line reference is required to fetch the PN, count, depth, and space map for a hash table page. (Note: If a page with depth of one is split (L2Depth=0), then the restriction that the maximum allowed depth is 16 may cause other pages to be split, as detailed in the following). The EIP portion of the HTA identifies one of 256 hash table entries per page. Each hash table page is conceptually organized as a "bucket and chain" hash table. The EIP is used to index into a list of pointers each of which points at a linked list of synonyms. However, the standard "bucket and chain" hash table organization would require at least two line references: one to access the pointer and one to fetch the head of the synonym chain. To eliminate one of these references, one pointer is stored with each hash table entry. Each hash table entry has the format:

(Image Omitted)

where the fields have the following meanings: IND - Indirection byte. Points to head of synonym chain. HTL - Hash table link. Points to next entry in synonym chain. STATE - bit 0=0 if this entry is free. bit 0=1 if this entry is allocated. bit 1=0 if IND byte is...