Browse Prior Art Database

OVERFLOW HAMbLING IN HAS&&NG1ABLES: A HYBRID APPROACH

IP.com Disclosure Number: IPCOM000128153D
Original Publication Date: 1979-Dec-31
Included in the Prior Art Database: 2005-Sep-15
Document File: 13 page(s) / 40K

Publishing Venue

Software Patent Institute

Related People

Peter Scheuermann: AUTHOR [+3]

Abstract

The hybrid method of handling overflows in hashing tables, which incapsulates both open addressing and chairning, is presented. A simulation model which accounts for the effect of the loading order is developed in order to evaluate the average number of accesses and the average number of overflows under the hybrid method. Furthermore, two cost models are considered to compare the performance of the hybrid method with open addressing and chaining for hashing tables kept in main core and on secondary storage devices.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 11% of the total text.

Page 1 of 13

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

OVERFLOW HAMbLING IN HAS&&NG1ABLES: A HYBRID APPROACH*

Peter Scheuermann Electrical Engineering and Computer Science Department Northwestern University Evanston, illinots 60201

Abstract

The hybrid method of handling overflows in hashing tables, which incapsulates both open addressing and chairning, is presented. A simulation model which accounts for the effect of the loading order is developed in order to evaluate the average number of accesses and the average number of overflows under the hybrid method. Furthermore, two cost models are considered to compare the performance of the hybrid method with open addressing and chaining for hashing tables kept in main core and on secondary storage devices.

* This work is sponsored in part by NSF Grant MCS77-03904. Will appear in Information Systems, 1979.

Introduction

The problem of quickly locating a particular record in a file, or all the records satisfying a given condition is crucial for on-line systems. Hashing functions, or key-to-address transforms, provide an efficient tool to be used either for primary key addressing or as the underlying access mechanism in the construction of indices for secondary key addressing. While a number of other methods, such as B-trees, have comparable retrieval ef-ficiencies, they are often restricted either in their applicability or in their dependence on the availability of auxiliary system resources to support insertion/deletion operations C2].

The problem of choosing a suitable hashing function for general classes of keys was settled in literature after the experimental study reported in C41 and the theoretical proof given in C11 concluded that the division tech-nique outperforms all other methods. In addition to the selection of the transform, which will not concern us here,the other decision variables which affect the performance of a hashing scheme are the method of handling over-flows and bucket size and loading factor. A number of analytic models have been developed [6,7,9,10] to compare the performance of different overflow handling techniques (open addressing, rehashing, chaining) in terms of the average number of accesses required to retrieve a required record or the number of overflow records. More recently, some of these results have been incorporated by Severance and Duhne C81 into a mathematical model which explicitly considers secondary storage device characteristics and time/space costs in order to account realistically for all the design variables men-tioned above. In this paper we consider the hybrid m ethod of handling overflows proposed originally by Van der Pool CIO] - Overflow records are stored in successive buckets, as in open- addressing, up to a selected distance away from the home buckets. From that point on, additional overflow records are handled as in the separate chaining method. We show first that the loading order has an effect on the number of additional...