Browse Prior Art Database

Compressed Tries

IP.com Disclosure Number: IPCOM000128511D
Original Publication Date: 1974-Dec-31
Included in the Prior Art Database: 2005-Sep-16
Document File: 9 page(s) / 29K

Publishing Venue

Software Patent Institute

Related People

Kurt Maly: AUTHOR [+3]

Abstract

In this paper we present a new data structure, called a compressed trie or C-trie,to be used in information retrieval systems. It has the same underlying m-ary tree structure as a tries but, whereas the fields of the nodes 3n a trio have to be large enough to hold a key or at least a.pointer, the fields in a C-trie are only one bit long. In the analysis part of the paper we shall show that for a collection of n keys the retrieval time of one key is of the order log m n and the storage requirement of the order n.(m + log2n) bits. We first analyze the C-trie as a data structure and then discuss several methods of its use for a data base. One method particularly is of interest in so far as we achieve the same storage requirement bound for a data base, keys plus addresses to records, as for the data structure above at the cost of increasing the retrieval time to some extent. Key words and phrases: data structure, data base, m-ary tree, trie, retrieval time, storage requirement, keys. CR categories: 3.70, 3.74, 3.75,

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 17% of the total text.

Page 1 of 9

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

Compressed Tries

Kurt Maly

Computer, Information, and Control Sciences

Institute of Technology

University of Minnesota

Minneapolis, Minnesota 55455 Technical Report 74-12

June, 1974 Cover courtesy of Ruth and Jay Leavitt

Compressed Tries

by

Kurt Maly

Department of Computer arid Information Sciences University of Minnesota Minneapolis, Minnesota 55455

Abstract

In this paper we present a new data structure, called a compressed trie or C-trie,to be used in information retrieval systems. It has the same underlying m-ary tree structure as a tries but, whereas the fields of the nodes 3n a trio have to be large enough to hold a key or at least
a.pointer, the fields in a C-trie are only one bit long. In the analysis part of the paper we shall show that for a collection of n keys the retrieval time of one key is of the order log m n and the storage requirement of the order n.(m + log2n) bits. We first analyze the C-trie as a data structure and then discuss several methods of its use for a data base. One method particularly is of interest in so far as we achieve the same storage requirement bound for a data base, keys plus addresses to records, as for the data structure above at the cost of increasing the retrieval time to some extent.

Key words and phrases: data structure, data base, m-ary tree, trie, retrieval time, storage requirement, keys.

CR categories: 3.70, 3.74, 3.75,

1. INTRODUCTION

The problem of creating and maintaining a data base has been approached from many sides. On the one hand we have the ideal direct addressing scheme which has negligible retrieval time but requires unrealistic'a12y high amounts of storage; and on the other end of the spectrum we

University of Minnesota Page 1 Dec 31, 1974

Page 2 of 9

Compressed Tries

have searching schemes, such as binary search~which require a retrieval time proportional to log 2R and an amount of storage proportional to the number of keys, n . Schemes which lie between those two extremes all involve a trade-off between storage needs and retrieval time. For instance, in tries we add information about the structure of the keys thereby increasing the basic amount of storage necessary to store the n keys by a considerable amount, but reducing the retrieval time, measured in digit inspections, to logmn where m is a parameter of the trie. The schemes, C-tries, we propose has most of the advantages of tries but yew of the disadvantages. That is, we shall show

that retrieval time still is proportional to log ,,n and in general the t storage requirement will be even less than the storage required for the binary search method.

2. THE C-TRIE AS A DATA STRUCTURE

The basic idea of the trie-structure, see Fredkin [1], is to represent the keys by means of an m- cry tree. Each key x , viewed as a bit string, is partitional into substrings of equal length, such that these strings viewed as binary numbers have values between 0 and m .

(Equ...