Browse Prior Art Database

The Balanced Cube: A Concurrent Data Structure

IP.com Disclosure Number: IPCOM000127940D
Original Publication Date: 1985-Dec-31
Included in the Prior Art Database: 2005-Sep-14
Document File: 22 page(s) / 64K

Publishing Venue

Software Patent Institute

Related People

William J. Dally: AUTHOR [+4]

Abstract

The appearance of concurrent computers such as the Caltech Cosmic Cube (11 is creating a need for algorithms and data structures which can exploit concurrency. To date these machines have been employed primarily for numerical calculations where the structure of the data is closely matched to the communications structure of the machine. If these machines are to be applied effectively to non-numerical problems, new data structures and algorithms are required. Sequential computers spend a large fraction of their time manipulating ordered sets of data. For these operations to be performed efficiently on a concurrent computer a new data structure for ordered seta is required. Conventional ordered set data structures such as heaps, balanced trees and B-trees [2] have a single root node through which all operations must pass. This root bottleneck limits the potential concurrency of tree structures making them unable to take advantage of the power of concurrent computers. Their maximum throughput is O(1). This paper proposes a new data structure for implementing ordered sets, the balanced cube, which offers significantly improved concurrency. The balanced cube eliminates the root bottleneck allowing it to achieve a throughput of O( off). Concurreacy in the balanced cube is achieved through uniformity. With the exception of the balancing algorithm, all nodes are equals. An operation may originate at any node and need not pave through a root bottleneck as in a tree structure. In keeping with the spirit of a homogeneous machine, the balanced cube is a homogeneous data structure. The balanced cube's topology is well matched to binary a-cube multiprocessors. .The bal-anced cube maps members of an ordered set to aubcubes of a binary n-cube. A Gray code mapping is used to preserve the linear adjacency of the ordered set in the Hamming distance adjacency of the cube. The balanced cube algorithms are based on a message passing model of concurrent com-putation. Each processing node includes a processor and a memory. Nodes must pass messages over an interconnection network to access the memory in other nodes. We assume that the communication time required by this message passing dominates the processing time, which we may safely ignore. Thin communication cost model accurately reflects the cost/performance behavior of massively concurrent machines constructed from VLSI pro-ceasing nodes. The Caltech Cosmic Cube is a scale model of such a machine. In this paper we concentrate on developing algorithms which make efficient use of the available communication resources. In sharp contrast, most existing concurrent algorithms have been developed assuming an ideal shared memory multiprocessor. In the shared memory model, communications coat is ignored. Processes can access any memory location with unit cost and an unlimited number of processes can access a single memory location simultaneously. Performance of algorithms analyzed using the shared memory model does not accurately re$ect their performance on large scale multiprocessors. Previous work on concurrent data structures has concentrated on reducing the interference between concurrent processes accessing a common data base but has not addressed the limited concurrency of existing data structures. Kung and Lehman [3] have developed 1 concurrent algorithms for manipulating binary search trees. Lehman and Yao [4J have extended these concepts and applied them to B-trees. Algorithms for concurrent search and insertion of data in AVL-trees [5J and 2-3 trees (6) have been developed by Ellis. These papers introduce a number of useful concepts that minimize locking of records, post-pone operations to be performed, and use marking mechanisms to modify the data structure. However, these papers consider the processes and the data to be stationary, and thus do not address the problems of moving processes and data between the nodes of a concurrent computer. The cost of communications, which we assume to dominate processing costs, has largely been ignored. The remainder of this paper describes the balanced cube and how it addresses the issues of correctness, concurrency, and throughput. In the next section the data structure is presented and the consistency conditions are described. This section also discusses the assumptions which are made is analyzing the balanced cube, and the environment in which the data structure and its algorithms are expected to reside. The V W search algorithm is described in Section 3. VW search uses the distance properties of the Gray code to search the balanced cube for a data record in O(log N) time while locking only a single node at a time. An insert algorithm is presented is Section 4. Insertion is performed by recursively splitting subcubea of the balanced cube. Section 5 discusses the delete operation. Deletion is accomplished by simply marking a record as deleted. A background garbage collection process reclaims deleted subcubes: The insertion and deletion algorithms tend to unbalance the cube. A balancing algorithm, presented in Section 6, acts to restore balance. Each of the algorithms presented in this paper is analyzed in terms of complexity, concurrency, and correctness. Section 7 extends the balanced cube concept to B-cubes which store several records in each node. Finally, Section 8 discusses the results of experiments run to verify the balanced cube algorithms.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 6% of the total text.

Page 1 of 22

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

The Balanced Cube: A Concurrent Data Structure

William J. Dally Charles L. Seitz

5174:TR:55 June 20. 1985 (note: revised from February 1985 version) Abstract This paper describes the balanced cube, a new data structure for implementing ordered sets. Conventional data structures such as heaps, balanced trees and B-trees have root bottlenecks which limit their potential concurrency ;sad make them unable to take advantage of the computing potential of concurrent machines. The balanced cube achieves greater concurrency by eliminating the root bottleneck; an operation in the balanced cube can be initiated from any node. The throughput of the balanced cube on a concurrent computer is O( off) compared with O(1) for a conventional data structure. Operations on the balanced cube are shown to be deadlock free and consistent with a sequential execution ordered by completion time.

1 Introduction

The appearance of concurrent computers such as the Caltech Cosmic Cube (11 is creating a need for algorithms and data structures which can exploit concurrency. To date these machines have been employed primarily for numerical calculations where the structure of the data is closely matched to the communications structure of the machine. If these machines are to be applied effectively to non-numerical problems, new data structures and algorithms are required. Sequential computers spend a large fraction of their time manipulating ordered sets of data. For these operations to be performed efficiently on a concurrent computer a new data structure for ordered seta is required. Conventional ordered set data structures such as heaps, balanced trees and B-trees [2] have a single root node through which all operations must pass. This root bottleneck limits the potential concurrency of tree structures making them unable to take advantage of the power of concurrent computers. Their maximum throughput is O(1). This paper proposes a new data structure for implementing ordered sets, the balanced cube, which offers significantly improved concurrency. The balanced cube eliminates the root bottleneck allowing it to achieve a throughput of O( off). Concurreacy in the balanced cube is achieved through uniformity. With the exception of the balancing algorithm, all nodes are equals. An operation may originate at any node and need not pave through a root bottleneck as in a tree structure. In keeping with the spirit of a homogeneous machine, the balanced cube is a homogeneous data structure. The balanced cube's topology is well matched to binary a-cube multiprocessors. .The bal-anced cube maps members of an ordered set to aubcubes of a binary n-cube. A Gray code mapping is used to preserve the linear adjacency of the ordered set in the Hamming distance adjacency of the cube. The balanced cube algorithms are based on a message passing model of concurrent com-putation. Each processing node includes a processor an...