Browse Prior Art Database

Synchronizing Distant Cooperating Processes

IP.com Disclosure Number: IPCOM000128612D
Original Publication Date: 1977-Dec-31
Included in the Prior Art Database: 2005-Sep-16
Document File: 16 page(s) / 55K

Publishing Venue

Software Patent Institute

Related People

Jerome A. Feldman: AUTHOR [+3]

Abstract

Distributed Computing, the execution of a single computation by a network of computers, is currently a very hot topic. As is usual in computing, applications are proceeding in advance of any systematic study of the subject: The technical and economic reasons for the great increase in distributed computing seem likely to remain in force for some time. In addition to its practical significance, distributed computing is a subject of inherent intellectual interest because every complex system, from a micro-organism to a society, does its computations with a number of fairly independent sub-systems whose communications can be viewed as messages. We have previously worked on a message-based model for system programs [Ball et al '76J and on a high-level programming language (PLITS) for distributed computing [Feldman '76J. This paper attempts to lay out an abstract framework for the analysis and synthesis of message-based distributed computation. The direct motivation for the current effort came from a graduate seminar that was working on the refinement and applications of PLITS. The students had very little trouble writing message-based PLITS programs but had great difficulty rovin them correct, even though they could prove ordinary programs. One difficulty was the absence of an agreed upon formalism for demonstrating the correctness of such programs. An initial attempt at such a formalism is one of the three goals of this paper. A second goal is to extend the recent work on synchronization of modules which share memory to those which communicate only by messages. Our third goal is to develop programming and proof techniques which will help in the production of demonstrably correct solutions to real distributed computing problems. >The choice of a model is often the crucial step in formalization of a domain. Distributed computing presents particular modeling dif-ficulties because an adequate treatment of it requires consideration of several poorly understood concepts including time-dependency, parallelism and unreliability. There also is much less accumulated intuition than is available for monoprocessing.~ The particular model chosen here assumes-that reliable transmission (including preserving sequence) and the detection of dead modules is treated by an underlying system. This is the same level of model used in our PLITS work and seems to be a clean abstraction of a level to be found in most distributed systems. This paper can also be viewed as part of the continuing effort to carefully define and study the problems of synchronizing multiple con-current program modules (or processes). The most recent formal effort along these lines is a paper by Peterson and Fischer [Peterson, G. '77J which discusses the critical section problem for distributed systems. The first two pages of that caper are included as the Appendix, to help provide the setting for this work. The critical section problem is not particularly interesting (cf. below), but is very simple and has a long history. We will loosely follow the formulation of Peterson and Fischer, but our concern is with message-based coordination problems. The section of Peterson and Fischer marked (1) describes how their model is based on shared memory for state variables (flags). This model makes sense in a single processor with multiple modules, but does not have an obvious >-3- realization in a system distributed in a network. One thing we will do later is explore direct extensions of the flag style of solution to message-based (realistic) distributed systems. The historical context of the critical section problem is in the design of resource-sharing operating systems. A resource (such as a printer or a disk unit) had to be used by only one process of the system at a time. Since all the processes were on a single machine, the reading of flags was perfectly straightforward. We are concerned with the allocation of resources by a widely distributed systen This paper explores the techniques available for synchronizing message-based systems and, especially, the proof methods which can be employed with them. There is an additional set of problems involving the allocation of multiple resources among parallel processes with overlapping requirements [Coffman '71]. This paper does not directly address these problems, but the results developed here can be used to extend the existing co-ordination algorithms to distributed systems. A typical resource to be allocated by a distributed processing system might be a distributed data base or band of the electromagnetic spectrum or a volume of air space by an air traffic control system. Often the best solution to these shared resource allocation problems is to have a specific module in charge of allocating 'the resource (this is analogous to replacing global variables with classes or modules). The major problem with having one controller per resource is that a failure in the controller totally precludes use of the resource. If there are redundant controllers for a resource, they will have a synchronization problem tike those discussed here.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 6% of the total text.

Page 1 of 16

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

Synchronizing Distant Cooperating Processes

TR26

by

Jerome A. Feldman

October, 1977 Department of Computer Science The University of Rochester Rochester, New York 14627 The preparation of this paper was supported by the Alfred P. Sloan Foundation under grant number 74-I2-5 and by the National Science Foundation grant number MCS76- 10825. >

1. Introduction

Distributed Computing, the execution of a single computation by a network of computers, is currently a very hot topic. As is usual in computing, applications are proceeding in advance of any systematic study of the subject: The technical and economic reasons for the great increase in distributed computing seem likely to remain in force for some time. In addition to its practical significance, distributed computing is a subject of inherent intellectual interest because every complex system, from a micro-organism to a society, does its computations with a number of fairly independent sub-systems whose communications can be viewed as messages. We have previously worked on a message-based model for system programs [Ball et al '76J and on a high-level programming language (PLITS) for distributed computing [Feldman '76J. This paper attempts to lay out an abstract framework for the analysis and synthesis of message-based distributed computation. The direct motivation for the current effort came from a graduate seminar that was working on the refinement and applications of PLITS. The students had very little trouble writing message-based PLITS programs but had great difficulty rovin them correct, even though they could prove ordinary programs. One difficulty was the absence of an agreed upon formalism for demonstrating the correctness of such programs. An initial attempt at such a formalism is one of the three goals of this paper. A second goal is to extend the recent work on synchronization of modules which share memory to those which communicate only by messages. Our third goal is to develop programming and proof techniques which will help in the production of demonstrably correct solutions to real distributed computing problems.

>The choice of a model is often the crucial step in formalization of a domain. Distributed computing presents particular modeling dif-ficulties because an adequate treatment of it requires consideration of several poorly understood concepts including time-dependency, parallelism and unreliability. There also is much less accumulated intuition than is available for monoprocessing.~ The particular model chosen here assumes-that reliable transmission (including preserving sequence) and the detection of dead modules is treated by an underlying system. This is the same level of model used in our PLITS work and seems to be a clean abstraction of a level to be found in most distributed systems. This paper can also be viewed as part of the continuing effort to carefully define and study the problems of s...