Browse Prior Art Database

Automatic Use of Parallelism

IP.com Disclosure Number: IPCOM000092783D
Original Publication Date: 1967-Feb-01
Included in the Prior Art Database: 2005-Mar-05
Document File: 2 page(s) / 14K

Publishing Venue

IBM

Related People

Driscoll, GC: AUTHOR [+2]

Abstract

In a computing system which contains several processing units, all of which have access to main storage, there are many computations in which a procedure is to be repeated a definite number of times. Such procedure uses different data or combinations of data each time, where the iterations of the procedure can take place simultaneously. Matrix multiplication is an example of such a computation, with vector multiplication the procedure. This method allows a computing system, as described above, to provide automatically the greatest possible degree of parallelism during these computations. At the same time, it breaks the computation into pieces that are as large as possible, to minimize the overhead incurred when a processor changes from task to task.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Automatic Use of Parallelism

In a computing system which contains several processing units, all of which have access to main storage, there are many computations in which a procedure is to be repeated a definite number of times. Such procedure uses different data or combinations of data each time, where the iterations of the procedure can take place simultaneously. Matrix multiplication is an example of such a computation, with vector multiplication the procedure. This method allows a computing system, as described above, to provide automatically the greatest possible degree of parallelism during these computations. At the same time, it breaks the computation into pieces that are as large as possible, to minimize the overhead incurred when a processor changes from task to task. It never creates a task when there is no more work for that task to perform. The method also allows dynamic respectification during computation of the maximum degree of parallelism, as discussed below.

A split instruction is provided which has three parameters. First, there is the address in main storage of a program status word PSW which specifies the parallel branch of the program which is to be executed. Secondly, there is a maximum parallelism index which specifies the maximum number of processors which are to undertake execution of this branch. Thirdly, there is the storage address of a counter which generally specifies the remaining amount of logical parallelism in the program at this point, e. g., the total number of executions of branch to be performed. When a processor executes a split instruction, normally having stored the appropriate PSW and counter at the indicated locations, it makes an entry in the appropriate queue. This is effected in an area of main storage accessible only to executive, i. e., privileged, routines. This entry contains the three parameters included in the split instruction. The processor then executes the next instruction in sequence. The counter itself is generally in a location accessible to the program and thus can be counted down as the proce...