Browse Prior Art Database

Start-up Methodology for a NUMA Machine

IP.com Disclosure Number: IPCOM000123467D
Original Publication Date: 1998-Dec-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 3 page(s) / 128K

Publishing Venue

IBM

Related People

Bannister, JP: AUTHOR [+9]

Abstract

A method for starting nodes on a Non-Uniform Memory Access (NUMA) machine is disclosed. The nodes are synchronized through the use of an initialization sequence and hardware registers at each node.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Start-up Methodology for a NUMA Machine

   A method for starting nodes on a Non-Uniform Memory
Access (NUMA) machine is disclosed.  The nodes are synchronized
through the use of an initialization sequence and hardware registers
at each node.

   In order to efficiently communicate between nodes in a
Non-Uniform Memory Access (NUMA) system, the nodes must be correctly
configured and enabled before the operating system can utilize them.
This is not a trivial problem, since the nodes in a NUMA system may
operate independently of one another and can start at different
times.  Initializing a remote transaction to nodes that are not ready
for operation may result in a deadlock and overall system failure.

   One solution is to have a hardware register on each node
(Start-up Register) and a predefined sequence of events (Start-up
Protocol) that the start-up software must adhere to in order to
ensure that all nodes in a NUMA system are ready and will be present
when the operating system boots up.  The following 2 sections explain
the solution in more detail.

   The Start-up Register

   The start-up register contains all the information
necessary for ensuring a good start-up on each node.  As shown in
Figure 1, the register contains two sections: 1) the local area, in
which information is communicated from the local node to the start-up
software and operating system, and 2) the shared area, in which
information about remote nodes is communicated to the start-up
software and operating system.

   Each register must contain in its local area the fields
N_RDY, N_EN, and N_ID.  N_RDY and N_EN are both 1 bit in length.
They are control bits for the start-up software.  N_RDY is a
read-only field that the hardware sets active when the current node
is ready to be enabled.  When software is ready to enable the node
for NUMA operation, it sets the read/write bit N_EN.  The N_ID is a
read-only field which uniquely identifies the current node.  It is of
size ceil(lg(N)), where N is the number of nodes, lg(a) is the log
base 2 of a, and ceil(a) is the function which returns the smallest
integer greater than or equal to a.

   The shared area is simply the N_EN bits shadowed to the
shared area of the start-up register.  This is so that each node can
see when the other nodes are ready and enabled.  There should be one
bit per node.

   In general, the total size of the register is given by N +
ceil(lg(N)) + 2.

   Figure 2 is an example of a start-up register for a 4 node
NUMA machine.  As the figure shows, there are 4 bits in the local
area and 4 bits in the shared area -- one bit per node.  N_ID in
this case is 2 since ceil(lg(4)) = 2.

   The Start-up Protocol

   Utilizing the start-up register, the start-up software
will be able to correctly bring up all the nodes.  The software can
determine if i...