Browse Prior Art Database

Distributed Power/Thermal Monitoring and Control in Large Computer Systems

IP.com Disclosure Number: IPCOM000114262D
Original Publication Date: 1994-Dec-01
Included in the Prior Art Database: 2005-Mar-28
Document File: 4 page(s) / 105K

Publishing Venue

IBM

Related People

Plat, E: AUTHOR [+2]

Abstract

Large computer systems are composed of various functional elements. These elements generally reside in either Thermal Conduction Module (TCM) or Card On Board (COB) packages. Each package is served by a number of power supplies and is cooled by chilled water or forced air.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Distributed Power/Thermal Monitoring and Control in Large Computer
Systems

      Large computer systems are composed of various functional
elements.  These elements generally reside in either Thermal
Conduction Module (TCM) or Card On Board (COB) packages.  Each
package is served by a number of power supplies and is cooled by
chilled water or forced air.

      Three Power/Thermal (P/T) environmental conditions must be
maintained to assure safe operation of each functional element:  (1)
voltage, (2) current, and (3) temperature.  A violation of the P/T
environmental specifications, a load fault, can damage system
hardware.  To prevent damage, a functional element exhibiting a load
fault is powered-off.

      The service system monitors and controls the P/T environment
via centralized instrumentation stations, called P/T Stations (PTS).
Each PTS has the following major characteristics:
  o  Resides in its own COB package with power supplies
  o  Services more than one functional element at a time
  o  Is a critical link between the service system and the power
      system
  o  Provides digital control of sense I/O
  o  Makes voltage, current, and temperature measurements
  o  Depends upon the service system to respond to fault conditions

      Some large computer systems employ "N+1" power supplies.  "N+1"
denotes the number of power supplies required to fulfill the load
demand plus a spare, which is a fault tolerant configuration.  The
contents of this article are most applicable to this class of large
computer system.

      An improvement to the centralized PTS approach is to distribute
the PTS monitoring and control functions to the power supplies.  This
concept depends upon two fundamental design requirements.
  1.  The service system communicates directly with the power
supplies.
  2.  Each power supply monitors and protects its load autonomously.

      Fig. 1 presents a topological overview of the power system
network.  The key power system components are (1) the communications
bridge between the service system and power system, (2) the
communications network, and (3) the power supplies.  The
communications network is composed of three major elements.
  o  BUS CONTROLLER - Manages the network protocol
  o  NETWORK BUS - Distributes the network
  o  REMOTE TERMINAL - Responds to the network protocol and bridges
      from the network to the subs...