Browse Prior Art Database

Zero Cost Power and Cooling Monitor System

IP.com Disclosure Number: IPCOM000112719D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 4 page(s) / 144K

Publishing Venue

IBM

Related People

Morris, N: AUTHOR [+2]

Abstract

Disclosed is the use of existing general purpose input signals on a disk device to signal power and cooling faults to the host system. Additional hardware is avoided.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 37% of the total text.

Zero Cost Power and Cooling Monitor System

      Disclosed is the use of existing general purpose input signals
on a disk device to signal power and cooling faults to the host
system.  Additional hardware is avoided.

      Many computer storage products consist of an array of identical
storage devices, housed in an overall sub-system mechanical package
which provides them with bulk-power and cooling.  Each device can be
accessed from the host computer and can report its status.  It is
important to be able to report the status of the bulk-power and
cooling components in the sub-system, so a device is needed to be
added that has a means of sending messages to the host computer.  For
example, in a SCSI sub-system this would need to be a monitor circuit
consisting of a microprocessor and SCSI communication chips.  In
other architectures such as the IBM SSA, a dedicated SSA chip and
support components would also be required.  The main disadvantage is
cost.  The monitor circuit has to be packaged on its own PCB which
requires design effort and adds manufacturing cost to the product.
Another disadvantage of the monitor circuit is that it needs its own
dedicated target ID.  In a typical SCSI sub-system with a maximum of
7 target IDs on one bus, the design is limited to 6 storage devices
plus 1 monitor circuit which reduces the price-performance ratio.

      The new concept is to use a number of the storage devices as
bulk-power and cooling component monitors.  Each storage device was
designed so that it could monitor one FRU (Field Replaceable Unit).
Each FRU provides a single bit of status data which is wired up to
the input of one of the storage devices which can then insert an
error message into its communication stream with the host computer if
it detects a failure in the FRU.  This monitor design includes three
other features.

      (1)  An Early Power Off Warning detector was designed to
operate within the storage device.  If it detects total loss of bulk
DC power then the device quiesces itself after it completes any
outstanding sector write operations.  This is important as data will
then not be corrupted in the event of an unexpected mains or bulk
power-supply failure.

      (2)  Each storage device is powered from two or more sources of
bulk-power to provide fault-tolerance.  The monitor can report the
loss of just one of these as a non-critical fault.  If two or more
sources of redundant bulk-power sources fail, then this is reported
as a critical failure (EPOW).  This is the "quorum/voting" system"
where it is necessary for normal operation for there to be a "quorum"
of bulk-power sources.

      (3)  The system is failsafe so that if any of the status sense
connections is broken, the appropriate error message is generated,
indicating the failure (or absence) of the cable FRU involved.  The
error reported must be one that leads the person investigating the
problem to check the correct cable...