Method for aggregating Light Path Diagnostics status across multiple servers.
Original Publication Date: 2004-Oct-22
Included in the Prior Art Database: 2004-Oct-22
In a customer environment, a heterogeneous selection of server models may be employed to meet the customer's requirements. Each model contains different components and is therefore subject to different types of hardware failures. As such, each model maintains a different scheme of Light Path Diagnostics LEDs. In this environment it is necessary for the system administrator to monitor the state of Light Path Diagnostics for each individual system and comprehend the various Light Path Diagnostics schemes. This article describes a method for aggregating Light Path Diagnostics status across multiple heterogeneous servers.
Method for aggregating Light Path Diagnostics status across multiple servers .
This article describes a method for remotely retrieving and aggregating Light Path Diagnostics status across a grouping of servers. This method includes servers within a single systems management network and servers within a single chassis (e.g. BladeCenter). This method utilizes the communication network and protocols defined by the systems management network to detect service processors within a group of servers. The individual service processors are queried to obtain the status of Light Path Diagnostics on the host system. Applications that utilize this method may display this information within a Graphical User Interface or create a text file describing the state of detected LEDs for each server.
One of the problems facing customers in an enterprise environment is the management of a host of servers that implemented on various platforms. The choice of platform for a particular system is based on the applications that server must run in conjunction with the business needs of the client. Determining problems in this environment becomes problematic. The I/T administrator is forced to learn various architectures and various error detection techniques in order to successfully manage the servers under her responsibility. IBM xSeries systems ease this burden by providing a mechanism to visually detect various hardware problems and warning conditions. This mechanism is implemented in the form of Light Path DiagnosticsTM. Light Path Diagnostics guide the administrator to a failing part on the system via a path of LEDs.
An administrator can use the Light Path Diagnostics to detect the presence of a system failure without having to understand the specifics of the architecture on each system. The Light Path architecture solves one aspect of the administration problem. In the described environment, there is the possibility for hundreds, and conceivable thousands, of systems. It is not practical for the administrator to walk up to each system and view the status of the hardware.
This invention describes a mechanism for aggregating the system status of multiple systems in a heterogeneous environment using a system management network. The definition of a heterogeneous environment, in this article, encompasses an I/T shop with a variety of xSeries machine types and a variety of operating systems. System status in this environment includes the status of FRU components, processors, voltage regulators, ambient temperatures, DASD error indications, PCI error indications among other error indicators. The system management network consists of a network of dedicated PCI cards and baseboard management controllers connected via RS-485 connections. Also included on the network is a single system that will be used to manage the systems on the network. The devices on this network are responsible for interacting with the host system to retrieve environmental information and send...