Surety is performing system maintenance this weekend. Electronic date stamps on new Prior Art Database disclosures may be delayed.
Browse Prior Art Database

Dynamic reconfiguration based on system health and cost

IP.com Disclosure Number: IPCOM000019276D
Original Publication Date: 2003-Sep-09
Included in the Prior Art Database: 2003-Sep-09
Document File: 4 page(s) / 89K

Publishing Venue



Given a workload running on a large NUMA system composed of nodes, or distributed across individual blades in a blade center, a mechanism is needed to monitor system health to predict system failure and/or high resource utilization. Current system health monitors look at individual components of system health, e.g amount of available memory, number of single bit errors in memory. This mechanism will combine both hardware and software health indicators to form a comprehensive node health indicator. If system failure is predicted from the system health, resources can be shifted from the failing node onto a node computed as being healthy. It this information could be used to shift the workload to less congested or less cost system with a health risk.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 55% of the total text.

Page 1 of 4

Dynamic reconfiguration based on system health and cost

      Configure each node to have a cost and health measurement for determining the failover or a dynamic reconfiguration

    Self monitoring nodes/blades to compute a node health and cost which can be used by the OS, hardware or a service processor to alter the running system configuration.

    Each node provides health and cost indicators, both hardware and software based. Hardware health can be determined by on-chip counters. OS health would come from internal statistic gathering data structures in the OS. Example health indicators are listed below

    Scalability Port Link utilization Scalability Port CRC error rate Processor/Memory bus utilization Memory Configuration - Mirrored, RBS Memory Single Bit Errors, Multiple Bit Errors Directory error rates. CEC issues... fans, power supplies Software:

    Page misses Available memory Disk Network utilization The cost component provides a picture of how expensive, or the performance degradation that would be seen, if a task is migrated to a new node. For example, in a system with direct SCP links between each node, if the traffic is asymmetric and one link is heavily utilized, it would not be advantageous to switch that route to a 2 hop route through a under-utilized node to avoid the high cable utilization, as it would cost more to take the two hop path than a highly utilized 1 hop path.

    Combining the hardware and software indicators is necessary because typically by the time the hardware detects "bad" errors, there is little room for recovery. By using a combined hardware/software system health, failure or degraded performance can be better predicted.

    [A parallel/example for existing technology.... Hard drives have SMART
technology to indicate when they are getting high error rates when reading from the platters (or at least that is my understanding). This mechanism would take that into account and lower the system health. If other disks are available, the data could be shifted from a failing disk to a healthy disk. The copy could be scheduled, and based on the projected system performance impact, the workload could be shifted away from this node. If the workload wasn't shifted, the disk copy activity would lower this node's health, preventing additional workloads from being added to an already busy node]

    Here is an example. Currently Node 1 starts off as having a direct connection to Node 2 as shown (figure 1).


Page 2 of 4

Node 1 8 u P 16 G B

Node 2 8uP 16 G B

Node 4 8 u P 16 G B

Node 3 8uP 16 G B

Figure 1

If this connection fails the s...