A Method of Utilizing OS Dynamic Resource Allocation/Configuration for Power Throttling in a Multi-Node (Bladecenter) Environment
Original Publication Date: 2004-Aug-20
Included in the Prior Art Database: 2004-Aug-20
On the next generation of Blade servers, power utilization and thermal cooling will become a major issue for individual node performance and functionality. With current Blade designs, power consumption and/or thermal production could induce cpu power throttling and/or complete node shutdown in some situations. If power draw or thermal conditions exceed power supply thresholds, the power supply will shut itself off. If the power supply shuts down all blades in the power domain will go down as well. In order to prevent this, the management module monitoring the power supplies and power domain needs to implement recovery policies which will reduce the net power consumption/heat levels down to values within the power supply's operating range. Many different algorithms are possible for deciding how to achieve the required reduction. Current solutions determine the total reduction in power consumption required. This value is evenly divided across all blades in the power domain. This has the effect of reducing the performance of all blades equally. Realistically, different applications running on different blades have different performance characteristics. This solution enables users to tailor power reduction goals for each blade to the resources not required by the primary application running on the blade.
A Method of Utilizing OS Dynamic Resource Allocation /Configuration for Power Throttling in a Multi-Node (Bladecenter) Environment
On current Bladecenter designs, it will be possible to have a total of 14 blades per chassis with each blade supporting up to 8 1GB DIMMs and 2 processors. As seen on current blades, each blade will have a local H8 based integrated system management module and a management module will provide manageability/control over the entire chassis.
For this particular solution, the end user has configured a power utilization policy via outside system software (IBM Director, etc) to reserve specific nodes within the chassis to be considered highest priority for power utilization to minimize/eliminate performance reduction. Secondary node(s) are configured in this same policy so that in the event of a power/thermal issue, individual node resources (memory and CPU primarily) can be removed/disabled.
The Policy manager in the management application will indicate for users which blades are members of each power domain and the characteristics of the power supplies that serve them. Generally, two power supplies act in a redundant capacity for a given power domain. In order to compensate for the loss of a power supply, the management module must reduce power consumption down to the capacity of the remaining power supply. Users will be given a value indicating the net reduction that needs to be achieved. For each blade in the domain, the user interface will display the power consumption allotted to the CPUs and DIMMs on that blade. Users will be able to select all DIMMs and CPUs beyond the first to disable in the case of power loss. These will be prioritized within the blade. As users select components to disable, their consumption will be subtracted from the total reduction that needs to be achieved. When this value reaches zero, the interface will indicate to users that they have a valid policy to apply.
Once all nodes' are running an operating system which supports dynamic resource addition/removal, each node's service processor and the chassis management module will constantly monitor power consumption and/or temperature within the chassis. In the event that the management module has detected that power utilization has exceeded a threshold usage, it will immediately notify the non-mission...