Autonomous Learning Algorithm for Power Management
Original Publication Date: 2009-Jul-14
Included in the Prior Art Database: 2009-Jul-14
Energy efficiency in computer data centers is becoming a critical issue for many companies. Disclosed is an autonomous learning method for improving energy efficiency in data centers.
Disclosed is an autonomous learning algorithm to optimize power consumption in a datacenter or massively multi-node computer. The disclosure includes the building of an energy/power profile for servers operating under different workloads, analyzing those profiles (which include system hardware and software configuration information) to identify patterns, and adjusting both the configuration and operation of the servers based on the analysis. Four specific embodiments of our disclosure follow:
1) Software Patch/Version Maintenance Considering Power Usage
This includes monitoring for changes in efficiency in systems after configuration changes are made, such that those changes can be propagated to more systems, or the configuration change could be limited due to a loss in efficiency (i.e., it would only be made on a subset of systems
just large enough to fulfill the necessary workload demands,
and could be "undone" if the workload drops to fit on a smaller number of servers). An example of this would be if a specific patch or software version was required to run a certain workload, but that version of the software ran less efficiently than the old version. We would not update all of the servers to the new version but, instead, limit it to the smallest number of systems necessary. This is in contrast to the situation where we would update all of the servers if the new version of the software ran more efficiently. This would be integrated into customer's fix/patch management software (like Management Central for System i*).
A software upgrade (this applies to hardware as well) can cause more CPU cycles to be consumed. It is also the case that there are upgrades specifically to reduce the use of CPU cycles and boost performance (for example, performance patches/PTFs). So new versions of software can cause either an increase or a decrease in power needs. Also, some software upgrades might require a hardware upgrade to increase performance. This requirement can be costly, and in the end, it might end up causing more energy consumption (or lowering it, depending on the hardware). The increase use of CPU cycles cause an increase in energy consumption.
(Example of autonomically dealing with a patch/upgrade that causes an increase in power consumption): A subset of a customer's ERP application might need a newer software version to run successfully. With our invention integrated into a software management tool, we will detect the increase in energy consumption (or the increase in CPU cycles needed) for the new version as compared to the old, we would build a plan to reduce the number of systems this software upgrade is applied to. Our invention can monitor the computing resource needs for the new software version such that as more of the customer's ERP application needs the new version, we slowly convert more nodes over to the new version. Optionally, we can switch nodes between software versions more...