Browse Prior Art Database

Optimizing Systems Management using Swarm Intelligence Principles

IP.com Disclosure Number: IPCOM000185303D
Original Publication Date: 2009-Jul-20
Included in the Prior Art Database: 2009-Jul-20
Document File: 4 page(s) / 66K

Publishing Venue

IBM

Abstract

Disclosed is an approach to the management of large numbers of computer servers or other computing devices, using the principles of swarm intelligence, which has previously been used to model the swarming or flocking behaviors of animals. Specifically, several basic behavioral principles of swarm intelligence are translated into analogous behaviors involving the management of computing workloads, and a method of managing workloads through low-level, peer-to-peer interaction following these swarm principles is then described.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 24% of the total text.

Page 1 of 4

Optimizing Systems Management using Swarm Intelligence Principles

The common approach to management of large numbers of servers or other devices, including virtualization, workload balancing, and network management, is to employ a powerful management server with sophisticated top-down control mechanisms to manage the deployment of computing resources (e.g., IBM Systems Director*, IBM Enterprise Workload Manager**, HP*** Systems Insight Manager). Centralized management servers frequently require rich agent software to be installed on each managed system or device, to communicate performance metrics and state information to the management server, and to run management operations on the managed systems under the control of the management server. The disadvantages of such centralized systems management tools include:

the expense of dedicating the resources of a powerful server to the management tasks;
the resource consumption and performance impact of the agent software running on each managed system and communicating over a network with a central server;
a tendency towards reactive behavior -- for example, lags in responding to relatively rapid increases or decreases in demand. Lag might be due to the time and resources required for communication between managed systems and the management server, or inefficiencies in the policy rules and algorithms for redeploying resources to the systems that require them;
increasing complexity and difficulty of creating effective management policies as the number of managed systems increases into the hundreds or thousands (scalability issues).

Swarm Intelligence (SI) has been a field of study in computer science for almost 20 years. According to the Wikipedia article on the subject, "SI is artificial intelligence based on the collective behavior of decentralized, self-organized systems. SI systems are typically made up of a population of simple agents interacting locally with one another and with their environment. The agents follow very simple rules, and although there is no centralized control structure dictating how individual agents should behave, local interactions between such agents lead to the emergence of complex global behavior. Natural examples of SI include behavior of ant colonies, bird flocking, animal herding, bacterial growth, and fish schooling."

This invention, which can be considered an instantiation of autonomic computing principles, applies basic principles of swarm intelligence to the management of large, complex groups of both physical and virtual computer servers in a novel way. This approach replaces, or at least supplements, top-down management and complex agent software with simpler peer-to-peer agents that follow basic principles analogous to those identified in SI research. By following this approach, large groups of servers, including "grid" or "cloud"...