Browse Prior Art Database

Method for server self-load balancing Disclosure Number: IPCOM000198803D
Publication Date: 2010-Aug-17
Document File: 6 page(s) / 44K

Publishing Venue

The Prior Art Database


Traditional load balancing techniques focus on predicting maximal load on the server and to ensure that during the highest load there will be enough resources to serve all traffic. This means that some of the resources are wasted outside of the peak load. In this article we will present new approach to this problem.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 26% of the total text.

Page 1 of 6

Method for server self-load balancing


A common problem in client-server architectures is distributing the load as seen from the server associated with the services of connecting clients. In an environment with a high number of clients, independent and routine services may cause periods of high server loading counterbalanced with periods of light server loading. These are inevitable as potentially large groups of clients are initialized and scheduled to run their services at the same time. This may lead to sub-optimal server performance as client requests are rejected and common server tasks require more time to complete due to the lack of available server resources (such as memory and CPU cycles). Ideally, the client connections and services should be distributed in time such that the server system observes a constant or more uniform loading.

General Description

The core idea of this method focuses on a strategy for the server to predict and balance its own client service loading using an adaptive statistical algorithm.

The simplest invocation of the method can be described as follows. First, upon client request, the server communicates with each client to deliver the default frequency to run a service. This may be in the form of a standard date and time or may be in the form of a frequency such as every six hours. Second, the server collects information about each client service and the duration of the service for each client in the network. Third, with a minimum set of historical data, the server forecasts its client service loading for a period of time. Forth, the server analyzes the forecast and identifies peak periods of client service loading. Finally, the server reacts to peak periods by iteratively balancing highly loaded client service timeframes into more lightly loaded client service timeframes against a desired threshold or tolerance. After multiple iterations of this scheme, the server shall have communicated more optimal execution times for each client's services in the network resulting in a server workload that is more uniformly distributed over time without excessive loading peaks.

Typical Embodiment

The prediction algorithm relies on the number of clients, the frequency of the services, the duration of every service, the priority of the service, the frequency and the weight of background administrative tasks.

Each service (i.e. all the service ending up in the same server call back) is associated with a unique priority number that is identifying both its importance from a business point of view and a pre-calculated average duration (i.e. if I have a critical service that is really quick its probability to be not processed in case of a stress situation). The priority can be determined taking into consideration for example specific system/business information such as, will the system or solution maintain integrity if that service gets dropped?

Every time a service gets invoked, the following measures are...