Genetic Algorithm for Runtime Choice of Optimal I/O Scheduling
Original Publication Date: 2004-Oct-13
Included in the Prior Art Database: 2004-Oct-13
Described herein are the foundations necessary to automate the run-time choosing of optimal I/O scheduling algorithms and the corresponding tunable parameters. Furthermore, this paper shows how to adapt to changing workloads and hardware without human intervention.
Genetic Algorithm for Runtime Choice of Optimal I /O Scheduling
Input/Output schedulers in Operating Systems have long been identified as a point for potential performance improvement. Many algorithms have been designed in several Operating Systems. For instance, in Linux there is support for the Anticipatory, Deadline, No-op, and CFQ I/O scheduling algorithms. New work is being done to support I/O scheduler modules that are replaceable at runtime, which would make the number of supported I/O scheduling algorithms almost infinite.
Each I/O scheduler algorithm works best on certain hardware and certain workloads. Changing either the workload or the hardware can change which algorithm would yield the best performance, but there is currently no way to automatically choose the best algorithm. Changing the algorithm requires investigation and administration by a highly skilled person. In other words it costs lots of time and money.
To make matters worse each algorithm has many tunable parameters that can be set to arbitrary values. Much effort is spent by performance analysts and system administrators choosing how to choose an I/O algorithm and then tune these variables. In the majority of cases machines run with sub optimal algorithms and sub optimal parameters.
Described are the foundations necessary to automate the choosing of the algorithm and the tunable parameters. Furthermore, this invention shows how to adapt to changing workloads and hardware without human intervention.
The core idea of this paper is to use a genetic algorithm to analyze collected I/O performance data, for example throughput and latency. Using this data the system evolves to adapt to workloads and hardware. Using a two level custom genetic algorithm that chooses I/O scheduling algorithms and tunable parameters as phenotypes and alleles respectively. This genetic algorithm has the advantage of not requiring intervention from the user or system administrator, of adjusting to changing workloads, and of giving excellent performance.
Each member of the population would consist of a weighting of I/O scheduling algorithms corresponding to the amount of time each I/O scheduling algorithm would be run during that member's time period. These algorithms are referred to as phenotypes. Each phenotype also has an associated set of tunable parameters, referred to as genes.
The high level flow on an iteration of a life-cycle would be:
I. Randomly generate the initial population
A. Assign all the phenotypes (I/O scheduling algorithms) and genes (tunable parameters) for each member randomly.
II. Run population timeslices
A. Give each member of the population system time configured using the
configuration determined by their phenotypes and genes.
III. Evaluate population member performance, discard the members in the lower half of performers.
A. Using a performance metric of choice, such as transactions per second, throughput, or latency the population members are rate...