Jake Moilanen provided a series of four patches against the 2.6.9 Linux kernel [story [1]] that introduce a simple genetic algorithm [2] used for automatic tuning. The patches update the anticipatory IO scheduler [story [3]] and the zaphod CPU scheduler [story [4]] to both use the new in-kernel library, theoretically allowing them to automatically tune themselves for the best possible performance for any given workload. Jake says, "using these patches, there are small gains (1-3%) in Unixbench & SpecJBB. I am hoping a scheduler guru will able to rework them to give higher gains."
Genetic algorithms as used in machine learning are modeled after the process of evolution as observed in nature, and are a field within the science of artificial intelligence. The idea is to generate a "population" defined with unique strings of "chromosomes", to test each of these chromosome strings for "fitness", to select a subset of the chromosome strings with the best fitness and use them to create new chromosomes, to apply random mutation to a small subset, and finally to start the process all over again. Over time, all the chromosomes should "evolve" toward having the best possible fitness, as defined by the algorithm.
In Jake's patch, the chromosome strings are possible tunings for Linux kernel internals, and fitness is measured in overall performance. Essentially, a large pool of possible tunings are selected (referred to above as chromosome strings). Each of these possible tunings is tested under the current workload, measured for best performance (referred to above as fitness). All of the tunings are then ordered from best performance to worst, and the worst half of the performers are replaced with new possible tunings derived from the best half of the performers. Finally, random mutation is applied to slightly modify some of the tunings. Jake explains, "over time the tunables should converge toward the optimal settings for that workload. If the workload changes, the tunables should converge to the new optimal settings (this is part of the reason for mutation)."
From: Jake Moilanen [email blocked] To: linux-kernel Subject: [ANNOUNCE 0/4][RFC] Genetic Algorithm Library Date: Thu, 6 Jan 2005 10:08:44 -0600 I'm pleased to announce a new in-kernel library to do kernel tuning using a genetic algorithm. This library provides hooks for kernel components to take advantage of a genetic algorithm. There are patches to hook the different schedulers included. The basic flow of the genetic algorithm is as follows: 1.) Start w/ a broad list of initial tunable values (each set of tunables is called a child) 2.) Let each child run for a timeslice. 3.) Once the timeslice is up, calculate the fitness of the child (how well performed). 4.) Run the next child in the list. 5.) Once all the children have run, compare the fitnesses of each child and throw away the bottom-half performers. 6.) Create new children to take the place of the bottom-half performers using the tunables from the top-half performers. 7.) Mutate a set number of children to keep variance. 8.) Goto step 2. Over time the tunables should converge toward the optimal settings for that workload. If the workload changes, the tunables should converge to the new optimal settings (this is part of the reason for mutation). This algorithm is used extensively in AI. Using these patches, there are small gains (1-3%) in Unixbench & SpecJBB. I am hoping a scheduler guru will able to rework them to give higher gains. The main area that could use reworking is the fitness calculation. The problem is that the kernel is looking more at the micro of what's going on, instead of the macro. I am thinking of moving the fitness calculation to outside the kernel. However, I would advocate keeping the number of layers needed to communicate between the genetic library and the hooked component down in order to keep it as lightweight as possible. The patches are based on 2.6.9 and still a little rough, but here is the descriptions: [1/4 genetic-lib]: This is the base patch for the genetic algorithm. It's based against 2.6.9. [2/4 genetic-io-sched]: The base patch for the IO schedulers to use the genetic library. [3/4 genetic-as-sched]: A genetic-lib hooked anticipatory IO scheduler. [4/4 genetic-zaphod-cpu-sched]: A hooked zaphod CPU scheduler. Depends on the zaphod-v6 patch. Please send comments, Jake Moilanen
From: Jake Moilanen [email blocked] Subject: [ANNOUNCE 1/4][RFC] Genetic Algorithm Library Date: Thu, 6 Jan 2005 10:14:47 -0600 Here is the base patch for the genetic-library. It includes generic routines for modifying and manipulating genes of children. If a specific routine is needed, that can be used instead. Signed-off-by: Jake Moilanen [email blocked] --- [patch [5]]
From: Jake Moilanen [email blocked] Subject: [ANNOUNCE 2/4][RFC] Genetic Algorithm Library Date: Thu, 6 Jan 2005 10:18:38 -0600 This is the base patch for the io-schedulers. It contains the fitness routine, disk_calc_fitness(), that could use a rework. Signed-off-by: Jake Moilanen [email blocked] --- [patch [6]]
From: Jake Moilanen [email blocked] Subject: [ANNOUNCE 3/4][RFC] Genetic Algorithm Library Date: Thu, 6 Jan 2005 10:22:49 -0600 This is the hooked anticipatory IO scheduler. Here is an example of the hooked scheduler. It should optimally tweak the tunables for the current workload. If nothing else, hopefully the genetic-lib will help scheduler writers find the optimal tunable settings. Signed-off-by: Jake Moilanen [email blocked] --- [patch [6]]
From: Jake Moilanen [email blocked] Subject: [ANNOUNCE 4/4][RFC] Genetic Algorithm Library Date: Thu, 6 Jan 2005 10:27:42 -0600 The hooked zaphod CPU scheduler patch. It depends on the zaphod-v6 patch. The ranges for the tunables could be tweaked better. Signed-off-by: Jake Moilanen [email blocked] --- [patch [7]]
Related Links:
- Archive of above thread [8]