Working on dynticks for i386 reminds me how every project in the kernel is 1000 times more intricate than it first appears. Put together your code and it looks fine but then you have to rewrite it twenty times or so for things you didn't anticipate. Then you chip away at the unexpected bugs one by one like trying to carve an ice cube out of an iceberg with a toothpick.
I've been battling with this swappiness issue and 2.6 for as long as I can remember now. No tuning to the current system seemed optimal and it dawned on me finally how I should approach it to actually behave very well on a desktop. I made a patch I called mapped_watermark.
This patch readjusts the way memory is evicted by lightly removing cached ram once the ram is more than 2/3 full, if less than the "mapped watermark" percent of ram is mapped ram (ie applications). The normal system is to aggresively start scanning ram once it is completely full. The benefits of this are:
I've put up a new release of -ck
http://kernel.kolivas.org
I made lots of little updates to the extra scheduling policies in staircase, a tiny micro-optimisation to staircase itself, added a cfq fix, an updated bootsplash and there were security fixes that are going into mainline. After looking at how big this list was I realised I should just release a new -ck so I've released 2.6.7-ck5.
After extensive testing by all the helpful people out there I've released a new version of the staircase scheduler patch. Version 7.7 was nice and stable but probably underperformed about 4 minor versions before it. The stability was necessary, though, because a whole swag of little annoying starvation issues made it into 7.4. This version adds a few more planned features, and has improved the performance substantially, and improved the fairness of the non-interactive and computational scheduler settings. Since it's a significant improvement I've also resynced the -ck patchset without any other changes and released 2.6.7-ck4.
Good to be back and hacking away in earnest. After resyncing with the current kernel I spent a lot of time thinking about the staircase scheduler. I've managed to resolve the remaining corner cases and believe I have a good working design now that seems stable, fair and has good interactivity and responsiveness.
While I've been stuck(?) here in Europe having a ball occasionally checking my email on weird keyboards I've been thinking about the evolution and completion of the staircase scheduler design.
One thing I can tell you is that I've been thinking a lot about the code and have 3 changes planned for it. 2 involve real infrastructure changes that should complete the scheduler design without increasing it's complexity significantly. The other is the simple removal of these lines from kernel/sched.c which you can do right now:
I couldn't resist. I posted a little update that causes a massive improvement in performance. It _may_ bring back the starvation in a milder form but the gains are worth it for those using this scheduler now. Patches against mainline stair5 and -ck2 are available.
http://ck.kolivas.org/patches/2.6/2.6.4/experimental/staircase/
http://ck.kolivas.org/patches/2.6/2.6.4/2.6.4-ck2/experimental/
Although I'm officially not doing any coding since I'll be leaving in a couple of days, I thought of something while on the treadmill. So I've posted a very minor update to the staircase code bringing it up to v5.1. First - tasks that get preempted should not be put to the back of the queue; this is something even mainline does that annoys me. Second - if a task has been preempted, when it is scheduled again it should have at least one jiffy worth of time_slice to optimise cache performance and prevent virtually useless wakeups. No massive performance changes or anything but it was easy enough to tack on in no time.
I have a public release announcement ready for the staircase scheduler and have put it online with a current stable usable version of it here: staircase scheduler.
I've addressed the issue of starvation in an even simpler manner that is far more effective now. There are no known issues at this time. I'll be throwing a bootload of benchmarks at it again to see how it performs now.
http://ck.kolivas.org/patches/2.6/2.6.4/experimental/staircase/
Well if you've been wondering where I've been hiding for a while now it's because I've been working on a new scheduler. I was hoping on getting something together for release before I went away for two months just to leave people with something to play with.
This new scheduler uses an algorithm I invented which I describe as a staircase-deadline scheduler because of the way priority and timeslice is handled (staircase) and the additon of deadlines to prevent starvation of low priority tasks and maintain appropriate cpu distribution.
I've ported grsec test2 to 2.6.4-ck1 and the cd audio dma patch and placed them in my experimental directory.
Resync with 2.6.4
Added sched_domains
Added reiser4
Swappiness autoregulation now has the option of changing from autoregulated
swappiness to static swappiness on the fly:
echo 0 > /proc/sys/vm/autoswappiness
My fans arrived! See my previous blog for what I'm talking about.
I ripped out all the fans from every machine I could find and replaced them with my new silenx fans.
The first step was putting the 120mm fan where the case fan was. Let's just say I've never done this before so it wasn't as simple as it should have been... but it was simple. I dashed into the other room to plug it in, turn it on and hear some awful grating noise. Turned it off, pulled it apart and took off the guard that was in place which it was hitting. Turned it back on again and heard.. cdrom spin up and down... hard disks spin up... cpu fan ticking ever so slightly and thats about it. HOLY SHIT this thing ROCKS! Put my hand over the fan outlet and I think subjectively it doesnt have any more flow than the previous one. Hmm not quite what I expected but ultimately it was the silence I was after. Actually it is audible once I close the box (which dampens down the sound of the 20dB cpu fan).
Time to try and shut the door on this work. I spent many hours this past weekend converting the SMT (aka hyperthreading) priority support with Nick's sched_domains in 2.6.4-rc1-mm1. The infrastructure of the domains is very interesting and seems nicely extendable. With Zwane and Nick's help I believe this time I've done it the right way without making changes to kernel/sched.c that are too architecture specific . It should be easy to use this for newer SMT designs as they come out with both improved performance and more siblings support.