"After posting some benchmarks involving cfs, I got some feedback, so I decided to do a follow-up that'll hopefully fill in the gaps many people wanted to see filled," Rob Hussey began. He added, "this time around I've done the benchmarks against 2.6.21, 2.6.22-ck1, and 2.6.23-rc6-cfs-devel (latest git as of 12 hours ago)." Rob briefly summarized, "the only analysis I'll offer is that both sd and cfs are improvements, and I'm glad that there is a lot of work being done in this area of linux development. Much respect to Con Kolivas, Ingo Molnar, and Roman Zippel, as well all the others who have contributed."
Referring to a chart in which the blue line represented the CFS process scheduler, and the green line represented the SD "staircase" process scheduler, Ingo Molnar noted, "heh - am i the only one impressed by the consistency of the blue line in this graph? :-) [ and the green line looks a bit like a .. staircase? ]" He acknowledged some slowdown in CFS compared to SD in one of the benchmarks, "-ck1 is 0.8% faster in this particular test." Ingo then explained, "many things happened between 2.6.22-ck1 and 2.6.23-cfs-devel that could affect performance of this test. My initial guess would be sched_clock() overhead." In further testing he applied a low-res-sched-clock that resulted in better performance for CFS leading him to conclude, "the performance difference between -ck and -cfs-devel seems to be mostly down to the more precise (but slower) sched_clock() introduced in v2.6.23 and to the startup penalty of freshly created tasks." When asked if the low-res-sched-clock was likely to be merged, Ingo replied:
"I don't think so - we want precise/accurate scheduling before performance. (otherwise tasks working off the timer tick could steal away cycles without being accounted for them fairly, and could starve out all other tasks.) Unless the difference was really huge in real life - but it isn't."
"It turns out that USB devices suck when it comes to powermanagement issues :(" lamented Greg KH in posting some patches to handle USB autosuspend problems. He noted that the patches were intended for inclusion in the upcoming 2.6.23 kernel, "a number of patches have been submitted near the end of this kernel release cycle that add new device ids to the quirk table in the kernel to disable autosuspend for specific devices. However, a number of developers are very worried that even with the testing that has been done, once 2.6.23 is released, we are going to get a whole raft of angry users when their devices break in nasty ways." He proved an example, "it seems that almost 2/3 of all USB printers just can not handle autosuspend. And there's a _lot_ of USB printers out there..."
Later in the discussion, Linux creator Linus Torvalds commented, "in general, I think the USB blacklist/whitelists are generally a sign of some deeper bug." He continued on to point out a number of quirks in the USB layer that need to be addressed and added:
"We used to have a lot of those things due to simply incorrect SCSI probing, causing devices to lock up because Linux probed them with bad or unexpected modepages etc. I suspect we still have old blacklist entries from those days that just never got cleaned up, because nobody ever dared remove the blacklist entry.
"We should strive to make the default behaviour be so safe that we never need a black-list (or a whitelist), and basically consider blacklists to be not a way to 'fix up a device', but a way to avoid some really serious AND *RARE* error."
"With 2.6.24 probably opening in the not-too-distant future, it's probably a good time to review what my plans are for when the merge window opens," began Roland Dreier on the Linux Kernel mailing list. He reflected on the recent decision to phase in usage of reviewed-by tags noting that he was a little behind on reviews, "unfortunately, due to the length of the backlog and the fact that 2.6.23 seems fairly close, some of the things listed below are going to miss the 2.6.24 merge window." Roland continued, stressing the importance of getting in-depth reviews:
"Although the plan is to phase in requiring 'Reviewed-by:' gently, for this merge, if you can get someone other than me to review your work, then the chances of it being merged increase dramatically. I'm talking about a real review -- ideally, someone independent (from another company would be good) who is willing to provide a 'Reviewed-by:' line that means the reviewer has really looked at and thought about the patch. There should be a mailing list thread you can point me at where the reviewer comments on the patch and a new version of that patch addressing all comments is posted (or in exceptional cases, where the patch is perfect to start with, where the reviewer says the patch is great)."
Linus Torvalds announced the sixth release candidate of the upcoming 2.6.23 kernel, a final release expected within the next few weeks. He noted:
"So last week was a bust, with a lot of core people away for the kernel summit, and with -rc5 having two rather nasty (and silly) one-liner problems that bit a number of people - a missing NULL pointer check in TCP, and a missing list terminator in ata_piix.
"So the fixes for those things were both pretty trivial, and they've been in the -git trees for the last few days, but I just pushed out an -rc6 that also merges up some other updates that did come in during the week."
The -rc6 source level changes can be browsed via the gitweb interface.
Hua Zhong reported an NFS regression in 2.6.23-rc4 as compared to 2.6.22, "[upgrading] causes several autofs mounts to fail silently - they just [do] not appear when they should." Trond Myklebust explained that the change to default behavior was intentional to prevent an NFS mount from being mounted with the wrong options. The patch also introduced a new mount option, "the new option is there in order to make it damned clear to sysadmins that this is a dangerous thing to do: mounts which don't share the same superblock also don't share the same data and attribute caches. Any file or directory which appears in both mounts had better only be used by one application at a time or be using an appropriate locking scheme." Jakob Oestergaard defended the change asserting, "what he 'broke' is, for example, a ro mount being mounted as rw. That *could* be a very serious security (etc.etc.) problem which he just fixed. Anything depending on read-only not being enforced will cease to work, of course, and that is what a few people complain about(!)."
Linus Torvalds disagreed strongly with the change, "that commit gets reverted or fixed. It's a regression, and your theories that it's 'better' that way are obviously broken." He added:
"The point being that you just disallowed people from doing things that are sane but _potentially_ dangerous. That's not how we work. The UNIX way is to give people rope - if you cannot *prove* that what they are doing is wrong, then you damn well better not disallow it."
In response to the concern that the changes to NFS were necessary to fix a security hole, Linus retorted, "this is *not* a security hole. In order to make it a security hole, you need to be root in the first place. So what you call a security hole is really no different from root installing a bad SUID binary. It's simply not the kernels place to then say 'SUID binaries will not work, because it's a potential security hole'."
Linus Torvalds announced the fifth release candidate for the upcoming 2.6.23 Linux kernel noting that he was on his way to Cambridge, England, for the 2007 kernel summit. The invite-only kernel summit has been hosted in Ontario, Canada the past five years, this being the first year it has been hosted in Europe. It will happen over three days, from September 4'th through September 6'th.
Regarding 2.6.23-rc5, Linus noted, "hopefully we've addressed most regressions, so please do give it a good testing." He went on to summarize, "the shortlog and diffstat are appended: the diffstat is uglified by some powerpc defconfig updates, but otherwise it all looks pretty nice and small. The shortlog is fairly informative if you care about the details of what changed, but it does end up boiling down to 'fixing a number of generally pretty small issues'. Mostly in drivers and SCTP. So have fun, give it a go, and expect a quiet week next week."
Linux creator Linus Torvalds announced the latest release candidate of the upcoming 2.6.23 kernel, "it can mostly be described with the one word, 'boring'", he said, noting there weren't any exciting changes. He added that there was two weeks between this and the last release candidate, summarizing:
"As a result, -rc4 is a bit bigger than it would/should have been, but hopefully it's all good, and we've fixed most regressions. There's some arch updates (MIPS, power, sparc64, s390) and an ACPI update, but the rest of it is mainly lots of small fixes (mostly to various random drivers). With some scheduler and networking noise."
Actual source-level changes can be viewed through the gitweb interface. Kernel Newbies maintains a list of all changes in the upcoming kernel.
Ingo Molnar announced version 20 of his Completely Fair Scheduler patchset, offering further cleanups for the new scheduler code that will be part of the upcoming 2.6.23 kernel, "there have been lots of small regression fixes, speedups, debug enhancements and tidy-ups - many of which can be user-visible." Ingo went on to summarize:
"There are nearly 100 changes - they do add up to a significant total linecount change. There was no crash bug or hang bug found in the CFS code since v19 was released. (in fact the last crash/hang bug in CFS was found and fixed in v7, more than 3 months ago, and even that crash only happened in an uncommon sw-suspend setup, not during normal use. So CFS has turned out to be a pretty robust codebase.) Nevertheless, if you had any problems (performance or behavioral) with v19 it's worth checking v20 out - and if v19 worked great for you it's worth checking out that v20 still works great =B-)"
"Either people really are calming down, and figuring out that we're in the stabilization phase," Linus Torvalds began in announcing 2.6.23-rc3, "or it's just that it's the middle of August, and most everybody at least in Europe are off on vacation." The actual source-level changes can be browsed via the kernel.org gitweb interface. Linus went on to summarize:
"Regardless of why, -rc3 is out, and doesn't have the tons of changes that -rc2 did. But there's some scheduler updates, sparc64 and powerpc changes, and random driver updates (the lpfc SCSI driver kind of stands out in the diffstat).
Shortlog appended, I don't know what I can add to it.. Please do give it a good testing, unless you're on a beach sunning yourself (and who are we kidding: you're pasty white, and sand is hard to get out of the keyboard - beaches are overrated)."
Some entertaining lguest documentation discussed in an earlier story was merged into the mainline kernel with the commit message, "the netfilter code had very good documentation: the Netfilter Hacking HOWTO. Noone ever read it. So this time I'm trying something different, using a bit of Knuthiness." Both Netfliter and lguest, as well as the documentation for both, were written by Rusty Russell. He describes the lguest driver as, "a simple hypervisor for Linux on Linux. Unlike kvm it doesn't need VT/SVM hardware. Unlike Xen it's simply 'modprobe and go'. Unlike both, it's 5000 lines and self-contained."
Downloading the 2.6.23-rc2 kernel and looking in the "drivers/lguest/" directory I found a simple README that kicks off an interesting documentation process, beginning, "welcome, friend reader, to lguest." It goes on to note, "I can't think of many 5000-line projects which offer both such capability and glimpses of future potential; it is an exciting time to be delving into the source!" At the end of the included README is a hint as to how to find the rest of the documentation, which is embedded inline within all the lguest files. Read on to begin the exploration into lguest and its documentation.
"So I tried to hold people to the merge window," Linus Torvalds began in announcing the 2.6.23-rc2 kernel, "and said no to a few pull requests, but this whole '-rc2 is the new -rc1' thing is a disease, and not only is -rc2 late, it's bigger than it should be. Oh, well." He noted that over 250 people contributed patches between -rc1 and -rc2, adding:
"A lot of the changes are small, and a lot of them really are fixes, but there's a MIPS merge in there too, and some absolutely _huge_ diffs due to some drivers undergoing Lindent cleanups (28 _thousand_ lines changes in advansys.c, and the PNP files got Lindented too, although those weren't nearly as big).
"But if you ignore the Lindent changes, the MIPS merge, the lguest documentation updates, and the MPT fusion driver changes, and the removal of the broken arm26 support, the rest of the changes really aren't that big."
"People who think SD was 'perfect' were simply ignoring reality," Linus Torvalds began in a succinct explanation as to why he chose the CFS scheduler written by Ingo Molnar instead of the SD scheduler written by Con Kolivas. He continued, "sadly, that seemed to include Con too, which was one of the main reasons that I never [entertained] the notion of merging SD for very long at all: Con ended up arguing against people who reported problems, rather than trying to work with them." He went on to stress the importance of working toward a solution that is good for everyone, "that was where the SD patches fell down. They didn't have a maintainer that I could trust to actually care about any other issues than his own." He then offered some praise to Ingo, "as a long-term maintainer, trust me, I know what matters. And a person who can actually be bothered to follow up on problem reports is a *hell* of a lot more important than one who just argues with reporters." Linus went on to note a comparison between the two schedulers:
"I realize that this comes as a shock to some of the SD people, but I'm told that there was a university group that did some double-blind testing of the different schedulers - old, SD and CFS - and that everybody agreed that both SD and CFS were better than the old, but that there was no significant difference between SD and CFS."
"Lguest is an adventure, with you, the reader, as Hero," began some documentation for lguest recently submitted by Rusty Russell. The documentation continued, "but be warned; this is an arduous journey of several hours or more! And as we know, all true Heroes are driven by a Noble Goal. Thus I offer a Beer (or equivalent) to anyone I meet who has completed this documentation. So get comfortable and keep your wits about you (both quick and humorous). Along your way to the Noble Goal, you will also gain masterly insight into lguest, and hypervisors and x86 virtualization in general."
Andrew Morton noted that he would consider the documentation patches for inclusion in the 2.6.23 kernel, to which Rusty replied, "indeed, no code changes, and I feel strongly that it should go into 2.6.23 because it's *fun*. And (as often complained) there's not enough poetry in the kernel." Linus Torvalds quipped, "there's a reason for that," going on to rhyme, "there once was a lad from Braidwood, with a wife and a hatred for FUD, he hacked kernels for fun, couldn't get them to run, but he always felt that he should." He added, "so when you say 'there's not enough poetry', next time you'll know why. You *really* don't want want poetry." This led to numerous additional poetic submissions about which Rusty noted, "there was a poetic infection, which distorted the kernel's direction, the code got no time, as they all tried to rhyme, and it shipped needing lots of correction."
As expected, Linus Torvalds released the 2.6.23-rc1 kernel two weeks after the release of 2.6.22, ending the merge window, "and it has a *ton* of changes as usual for the merge window, way too much for me to be able to post even just the shortlog or diffstat on the mailing list". He noted, "I personally like how 'sendfile' is now totally gone internally, and the kernel now ends up doing all that with splice insted. Good riddance, although we'll obvously end up supporting the old user level interfaces for a long time." Linus went on to summarize the other changes:
"Lots of architecture updates (for just about all of them - x86[-64], arm, alpha, mips, ia64, powerpc, s390, sh, sparc, um..), lots of driver updates (again, all over - usb, net, dvb, ide, sata, scsi, isdn, infiniband, firewire, i2c, you name it).
"Filesystems, VM, networking, ACPI, it's all there. And virtualization all over the place (kvm, lguest, Xen).
"Notable new things might be the merge of the cfs scheduler, and the UIO driver infrastructure might interest some people."
The Xen virtual machine monitor was recently merged into the upcoming 2.6.23 Linux kernel in a series of patches from Jeremy Fitzhardinge. The project was originally started as a research project at the University of Cambridge, and has been repeatedly discussed as a merge candidate for the mainline Linux kernel.
Xen is described in the project's FAQ as:
"Xen is a virtual machine monitor (VMM) for x86-compatible computers. Xen can securely execute multiple virtual machines, each running its own OS, on a single physical system with close-to-native performance."