Allan McKinnon offered some benchmark testing results, comparing Ingo Molnar's HT-aware patch [earlier story] for the O(1) scheduler (2.5.33-HT) against the stock 2.5.33 and 2.4.8-10bigmem kernels. He used kernel compilation as the benchmark, increasing the number of threads with each test. All of his results were quite positive, showing a significant improvement when using the patch with one or two active tasks. When using the "everypoint internal" Java-based benchmark, Allan also noted that while there was quite a bit of variance in the tests without the HT-aware scheduler patch applied, with the patch this variance went away. Read on to see his actual results.
[Note: I've modified the table formatting below, as the original formatting was much too wide. No information has changed. If you wish, you can view the original formatting here]
Earlier when testing Roman Zippel's new kernel configuration system [earlier story] I was unable to compile the QT graphical configuration portion. After fixing a mismatch between my compiler and binutils I was still running into numerous "undefined references". Roman explained to me, "You have to use the same compiler as qt was compiled with", and I did not have a gcc 2.95 compiler installed as was evidently used to build the copy of QT that came packaged with my RedHat 7.2 server.
I downloaded the source for the latest version of QT from Trolltech (though older versions will work fine) and followed their instructions to compile it, using gcc 3.2. Quite a long time later (on my aging 550MHz P3) QT was fully compiled and installed. At this point I was finally able to compile the 'qconfig' portion of Roman's new configuration system...
Con Kolivas has been maintaining an excellent set of kernel patches " designed to improve system responsiveness, with emphasis on desktop pcs." The latest patches are against the stable 2.4.19 kernel, adding the O(1) scheduler, batch scheduling, kernel preemption, low latency and Andrea Arcangeli's latest -aa VM improvements. You can optionally swap the -aa VM for Rik van Riel's [earlier interview] -rmap VM, as included in Alan Cox's [earlier interview] 2.4 -ac branch.
With the recent release of 2.5.33, Linus Torvalds commented:
"There's a fair amount of stuff in here again, but I'd personally like to have people who actually use that d*ng floppy driver please test it out. I finally broke down and tried to fix it, since it's been broken in 2.5.x for longer than most people care to remember. I don't even have floppies to test with, I just verified that I could read two old backup disks, and one seemed fine, and the other read 90% of the thing, which was a lot more than I expected since they are both at least five years old. I've never had good luck with those unreliable 3.5" things, I'd rather have as little to do with them as possible."
Today Linus posted a small patch against 2.5.33 fixing a problem where "any partial request completion would be totally messed up by the BIO layer". Jens Axboe acknowledged that that patch was correct, "the most embarassing thing is that Bart and I have both found this independently months ago but it seems it got lost at my end (or your end, but lets not point fingers :-) :-(" The bug triggered corruption with the floppy driver, and possibly other block device drivers generating partial requests.
Scott Feldman announced that with the release of development kernel 2.5.33, the e1000 driver now supports TCP Segmentation Offloading (TSO), offering a significant boost in two-way transfer rates. (In the provided benchmark, send only throughput did not increase as the wire's physicial limitation had already been reached.)
TCP Segmentation Offload (or TCP Large Send) is when buffer's much larger than the supported maximum transmission unit (MTU) of a given medium are passed through the bus to the network interface card. The work of dividing the much larger packets into smaller packets is thus offloaded to the NIC. More specifically, the e1000 driver is passing 64k packets to the network card, which then divides these into proper MTU-sized 1500 byte packets.
Alexey Kuznetsov added TSO support into the stack, noting that as of yet, "the implementation in tcp is still at [the] level of a toy". Be that as it may, it's a good start, and before long other TSO capable devices will likely also be supported.
Luca Barbieri posted an interesting patch, which "implements a system that modifies the kernel code at runtime depending on CPU features and SMPness".
What this patch does (as far as i can figure out) is detect if your machine actually has multiple CPUs (you must have SMP support compiled into the kernel), and modify the kernel accordingly.
It doesn't seem likely that it will be merged into mainline anytime soon (if at all), but it's still pretty interesting in concept.
Trond Myklebust posted a set of three patches to the lkml today to introduce BSD-style user credentials into the 2.5 development kernel. The first two patches simply lay some ground work, converting all references to the 'current' structure pointer defined in '<asm/current.h>' into inline function calls, and renaming the 'ucred' structure defined in '<linux/socket.h>' to 'scm_ucred'.
The third patch begins to actually introduce BSD style credentials, introducing the 'ucred' structure which "will later be used as the basic element of user authentication at the VFS level in lieu of the current hodge-podge of partial creds in struct file and lower level filesystem code." Linus Torvalds responded to the third patch, and the resulting conversation helps to further explain the direction these patches are leading.
Christoph Hellwig posted about a patch which "includes only the core functionality of the SGI XFS filesystem for Linux 2.5.32. It does NOT include changes for Posix ACLs, dmapi, kdb or other code included in the XFS CVS tree".
Hopefully, this will ultimately pave the way for the inclusion of XFS in 2.5.
Ingo Molnar, author of the O(1) scheduler [earlier story] and the orginal preemptive kernel patch, has provided a patch to make the O(1) scheduler fully aware of HyperThreading. Ingo explains:
"Symmetric multithreading (hyperthreading) is an interesting new concept that IMO deserves full scheduler support. Physical CPUs can have multiple (typically 2) logical CPUs embedded, and can run multiple tasks 'in parallel' by utilizing fast hardware-based context-switching between the two register sets upon things like cache-misses or special instructions. To the OSs the logical CPUs are almost undistinguishable from physical CPUs. In fact the current scheduler treats each logical CPU as a separate physical CPU - which works but does not maximize multiprocessing performance on SMT/HT boxes."
Read on for Ingo's full explanation.
Mel Gorman recently announced the release of version 0.6 of VM Regress, "a regression, benchmarking and test tool for the Linux VM." Following this release, Mel posted the results of a series of four tests comparing the stock 2.4.19 VM with Rik van Riel's [earlier interview] 2.4.19-rmap14a VM [earlier story].
Regarding the results of these tests, Mel acknowledges that, "It is hard to draw solid conclusions because large gaps still exist in the data but some can be drawn." His tests indicate no significant difference between the two VM's when there is sufficient memory (no swapping). However, with low memory when memory references occur in a linear pattern, the stock VM appears to perform better. And then when memory references occur in other patterns with large anonymous page use, the rmap VM appears to perform better. Future versions of this VM tool will provide much more information, such as "the page age graphs [which] are on the way and will be available in VM Regress 0.7"
Patent issues are not new for the linux kernel. Redhat patented Ingo Molnar's work, and the SELinux project faced patent claims from the company hired to do most of the coding. Now patent concerns are plauging the VM. Linux Weekly News has posted a few stories, and Slashdot picked them up. However, none of these summaries reflect the breath of debate found on the lkml. The debate centers around this question: What should OSS projects do about software patents?
I have included some of the more relevant posts (at last count google indicates that there are 109 posts in the thread, though not all of them discuss the patent issues) so that all arguments can be examined. Included emails are from Rob Landley, Daniel Phillips, Alexander Viro, Rik van Riel [earlier interview], Larry McVoy [earlier interview], Alan Cox [earlier interview] and Linus Torvalds.
Stephen Tweedie, author and maintainer of the ext3 journaled filesystem submitted a couple of ext3 patches for inclusion into the stable 2.4 kernel. He noted that "this patch set contains the biggest recent change to ext3". Both patches have been tested "for some time now" in the ext3 CVS tree. (Stephen Tweedie originally wrote ext3 for the 2.2 kernel. It was later ported to the 2.4 kernel by Peter Braam, Andreas Dilger and Andrew Morton [earlier interview], with help from Stephen. The ext3 filesystem was merged into the 2.4 mainline kernel tree with the release of kernel 2.4.15.)
The first included patch against the ext3 filesystem helps to make it more "robust against things like dump(8) or tune2fs(8) playing with the block device on a live filesystem." The second patch in the set fixes "a race window in buffer refiling" in which bdflush could put the buffer into an unexpected state. Read on for Stephen's full explanations.
With the 2.5 development kernel recently switching from Marcin Dalecki's IDE core work to a port of the 2.4 IDE core [earlier story], many are curious to know what the plan is looking forward. Paul Bennett posted the following questions to the lkml regarding the future of IDE, "What are the goals for 2.5. What is the implementation plan? What were the problems in 2.4, and how will they be fixed in 2.5, etc?"
In an earlier thread, Linus pointed to Alan Cox [earlier interview] as the likely candidate for taking on the IDE tangle, "Right now it looks like Alan is at least for the moment willing to work on the IDE code, which is obviously great. I just wonder how long he'll stand it (he's maintained various IDE buglists etc issues for years, so we can hope)."
Alan replied to Paul's question, describing a four phase plan to begin untangling the IDE code in a sane manner, "that should allow us to keep solid stable IDE along the way." He marked 'phase one' as "mostly complete".
Marcelo released 2.4.20-pre4 today. Included in the bug fixes and driver updates was the merge of JFS. JFS is IBM's journaled filesystem from OS/2. JFS had previously been merged into the -ac tree (2.4.18pre9-ac4) and was merged into the 2.5 tree early on (2.5.6). JFS joins ext3 and reiserfs in the 2.4 tree. SGI's XFS is still awaiting inclusion into the stable tree. [Ed: the JFS IBM port was from OS/2 not from AIX. Sorry for the mistake.]
Ingo Molnar has implemented a replacement for the old sys_exit call, which was an O(N) function. The new version is an O(1) function. With the old function, the overhead would go up as the number of processes and threads (N) went up. O(1) means that the overhead is constant, reguardless of the number of processes. Initial post and a link to the thread follows.