"This is a bugfixed version of 2.6.26-rc5-mm2, which was a bugfixed version of 2.6.26-rc5-mm1. None of the git trees were repulled for -mm3 (and nor were they repulled for -mm2). The aim here is to get all the stupid bugs out of the way so that some serious MM testing can be performed. Please perform some serious MM testing."
"We are working [on] a new I/O scheduler based on CFQ, aiming at improved predictability and fairness of the service, while maintaining the high throughput it already provides," began Fabio Checconi, announcing the BFQ I/O scheduler. "The Budget Fair Queueing (BFQ) scheduler turns the CFQ Round-Robin scheduling policy of time slices into a fair queuing scheduling of sector budgets," he continued, "more precisely, each task is assigned a budget measured in number of sectors instead of amount of time, and budgets are scheduled using a slightly modified version of WF2Q+. The budget assigned to each task varies over time as a function of its behaviour. However, one can set the maximum value of the budget that BFQ can assign to any task." Fabio went on to explain:
"The time-based allocation of the disk service in CFQ, while having the desirable effect of implicitly charging each application for the seek time it incurs, suffers from unfairness problems also towards processes making the best possible use of the disk bandwidth. In fact, even if the same time slice is assigned to two processes, they may get a different throughput each, as a function of the positions on the disk of their requests. On the contrary, BFQ can provide strong guarantees on bandwidth distribution because the assigned budgets are measured in number of sectors. Moreover, due to its Round Robin policy, CFQ is characterized by an O(N) worst-case delay (jitter) in request completion time, where N is the number of tasks competing for the disk. On the contrary, given the accurate service distribution of the internal WF2Q+ scheduler, BFQ exhibits O(1) delay."
Jens Axboe reacted favorably, "Fabio, I've merged the scheduler for some testing. Overall the code looks great, you've done a good job!" He noted that the scheduler should soon appear in the -mm tree, and that it was worth considering merging the two I/O schedulers together.
"I'm pleased to announce [the] 7'th and final release of the distributed storage subsystem (DST)," Evgeniy Polyakov stated, completing the TODO list on the project's web page. He titled the release, "squizzed black-out of the dancing back-aching hippo", noting, "it clearly shows my condition". New features in this release include checksum support, extended auto-configuration for detecting and auto-enabling checksums if supported by the remote host, new sysfs files for marking a given node as clean (in-sync) or dirty (not-in-sync), and numerous bug fixes.
Evgeniy released the first version of his distributed storage subsystem in July of 2007. In September he explained that this was the first step in a larger distributed filesystem project he's planning. In late October, Andrew Morton noted that the work looked ready to be merged into his -mm kernel.
Andrew Morton responded favorably to Evgeniy Polyakov's most recent release of his distributed storage subsystem, "I went back and re-read last month's discussion and I'm not seeing any reason why we shouldn't start thinking about merging this." He then asked, "how close is it to that stage? A peek at your development blog indicates that things are still changing at a moderate rate?" Evgeniy replied:
"I completed storage layer development itself, the only remaining todo item is to implement [a] new redundancy algorithm, but I did not see major demand on that, so it will stay for now with low priority. I will use DST as a transport layer for [a] distributed filesystem, and probably that will require additional features, I have no clean design so far, but right now I have nothing in the pipe to commit to DST."
Andrew Morton posted his first -mm patchset against the recently released 2.6.23 kernel, preparing for a big merge of patches bound for inclusion in the upcoming 2.6.24 kernel. He noted:
"I've been largely avoiding applying anything since rc8-mm2 in an attempt to stabilise things for the 2.6.23 merge.
"But that didn't stop all the subsystem maintainers from going nuts, with the usual accuracy. We're up to a 37MB diff now, but it seems to be working a bit better."
With the official release of the 2.6.23 kernel expected any day now, Andrew Morton posted his -mm merge plans for the 2.6.24 kernel. The current Linux kernel development model is to open up the mainline kernel for significant merges during the two weeks following a major kernel release. Thus, during the two weeks following the imminent release of the 2.6.23 kernel, subsystem maintainers will push their latest trees to Linus' mainline tree. Andrew Morton will also push many of the patches he collects in his -mm tree to Linus' mainline tree during these two weeks, as detailed in his email. At the end of the merge window, 2.6.24-rc1 will be released and the stabilization process begins, though in reality significant merges also often slip in between -rc1 and -rc2. A series of -rc kernels will be released, eventually leading to a stable 2.6.24 kernel two or three months after the process started, and it all starts again.
"This feature allows a read-only view into a read-write filesystem. In the process of doing that, it also provides infrastructure for keeping track of the number of writers to any given mount," Dave Hansen began, describing his "read-only bind mounts" patches. He continued, "this has a number of uses. It allows chroots to have parts of filesystems writable. It will be useful for containers in the future because users may have root inside a container, but should not be allowed to write to some filesystems. This also replaces patches that vserver has had out of the tree for several years. It allows security enhancements by making sure that parts of your filesystem [are] read-only (such as when you don't trust your FTP server), when you don't want to have entire new filesystems mounted, or when you want atime selectively updated."
Christoph Hellwig was interested in seeing the patches get some more testing, "I still think we really want this in -mm. As we've seen at the kernel summit there's a pretty desperate need for it." Andrew Morton noted that the "unprivileged mounts" code was working in the same area, but described that work as "a bit stuck." He suggested, "it sounds like a better approach would be for me to merge the r/o bind mounts code and to drop (or maybe rework) the unprivileged mounts patches." Dave explained that they don't collide much, to which Andrew's reply suggested that the read-only mount patches would be merged into the -mm kernel soon.
"Recently, the CE Linux forum has been working to revive the Linux-tiny project," stated Tim Bird on the Linux Kernel mailing list, adding that Michael Opdenacker has been selected as the project's new primary maintainer. The project's website explains:
"The linux-tiny patchset is a series of patches against the 2.6 mainline Linux kernel to reduce its memory and disk footprint, as well as to add features to aid working on small systems. Target users are developers of embedded system and users of small or legacy machines such as 386s and handhelds."
Andrew Morton suggested that patches should be sent to him to be merged into his -mm tree, aiming for inclusion in the mainline kernel, "seriously, putting this stuff into some private patch collection should be a complete last resort - you should only do this with patches which you (and the rest of us) agree have no hope of ever getting into mainline." Michael, the project's new maintainer, agreed, "you're completely right... The patches should all aim at being included into mainline or die." Tim added, "the patchkit gives a place for things to live while they are out of mainline, and still have multiple people use and work on them. Optimally the duration of being out-of-mainline would be short, but my experience is that sometimes what an embedded developer considers reasonable to hack off the kernel is not considered so reasonable by other developers (even with config options)."
Following Andrew Morton's recent comment, "this just isn't working any more," Miles Lane asked, "what can be done to reduce the huge number of build fixes required to release an MM tree?" Andrew jokingly replied, "my mind turns to cattle prods." Regarding the suggestion that he could publicly list the offenders he quipped, "I could name names, but it would look like '
grep @ MAINTAINERS' ;))" He continued to say, "I don't think much can be done about it, really," going on to explain:
"See, what I do is to merge probably hundreds of patches into the -mm-only part of the tree and then, after a few days, get down and compile-test it all, then fix it, then runtime test it all, then fix that. Because it is vastly more efficient to do all this work against hundreds of patches than it is to do it against one patch at a time, no?
"And guess what? All the other maintainers do the same thing: someone sends you a patch, it looks good, so you commit it. After you've committed a decent batch of patches, get in there and test it all. Problem is, I often will get in there and do all that testing before the subsystem-tree owner has done his testing."
A frustrated sounding Andrew Morton released the 2.6.23-rc6-mm1 kernel as "a 29MB diff against 2.6.23-rc6." Many patches are merged first into Andrew's -mm tree for testing before being pushed to Linus' mainline tree during the merge window. Andrew suggested that the -mm process wasn't working as well as it could:
"It took me over two solid days to get this lot compiling and booting on a few boxes. This required around ninety fixup patches and patch droppings. There are several bugs in here which I know of (details below) and presumably many more which I don't know of. I have to say that this just isn't working any more."
"The cfs core has been enhanced since quite sometime now to understand task-groups and [to] provide fairness to such task-groups," began Srivatsa Vaddagiri, "what was needed was an interface for the administrator to define task-groups and specify group 'importance' in terms of its cpu share. The patch below adds such an interface."
Srivatsa requested that his patch be merged into Andrew Morton's -mm tree to receive more testing, "note that the load balancer needs more work, esp to deal with cases like 2-groups on 4-cpus, one group has 3 tasks and other having 4 tasks. We are working on some ideas, but nothing to share in the form of a patch yet. I felt sending this patch out would help folks start testing the feature and also improve it."
In a series of 5 patches, Jesper Juhl propsed moving 4K stacks from a debug feature to a non-debug feature, defaulting it to be enabled in the -mm tree. He referred back to a lengthy earlier discussion in which he had proposed making 4K stacks the default in the mainline kernel, then added:
"Based on the comments in that thread I conclude that 4KSTACKS are not really considered a debug-only feature any longer, but the time is not right (yet) to make them the default - and it's certainly not yet the time to get rid of 8K stacks."
"In that thread I promised to provide some patches that would lift 4KSTACKS out of debug-only feature status, which is what the first two patches in this series do. I also said I would provide a patch to make 4KSTACKS 'default y' to get more testing, but restrict that patch to -mm - that's the fifth patch in this series. Patches 3 & 4 in this series move the config option out of the Kernel hacking menu and into Processor types and features".
"Is anyone testing the kgdb code in here?" Andrew Morton asked in his release announcement for the 2.6.23-rc1-mm2 patchset. Mike Frysinger asked, "does kgdb actually have a chance to get merged? With the history of it, i just assumed it was never going in". In the past, Linus Torvalds has resisted merging kernel debuggers and famously said, "I don't like debuggers. Never have, probably never will," going on to explain why he didn't want it to be too easy to hack the Linux kernel. An earlier push to get kgdb merged in 2004 didn't succeed, though some architectures already have versions of the debugger. The current kgdb patchset in Andrew's tree includes code for the i386, x86_64, ppc, mips, sh and arm architectures.
Andrew replied to Mike's question, "I was hoping for a 2.6.24 merge. But I haven't actually looked at it yet. Hopefully Jason is planning to get it all out for review soonish." He went on to add, "runtime testing isn't actually the most important thing at this time - if is doesn't work, well hey, we fix it, easy - we always have bugs. The main emphasis right now should be on higher-level design/review/integration stuff." Jason Wessel noted, "the KGDB tree is broken up into incremental units each layer adding more functionality and or arch specific pieces."
Following the release of the 2.6.22 kernel [story], Andrew Morton [interview] posted a list of a wide range of patches that are in his -mm kernel, summarizing for each his plans as to whether or not they will be pushed upstream for inclusion in the upcoming 2.6.23 kernel. Comments included simply noting "merge" or "hold", as well as "these appear to need some work,", "don't know, need to ping suitable developers over this work," and "sent to maintainer." Perhaps most entertaining was Andrew's response to the vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch, "this is scary. Will sit and admire it until it has been demonstrated to be a net gain." It is possible to track which patches are actually merged using the gitweb interface to Linus' kernel tree.
Following the release of the 2.6.21 kernel [story] Andrew Morton [interview] posted a list of patches in his -mm kernel, summarizing for each his plans as to whether or not they wil be pushed upstream for inclusion in the upcoming 2.6.22 kernel. He noted, "the overall stability in recent -mm's was not sufficiently high and we ran out of time to find all the bugs. I shouldn't have merged all those patches last week - they contained an exceptional amount of garbage. This all means that more bugs than usual will probably leak into mainline, and we'll have to fix them there." He went on to add, "I've been ducking most non-bugfix patches recently. I have ~200 feature and cleanup patches queued for later consideration, so people who sent those will be hearing from me eventually."