"I don't think anything of what was discussed in this thread would be in scope for 2.6.24 (unless Linus wants to let the bunny that brings eggs release 2.6.24)."
"[The] text below is mostly for the benefit of newbies - it's more along the lines of 'how to get from [a] bug report to the source of [the] bug', with more details than normal," began Al Viro, offering a full review of another Linux kernel oops in an effort to educate more people on how this is done. Al's walk through included a patch to fix the bug that caused the oops. He noted:
"This might be worth doing on [a] more or less regular basis, especially if more people join the fun; everyone [has] their own set of tricks in [this] area and making it easier to gather might help a lot of people. It's not just about oops-tracing per se, of course - Arjan's site gives a nice collection of those, so that makes an obvious starting point."
"I've decided to change the copyright to have the same set of rules as the GNU copyleft - I got some mail asking about it, and I agree."
"This patch speeds up e2fsck on Ext3 significantly using a technique called Metaclustering," stated Abhishek Rai. In an earlier thread he quantified this claim, "this patch will help reduce full fsck time for ext3. I've seen 50-65% reduction in fsck time when using this patch on a near-full file system. With some fsck optimizations, this figure becomes 80%." Most criticism so far has been in regards to formatting issues with the patch preventing it from being easily tested, resolved in the latest postings. It was also cautioned that the patch affects a significant amount of ext3 code, and thus will require very heavy testing. Abhishek described how the patch offers its significant gains for e2fsck:
"Metaclustering refers to storing indirect blocks in clusters on a per-group basis instead of spreading them out along with the data blocks. This makes e2fsck faster since it can now read and verify all indirect blocks without much seeks. However, done naively it can affect IO performance, so we have built in some optimizations to prevent that from happening. Finally, the benefit in fsck performance is noticeable only when indirect block reads are the bottleneck which is not always the case, but quite frequently is, in the case of moderate to large disks with lot of data on them. However, when indirect block reads are not the bottleneck, e2fsck is generally quite fast anyway to warrant any performance improvements."
"Many things are possible, in the NASA sense of 'with enough thrust, anything will fly'. Whether or not it is *useful* and *worthwhile* are of course different questions!"
"A small word of warning: linux looks like a unix, but I implemented it from scratch, and with very little literature on how things 'should' be done."
"Sorry to sound a bit harsh, but sometimes it doesn't hurt to think a bit outside your own sandbox."
"This week, a total of 49 oopses and warnings have been reported, compared to 53 reports in the previous week," Arjan van de Ven noted, sending out a list of the week's top 10 kernel oopses. Al Viro suggested, "FWIW, people moaning about the lack of entry-level kernel work would do well by decoding those to the level of 'this place in this function, called from <here>, with so-and-so variable being <this>' and posting the results." This was met by multiple requests for documentation on how to actually decode an oops. Linus Torvalds explained:
"It's actually not necessarily at all that trivial, unless you have a deep understanding of the code generated for the architecture in question (and even then, some oopses take more time to figure out than others, thanks to inlining and tailcalls etc). If the oops happened with a kernel you generated yourself, it's usually rather easy. Especially if you said 'y' to the 'generate debugging info' question at configuration time."
Linus went on to detail how to debug a random oops reported on the lkml, "you will generally have to disassemble the hex sequence given in the oops (the 'Code:' line), and try to match it up against the source code to try to figure out what is going on." He then offered a number of tips on how this is best accomplished, continuing with an example walking through one of the reports oops. Al Viro replied describing his own methods of accomplishing the same thing, walking through of another oops and isolating a bug.
"I must say that the number of bugs which actually go away when the user stops using nvidia/fglrx/ndiswrapper/etc is a small minority."
"It's been two weeks since rc6, but let's face it, with xmas and new years (and birthdays) in between, there hasn't actually been a lot of working days, and the incremental patch from -rc6 is about half the size of the one from rc5->rc6," began Linus Torvalds, announcing the release of the 2.6.24-rc7 Linux kernel. He then quipped, "and I'll be charitable and claim it's because it's all stabilizing, and not because we've all been in a drunken stupor over the holidays." Linus quickly summarized the changes:
"The shortlog (appended below) is short and fairly informative. It's all really just a lot of rather small changes. The diffstat shows a lot of one- and two-liners, with just a few drivers (and the Cell platform) getting a bit more attention, and the SLUB support of /proc/slabinfo showing up as a blip."
"Current Linux versions can enter suspend-to-RAM just fine, but only can do it on explicit request. But suspend-to-RAM is important, eating something like 10% of [the] power needed [compared to an] idle system. Starting suspend manually is not too convenient," began Pavel Machek, describing an idea he referred to as Sleepy Linux. He continued, "[starting suspend manually] is not an option on multiuser machines, and even on single user machines some things are not easy: 1) Download this big chunk in Mozilla, then go to sleep; 2) Compile this, then go to sleep; 3) You can sleep now, but wake me up in 8:30 with mp3 player". Pavel provided a simple not-fully-functional patch, then described his proposed solution:
"Today's hardware is mostly capable of doing better: with correctly set up wakeups, machine can sleep and successfully pretend it is not sleeping -- by waking up whenever something interesting happens. Of course, it is easier on machines not connected to the network, and on notebook computers."
"Perhaps it will also help with whatever effort I find time to make towards convincing Andrew that [TuxOnIce] really does have significant advantages over [u]swsusp and kexec based hibernation."
"What I'm arguing (very strongly) against is this attitude of 'we don't know what's wrong, but we'll leave it broken because we can't be bothered to figure it out'."
"New year, new kernel: Linux 2.4.36 is finally ready and has been checked long enough to be released. Quite a bunch of bugs, build errors and security issues have been fixed since 2.4.35, but all of those fixes were merged into 2.4.35-stable," 2.4 maintainer Willy Tarreau stated, announcing the latest 2.4 stable Linux kernel. He noted, "I should say that I'm quite satisfied of this dual-branch release model which proves to be very successful at separating quick fixes from changes which require more thorough testing." Willy went on to add:
"Concerning future versions, I have nothing pending in the queue anymore. I will then go on with 2.4.36.X when bug fixes come in, and only open 2.4.37 when I get something which I do not consider suitable for 2.4.36.X."
The previous 2.4.35 stable kernel was released in July of 2007. Source level changes can be viewed through the linux-2.4 gitweb interface.
"Repeatedly posting crud does not make it right."