"There are lots of things in the FS that need deep thought,and the perfect system to fully use the first 64k of a 1TB filesystem isn't quite at the top of my list right now."
"Or, we could just do the ugliest patch ever, namely
-#define pcibus_to_node(node) (-1) +#define pcibus_to_node(node) ((int)(long)(node),-1)
Wow. It's so ugly it's almost wraps around and comes out the other side and looks pretty."
Patches for a much publicized Linux kernel local root exploit were released today as 2.6.24.2, 2.6.23.16, and 2.6.22.18. The latest bug, labeled as CVE-2008-0600, was introduced by the vmsplice() system call and added into the 2.6 kernel in 2.6.17. It is the third in a series of root exploits surrounding the same system call, the two earlier bugs being CVE-2008-0009 and CVE-2008-0010. Easily obtained exploits exist for both the older CVE-2008-0010 which affected the 2.6.23 and 2.6.24 kernels, and the latest CVE-2008-0600, allowing a local non-root user to gain root permissions.
"All currently active Linux kernel versions are now released with a fix for this problem. We have released them through our normal channels, with the needed information as to what the problem is, a pointer to the CVE number, and the patch itself."
"Ok, it's a bloody large -rc (as was 24-rc1, for that matter), probably because the 2.6.24 release cycle dragged out, so people had a lot of things pending," noted Linus Torvalds, announcing the 2.6.25-rc1 kernel. He added, "the full diff is something like 11MB and 1.4M lines of diffs, with the bulk of the stuff being in architecture updates and drivers." Linus continued:
"Just to have some fun, I did trivial statistics, and of the 1.4M lines of diffs, about 38% - 530k lines - were in architecture files (400k+ lines of diffs in arch/, 100k+ lines of diffs in include/asm-*), and another big chunk is in drivers (including sound) at about 44% - 610k lines - of changes. The rest comes in much smaller, but still noticeable is networking (8% - 110k lines), with filesystems at 4%, and documentation at about 2%. The remaining crumbles being spread out mostly over block layer, crypto, kernel core, and security layer updates (ie SElinux and smack)."
Linus highlighted a few of the changes, including, "the Intel graphics driver is starting to do suspend/resume natively (ie even without X support), which is a welcome sign of the times and may help some people; lots of cleanups from the x86 merge (making more and more use of common files), but also the big page attribute stuff is in and caused a fair amount of churn, and while most of the issues should have been very obvious and all got fixed, this is definitely one of those things that we want a lot of very wide testing of to make sure nothing regressed; fair number of changes to things like the legacy IDE drivers too, and a totally new driver for the very common PCIE version of the Intel e1000 network card etc; and I've probably totally forgotten about tons of other stuff I should have mentioned, but the point is that not only do we have lots of new core, we do have a fair amout of changes to basic stuff that can actually affect perfectly bog-standard hardware setups. So give it all a good testing."
"We've gone and made it awfully easy to get code into the kernel nowadays. Perhaps too easy. I'm presently having a little campaign of watching what's going on a bit more closely, and encouraging people to make it easier for others to see what's going on, should they choose to do so."
"While this is probably one of the last days of the merge window, please still consider pulling the 'kgdb light' git tree," began Ingo Molnar, explaining:
"This is a slimmed-down and cleaned up version of KGDB that i've created out of the original patches that we submitted two weeks ago. I went over the kgdb patches with Thomas and we cut out everything that we did not like, and cleaned up the result. KGDB is still just as functional as it was before (i tested it on 32-bit and 64-bit x86) - and any desired extra capability or complexity should be added as a delta improvement, not in this initial merge."
Ingo noted that the previous merge request modified 41 files, while this new merge request modifies only 22 files. Among the changes, he highlighted, "removed _all_ critical path impact, even if KGDB is enabled and active; removed all the lowlevel serial drivers; added a redesigned and cleaned up version of the 'KGDB over polled consoles' approach; removed the longjump code; removed the module symbol hacks; removed the GTOD/clocksource hacks; removed the softlockup hacks; removed the toplevel Makefile changes; removed the might_sleep scheduler hack; and did lots of other cleanups and rewrites as well." Ingo summarized, "as a result, this kgdb series has _obviously_ zero impact on the kernel, because it just does not touch any dangerous codepath. From this point on KGDB can evolve in small, well-controlled baby steps, as all other kernel code as well. And the resulting kgdb is still very functional: it can still break into a kernel (via SysRq-G), can catch crashes, can single-step, etc. It's already a quite usable first step."
"If you listen carefully you can hear dozens of Linux kernel developers collectively holding their breath and thinking 'Maybe Linus will finally merge kgdb'. Yes, user bug reports are important. Developer efficiency is important too."
"With a lot of help from Ingo Molnar and Pekka Enberg over the last couple of weeks, we've been able to produce a new version of kmemcheck!" announced Vegard Nossum, adding, "the current version of the patch boots on real hardware, but we've seen freezes on some machines, so it's not perfect yet. (In other words, this patch is HIGHLY experimental, and run at your own risk, etc.)". He also offered a high level summary of the patch:
"kmemcheck is a patch to the linux kernel that detects use of uninitialized memory. It does this by trapping every read and write to memory that was allocated dynamically (e.g. using kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log."
Ingo Molnar credited the new patch with already finding 4 kernel bugs, and offered some more insights into how the patch works, and why it's useful, "it should also be made clear that not only does kmemcheck consume half of the RAM to do byte granular tracking of the other half of RAM, it's also slow, very slow, because almost every kernel-space instruction will generate a pagefault and then it will be single-stepped and it takes a debug fault as well. That's of course totally crazy, but that's also OK and it's what makes the feature so interesting and powerful."
"It's ascii art I took it from someone's signature 12 years ago, it's meant to be the guy on the cover of some of the editions of the Hitchhikers Guide to the Galaxy by Douglas Adams. Don't Panic! :-)"
"You can play games in user space, but you're fooling yourself if you think you can do as well as doing it in the kernel. And you're *definitely* fooling yourself if you think mmap() solves performance issues. 'Zero-copy' does not equate to 'fast'. Memory speeds may be slower than core CPU speeds, but not infinitely so!"
"I wasn't planning on releasing v0.12 yet, and it was supposed to have some initial support for multiple devices. But, I have made a number of performance fixes and small bug fixes, and I wanted to get them out there before the (destabilizing) work on multiple-devices took over," explained Chris Mason regarding the 0.12 release of his new btrfs filesytem. Btrfs was first announced in June of 2007, as an alpha-quality filesystem offering checksumming of all files and metadata, extent based file storage, efficient packing of small files, dynamic inode allocation, writable snapshots, object level mirroring and striping, and fast offline filesystem checks, among other features. The project's website explains, "Linux has a wealth of filesystems to choose from, but we are facing a number of challenges with scaling to the large storage subsystems that are becoming common in today's data centers. Filesystems need to scale in their ability to address and manage large storage, and also in their ability to detect, repair and tolerate errors in the data stored on disk." Regarding the latest release, Chris offered:
"So, here's v0.12. It comes with a shiny new disk format (sorry), but the gain is dramatically better random writes to existing files. In testing here, the random write phase of tiobench went from 1MB/s to 30MB/s. The fix was to change the way back references for file extents were hashed."
"Patches like this scare the pants off me."
It was recently pointed out that most of the x86 architecture patches had been merged into the mainline 2.6.25 kernel, except for the kgdb patches. Linus Torvalds replied, "I won't even consider pulling it unless it's offered as a separate tree, not mixed up with other things. At that point I can give a look." He continued:
"That said, I explained to Ingo why I'm not particularly interested in it. I don't think that 'developer-centric' debugging is really even remotely our problem, and that I'm personally a lot more interested in infrastructure that helps normal users give better bug-reports. And kgdb isn't even _remotely_ it.
"So I'd merge a patch that puts oops information (or the whole console printout) in the Intel management stuff in a heartbeat. That code is likely much grottier than any kgdb thing will ever be (Intel really screwed up the interface and made it some insane XML thing), but it's also fundamentally more important - if it means that normal users can give oops reports after they happened in X (or, these days, probably more commonly during suspend/resume) and the machine just died."
"[The] lkml is the right mailing list for reporting Linux bugs. This is an extremely harmful trend I've seen lately: some kernel hackers going out on a limb directing the flow of bugreports _away_ from lkml, by suggesting to testers that lkml is somehow inappropriate for reporting Linux kernel bugs."