"A weak coder becomes a strong coder by reading code and writing code - every day, for fun."
Linus Torvalds announced the 2.6.27-rc5 Linux Kernel, noting that his "weekly releases" tend to happen every eight days, adding, "the bulk of it is all config updates, and with arm and powerpc leading the pack." Linus continued:
"While the config updates amount to about three quarters of the diff, and if you don't use a rename-aware diff the blackfin include file movement pretty much accounts for the rest, hidden behind all those trivial (but bulky) changes are a lot of small changes that hopefully fix a number of regressions.
"The most exciting (well, for me personally - my life is apparently too boring for words) was how we had some stack overflows that totally corrupted some basic thread data structures. That's exciting because we haven't had those in a long time. The cause turned out to be a somewhat overly optimistic increase in the maximum NR_CPUS value, but it also caused some introspection about our stack usage in general. Including things like a patch to gcc to fix insane stack usage for vararg functions on x86-64. But that one would only hit anybody who was a bit too adventurous and selected the big 4096 CPU configuration. The rest of the regressions fixed are a bit more pedestrian."
"Be careful -- there are some serious dragons there in the presence of multiple threads."
"Another week, another -rc," began Linus Torvalds, announcing the 2.6.27-rc4 Linux kernel, continuing, "this time the diffstat is almost totally dominated by the addition of the musb driver that drives the MUSB and TUSB controllers integrated into omap2430 and davinci. That, together with the removal of the auerswald USB driver (replaced by libusb version) is more than half of the bulk of the patch, and obviously most users won't ever notice." Linus added:
"Apart from those bulky USB updates, there's some arch updates (blackfin and ia64), network and input driver updates, and an XFS and UBIFS update. The rest is mostly random stuff all over, probably best described by the appended shortlog. A number of regressions should be off the table, but more remain..."
"'Good enough' is never good enough ;) What is the ideal implementation? Let's implement that."
"I'd like to get a first round of review on my AXFS filesystem," began Jared Hulbert, describing his new Advanced XIP File System for Linux. XIP stands for eXecute-In-Place. The new filesystem received quite a bit of positive feedback. Jared offered the following description:
"This is a simple read only compressed filesystem like Squashfs and cramfs. AXFS is special because it also allows for execute-in-place of your applications. It is a major improvement over the cramfs XIP patches that have been floating around for ages. The biggest improvement is in the way AXFS allows for each page to be XIP or not. First, a user collects information about which pages are accessed on a compressed image for each mmap()ed region from /proc/axfs/volume0. That 'profile' is used as an input to the image builder. The resulting image has only the relevant pages uncompressed and XIP. The result is smaller memory sizes and faster launches."
"The C standard will eventually support concurrency (they are working on it), and it will almost inevitably be a horrible pile of stinking sh*t, and we'll continue to use the gcc inline asms instead, but then the gcc people will ignore our complaints when they break the compiler, and say that we should use the stinking pile-of-sh*t ones that are built in.
A recent discussion on the Linux Kernel mailing list noted that threaded 64-bit applications suffer a drastic slowdown in pthread_create performance when stack utilization goes above 4GB. Ingo Molnar offered an explanation of the problem, "unfortunately MAP_32BIT use in 64-bit apps for stacks was apparently created without foresight about what would happen in the MM when thread stacks exhaust 4GB. The problem is that MAP_32BIT is used both as a performance hack for 64-bit apps and as an ABI compat mechanism for 32-bit apps. So we cannot just start disregarding MAP_32BIT in the kernel - we'd break 32-bit compat apps and/or compat 32-bit libraries." The original report noted that once the shared stack goes above 4GB in size, thread creation can take as long as 10 milliseconds, a slowdown described as "quite unacceptable".
Ingo created a patch introducing a new MAP_STACK flag for glibc to be used instead of MAP_32BIT and avoid imposing the 32-bit performance limitation on threaded 64-bit applications. He noted, "glibc can switch to this new flag straight away - it will be ignored by the kernel." The new flag was quickly merged upstream, and changes were planned for glibc.
"If web browsers, office suites and mail clients on Windows have certain kinds of vulnerabilities, it is safe to assume that the same programs on Linux will have similar problems."
"It is about time to take a step back and describe what I have been implementing," began Daniel Phillips, referring to his new Tux3 filesystem. He provided a simple ASCII diagram that detailed the filesystem's hierarchical structure, describing each of the elements. About one he noted, "the volume table is a new addition not central to the goals of Tux3, but a nice feature to have given that it comes nearly for free. One Tux3 volume can have an arbitrary number of separate filesystems tucked inside it, indexed by a simple integer parameter at mount time. People say they like this idea and it imposes no significant complexity, so it goes in." Daniel continued:
"Each volume has a metablock pointing at the forward log chain for the volume, a version table that describes the hierarchical relationship between versions (snapshots), an atime table to take care of that horrid legacy Unix feature, and an inode table containing files and attributes of files. [...] Versioning takes place in three places, versioned pointers in the atime btree, versioned extents in a file data btree and versioned attributes in the inode table. [...] Notice the absence of a journal, the functionality of which is provided by forward log elements that I described in the Hammer thread (and will eventually write a separate post about)."
"History is a one way street, and you might as well have the fs known the way it is so that people remember 'reiser oh wasn't he the guy who..' - unless you are trying to market the fs I guess."
"Things really _have_ calmed down, and hopefully we've also resolved a lot of the regressions in -rc3," began Linus Torvalds, announcing the 2.6.27-rc3 Linux kernel. He noted that much of the patch size was from the inclusion of the new ath9k wireless driver, with much of the rest of the patch size due to the renaming of many arch include files in the ARM, AVR32 and m68lnommu architectures. Linus continued:
"All the small changes are where the regression fixes are, and other random improvements. And they're all over. The ShortLog (appended) probably gives a taste of it."
"Security is not an absolute. Just as the terrorists win if it can induce the White House to shred the constitution and force us all to live in a constant state of fear, it is also pointless to induce people to install software that horrifically slows down their server so badly that you can't get anything done."
Mikulas Patocka announced new patches introducing snapshot merging for the Linux kernel's logical volume manager. He explained, "snapshot merging allows you to merge snapshot content back into the original device. The most useful use for this feature is the possibility to rollback [the] state of the whole computer after [a] failed package upgrade, [or an] administrator's error". The patches are for the 2.6.26 kernel, with device mapper 1.02.27 and LVM2.2.02.39.
Mikulas noted that there are three types of merges supported,
--onactivate. The default merge method is
--nameorigin, which can merge a snapshot into the origin volume, which can be mounted at any time after the merge starts. The
--namesnapshot method merges into a snapshot, which can then be mounted. And the
--onactive method schedules a merge to happen the next time the volume is activated, such as during a reboot. Mikulas noted, "this implementation of snapshot merging is meant to be stable, report any possible bugs to me."
"Btrfs v0.16 is available for download," began Chris Mason, announcing the latest release of his new Btrfs filesystem. He noted, "v0.16 has a shiny new disk format, and is not compatible with filesystems created by older Btrfs releases. But, it should be the fastest Btrfs yet, with a wide variety of scalability fixes and new features." Improved scalability and performance improvements include fine grained btree locking, pushing CPU intensive operations such as checksumming into their own background threads, improved
data=ordered mode, and a new cache to reduce IO requirements when cleaning up old transactions. Other new features include support for ACLs, prevention of orphaned inodes so files won't be lost after a crash, and a more robust directory index format. Chris noted:
"There are still more disk format changes planned, but we're making every effort to get them out of the way as quickly as we can. You can see the major features we have planned on the development timeline. [...] the btrfs kernel module now weighs in at 30,000 LOC, which means we're getting very close to the size of ext."