The Linux Kernel Archives are perhaps most familiar through their web interface, http://kernel.org/. The latest release of the Linux kernel is easily found here, along with patches by various Linux kernel hackers, and mirrors of other popular free and open source projects. Countless people worldwide happily rely on this archive without giving much thought to the effort behind it.
In a recent announcement to the Linux Kernel Mailing List, H. Peter Anvin detailed a recent upgrade of the infrastructure behind kernel.org. The new servers were donated by Hewlett-Packard, and are each quad Opterons with 24 gigabytes of RAM and 10 terabytes of disk space. Internet Systems Consortium, Inc. donates the bandwidth in the form of two independent gigabit-connected datacenters, PAIX Palo Alto and e200paul in San Francisco. H. Peter Anvin, Nathan Laredo, and Kees Cook all donate time to maintain the archives. KernelTrap recently spoke with Peter Anvin to learn more about the history behind the Linux Kernel Archives.
BitKeeper was first utilized by a Linux project in December of 1999, when it was employed by the Linux PowerPC project. Then in February of 2002, Linux creator Linus Torvalds decided that BitKeeper was "the best tool for the job" and started using it to manage the mainline kernel, an event that received much attention in the free and open source communities [story], and beyond. BitMover, the company behind BitKeeper, was founded by its current CEO, Larry McVoy [interview], who originally conceived of BitKeeper as a tool to keep Linus from getting burnt out by the growing task of managing the Linux Kernel. Since Linus began using the tool three years ago, the pace of Linux kernel development has doubled [story].
There are two definitions for the word "free" that are commonly used to describe software. The first is "Free as in Freedom", and the other is "Free as in Free Beer". BitKeeper was made available freely under the latter definition, allowing free and open source software developers to use the tool without having to pay any money. It was provided under the agreement that anyone actively using the free tool would not develop a competing product at the same time. In other words, the aim was to provide a tool that could be freely used, but not freely cloned. At the same time, a more advanced version of BitKeeper has been sold commercially, and both products remain the intellectual property of BitMover.
A vocal group has long protested Linus' use of BitKeeper, considering Linux the free and open source flagship product. GNU Project founder Richard Stallman [interview] is among the protestors, harshly criticizing Linus' decision to use a non-free (as in freedom) tool [story]. However, most acknowledge that no free tool currently exists that is as powerful as BitKeeper, offering the ability to perform truly distributed development. Attempts to reverse engineer some of BitKeeper's features have lead to repeated cautions by BitMover. Most recently two such reverse engineering attempts have contributed to BitMover's decision to end the development and availability of the free BitKeeper product.
As RAM increasingly becomes a commodity, the prices drop and computer users are able to buy more. 32-bit archictectures face certain limitations in regards to accessing these growing amounts of RAM. To better understand the problem and the various solutions, we begin with an overview of Linux memory management. Understanding how basic memory management works, we are better able to define the problem, and finally to review the various solutions.
This article was written by examining the Linux 2.6 kernel source code for the x86 architecture types.
This HowTo was written for users who don't know how to install NVidia and/or VMWare modules with the 2.6 Linux kernel [forum]
If you're currently running the 2.4 Linux kernel [forum] and are interested in upgrading to the new 2.6 Linux kernel then you might want to read KernelTrap's 'How To Upgrade To The 2.6 Kernel' [story].
When I wrote this guide I was using the 2.6.0-test11-wli-2 kernel [story] [howto].
This HOWTO comes with no guarantees, use at your own risk.
Note: I'm not fond of binary-only drivers myself, but some Linux users are forced to use binary drivers (myself included). This HowTo is here for users that are forced to use NVidia or VMWare, so please _don't_ start a flamewar about this.
William Lee Irwin III [interview], from here on referred to simply as 'wli', has been maintaining a patchset against the 2.5 development kernel for some time. His announcement for 2.6.0-test11-wli-2 [story] caught my attention, so I decided to give it a try. Scroll down to the end of this article for a step-by-step guide walking you through how to apply the -wli patchset and compile your new kernel.
Curious to know more about wli's efforts, I dropped him an email with a few questions. His in-depth replies, included within, are quite insightful and informative. He explains the history behind this patchset, provides an overview of some of the improvements it contains, evaluates its stability, and talks a little about where he's going with it. Regarding the patchset, he explains, "one of the primary goals is to improve performance", adding, "there is a secondary goal of improving resource scalability and another of improving resource accounting."
On a cautionary note, some drivers and possibly some filesystems may have problems with a reduced kernel stack, so the 4K_STACK configuration option may be best left disabled, though read wli's comments within to determine if this affects you. Additionaly, wli explains that the -wli patchset is incompatible with smbfs and ncpfs due to removal of d_validate(), another change explained within. Finally, wli warns against using his patchset with binary-only graphics drivers, commenting that they seem, "utterly unable to cope with the changes I've made". None of these warnings applied to my personal desktop server which booted the -wli-2 kernel without problems. I'm happily testing it now as I write this article.
In a couple of earlier articles, we walked through the process of upgrading to the 2.6.0-test4 kernel [story], and then using a small patch to upgrade to the 2.6.0-test5 kernel [story]. Today we'll continue our patching efforts to upgrade to an even faster feeling and more stable kernel with Andrew Morton's [interview] -mm patchset [forum].
Andrew Morton began releasing his -mm kernel patches a little over a year ago, in the summer of 2002. The -mm tree began as a 90k patch against the 2.5.17 development kernel, merging in the remote kernel debugger, kgdb. By the release of 2.5.18, the -mm patchset had grown to nearly 238k, merging in a wide assortment of fixes and new functionality. As of this writing, the current -mm patchset is 2.6.0-test5-mm3, weighing in at nearly 5 megabytes. Andrew's -mm tree has evolved from a testing ground for numerous new technologies, to a comprehensive patchset that is usually more stable than the mainline 2.6.0-test kernel itself. This bodes well for the future of the 2.6 kernel, as Andrew Morton will soon be the official 2.6 kernel maintainer.
There are numerous reasons you may desire trying Andrew's -mm kernel tree. Stability alone is a good incentive, and scanning the lengthy changelog you'll find a significant number of bug fixes that have been applied. I asked Andrew how the stability of his kernel compares to that of the mainline 2.6.0-test kernel, and he replied that though occasionally new bugs creep in, due to having the latest fixes the -mm tree is generally more stable and up-to-date.
Linux creator Linus Torvalds has released the linux 2.6.0-test5 kernel, with the following comments:
"Lots of small stuff, as usual. I think the biggest "core" change is the Futex changes by Jamie and Hugh, and the dev_t preparations by Al Viro. But there are ARM and ppc updates here too, and a few drivers have bigger fixes (tg3 driver and the USB gadget interface stand out on diffstat). Watchdog driver updates etc. And Russell King fixed more PCMCIA issues."
Read on for the full changelog.
Additionally, if you followed my recent upgrade howto [story], are running a 2.6.0-test kernel, and are interested in upgrading to 2.6.0-test5, read on for a few simple tips on upgrading with incremental patches.
Anyone who's been following Linux kernel development for the past several months has heard about one exciting feature after another being merged into the still un-released 2.6 kernel. New features that noticeably affect user experience include Robert Love's [interview] preemptible kernel work [story], Ingo Molnar's [interview] O(1) Scheduler [story], Rik Van Riel's [interview] reverse mapping VM [story], Nick Piggins' [interview] Anticipatory I/O scheduler [story], and much, much more...
Having some spare time a few nights ago, I decided to give the latest kernel, 2.6.0-test4, a trial run on my aging 550Mhz PIII desktop computer, and the result was nothing short of spectacular. As the final 2.6.0 release approaches, it is important that an increasing number of users (aka testers) give this kernel a try, especially as currently it's still a sexy task for developers to track down kernel bugs and stabalize their work. Once work starts on the 2.7 development tree, inevitably much talent will again be focusing on new features.
The purpose of this document is to provide some helpful tips to readers that currently compile their own 2.4 kernels, but haven't yet made the leap to 2.6. This is still a development kernel, so you may run into problems, but overall stability and performance is quite impressive and I can't recommend enough that you try it today.
William Lee Irwin III [interview] recently announced on the lkml that he'd successfully gotten Linux running on a 64GB x86 server. His posts included two different boot message logs, one without his page clustering patch, and one with. In the latter case, his patch overcomes the 1GB mem_map virtual space limitation imposed by x86 32-bit servers, without which the kernel over-runs allowable memory space.
Bill's current efforts are based upon Hugh Dicken's earlier page clustering patches for the 2.4.x kernel. Hugh's efforts were actually focused on allowing larger filesystem block sizes, prompting Bill to say, "The fact it resolves the horror of mem_map overrunning kernel virtualspace on i386 PAE is really an obscure coincidence." His patch is still a work in progress, but with time will offer a number of additional benefits beyond the support of 64GB x86 servers. For example, utilizing the entire software page in fault handlers results in prefaulting benefits, and increasing the physical contiguity of data results in I/O throughput benefits. However, at this time "until it is done it will have severe performance problems on small memory machines (say, less than 16GB)."
I approached Bill, asking questions to better understand what he was working toward. He replied with a wealth of information, including several ASCII diagrams and lengthy explanations. To summarize, he offered:
"Without pgcl, 64GB is a doorstop, because in /proc/meminfo LowTotal: was a mere 176MB and so incapable of supporting any significant loads. With pgcl, 64GB functions quite nicely, because LowTotal is 750MB and has room for all the kernel bloat that should be there (but things that shouldn't still need to be fixed)."
For the complete details, read on.
Rusty Russell's new module loader was recently merged into Linus' 2.5 kernel tree [story]. This new implementation aims to cleanup and reduce the amount of code in the kernel and user space required to load a kernel module. Additionally, it now removes the requirement that kernel and user space code for modutils have to be in sync.
Robert Love [interview] recently backported the jiffies_to_clock_t() code from the 2.5 development kernel to the 2.4 stable kernel. This patch allows one to adjust the frequency of the timer interrupt, defined in the standard 2.4 kernel with HZ=100. In 2.5 this has been increased to HZ=1000.
I wrote Robert asking if he could explain the usefulness of his patch, and he replied in kind with a lengthy and very interesting email detailing what the patch is, how it works, and why it's useful. He explains, "The timer interrupt is at the heart of the system. Everything lives and dies based on it. Its period is basically the granularity of the system: timers hit on 10ms intervals, timeslices come due at 10ms intervals, etc."
Read on to learn what affect changing this value will have on your Linux server, and to see the actual patch...
Recently, a lot of work has gone into the 2.5 development kernel to facilitate better debugging. Starting with the 2.5.39 kernel, an infrastructure is in place for tracking down a wide range of atomicity/sleep bugs.
For example, a task in the kernel cannot sleep if it is atomic (by definition). By atomic we mean a number of things: the task holds a spinlock, holds the BKL, or has explicitly disabled preemption. Further, interrupt handlers are not schedulable so they too are atomic. In other words, not willing to be scheduled.
Roman Zippel recently released version 0.4 of his new configuration system for the 2.5 Linux kernel, a replacement for the aging CML currently in use. He has attempted to make the transition to this new system much simpler than was the case with CML2 [earlier story] in a hope to have it merged into the 2.5 kernel.
Curiosity got the best of me, so I downloaded the latest version as well as the 2.5.33 kernel tree, and gave it a try. I ran into a couple of compilation errors, so initiated an email exchange with Roman. Beyond helping me troubleshoot these problems, he was also kind enough to answer some general questions about his efforts with the new configuration system. Read on for the full details, then try it out yourself and provide Roman with feedback.
Robert Love [earlier interview] released version 0.0.9 of his scheduler utilities package tonight. The schedutil README explains:
"These are the Linux scheduler utilities - schedutils for short. These programs take advantage of the scheduler family of syscalls that Linux implements across various kernels. These system calls implement interfaces for scheduler-related parameters such as CPU affinity and real-time attributes. The standard UNIX utilities do not provide support for these interfaces -- thus this package."
Having learned about these utilities, I decided to give them a try myself. I downloaded the source tarball from Robert's home page (an RPM is also available) and extracted the source. Compilation and installation from source was nothing more than the familiar '
./configure && make && su -c "make install"'. Very quickly I was reading the man pages and playing with these useful utilities.