The latest upload [1] of the Debian [2] hurd [3] package features a patch [4] by Ognyan Kulev which has support for ext2 partitions larger than 2 GB on 32bit systems. Over the last years, this limit had become an annoying issue of the GNU/Hurd system, so this change represents an important milestone for the Debian GNU/Hurd port [5] with respect to user expectations. Although the patch has not yet been integrated by the upstream GNU Hurd maintainers, the Debian package maintainers consider it (after thorough testing) stable enough to warrant its inclusion into the Debian hurd package.
The original implementation of the ext2 file system translator simply mmap'ed the whole file system into its address space, which limited support to filesystems of no greater than approximately 2 GB on 32 bit architectures. At that time (the early 90s), this limit was no practical issue and the much simpler implementation strongly outweighed this shortcoming.
There have been several discussions on the re-design of the ext2fs translator among the Hurd developers over the years. The earlier ones were not held on publically archived lists, but summaries and reposts of the proposals by Thomas Bushnell, BSG and Roland McGrath can be found here [6] and here [7]. Later discussions [8] included a proposal [9] by Neal Walfield and subsequently a concrete implementation proposal [10] hashed out at a Hurd developer meeting in Karlsruhe in early 2003. In April 2003, Ognyan Kulev posted [11] the first alpha version of his patch. He released several more alpha versions after input [12] from Neal Walfield over the summer and then a first beta in late October. During that time, Ognyan also published [13] an extensive write-up of the design and implementation details of his patch, based on an early version. The general approach has been summarized by Neal as follows:
"By mapping the entire file system into memory, ext2fs was able to trivially calculate where metadata was located by simply adding the offset of the data on the partition to the base address of the memory map. In order to allow ext2fs to scale to larger partition sizes (i.e. those in which the partition could not be memory mapped in its entirety), a level of indirection has been introduced in the form of two hashes which map partition offsets to memory addresses and vice versa. When a block of metadata is accessed, the partition offset is looked up in the hash. If a corresponding address is not found, a mapping is created. In this way, the virtual address space is better used. This additional level of indirection is not neccessary on architectures where an address space is really sparse as the original implementation is sufficient and marginally faster."
After the beta release of the patch, things slowed down a bit until Neal took up Hurd development again last summer and began to thoroughly review [14] the interface changes in libpager, as well as comparing them to Thomas' [15] and Roland's [16] proposals, which resulted in further discussion. Ognyan changed his implementation slightly according to Neal's advice and later fixed some more bugs which surfaced when people starting to stress-test the code.
Although the details of the current implementation have not yet been approved (or even reviewed) by the Hurd maintainers (Thomas, Roland and Marcus Brinkmann) and thus the interface changes to libpager are not final, the Debian GNU/Hurd porters have decided that the practical gain in having support for large partitions outweighs the risks of diverging from upstream with respect to the well contained libpager interface.
Ognyan has also worked on an ext3fs translator as subject of his master thesis [17] (in bulgarian) and already released [18] the first version some tima ago, so further improvements can be expected in the future.