Interview: Neal Walfield

Submitted by Jeremy
on November 12, 2001 - 11:24am

This week KernelTrap spoke with Neal Walfield of the GNU/Hurd development team. From their project FAQ, "'Hurd', as an acronym, stands for `Hird of Unix-Replacing Daemons'. Hird, in turn, stands for `Hurd of Interfaces Representing Depth'.


Jeremy Andrews:
How did you get started working on the GNU Hurd?

Neal Walfield:
Before I began working on the Hurd, I was a user. The new ideas
seemed quite powerful and I had to know more. So, like any hacker, I
dove in. On the path to enlightenment, I found bugs and missing
features and began to submit patches.

JA:
When did you first start using the Hurd? How much has changed in
that time?

Neal Walfield:
I first started getting involved with the GNU/Hurd about two years
ago. Since then, the GNU/Hurd has had some important stabilizations and
feature completions and also undergone some major work from a user's
perspective: the installation is easier, most day-to-day packages
exist, etc. Plus the community is much larger today then when I
started: we have a lot of new potential waiting to join the ranks of
the developers. The major barrier to their entry being understanding
the system: unlike Unix, Hurd concepts are not being taught in school
or covered by any books.

JA:
What is the Hurd?

Neal Walfield:
The Hurd is a set of servers that provide similar interfaces to those
found in traditional Unix-like kernels. The servers, each designed to
do one task or manage one aspect of the system, run in user space
thereby isolating them from both the kernel and each other. This
offers more power and flexibility to both the administrator and the
user and, in doing so, increases system security.

When Unix was created more than thirty years ago, certain compromises,
which given the resources available at the time, made sense. Time
passed and both Unix and computers evolved. However, the initial
compromises, which required rearchitecting central parts of the system
to fix, became design flaws. The Hurd is one reaction to these
defects.

The central concept of the Hurd is that the user is empowered yet
isolated from the system. This does not, and cannot, exist in Unix:
there is just too much core functionality that lives in the kernel.
Why is this bad? Well, it means that parts of the system a user
could take advantage of become off limits.

One example of this is mounting file systems. Users have come to
accept that only root can mount a given file system even if the user
has access to the underlying data. There is not, however, any reason
that a regular user should not be able to mount a file system anywhere
he has permission to create a directory. The reason that this is not
permitted in Unix is that the file system code lives in the kernel.
Thus, if the file system is rigged correctly, it becomes possible for
the user to elevate his privileges or even crash the system. In
the Hurd, this is not possible: a file system is managed by a normal
user space process that runs as the user that set it up. Therefore,
even if a user's file system crashes, it appears to the system as
nothing more than any other program receiving a SIGSEGV, i.e. no
reason for the system to panic.

When we refer to file systems, we are not just referring to data that
lives on raw media, but also file systems that can be accessed over a
network and anything else that can be imagined. Thus, if a user is
able to connect to an FTP server using an FTP client, they should be
able to just place the entire hierarchy directly within the file
system and work with the files directly. In fact, tools that do this
already exist: notably, bash can access `/dev/tcp/HOST/PORT' and Gnome
has its own virtual file system. However, to make this a global
solution, it would have to be reimplemented by every application
from word processors down to cat. Or, we could just implement it once
in the right place: directly within the system itself.

By giving users more power, we are also able to increase system
security. In order to clarify what I am driving at, I will attempt to
offer a small concrete example using one of the Hurd servers, the
password server, which listens on the `/servers/password' node. It is
the password server's job to hand out authentication tokens to clients
who are able to successfully identify themselves.

When a process obtains a send right to the `/servers/password' node
(by, for instance, calling open), it can send messages to the
underlying server. One of the RPCs, remote procedure calls, that it
can send is `password_check_user'. The protocol requires that the
caller supply an user id and a password. The server, upon deeming
the password correct, returns an authentication token to the caller.
Using this token, the client is able elevate his privileges with
daemons that respect the token.

What this means is that, for instance, an FTP server can be
implemented to start with no user ids (i.e. no authentication tokens
and, therefore, no privileges) and, when a client attempts to
authenticate, it can pass the actual work off to the password server.
Beyond making the coding of the daemon easier, a huge security
advantage has been gained: the daemon raises, not lowers, its
privileges within the system when the client has successfully
authenticated itself. Although the result is the same, attacks, such
as buffer overflows, are rendered benign: even if the attacker is able
to successfully crack the FTP daemon during the authentication phase,
he enters the system with no privileges -- not root privileges as he
would with a traditional FTP daemon which drops its suid root
privileges after the authentication phase.

Additionally, none of the pieces of the system are static: each can be
replaced or supplemented by both the system administrator and the
users. A powerful implication is that a user can create his
own authentication daemon. Although other processes will not trust
the authentication tokens issued by that server, the new authentication
server can be used as a proxy subdivide the resources controlled by
a given user and provide an isolated subsystem.

For instance, a web application, such as email, may put all of its
data mailboxes in an ext2 file system in a file (i.e. a loop back device in Linux terminology). The file would be
owned by, for instance, `application-data'. An ext2fs server could be
setup to use that file and the new authentication server.
Additionally, when clients login to check their email, they would be
issued tokens from the private authentication server and not the
system authentication server. If an attacker is able to compromise
the application, he will only be given an authentication token that is
valuable within the email application and useless outside of that
subsystem.

We also have the ability to boot a Subhurd: a new instantiation of
the Hurd that runs in parallel with the original Hurd. They are
almost completely isolated from each other with the exception of some device sharing.
This is roughly equivalent to Usermode Linux, however, has existed since the
beginning; the Hurd just lends itself to this idea.

JA:
Do the servers have to run on the same system? Or can they be arranged in a
sort-of multi-system cluster?

Neal Walfield:
The servers do not have to run on the same system, at least
theoretically. I have a few ideas about how I think network
transparent IPC could work, however, no code. This is a topic that I
would like to explore more in the future.

JA:
To isolate file systems on a user basis sounds very powerful and convenient.
Does it currently work as described?

Neal Walfield:
Everything that I described above is, minus bugs, how it actually
works.

JA:
When you talk about placing entire directory hierarchies within a file system
and using the files directly, how does this differ from NFS?

Neal Walfield:
Functionally, NFS offers the same mechanism, however, the important
idea to take away from this discussion is that the Hurd has a
completely different policy than traditional Unix.

In Unix, only root can mount file systems -- including NFS. It is
true that there exist certain NFS automounters, however, these are
only a bandaid for a much deeper wound: the user is limited to a
single type of file system. What about ftpfs or smbfs? And should we
stop with network file systems? For instance, in the Hurd, any user
can use what Linux refers to as a loop back device. And, I feel
obliged to add shadowfs: a file system which takes multiple
directories and merges them according to a given set of rules. Should
there be an automounter for each of these types of file systems? As
far as I can see, it makes much more sense to just fix the problem at
its root so that normal users can securely modify the virtual file system,
the VFS.

Another advantage of Hurd policy is that all file systems are
developed in user space as normal programs. Thus, the servers have
full access to the C library or whatever language the implementor
chooses (including Perl, Python, Scheme and even Bourne shell). The
developer can use a real debugger and does not have to worry about
system crashes: there are no more kernel panics, only SIGSEGVs. And,
when it comes to distributing the server, this can be done completely
independently of the `kernel' like a normal program. A Debian administrator
may one day say:

        # apt-get install ftpfs

No module to insert, no reboot required.

JA:
The Hurd authentication server system is interesting, but how heavily has this
been audited and actually tested for security? How is this testing
done?

Neal Walfield:
All of the developers have taken a look at it at one time or another,
however, if that counts as an audit, I am not sure. I would not be
surprised to learn that bugs remain.

JA:
You mention the ext2 file system, the Linux standard. What is the standard Hurd
file system?

Neal Walfield:
The Hurd supports both the ext2 file system and the fast file system,
i.e. BSD's UFS; there is no special file system for the Hurd. In
order to support some of the Hurd's new features such as passive
translators and an additional set of permission bits for the unknown
user (i.e. users with no ids), we have exercised the ability to add
operating specific extensions to the file systems and yet remain

completely compatible. The file system's owner (as set when running
e.g. mke2fs) determines which features are available.

JA:
What are passive translators?

Neal Walfield:
When a server or translator, as it is also called, is running, it is
considered active, that is, it is listening on a port in the
virtual file system and ready to service clients. This is generally
referred to as a mounted file system in Unix terminology.

In Unix, when the system starts up, the boot scripts will generally
mount a certain number of file systems as determined by the contents
of the `/etc/fstab' file. Using the same method in the Hurd would be
against our philosophy: `/etc/fstab' is a central resource modifiable
only by the super user.

In order to overcome this, the idea of the passive translator was
born. Much like a symbolic link, a passive translator is stored
inside of the inode. It provides a specification string on how to
start an instance of the active translator. As normal, when a path is
being looked up, file systems will route requests to the appropriate
children. Now, however, the server will also check to see if a given
path component has a passive translator associated with it before
attempting to continue the resolution. If so, it will try to start an
active translator and reroute the request to it.

For instance, `/home' might contain a passive translator with the
following specification `/hurd/ext2fs /dev/hd0s3'. This means that
the `/hurd/ext2fs' program should be started and given `/dev/hd0s3'
as its argument list. There is a small protocol that is used to attach
the new server to the VFS, however, that is beyond the scope of this
discussion.

Now, say that you login to the system, one of the first things that
your shell will do is load its startup files, e.g. `/home/jeremy/.profile'.
This is done by contacting the root file system and asking it to
return a port to `/home/jeremy/.profile'. In our scenario, the root
file system will get as far as `/home' before it sees that another
translator manages the VFS past this point. If there is not already
an active translator already listening on `/home', the root file
system will start one and then return to the user a port to the new
server and a message indicating that it needs to ask it about the rest
of the the path, i.e. `jeremy/.profile.' The client will send a
second message, this time to the new server and then ask it to try to
resolve the rest of the path. If all goes well, it will return a port
to the indicated file.

JA:
Does the Hurd have a root user with superuser privileges?

Neal Walfield:
The user with id zero is generally considered to have superuser
privileges and all of the system servers recognize this. However,
users' servers are not required to respect this.

JA:
I'm afraid I don't understand. If I have a server that is running the GNU/Hurd,
and I am the root user, can another user with an account on my machine then make
a client server that I can not control, kill, or otherwise affect? Even though
I am the superuser?

Neal Walfield:
A server may choose to deny access to root, however, the superuser has
absolute control over all of the system servers. As such, he can send
signals to any process that he wants or reclaim any and all resources
as he sees fit.

JA:
Can you explain more about 'the unknown user'? How would a user not have a user
id?

Neal Walfield:
The authentication model in the Hurd is based on authentication
tokens: either you have them, or you do not. This is quite different
from the Unix model where each process is running with a single discrete
identity.

You might think of the tokens as identity cards. Depending on where
you are going and what services you require depends on which cards you
need. A user may have access to many different identity tokens, a
single token, or none at all. In the last case, the user is
considered an unidentified foreigner and will be granted very little
access to the system.

JA:
How many servers is the Hurd currently comprised of? Can you list some of the
major ones, and describe them a little?

Neal Walfield:
The Hurd is composed of several core servers, specifically, the exec,
proc and auth servers. The exec server is in charge of setting up new
Hurd processes, the proc server manages process' (pids, process groups --
all of the POSIX details plus a few extensions) and the auth server
implements the trust protocol.

Other important servers include the physical file systems such as
ext2fs, UFS and NFS. These also play a central role in the
construction of the VFS.

However, no one has to use any of these servers -- a user can write
his own and choose to ignore what the system offers.

JA:
How usable is the Hurd in its current version?

Neal Walfield:
There has not been an official release of the Hurd since 1997. Most
of the developers are concentrating on finishing the current feature
set and working out important bugs.

With respect to usability, the Hurd works quite well as a desktop
system, however, I would not yet recommend it to anyone as a server.
That said, approximately half of the Debian Woody archive has been
compiled for the Hurd. This includes most development tools and
noteworthy programs such as XFree86.

The reason that more programs have not been compiled is due to two
main factors: many Free Software programmers often introduce small
Linux-isms. Happily, most of these can be fixed within a few hours,
however, contacting the upstream author and being sure that the patch
is properly integrated can take much longer. When this number is
multiplied by the number of packages in Debian, the second reason
becomes quite clear: a lack of manpower.

When I say Linux-ism, I refer to the phenomenon where developers
think that GNU/Linux is POSIX. This is just not true. GNU/Linux is
an implementation of POSIX. The GNU/Hurd is yet another. How is this
possible if POSIX is a industry standard? POSIX gives system developers
many choices and has places where behavior is undefined. Clearly,
undefined behavior is defined in a real implementation (even if it
means a SIGSEGV), however, it does not have to be consistent across
implementations. One example is that POSIX allows the setting of
PATH_MAX, which, if defined, is the system limit for the length of the
entire file name. Note the `if defined.' In Linux, this is defined.
In the Hurd, it is not. However, many developers believe that since
it is defined in GNU/Linux, the GNU/Hurd must be wrong. No, this is a choice
that POSIX offers operating system developers. Those interested in
writing portable applications must respect all possible
choices.

JA:
How big is the team of people currently working on the Hurd?

Neal Walfield:
There are currently about five people who work actively on the Hurd
proper. As far as porting is concerned, there are about fifteen
developers who participate regularly. Many of the Debian developers
have started to help by porting their own packages.

JA:
What are some of the outstanding "important bugs" that still need to
be fixed?

Neal Walfield:
These are mostly stability issues or features that we claim to support
but only support halfway.

JA:
In trying to understand how usable the Hurd currently is, I'd be curious to know
what are some of the features claimed to be supported, but really only supported
halfway?

Neal Walfield:
The GNU/Hurd, as a desktop system is quite usable, albeit, a bit slow.
In terms of stability, there are not many major crashers. Which is to
say, an uptime of over a week is quite possible. That said, I would
not recommend using the GNU/Hurd on a server. At least not yet.

With respect to the misreported features, we claim to support
setrlimit and family, however, we do not report accurate statistics
and cannot actually set all of the defined limits. Another example is
locking. We have an implementation of BSD flock which is used to
emulate parts of POSIX locking, however, it does not provide a full
compatibility layer.

JA:
Is there a target date for the next official release?

Neal Walfield:
Not that I am aware of.

JA:
It would seem to me that having another official release would generate more
interest and potentially increase your user base. What needs to happen before
we'll see another official release?

Neal Walfield:
I am sure that an official release would generate a lot more interest
in the GNU/Hurd, however, I do not know if we need that type of
interest at the moment. The developers are already spread quite thin
and having to play technical support (which is what you promise when
you do a release) would be quite taxing. Additionally, we can only
ask users to give us so many chances. If we release today and they see the
current limitations, a year later, they may not be so willing to try
again.

As for what is required before another release, I am not the
maintainer, however, some important items that need to be done
eventually include: integrating pthread support; rewritting libdiskfs
to allow larger partitions; and using OSKit-Mach, an implementation of Mach based on the University of Utah's OSKit which would provide a new driver framework. There are also stability issues that need to
be addressed and the VM subsystem needs some work.

JA:
What window managers and desktop environments are available?

Neal Walfield:
All of the basic window managers are available: blackbox, fvwm, twm
and window maker. The more advanced Gnome desktop and KDE desktop
environments do not yet work under the Hurd due to a lack of pthreads.

The C library actually predates the pthreads standard; we use a
package called cthreads. Getting a working implementation of pthreads
should not be difficult. However, getting a fully conforming
implementation that works on both the Hurd and Linux (which is the
goal of the glibc developers), requires a bit of restructuring and
careful designing.

Once we have a working pthreads implementation, a lot more of the
archive will be compilable.

JA:
What processors are supported by the Hurd?

Neal Walfield:
Currently, only ia32 is officially supported. In the past, there was
a port to the mips architecture, however, it is no longer maintained
and has likely suffered from bit rot. Looking to the future, work is
currently underway to port the Hurd to the PowerPC platform by Peter
Bruin. We consider the ease of this port, which was done by a single
person in his spare time over the part few months without _any_ help
from the core developers, an affirmation of
many of the design decisions that were made in the Hurd.

JA:
Does the Hurd support multiple processor systems?

Neal Walfield:
The Hurd itself is aggressively multi-threaded and all of the locking
has been done with an eye towards multi-processor systems. That said,
we have not yet used a microkernel that stably supports multiple
cpus.

JA:
Is the Hurd kernel based on earlier UNIX kernels, or the Linux
kernel?

Neal Walfield:
As I already alluded to, the Hurd is not a kernel; it is a set of
servers. We do, however, use two main parts of the Linux kernel: In GNU
Mach, some glue code has been written to allow the use of Linux Device
drivers. We have also used Linux's implementation of TCP/IP in the
pfinet server.

JA:
My background is in UNIX. I understand that the Hurd is not in and of
itself a kernel, but it must have a kernel and therefore kernel space processes?
What then does the kernel itself do, and what processes remain in kernel
space?

Neal Walfield:
Unix's design is what is referred to as a monolithic kernel. In this
architecture, most of the system's functionality lives in the kernel
proper. The limitations that the Hurd tries to address are
essentially the limitations imposed on the user in this architecture.

The GNU/Hurd is a multiserver microkernel. What this means is that a
microkernel, in our case, GNU Mach, provides the basic mechanisms --
virtual memory, devices drivers, a minimal task framework and very
general interprocess communication facilities -- and multiple external
servers (this is the Hurd) dictate system policy -- the concept of
users, authentication and trust, file systems, VFS, networking, etc.

It is worth noting that the Hurd is not married to GNU Mach; there
is an effort underway to port the Hurd to the L4 microkernel, a newer
next generation microkernel that has built upon many of the ideas
found in Mach. Once this is done satisfactorily, it will likely be
adopted.

JA:
The Debian
GNU/Hurd project is working to make available the same software as is available with
Debian GNU/Linux. What are the advantages to using the Hurd with this
software, instead of the Linux kernel?

Neal Walfield:
With respect to applications, the advantages are almost completely
transparent: as the servers live within the virtual file system,
no special API is need. Daemons, however, will need to be rewritten
to take full advantage of the Hurd's security model.

JA:
Linus Torvalds has frequently and openly criticized the Hurd on the Linux Kernel
Mailing List. Recently he went on a tirade regarding MAP_COPY, convinced it was
a terrible idea. Can you talk a little about what MAP_COPY is? And why Linus
may be so anti-Hurd?

Neal Walfield:
The only thing that I have to say is that, beyond a few glass of wine
now and then, I have never taken any non-prescribed drugs.

JA:
Can you offer any insight and/or encouragment to aspiring kernel hackers?

Neal Walfield:
Patience. An important element in systems programming is not coding
but designing: interfaces are much harder to change than
implementations. Be prepared to throw away a lot of code
before you find the best design.

JA:
What are some good web links to get more Hurd information?

Neal Walfield:

JA:
Is there anything else you'd like to add?

Neal Walfield:
An important consideration is speed: will the Hurd be able to
outperform current operating systems? The answer is maybe, but, that
it is not the top priority. The system, as it is today, is not
optimized at all. To quote from ``The UNIX Time Sharing
System,'' a paper published by Ritchie and Thompson in 1974: ``Early
versions of the operating system were written in assembly language,
but during the summer of 1973, it was rewritten in C. The size of the
new system is about one third greater than the old. Since the new
system is not only easier to understand and to modify but also
includes many functional improvements . . . we consider this increase
in size quite acceptable.'' We feel that the Hurd offers many
functional improvements and, as each server is completely isolated not
only by each being placed in its own user space task, but also by a
strict formal API, we feel that the maintainability of the system as a
whole will also be vastly improved.

JA:
Thank you very much for your time! The Hurd sounds quite interesting, so much so
that I'm now inspired to dig up a server to give it a try.

Neal Walfield:
Great! I look forward to seeing you on the mailing lists.


About the interviewer:

Jeremy Andrews was born and raised in Southeast Alaska. Currently he lives and works in South Florida. He maintains KernelTrap as a hobby.
Copyright (c) 2001, Jeremy Andrews and Neal Walfield

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is available at http://www.gnu.org/copyleft/fdl.html.

Running daemons without user IDs

Anonymous
on
March 3, 2002 - 11:16am

Hm, it's interesting idea to have someone non-existent occupying system resources and eventually trying to gain users' permissions.

That idea seems FUBAR to me.

Wow

on
September 30, 2002 - 9:15pm

The Hurd is so cool. The design seems really flexible,
a true Unix replacement.

Not being able to install applications in my home
directory without being root is something that always
bothered me. I hate having to pester a sysadmin to
install the Guile. Also, it would allow a user to
play with,say, the developement version of MySQL
without interfering with the system.

As for security, this is how kernel should have been
developed all along.

You already can install application in your home

Anonymous (not verified)
on
January 20, 2005 - 8:16am

and can run server in unprivileged port (say 8080 .. you can install zope in your home), you can install guile and mysql. You can do it with linux, bsd, modern *nix and older unix.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.