Linux: v9fs, a 9P Filesystem Protocol Implementation

Submitted by Jeremy
on June 23, 2005 - 8:31am

Among the patches in Andrew Morton [interview]'a -mm kernel recently discussed for possible inclusion in 2.6.13 [story] was v9fs, the Linux port of Plan 9's 9p filesystem protocol. Andrew noted, "I'm not sure that this has a sufficiently high usefulness-to-maintenance-cost ratio.". Defined in this rfc, v9fs is a connection-oriented networked filesystem. One defender of v9fs explained:

"The 9P protocol implemented by v9fs is the result of over a decade of research in distributed systems at Bell Labs by the original Unix team, and it has various implementations for other operating systems that have been used in production systems for many years.

"9P is designed to be portable across systems and transport protocols, it's network transparent, and it gives us interoperativity with Inferno(which can run hosted under Linux already), Plan 9, and p9p, and implementations for *BSD and other systems are in the works."


From: Eric Van Hensbergen [email blocked]
To: Andrew Morton [email blocked]
Subject: Re: v9fs (-mm -> 2.6.13 merge status)
Date:	Tue, 21 Jun 2005 08:51:27 -0500

On 6/21/05, Andrew Morton [email blocked] wrote:
> 
> v9fs
> 
>     I'm not sure that this has a sufficiently high
>     usefulness-to-maintenance-cost ratio.
> 
 
I think v9fs/9P has some unique aspects which differentiate it from
the other distributed system protocols integrated into Linux:
a) it presents a unified distributed resource sharing protocol.  It
will be able to distribute devices, file systems, system services, and
application interfaces.
b) it provides non-caching RPC-style access to synthetic file systems
which could be used with in-kernel file systems such as sysfs or with
user-space synthetics such as those provided by FUSE
c) its implementation supports transport independence enabling easy
support for different interconnects (shared memory, Xen device
channels, RDMA, Infiniband, etc.)

v9fs-2.0 has a somewhat limited audience at the moment - but now that
the initial implementation is more or less complete we are working to
build applications on top of it (and provide a better server).  It's
being integrated into cluster projects at LANL and being looked at wrt
virtualization I/O at IBM.  Its our hope that these improvements and
cluster applications will motivate more wide-spread use of the v9fs
module.

     -eric


From: Uriel [email blocked] Subject: Re: v9fs (-mm -> 2.6.13 merge status) Date: Tue, 21 Jun 2005 16:35:52 +0100 On Tue, Jun 21, 2005 at 08:51:27AM -0500, Eric Van Hensbergen wrote: > On 6/21/05, Andrew Morton [email blocked] wrote: > > > > v9fs > > > > I'm not sure that this has a sufficiently high > > usefulness-to-maintenance-cost ratio. The 9P protocol implemented by v9fs is the result of over a decade of research in distributed systems at Bell Labs by the original Unix team, and it has various implementations for other operating systems that have been used in production systems for many years. 9P is designed to be portable across systems and transport protocols, it's network transparent, and it gives us interoperativity with Inferno(which can run hosted under Linux already), Plan 9, and p9p, and implementations for *BSD and other systems are in the works. 9P has the potential to become the standard protocol for distributed resources and I don't think any of the alternatives come anywhere near being as well designed, well proven and encompassing. uriel
From: Martin Atkins [email blocked] Subject: Re: v9fs (-mm -> 2.6.13 merge status) Date: Wed, 22 Jun 2005 05:13:28 +0000 (UTC) Uriel [email blocked] writes: >.. > The 9P protocol implemented by v9fs is the result of over a decade of > research in distributed systems at Bell Labs by the original Unix team, I would second that. There are several other filesystems already merged or under discussion that are superficially similar - fuse, coda, and even nfs! However, v9fs is the only one that is (all of) 1) suitable for synthetic filesystems (resources), as well as 'normal' filesystems 2) equally good for local and remote filesystems 3) OS/machine independent "on-the-wire" 4) (network) transport independent 5) has a significant history of deployment If there are few Linux applications using v9fs as yet (but there is plan9ports), then one must admit that there is something of a chicken-and-eggs situation involved! Martin

Related Links:

Another reply in that thread:

Ron Minnich (not verified)
on
June 23, 2005 - 1:01pm

I got pointed at this discussion. Here are my $.02 on why we at LANL are
interested in v9fs.

We build clusters on the order of 2000 machines at present, with larger
systems coming along. The system which we use to run these clusters is
bproc. While bproc has proven to be very powerful to date, it does have
its limits:
- requires homogenous system
- the network protocols it uses, while simple, are somewhat ad-hoc
(as is common in this type of system)
- if you are on a bproc system as user x, using 25% of the system,
you still see 100% of the processes. This is a bit of a security issue.

We have a desire to build single-system-image looking clusters along the
bproc model, but at the same time compose those clusters of, e.g.,
Opterons and G5s. This mixing is highly desirable for compoutations that
have phases, some of which belong on one type of a machine, and some on
another.

We are going to use v9fs as the glue for our next-generation cluster
software, called 'xcpu'. Xcpu has been implemented on Plan 9 and works
there. I have ported xcpu to Linux, using v9fs as the client side and Russ
Cox's plan9ports server to write servers.

xcpu presents a remote execution service as a 9p server. xcpu has been
tested across architectures and it works very well. By summer 2006, we
hope to have cut over our bproc systems to xcpu.

That's one use for v9fs. We also plan to use v9fs to provide us with
servers for global /proc, monitoring, and control systems for our
clusters.

The global /proc is interesting. bproc provides a global /proc, but it is
incomplete; entries for, e.g., exe and maps are not filled in. bproc also
caches part of the /proc, but the rules about what is cached and what the
timeouts are, are set in the kernel module and not easily changed. We are
going to have an "aggregating" user level 9p server based on
Mirtchovskis's aggrfs, which will both aggregate all the cluster nodes,
and have caching rules that make sense in clusters of 1000s of node (for
example, it is ok to cache /proc/x/status; there is no need to cache
/proc/x/maps, and you probably don't want to anyway).

A neat capability is that if we give a user, e.g., 25% of the cluster, we
can tailor that user's name space so that they only see their procs and
the 25% of the cluster they own. This is good for security, but also good
for convenience: most users don't really care that some other user is on
75% of the cluster. Global pid spaces are neat in theory, messy in
practice at large scale. I want my global pid space to be global to *me*,
meaning I see the global space of the nodes I care about. The sysadmin,
of course, wants to see everything. All this is possible. V9fs, along with
Linux private name spaces, will allow us to provide this model: users can
see some or all of the global pid space, depending on need; users can be
constrained to only see part of the global pid space, depending on other
issues.

9p will also replace the Supermon protocol, allowing people to easily view
status information in a file system.

In addition to the cluster usage, there is also grid usage. The 9grid,
composed of plan 9 systems, is connected by 9p servers. Linux systems can
join the 9grid with no problem, once Linux has v9fs.

Were v9fs just a file system, I would not really be interested in it one
way or another; we have NFS, after all. But v9fs is really the key piece
of a new model of cluster services we are building at LANL. 9p will be the
glue, and v9fs will be the needed client side for hooking 9p servers into
the file system name space.

I'm hoping we can see v9fs in the kernel someday.

thanks

ron

User experiences with distributed FSs?

HM (not verified)
on
June 23, 2005 - 2:08pm

Anyone around here had an experience with this 9P filesystem? is it any good for distributed filesystem use compared to NFS and coda?

Comparison with NFS

Eric Van Hensbergen (not verified)
on
June 23, 2005 - 8:07pm

There's a comparison with NFS (running Bonnie and Postmark) in a FREENIX 2005 paper. The results are summarized in a graph (http://v9fs.sourceforge.net/perf/index.html) on the v9fs web page. The answer is that for raw throughput (bonnie), v9fs is more or less similar if you run with a large enough block size (32k). For lots of granular access and metadata operations (postmark), v9fs pulls ahead mostly due to lower code complexity and a simple protocol. These are NFSv3 numbers, not NFSv4 -- I haven't done an NFSv4 comparison yet.

I'd like to point out though, that if all you are looking for is just a distributed file system for static files, 9P may not be the right answer for you (at least not until we add a better cache layer). Its strength lay in many of the items mentioned in the thread (transport independence, the ability to share multiple types of resources and interfaces in a unified and consistent manner, and its ability to provide a mechanism for accessing cluster-wide synthetic file system interfaces to applications and operating systems. Its these areas where it really distinguishes itself versus the other distributed system protocols.

Yes. I tried it out today for

Anonymous (not verified)
on
March 20, 2006 - 4:55am

Yes. I tried it out today for the first time. Its unbelieve. With all the problems I have been having with NFS (the usual NFS bugs) its a real relief seeing its magic working by default.

John

"I'm not sure that this has a

anon (not verified)
on
June 24, 2005 - 5:46am

"I'm not sure that this has a sufficiently high usefulness-to-maintenance-cost ratio."

The replies so far have been geared at increasing the perceved usefulness. But nobody has talked about the maintenance cost. The Linux 2.6 port seems to be fairly new and untested. There also doesn't appear to be many people making use of this port. The few people who do need this can download the patch from sourceforge - so why include it in the kernel distribution at this point?

The amount of code in v9fs is

anon (not verified)
on
June 24, 2005 - 7:28am

The amount of code in v9fs is very small and doesn't touch or duplicate any existing kernel code. 9P is a very small, simple and clean protocol.

All Plan 9 and Inferno users are potential users; p9p provides a great set of 9P servers for Linux. And various other applications are starting to use 9P to export fs-like network transparent interfaces.

v9fs has at least three or four active contributors and it's being used for big projects at LANL and IBM Research, so there should be no concern about it's maintenance.

I've used and tested this sof

Dave Leimbach (not verified)
on
June 25, 2005 - 2:07pm

I've used and tested this software across different unixes. The software distribution comes with a purely userland 9P server that can be run on Mac OS X and then I can mount those files to my linux box. I know of people using the same u9fs [userland implementation referenced above] to connect to Solaris boxes and import filesystems over long distances [though I question the safety of that, since there is no authentication, you could tunnel it over ssh I suppose]

Anyway, I've used 9p to mount a filesystem server that's located in Japan, and I'm in the united states, I wouldn't dare try that with NFS.

I find it to be very useful. Might make a nice diskless node protocol too [it works well for that on Plan 9 :)]

I still think this is more useful than the Ham radio support in Linux :)

I would say

on
June 29, 2005 - 12:51am

I would say (to AKPM) that if the original devels are willing to do the ongoing maintenance, then why not? That would eliminate the maintenance-cost aspect, IMHO. Honestly, I do not have any immediate usefulness for this; but as long as the maintenance is zilch I'd say it's OK. When I have an immediate usefulness, then I will worry about the maintenance.

v9fs maintenance

ron minnich (not verified)
on
June 30, 2005 - 5:17pm

I wrote the original version of v9fs for 2.0.36 and it has run on just about every Linux rev since then, starting from 1997 or so until now. It has actually seen a fair amount of use; the 2.4 version was run on the 1024-node Pink cluster here at LANL. We've had good external involvement over the years, and the latest set of people (Eric, Lucho, and others) is the best yet. They've done a great deal of work in getting this code ready for submission.

I think it is safe to say we've got a solid core group to make sure it keeps working. I hope we can ease any worries on that issue.

thanks again.

ron

ipv6 support?

Anonymous (not verified)
on
January 12, 2008 - 8:12am

Right now v9fs only supports ipv4 when using proto=tcp, is there work being done on ipv6 support?

Will the developers talk more

Anonymous Coward (not verified)
on
July 8, 2005 - 12:45am

Will the developers talk more about how this compares with FUSE? I think it would be a killer feature for users in that sort of capacity.

You really can't compare it t

Anonymous (not verified)
on
March 20, 2006 - 5:05am

You really can't compare it to FUSE. Because something like this makes FUSE totally useless.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.