Re: sysfs: tagged directories not merged completely yet

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Tejun Heo <tj@...>
Cc: Greg KH <greg@...>, Al Viro <viro@...>, Benjamin Thery <benjamin.thery@...>, <linux-kernel@...>, Serge E. Hallyn <serue@...>, Al Viro <viro@...>, Linus Torvalds <torvalds@...>
Date: Tuesday, October 14, 2008 - 8:19 am

Tejun Heo <tj@kernel.org> writes:


I don't see how FUSE can help.  The problem is getting the information 
out of the kernel, and not breaking backwards compatiblity while we
do it.  As I understand FUSE it just allows for user space filesystems.
Which is great if I want to hide information.


I don't see how.  If userspace doesn't have the information I don't
see how placing a filter will allow it to show up there.

The challenge is to not conflict on network device names.  If someone can
think of where we can put the network devices that are in different
network namespaces in sysfs so they don't conflict when they have the
same name I have no problem with that.  But where can we put them?


Well compared to the rest of it the part with dirents is just a handful
of lines of code.  The vfs part is the expensive and hairy part.


Yes.  I am looking at that.


Yes.  I think parts of the page cache and anything in the inode itself
is protected by i_mutex.  As for timestamsp or anything else that
we really care about we can and should put them in sysfs_dirent and
we can have the stat method recreate it, and possibly have d_revalidate
refresh it.


Reasonable.  I have seen two ways of handling rename properly.
Some weird variant d_splice_alias or some cleaner variant of what
we are doing today.


The guarantee is that we will see all entries that are there for the
duration of readdir, we order the directory by inode, and stick
the inode number in f_pos.  So now we don't have the problem of
returning the same entry multiple times or skipping existing entries.


Lookup, create, unlink and if we drop the lock during readdir, readdir
restart.  The all require a linear scan.


Depends on how many devices people are adding and removing dynamically
I guess.  sysctl has had that issue so I am thinking about it.  I
figure we need to make things work properly first.


Not really.  It is really very straight forward. 99% of the modified
code simply has an extra pointer dereference.  

Except for sysfs the network namespace code that has merged is in a
very usable state.  There are a few little things like iptables
support that still needs some work.  From a practical standpoint sysfs
was one of the first things I started working on and it is one of the
last things to be done.


Some of it yes.  Which asks for a more comprehensive solution.  Part
of the challenge is that there has been insistence on an especially
generic solution, in sysfs and I'm not certain that has helped.


To my knowledge yes.  Most of the cost is trivial, and it makes
a darn good excuse to clean up problem code.


So far sysfs is the most costly and the hardest part.  Most of the
cost is in the noise and in the design.

One thing the namespaces fundamentally get you is scaling.  You can
run probably 10x more environments on a single server.  Which makes
then cheaper and available, on all hardware.

Beyond that there are people who actually just want to use a single
namespace for what you can do.  They are general tools and are useful
in more ways than just checkpoint restart and virtualization.

Think what happens if you are a switch/router and you switch two
different networks both using overlaping addresses in the 10.x segment.

Or think how much easier it is to test routing with just a single machine.

All kinds of interesting uses.


This isn't a partial view thing really.  This is how do I put it all
in there not have conflicts and preserve backwards compatibility.

In proc.  I have work as hard as I can to build a design that will let
us see it all without sacrificing backwards compatibility.  With /proc/<pid>
I have a natural place to put data in a per process view.  I don't
have that in sysfs, and sysfs at some point stopped being about just
the hardware.  So the only way I have found to have places for everything
is to do multiple mounts.


It is probably worth a double check.  Coming in all physical network
devices happen in the initial network namespace so that direction isn't
a problem.  Worse case I expect we figure out how to add a field that
specifies enough about the network namespace so the events can be relayed
to appropriate part of user space.

Eric

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
sysfs: tagged directories not merged completely yet, Benjamin Thery, (Mon Sep 22, 10:31 am)
Re: sysfs: tagged directories not merged completely yet, Benjamin Thery, (Tue Sep 23, 10:24 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Sep 23, 2:23 pm)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 7:56 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 8:04 pm)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 8:58 pm)
Re: sysfs: tagged directories not merged completely yet, Daniel Lezcano, (Tue Oct 7, 5:01 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 4:08 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 4:27 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 9:29 pm)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 7, 8:39 pm)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Mon Oct 13, 9:11 pm)
Re: sysfs: tagged directories not merged completely yet, Serge E. Hallyn, (Tue Oct 14, 2:53 pm)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 14, 8:48 pm)
Re: sysfs: tagged directories not merged completely yet, Serge E. Hallyn, (Wed Oct 15, 9:42 am)
Re: sysfs: tagged directories not merged completely yet, Benjamin Thery, (Wed Oct 15, 9:54 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Tue Oct 14, 8:19 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Thu Oct 16, 5:58 pm)
Re: sysfs: tagged directories not merged completely yet, Serge E. Hallyn, (Tue Oct 7, 6:54 pm)
Re: sysfs: tagged directories not merged completely yet, Serge E. Hallyn, (Tue Oct 7, 8:12 pm)
Re: sysfs: tagged directories not merged completely yet, Serge E. Hallyn, (Wed Oct 8, 10:18 am)
[PATCH 0/3] minor sysfs tagged directory fixes, Eric W. Biederman, (Tue Oct 7, 6:47 am)
[PATCH 2/3] sysfs: Fix and sysfs_mv_dir by using lock_rename., Eric W. Biederman, (Tue Oct 7, 6:51 am)
Re: sysfs: tagged directories not merged completely yet, Eric W. Biederman, (Mon Sep 22, 4:24 pm)