Re: [PATCH 09/10] netns: Enable tagging for net_class directories in sysfs

Previous thread: [PATCH 08/10] driver core: Implement tagged directory support for device classes. by Benjamin Thery on Monday, June 2, 2008 - 9:45 am. (1 message)

Next thread: [PATCH 10/10] sysfs: user namespaces: fix bug with clone(CLONE_NEWUSER) with fairsched by Benjamin Thery on Monday, June 2, 2008 - 9:46 am. (2 messages)
To: Andrew Morton <akpm@...>
Cc: Greg Kroah-Hartman <gregkh@...>, Eric Biederman <ebiederm@...>, Serge Hallyn <serue@...>, <linux-kernel@...>, Tejun Heo <htejun@...>, Al Viro <viro@...>, Daniel Lezcano <dlezcano@...>, Benjamin Thery <benjamin.thery@...>
Date: Monday, June 2, 2008 - 9:46 am

net: Enable tagging for net_class directories in sysfs

The problem. Network devices show up in sysfs and with the network
namespace active multiple devices with the same name can show up in
the same directory, ouch!

To avoid that problem and allow existing applications in network namespaces
to see the same interface that is currently presented in sysfs, this
patch enables the tagging directory support in sysfs.

By using the network namespace pointers as tags to separate out the
the sysfs directory entries we ensure that we don't have conflicts
in the directories and applications only see a limited set of
the network devices.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
---
fs/sysfs/mount.c | 36 ++++++++++++++++++++++++++++++++++++
include/linux/sysfs.h | 2 ++
net/Kconfig | 2 +-
net/core/net-sysfs.c | 20 ++++++++++++++++++++
4 files changed, 59 insertions(+), 1 deletion(-)

Index: linux-mm/fs/sysfs/mount.c
===================================================================
--- linux-mm.orig/fs/sysfs/mount.c
+++ linux-mm/fs/sysfs/mount.c
@@ -16,6 +16,8 @@
#include <linux/mount.h>
#include <linux/pagemap.h>
#include <linux/init.h>
+#include <linux/nsproxy.h>
+#include <net/net_namespace.h>

#include "sysfs.h"

@@ -78,6 +80,7 @@ static int sysfs_fill_super(struct super
root->d_sb = sb;
sb->s_root = root;
sb->s_fs_info = info;
+ info->tag.net_ns = hold_net(current->nsproxy->net_ns);
return 0;

out_err:
@@ -95,6 +98,9 @@ static int sysfs_test_super(struct super
struct sysfs_super_info *info = sysfs_info(sb);
int found = 1;

+ if (task->nsproxy->net_ns != info->tag.net_ns)
+ found = 0;
+
return found;
}

@@ -131,6 +137,8 @@ static void sysfs_kill_sb(struct super_b
struct sysfs_super_info *info = sysfs_info(sb);

kill_anon_super(sb);
+ if (info->tag.net_ns)
+...

To: Benjamin Thery <benjamin.thery@...>
Cc: Andrew Morton <akpm@...>, Eric Biederman <ebiederm@...>, Serge Hallyn <serue@...>, <linux-kernel@...>, Tejun Heo <htejun@...>, Al Viro <viro@...>, Daniel Lezcano <dlezcano@...>
Date: Tuesday, June 3, 2008 - 12:08 am

I don't like it how the network subsystem is starting to leach into the
sysfs core here. What happens when the next subsystem wants to do the
same thing? And then the next one? Will they all have to do this kind
of intrusive changes to sysfs?

Can't this be done only in the network subsystem?

thanks,

greg k-h
--

To: Greg KH <gregkh@...>
Cc: Andrew Morton <akpm@...>, Eric Biederman <ebiederm@...>, Serge Hallyn <serue@...>, <linux-kernel@...>, Tejun Heo <htejun@...>, Al Viro <viro@...>, Daniel Lezcano <dlezcano@...>
Date: Tuesday, June 3, 2008 - 11:24 am

>
> Can't this be done only in the network subsystem?

I'm not sure to understand exactly what you mean.

What you don't like is seeing these hunks of network code in
fs/sysfs/mount.c? And you prefer to see these bits of code resides in
the network subsystem instead and see only "generic" sysfs services in
fs/sysfs/mount.c?

If this is it, I have some idea to implement a less intrusive
sysfs_net_exit(), which can be shared with the other namespaces.
Serge introduces the same kind of changes in patch 10 to fix an issue in
user namespace. I think we can share a bit of code and move the parts
specific to each namespace in their own subsystems.

--
B e n j a m i n T h e r y - BULL/DT/Open Software R&D

http://www.bull.com
--

To: Benjamin Thery <benjamin.thery@...>
Cc: Andrew Morton <akpm@...>, Eric Biederman <ebiederm@...>, Serge Hallyn <serue@...>, <linux-kernel@...>, Tejun Heo <htejun@...>, Al Viro <viro@...>, Daniel Lezcano <dlezcano@...>
Date: Tuesday, June 3, 2008 - 12:35 pm

Yes, exactly.

I don't want the problem that if more subsystems want to implement
something like this, they too need to modify the sysfs core.

I think that would be a good idea. Care to redo the series?

thanks,

greg k-h
--

To: Greg KH <gregkh@...>
Cc: Andrew Morton <akpm@...>, Eric Biederman <ebiederm@...>, Serge Hallyn <serue@...>, <linux-kernel@...>, Tejun Heo <htejun@...>, Al Viro <viro@...>, Daniel Lezcano <dlezcano@...>
Date: Tuesday, June 3, 2008 - 3:10 pm

I'm already working on it.
I'll try to re-post in a couple of days.

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

--

To: Greg KH <gregkh@...>
Cc: Benjamin Thery <benjamin.thery@...>, Andrew Morton <akpm@...>, Eric Biederman <ebiederm@...>, Serge Hallyn <serue@...>, <linux-kernel@...>, Tejun Heo <htejun@...>, Al Viro <viro@...>, Daniel Lezcano <dlezcano@...>
Date: Tuesday, June 3, 2008 - 8:16 am

At least as far as the tagging goes, each namespace which needs to do
this (network, devices, and user, at least) will add a field to the
tag structure and call sysfs_enable_tagging() on the relevant
directories. So no more core sysfs changes should be needed, as Eric

sysfs/kobject layer has to somehow decide what to show for
/sys/class/net contents based on the mountpoint, right, so I don't see
how the network subsystem could do it.

The only non-tagging alternative I'd see would be to keep entirely
separate kobject pools for each namespace. To do that we'd probably
want to break /sys/class/net into a separate fs that can be
remounted, so at least we don't have to keep the rest of the kobject
pools (/sys/firmware, kernel, etc) in sync...

Eric had mentioned before breaking /sys into multiple mountpoints, so
I'll assume the fact that he implemented tagging means that there was
too much cross-linking and whatnot across the /sys tree to make that
feasible.

More importantly, that approach would require more core sysfs changes
for the next namespace, whereas the tagging approach does not!

thanks,
-serge
--

Previous thread: [PATCH 08/10] driver core: Implement tagged directory support for device classes. by Benjamin Thery on Monday, June 2, 2008 - 9:45 am. (1 message)

Next thread: [PATCH 10/10] sysfs: user namespaces: fix bug with clone(CLONE_NEWUSER) with fairsched by Benjamin Thery on Monday, June 2, 2008 - 9:46 am. (2 messages)