[PATCH 0/2]v2 configfs: symlink() fixes

Previous thread: [RFC][PATCH] ext3: don't read inode block if the buffer has a write error by Hidehiro Kawai on Monday, June 23, 2008 - 7:25 am. (9 messages)

Next thread: AdvFS released under GPLv2 by Xose Vazquez Perez on Monday, June 23, 2008 - 8:19 am. (6 messages)
To: <Joel.Becker@...>
Cc: <linux-kernel@...>, <ocfs2-devel@...>, Louis Rilling <louis.rilling@...>
Date: Monday, June 23, 2008 - 8:16 am

[ applies on top of the previously submitted rename() vs rmdir() deadlock fix ]

Hi,

The following patchset fixes incorrect symlinks to dead items in configfs, which
are forbidden by specification.

The first patch actually prevents such dangling symlinks from being created, but
introduces a weird(?) behavior where a failing symlink() can make a racing
rmdir() fail in the symlink's parent and in the symlink's target as well. This
behavior is fixed with the next patch.

Changelog:
- fix error code when symlink's target is being removed
- re-implemented the weird(?) behavior fix in a way that does not temporarily
instantiate the new symlink in the VFS.

Summary:
configfs: Fix symlink() to a removing item
configfs: Fix failing symlink() making rmdir() fail

fs/configfs/configfs_internal.h | 1 +
fs/configfs/dir.c | 24 +++++++++++++++++-------
fs/configfs/symlink.c | 14 +++++++++++++-
3 files changed, 31 insertions(+), 8 deletions(-)

--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes
--

To: Louis Rilling <louis.rilling@...>
Cc: <linux-kernel@...>, <ocfs2-devel@...>
Date: Monday, June 23, 2008 - 6:20 pm

Silly question: you've tested this, right?

Joel

--

"I almost ran over an angel
He had a nice big fat cigar.
'In a sense,' he said, 'You're alone here
So if you jump, you'd best jump far.'"

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
--

To: Joel Becker <Joel.Becker@...>
Cc: <linux-kernel@...>, <ocfs2-devel@...>
Date: Tuesday, June 24, 2008 - 9:20 am

--bO4vSxwwZtUjUWHo
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Yup. I tested this on Linux 2.6.26-rc5 with a modified example_configfs.c
(attached), and on a backport for 2.6.20 with my own subsystem (based on
Kerrighed http://www.kerrighed.org/wiki/index.php/SchedConfig ).

Do you have some regression tests?

sample tests for symlinks:

mkdir -p /config/03-group-children/c1/c1.1
cd /config/03-group-children/c1/c1.1

# repeatedly create and remove a symlink to self
ln -s . link
rm link
ln -s . link
rm link

mkdir /config/02-simple-children/c2

# repeatedly create and remove a symlink to another item
ln -s /config/02-simple-children/c2 link
rm link
ln -s /config/02-simple-children/c2 link
rm link

# create two links with inverted source and target
ln -s /config/02-simple-children/c2 link
ln -s . /config/02-simple-children/c2/link
rm link
rm /config/02-simple-children/c2/link

# try hard to create a link to a removed item
# last loop is to be run in a separate shell
cd /config/03-group-children/c1
while true; do mkdir c1.1 && echo mkdir ok; rmdir c1.1 && echo rmdir ok; do=
ne
# in a separate shell
cd /config/02-simple-children/c2
while true; do ln -s /config/03-group-children/c1/c1.1 link && echo ln ok &=
& \
(ls link/ || (echo failed!!!; read foo)) && rm link; done

Louis

--=20
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes

--bO4vSxwwZtUjUWHo
Content-Type: text/x-csrc; charset=us-ascii
Content-Disposition: attachment; filename="configfs_example.c"

/*
* vim: noexpandtab ts=8 sts=0 sw=8:
*
* configfs_example.c - This file is a demonstration module containing
* a number of configfs subsystems.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU Genera...

To: Louis Rilling <Louis.Rilling@...>
Cc: <linux-kernel@...>, <ocfs2-devel@...>
Date: Tuesday, June 24, 2008 - 1:04 pm

No. I've been meaning to create a complex configfs_example with
lots of different scenarios so I could write some. Feel free to do so!
Thank you for sending this along.

Joel

--

"There is a country in Europe where multiple-choice tests are
illegal."
- Sigfried Hulzer

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127
--

To: <Joel.Becker@...>
Cc: <linux-kernel@...>, <ocfs2-devel@...>, Louis Rilling <louis.rilling@...>
Date: Monday, June 23, 2008 - 8:16 am

On a similar pattern as mkdir() vs rmdir(), a failing symlink() may make rmdir()
fail for the symlink's parent and the symlink's target as well.

failing symlink() making target's rmdir() fail:

process 1: process 2:
symlink("A/S" -> "B")
allow_link()
create_link()
attach to "B" links list
rmdir("B")
detach_prep("B")
error because of new link
configfs_create_link("A", "S")
error (eg -ENOMEM)

failing symlink() making parent's rmdir() fail:

process 1: process 2:
symlink("A/D/S" -> "B")
allow_link()
create_link()
attach to "B" links list
configfs_create_link("A/D", "S")
make_dirent("A/D", "S")
rmdir("A")
detach_prep("A")
detach_prep("A/D")
error because of "S"
create("S")
error (eg -ENOMEM)

We cannot use the same solution as for mkdir() vs rmdir(), since rmdir() on the
target cannot wait on the i_mutex of the new symlink's parent without risking a
deadlock (with other symlink() or sys_rename()). Instead we define a global
mutex protecting all configfs symlinks attachment, so that rmdir() can avoid the
races above.

Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
---
fs/configfs/configfs_internal.h | 1 +
fs/configfs/dir.c | 10 ++++++++++
fs/configfs/symlink.c | 8 +++++++-
3 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/fs/configfs/configfs_internal.h b/fs/configfs/configfs_internal.h
index da015c1..5f61b26 100644
--- a/fs/configfs/configfs_internal.h
+++ b/fs/configfs/configfs_internal.h
@@ -51,6 +51,7 @@ struct configfs_dirent {
#define CONFIGFS_USET_IN_MKDIR 0x0200
#define CONFIGFS_NOT_PINNED (CONFIGFS_ITEM_ATTR)

+extern struct mutex configfs_symlink_mutex;
extern spinlock_t configfs_dirent_lock;

extern struct vfsmount * configfs_mount;
diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index f2a12d0..2c873fd 100644
--- a/fs/configfs/dir.c
+++ ...

To: <Joel.Becker@...>
Cc: <linux-kernel@...>, <ocfs2-devel@...>, Louis Rilling <louis.rilling@...>
Date: Monday, June 23, 2008 - 8:16 am

The rule for configfs symlinks is that symlinks always point to valid
config_items, and prevent the target from being removed. However,
configfs_symlink() only checks that it can grab a reference on the target item,
without ensuring that it remains alive until the symlink is correctly attached.

This patch makes configfs_symlink() fail whenever the target is being removed,
using the CONFIGFS_USET_DROPPING flag set by configfs_detach_prep() and
protected by configfs_dirent_lock.

This patch introduces a similar (weird?) behavior as with mkdir failures making
rmdir fail: if symlink() races with rmdir() of the parent directory (or its
youngest user-created ancestor if parent is a default group) or rmdir() of the
target directory, and then fails in configfs_create(), this can make the racing
rmdir() fail despite the concerned directory having no user-created entry (resp.
no symlink pointing to it or one of its default groups) in the end.
This behavior is fixed in later patches.

Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com>
---
fs/configfs/dir.c | 14 +++++++-------
fs/configfs/symlink.c | 6 ++++++
2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index 614e382..f2a12d0 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -370,6 +370,9 @@ static int configfs_detach_prep(struct dentry *dentry, struct mutex **wait_mutex
struct configfs_dirent *sd;
int ret;

+ /* Mark that we're trying to drop the group */
+ parent_sd->s_type |= CONFIGFS_USET_DROPPING;
+
ret = -EBUSY;
if (!list_empty(&parent_sd->s_links))
goto out;
@@ -385,8 +388,6 @@ static int configfs_detach_prep(struct dentry *dentry, struct mutex **wait_mutex
*wait_mutex = &sd->s_dentry->d_inode->i_mutex;
return -EAGAIN;
}
- /* Mark that we're trying to drop the group */
- sd->s_type |= CONFIGFS_USET_DROPPING;

/*
* Yup, recursive. If there's a problem, blame
@@ -414,12...

Previous thread: [RFC][PATCH] ext3: don't read inode block if the buffer has a write error by Hidehiro Kawai on Monday, June 23, 2008 - 7:25 am. (9 messages)

Next thread: AdvFS released under GPLv2 by Xose Vazquez Perez on Monday, June 23, 2008 - 8:19 am. (6 messages)