Al, Christoph,
Zach just ran into this bug as well. Does this fix look reasonable?
thanks-
sage----
From Sage Weil <sage@newdream.net>d_move() is strangely implemented in that it swaps the position of
new_dentry and old_dentry in the namespace. This is admittedly weird (see
comments for d_move_locked()), but normally harmless: even though
new_dentry swaps places with old_dentry, it is unhashed, and won't be seen
by a subsequent lookup.However, vfs_rename_dir() doesn't properly account for filesystems with
FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
unhashes the new_dentry prior to the rename() iop and rehashes it after,
but doesn't account for the possibility that rename() may have swapped
{old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it rehashes
new_dentry (now the old renamed-from name, which d_move() expected to go
away), such that a subsequent lookup will find it.To correct this, move vfs_rename_dir()'s call to d_move() _before_ the
target inode mutex is dealt with. Since d_move() will have been called
for all filesystems at this point, there is no need to rehash new_dentry
unless the rename failed. (If the rename succeeded, old_dentry should
already be rehashed in the new location.)The only in-tree filesystems with FS_RENAME_DOES_D_MOVE are ocfs2 and nfs.
My suspicion is that they are not bitten by this particular bug because
the incorrectly rehashed new_dentry gets rejected by d_revalidate().This was caught by the recently posted POSIX fstest suite, rename/10.t
test 62 (and others) on ceph. With this patch, all tests succeed.Signed-off-by: Sage Weil <sage@newdream.net>
---
fs/namei.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)--- linux-2.6.25-orig/fs/namei.c 2008-04-16 19:49:44.000000000 -0700
+++ linux/fs/namei.c 2008-04-18 13:59:30.000000000 -0700
@@ -2488,17 +2488,18 @@
error = -EBUSY;
else
error = old_dir->i_op->rename(old_dir...
I think rehashing the new dentry is bogus, even on error. And it
looks racy with lookup as well.I wonder what the original reason for that was? Git history doesn't
tell...So a better fix would be just to remove the rehashing completely.
Does the below patch work for you?Thanks,
Miklos---
fs/namei.c | 2 --
1 file changed, 2 deletions(-)Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c 2008-07-11 22:09:32.000000000 +0200
+++ linux-2.6/fs/namei.c 2008-07-11 22:40:16.000000000 +0200
@@ -2643,8 +2643,6 @@ static int vfs_rename_dir(struct inode *
if (!error)
target->i_flags |= S_DEAD;
mutex_unlock(&target->i_mutex);
- if (d_unhashed(new_dentry))
- d_rehash(new_dentry);
dput(new_dentry);
}
if (!error)
--
I assume just to leave the dentry in the same stat we originally found it
This would work as well, yeah. I've no real preference, here...
thanks-
--
So we'd just come back through lookup to repopulate the existing
destination name that vfs_rename_dir() unhashed before calling
->rename() in the case that the rename fails? That seems gross, butIt'd work for my case, yeah.
- z
--
We are talking about an _extremely_ rare event. Even the
"vfs_rename_dir() with positive target" is very rare, let alone a
failing one.If we are going to worry about directory removal failure cases, we
should start with rmdir(), which is a wee bit more common, than the
above case here.Miklos
--
Agreed, which is why I called it relatively harmless.
- z
--
| Srivatsa Vaddagiri | Re: [PATCH, RFC] reimplement flush_workqueue() |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| debian developer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Rafael J. Wysocki | 2.6.26-rc7-git2: Reported regressions from 2.6.25 |
| Alexey Dobriyan | Re: [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Ilpo Järvinen | Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+ |
git: | |
