On Mon, 2007-11-05 at 15:40 +0000, Hugh Dickins wrote:I've never actually seen this happen in practice, but I do know exactly what you're talking about. Actually, I think your s/while/if/ change is probably a decent fix. Barring any other races, that loop should always have made progress on mnt->__mnt_writers the way it is written. If we get to: ----------------->HERE and don't have a positive mnt->__mnt_writers, we know something is going badly. We WARN_ON() there, which should at least give an earlier warning that the system is not doing well. But it doesn't fix the inevitable. Could you try the attached patch and see if it at least warns you earlier? I have a decent guess what the bug is, too. In the unionfs code: The r/o bind mount code will mnt_drop_write() on that file's f_vfsmnt at __fput() time. Since that code never got a write on the mount, we'll see an imbalance if the file was opened for a write. I don't see this file's mnt set anywhere, so I'm not completely sure that this is it. In any case, rolling your own 'struct file' without using alloc_file() and friends is a no-no. BTW, I have some "debugging" code in my latest set of patches that I think should fix this kind of imbalance with the mnt->__mnt_writers(). It ensures that before we do that mnt_drop_write() at __fput() that we absolutely did a mnt_want_write() at some point in the 'struct file's life. -- Dave linux-2.6.git-dave/fs/namespace.c | 31 ++++++++++++++++++++++--------- linux-2.6.git-dave/include/linux/mount.h | 1 + 2 files changed, 23 insertions(+), 9 deletions(-) diff -puN fs/namei.c~fix-naughty-loop fs/namei.c diff -puN fs/namespace.c~fix-naughty-loop fs/namespace.c --- linux-2.6.git/fs/namespace.c~fix-naughty-loop 2007-11-05 08:03:59.000000000 -0800 +++ linux-2.6.git-dave/fs/namespace.c 2007-11-05 08:35:06.000000000 -0800 @@ -225,16 +225,29 @@ static void lock_and_coalesce_cpu_mnt_wr */ static void handle_write_count_underflow(struct vfsmount *mnt) { - while (atomic_read(&mnt->__mnt_writers) < - MNT_WRITER_UNDERFLOW_LIMIT) { - /* - * It isn't necessary to hold all of the locks - * at the same time, but doing it this way makes - * us share a lot more code. - */ - lock_and_coalesce_cpu_mnt_writer_counts(); - mnt_unlock_cpus(); + if (atomic_read(&mnt->__mnt_writers) >= + MNT_WRITER_UNDERFLOW_LIMIT) + return; + /* + * It isn't necessary to hold all of the locks + * at the same time, but doing it this way makes + * us share a lot more code. + */ + lock_and_coalesce_cpu_mnt_writer_counts(); + /* + * If coalescing the per-cpu writer counts did not + * get us back to a positive writer count, we have + * a bug. + */ + if ((atomic_read(&mnt->__mnt_writers) < 0) && + !(mnt->mnt_flags & MNT_IMBALANCED_WRITE_COUNT)) { + printk("leak detected on mount(%p) writers count: %d\n", + mnt, atomic_read(&mnt->__mnt_writers)); + WARN_ON(1); + /* use the flag to keep the dmesg spam down */ + mnt->mnt_flags |= MNT_IMBALANCED_WRITE_COUNT; } + mnt_unlock_cpus(); } /** diff -puN include/linux/mount.h~fix-naughty-loop include/linux/mount.h --- linux-2.6.git/include/linux/mount.h~fix-naughty-loop 2007-11-05 08:22:21.000000000 -0800 +++ linux-2.6.git-dave/include/linux/mount.h 2007-11-05 08:28:20.000000000 -0800 @@ -32,6 +32,7 @@ struct mnt_namespace; #define MNT_READONLY 0x40 /* does the user want this to be r/o? */ #define MNT_SHRINKABLE 0x100 +#define MNT_IMBALANCED_WRITE_COUNT 0x200 /* just for debugging */ #define MNT_SHARED 0x1000 /* if the vfsmount is a shared mount */ #define MNT_UNBINDABLE 0x2000 /* if the vfsmount is a unbindable mount */ _ - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Tejun Heo | [PATCH 2/5] sysfs: simplify sysfs_rename_dir() |
| Andi Kleen | [PATCH x86] [0/16] Various i386/x86-64 changes |
| Dave Hansen | Re: [RFC/PATCH] Documentation of kernel messages |
git: | |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| David Miller | [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Thomas Gleixner | Re: [BUG] New Kernel Bugs |
