For a long time I'm bitten by a bad interaction of mount -o remount,ro and quota operations. The sequence is as follows: mount /fs quotaon -ug /fs mount -o remount,ro /fs umount /fs At this point, umount never returns. /proc/$pid/wchan shows vfs_quota_off: Feb 6 20:53:25 linux kernel: umount D e5183eb8 0 8646 1 Feb 6 20:53:25 linux kernel: e5183ecc 00000086 00000002 e5183eb8 e5183eb0 00000000 c1db2540 c1db2684 Feb 6 20:53:25 linux kernel: c1db2684 c1c0dd00 00000000 cfd9f1c0 c0367080 c0367080 f5849000 f7f06880 Feb 6 20:53:25 linux kernel: f7e89d80 00000000 c0367080 b7c9795c 005f3997 00000000 000000ff 00000000 Feb 6 20:53:25 linux kernel: Call Trace: Feb 6 20:53:25 linux kernel: [<c01a2a65>] vfs_quota_off+0x345/0x490 Feb 6 20:53:25 linux kernel: [<c013a3a0>] autoremove_wake_function+0x0/0x50 Feb 6 20:53:25 linux kernel: [<c0174bf6>] deactivate_super+0x46/0x80 Feb 6 20:53:25 linux kernel: [<c0188bba>] sys_umount+0x4a/0x240 Feb 6 20:53:25 linux kernel: [<c017637f>] sys_stat64+0xf/0x30 Feb 6 20:53:25 linux kernel: [<c0162069>] remove_vma+0x39/0x50 Feb 6 20:53:25 linux kernel: [<c0162b67>] do_munmap+0x197/0x1f0 Feb 6 20:53:25 linux kernel: [<c0188dc5>] sys_oldumount+0x15/0x20 Feb 6 20:53:25 linux kernel: [<c010417e>] sysenter_past_esp+0x5f/0x85 The filesystem is ext3. The issue is here for a long time, at least since before 2.6.20, and is still present in 2.6.23 (I'll try 2.6.24 later today). Can it be fixed please? :) Thanks! /mjt --
Of course, thanks for report :). The problem is we allow remounting read only which we should refuse when quota is enabled. I'll fix that in a minute. Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs --
Jan Kara wrote: [deadlock after remount-ro followed with umount when Hmm. While that will prevent the lockup, maybe it's better to perform an equivalent of quotaoff on mount-ro instead? Or even do something more useful, like flush the quota stuff like the rest of the filesystem is flushed to disk, so that on umount, quota will not stay on the way... I mean, why it locks in the first place? Quota subsystem trying to write something into an read-only filesystem? If so, WHY it is trying to do that on umount instead on a remount-ro? Thanks! /mjt --
We couldn't leave quota on when filesystem is remounted ro because we need to modify quotafile when quota is being turned off. We could turn off quotas when remounting read-only. As we turn them off during umount, it Actually, I couldn't reproduce the hang on my testing machine so I don't know exactly why it hangs. But my guess is that it's because we try to write to the filesystem... Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR --
Objection. XFS handles quotas differently that does not involve modifying a file on the fs, so quotas could stay on (even if it does not make much sense) while the fs is ro. (Hm, storing quota as files reminds me of the ugly xattr hack in reiserfs3.) --
Yes, but XFS doesn't give a damn about what we do in VFS with quotas ;) So we are speaking here only about quotas implemented in VFS and these need writing. BTW: When filesystem is remounted read-only, quota information shouldn't change so it doesn't matter whether you turn it off or leave it on. The only difference is that when you later remount rw, you have to turn Oh yes... there are some similarities ;). But quota was first! ;) Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR --
Wrong and right. XFS uses files for the backing store for quota information, though it does handle quotas very differently to every other Linux filesystem. On remount-ro, we simply flush all the dirty dquots to their backing store so the dquots can then be treated as ro just like every other object in the filesystem. You don't need to turn off quotas to do this.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --
Jan Kara wrote: I can't reproduce it here easily as well. Yesterday I had a locked-up console and had to hard-reboot the machine due to this (it was far from first time when I've hit this issue), but "on-demand reproducing" don't work (the uptime on that host was about 100 days, and I had to do some repartition - hence remount-ro to copy consistent data to other place - maybe during that 100 day there was something... ;) And I wasn't able to reproduce it on 2.6.24 so far, as well (this one is only used on a test machine so far). I'll keep trying ;) Thanks for your support! /mjt --
