Re: xfs [_fsr] probs in 2.6.24.0

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Linda Walsh <xfs@...>
Cc: <xfs@...>, Linux-Kernel <linux-kernel@...>
Date: Tuesday, February 12, 2008 - 4:58 am

On Mon, Feb 11, 2008 at 05:02:05PM -0800, Linda Walsh wrote:

Filesystem bugs rarely hang systems hard like that - more likely is
a hardware or driver problem. And neither of the lockdep reports
below are likely to be responsible for a system wide, no-response
hang.

[cut a bunch of speculation and stuff about hardware problems
causing XFS problems]


If your hardware or drivers are unstable, then XFS cannot be
expected to reliably work. Given that xfs_fsr apparently triggers
the hangs, I'd suggest putting lots of I/O load on your disk subsystem
by copying files around with direct I/O (just like xfs_fsr does) to
try to reproduce the problem.

Perhaps by running xfs_fsr manually you could reproduce the
problem while you are sitting in front of the machine...

Looking at the lockdep reports:


dio_get_page() takes the mmap_sem of the processes
vma that has the pages we do I/O into. That's not new.
We're holding the xfs inode iolock at this point to protect
against truncate  and simultaneous buffered I/O races and
this is also unchanged.  i.e. this is normal.


munmap() dropping the last reference to it's vm_file and
calling ->release() which causes a truncate of speculatively
allocated space to take place. IOWs, ->release() is called
with the mmap_sem held. Hmmm....

Looking at it in terms of i_mutex, other filesystems hold
i_mutex over dio_get_page() (all those that use DIO_LOCKING)
so question is whether we are allowed to take the i_mutex
in ->release. I note that both reiserfs and hfsplus take 
i_mutex in ->release as well as use DIO_LOCKING, so this
problem is not isolated to XFS.

However, it would appear that mmap_sem -> i_mutex is illegal
according to the comment at the head of mm/filemap.c. While we are
not using i_mutex in this case, the inversion would seem to be
equivalent in nature.

There's not going to be a quick fix for this.

And the other one:


Looks like yet another false positive. Basically we do this
in xfs_swap_extents:

	inode A: i_iolock class 2
	inode A: i_ilock class 2
	inode B: i_iolock class 3
	inode B: i_ilock class 3
	.....
	inode A: unlock ilock
	inode B: unlock ilock
	.....
	inode B: ilock class 3

And lockdep appears to be complaining about the relocking of inode A
as class 2 because we've got a class 3 iolock still held, hence
violating the order it saw initially.  There's no possible deadlock
here so we'll just have to add more hacks to the annotation code to make
lockdep happy.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
xfs [_fsr] probs in 2.6.24.0, Linda Walsh, (Mon Feb 11, 9:02 pm)
Re: xfs [_fsr] probs in 2.6.24.0, David Chinner, (Tue Feb 12, 4:58 am)
Re: xfs [_fsr] probs in 2.6.24.0, Linda Walsh, (Tue Feb 12, 5:02 pm)
Re: xfs [_fsr] probs in 2.6.24.0, David Chinner, (Tue Feb 12, 5:59 pm)
Re: xfs [_fsr] probs in 2.6.24.0, Eric Sandeen, (Tue Feb 12, 5:24 pm)
Re: xfs [_fsr] probs in 2.6.24.0, Linda Walsh, (Tue Feb 12, 5:44 pm)
Re: xfs [_fsr] probs in 2.6.24.0, Eric Sandeen, (Tue Feb 12, 5:54 pm)