I'm getting similar errors on an x86-32 & x86-64 kernel. The x86-64 system
(2nd log below w/date+times) was unusable this morning: one or more of the
xfs file systems had "gone off line" due to some unknown error (upon reboot,
no errors were indicated; all partitions on the same physical disk).I keep ending up with random failures on two systems. The 32-bit sys more
often than not just "locks-up" -- no messages, no keyboard response...etc.
Nothing to do except reset -- and of course, nothing in the logs....I'm turning on all the stat and diagnostics that don't seem to have a noted
performance penalty (or not much of one) to see if that helps. Perhaps one
issue is the "bug" (looks like multiple instances of the same bug) in xfs.The 32-bit machine does lots of disk activity in early morning hours...and
coincidentally, it seemed to be crashing (not consistently) in the morning.
With 2.6.24, the 32-bit machine "seems" a bit more stable -- up for over 3
days now (average before was <48 hours). But the 64-bit machine went
bonkers (not sure of exact time) -- It *had* been stable (non crashing,
anyway) before 2.6.24.I vaguely remember there being a similar xfs lock bug a few versions back
and was told just "not to worry about and turn off the lock checking to
"avoid the problem"....( :^| ). So I did, but still trying to track down
randomness. So I turn back on what checks I could and ... ding -- still a
problem in the xfs area -- which "coincidentally" (has happened on more than
one occasion), an xfs partition developed a run-time error and shut itself
down. This also happened on the 32-bit machine with the SATA disk (thought
it might be sata specific), so removed its controller and disk and threw in
a same-size PATA. No more file-system errors on 32bit. But first time I've
had a filesystem 'forced offline' on the 64bit (but just switched to 2.6.24
recently for obvious reasons).Sadly -- it also seemed to be the case that the 32bit machine when it had
the SATA c...
Filesystem bugs rarely hang systems hard like that - more likely is
a hardware or driver problem. And neither of the lockdep reports
below are likely to be responsible for a system wide, no-response
hang.[cut a bunch of speculation and stuff about hardware problems
If your hardware or drivers are unstable, then XFS cannot be
expected to reliably work. Given that xfs_fsr apparently triggers
the hangs, I'd suggest putting lots of I/O load on your disk subsystem
by copying files around with direct I/O (just like xfs_fsr does) to
try to reproduce the problem.Perhaps by running xfs_fsr manually you could reproduce the
problem while you are sitting in front of the machine...dio_get_page() takes the mmap_sem of the processes
vma that has the pages we do I/O into. That's not new.
We're holding the xfs inode iolock at this point to protect
against truncate and simultaneous buffered I/O races andmunmap() dropping the last reference to it's vm_file and
calling ->release() which causes a truncate of speculatively
allocated space to take place. IOWs, ->release() is called
with the mmap_sem held. Hmmm....Looking at it in terms of i_mutex, other filesystems hold
i_mutex over dio_get_page() (all those that use DIO_LOCKING)
so question is whether we are allowed to take the i_mutex
in ->release. I note that both reiserfs and hfsplus take
i_mutex in ->release as well as use DIO_LOCKING, so this
problem is not isolated to XFS.However, it would appear that mmap_sem -> i_mutex is illegal
according to the comment at the head of mm/filemap.c. While we are
not using i_mutex in this case, the inversion would seem to be
equivalent in nature.There's not going to be a quick fix for this.
Looks like yet another false positive. Basically we do this
in xfs_swap_extents:inode A: i_iolock class 2
inode A: i_ilock class 2
inode B: i_iolock class 3
inode B: i_ilock class 3
.....
inode A: unlock ilock
inode B: unlock ilock
inode B: ilock class 3And...
Well, tickless is new and shiny and I doubt anyone has done
much testing with XFS on tickless kernels. Still, if that's a new
config option you set, change it back to what you had for .23 onIf you have a multithreaded application that mixes mmap and
direct I/O, and you have a simultaneous munmap() call and
read() to the same file, you might be able to deadlock access
to that file. However, you'd have to be certifiably insane
to write an application that did this (mix mmap and direct I/O
to the same file at the same time), so I think exposure isThat's client side direct I/O, which is not what the server
does. Client side direct I/O results in synchronous buffered
I/O on the server, which will thrash your disks pretty hard.It prevents a single thread deadlock when doing transaction reservation.
i.e. the process of setting up a transaction can require the ilock
to be taken, and hence we have to drop it before and pick it back up
after the transaction reservation.We hold on to the iolock to prevent the inode from having new I/O
started while we do the transaction reservation, so it's in the
same state after the reservation as it was before....We have to hold both locks to guarantee exclusive access to the
inode, so once we have the reservation we need to pick the ilocks
back up. The way we do it here does not violate lock ordering at all
(iolock before ilock on a single inode, and ascending inode number
order for multiple inodes), but lockdep is not smart enough to know
that. Hence we need more complex annotations to shut it up.Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
4k stacks?
-Eric
--
----
But but but...almost from the day they were introduced. And
-- guess it doesn't work so well?If they are that problematic, maybe selecting xfs as a config
option should force 8k stacks (ugly solution, but might eliminate
some lost hair (from pulling it out) for end users....?Guess I should go back to 8k's for now...seems odd that it'd
pop up now, but maybe it's the xtra NFS loading? Sigh.
--
Resource requirements grow over time, film at 11? :)
the checker is a random thing, it checks only on interrupts; it won't
always hit. you could try CONFIG_DEBUG_STACK_USAGE too, each thread
prints max stack used when it exits, to see if you're getting close on
normal usage.Or just use 8k.
-Eric
--
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Willy Tarreau | Re: Linux 2.6.21 |
| Jan Kundrát | kswapd high CPU usage with no swap |
git: | |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| David Miller | Re: [PATCH] tcp: splice as many packets as possible at once |
