non-zero exclusive count(Re: locking against myself in getcacheblk()?)

Previous thread: wlan config crash by Edward Smith on Friday, December 24, 2010 - 12:52 pm. (1 message)

Next thread: hammer assertion panic by Peter Avalos on Sunday, December 26, 2010 - 4:17 pm. (1 message)
From: YONETANI Tomokazu
Date: Sunday, December 26, 2010 - 9:00 am

Hi.
I managed to trigger this panic while trying to find out how to
reliably reproduce another panic (lockmgr: non-zero exclusive count).
Just issue the following command on a machine running recent -DEVELOPMENT
(either on i386 or x86_64):

$ grep -r --mmap SomeString /usr/pkgsrc

If this won't trigger the panic but you need kern/vmcore, please let me
know and I can upload mine to my leaf account.

Best Regards,
YONETANI Tomokazu.
From: Matthew Dillon
Date: Sunday, December 26, 2010 - 12:09 pm

How much ram do you have in that machine (so I can reproduce the test).
    I have a feeling that cache cycling of VM pages might tend to trigger
    the (second) panic more often.

    non-zero exclusive counts usually mean an extra lock release or a
    missing lock acquisition for a lockmgr lock.  It can be a little
    trickly if it is a vnode lock since a completely unrelated bit of
    code might be causing the situation and then later on a perfectly
    fine piece of code triggers it.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
From: YONETANI Tomokazu
Date: Sunday, December 26, 2010 - 9:57 pm

As for non-zero exclusive count panic, it's odd because although
lockmgr() triggered the non-zero exclusive count panic, kgdb shows
that lkp->lk_exclusivecount == 0 (shown below).  Usually the panic is
followed by several more panics even during the dump routine is running,
but luckily I managed to save the crash dump and put it on my leaf account
as ~y0netan1/crash/{kern,vmcore}.21 (these are saved yesterday, so this
kernel doesn't have the patch you suggested in another message yet).

				:
#14 0xffffffff802a6cc4 in panic (
    fmt=0xffffffff80508548 "lockmgr: non-zero exclusive count")
    at /usr/obj/src.test/sys/kern/kern_shutdown.c:783
#15 0xffffffff80298497 in lockmgr (lkp=0xffffffe060e72878,
    flags=<value optimized out>) at /usr/obj/src.test/sys/kern/kern_lock.c:369
#16 0xffffffff8042e387 in vm_map_find (map=0xffffffe060e72800, object=0x0,
    offset=0, addr=0xffffffe060410ac0, length=32768, align=4096, fitit=1,
    maptype=1 '\001', prot=3 '\003', max=7 '\a', cow=0)
    at /usr/obj/src.test/sys/vm/vm_map.c:1201
#17 0xffffffff8043175a in vm_mmap (map=0xffffffe060e72800,
    addr=0xffffffe060410ac0, size=<value optimized out>, prot=3 '\003',
    maxprot=7 '\a', flags=<value optimized out>, handle=0x0, foff=0)
    at /usr/obj/src.test/sys/vm/vm_mmap.c:1362
#18 0xffffffff80431d61 in kern_mmap (vms=0xffffffe060e72800,
    uaddr=<value optimized out>, ulen=<value optimized out>,
    uprot=<value optimized out>, uflags=<value optimized out>, fd=-1, upos=0,
    res=0xffffffe060410b58) at /usr/obj/src.test/sys/vm/vm_mmap.c:397
				:
(kgdb) fr 15
#15 0xffffffff80298497 in lockmgr (lkp=0xffffffe060e72878,
    flags=<value optimized out>) at /usr/obj/src.test/sys/kern/kern_lock.c:369
369                             panic("lockmgr: non-zero exclusive count");
(kgdb) p *lkp
$1 = {lk_spinlock = {lock = 0}, lk_flags = 3146240, lk_sharecount = 1,
  lk_waitcount = 3, lk_exclusivecount = 0, lk_unused1 = 0,
  lk_wmesg = 0xffffffff80537872 "thrd_sleep", lk_timo = 0,
  lk_lockholder ...
From: Matthew Dillon
Date: Monday, December 27, 2010 - 10:21 am

Here's a bug fix for the second bug... the recursive lock --mmap issue.
    Please test it (with the originally suggested #if 0 workaround removed).
    This fix is for HAMMER, if it works I'll do the same thing w/ NFS and UFS.

	fetch http://apollo.backplane.com/DFlyMisc/buf01.patch

    We still have a recursive lock or deadlock possibility with write()'s
    which isn't quite so easy to fix.  I'm not quite sure what to do there.

    I'll review the bisect range for the first bug and see if I can find
    where the missing lock pairing is.

						-Matt
From: YONETANI Tomokazu
Date: Monday, January 3, 2011 - 7:01 pm

Apparently the previous versions (I haven't tried other versions than
2.8-RELEASE yet) are affected too, and the above patch seems to fix it
as well.
From: Matthew Dillon
Date: Monday, December 27, 2010 - 4:01 pm

Hmm.  The inconsistency is probalby due to the secondary panics.
    Run 'dmesg -M vmcore.21 -N kern.21', two panics occured at the same
    time and their kprintf output got mixed in together.  Once a panic
    occurs numerous subsystems will allow locks through whether they can
    acquire them or not.

    Now on the non-zero exclusive count issue.  I don't see any mismatched
    lock/unlock operations but I suspect it may be the TDF_DEADLKTREAT
    code allowing an exclusive lock to worm its way inbetween an
    EXCLUPGRADE request (because it doesn't block on LK_WANT_UPGRADE in
    that TDF_DEADLKTREAT case).

    The way to test this is to comment out the setting of TDF_DEADLKTREAT
    in kern/kern_subr.c.  There's just one place where it happens.

    If that turns out to fix the problem then I think we may be forced
    to get rid of the shared->exclusive upgrade code entirely or restrict
    it to less sophisticated cases.  The whole shared->exclusive priority
    mechanism in lockmgr has had deadlock issues for a long time.

						-Matt


From: Matthew Dillon
Date: Sunday, December 26, 2010 - 12:39 pm

Ah!!  I reproduced it easily w/ 8G ram on my 64-bit test box!

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>
From: Matthew Dillon
Date: Sunday, December 26, 2010 - 1:07 pm

Ok, this grep caused a 'lockmgr: locking against myself' panic
    due to the grep code doing a read() system call INSIDE the same
    mmap()'d file, causing the uiomove/copyout call to overlap the
    buffer cache buffer being held by the read().

    Hmm.  It looks like it hit a read-ahead-mark and called readrest
    on a valid page.  This should be an allowed operation but I'm going
    to think a bit on how to fix it.. I'll probably have to add a hold_count
    field to the buffer cache buf structure.

    You should be able to temporarily work around the bug by commenting
    out the PG_RAM test on line 1083 of vm/vm_fault.c:

#if 0
	if (fs->m->flags & PG_RAM) {
		if (debug_cluster)
			kprintf("R");
		vm_page_flag_clear(fs->m, PG_RAM);
		goto readrest;
	}
#endif

    This code isn't the bug, but it is probably triggering the bug.  If
    that fixes this secondary issue for you you can go back to finding
    the primary panic you were trying to track down and I will continue
    looking into this one figuring out how to fix it properly.

						-Matt

Previous thread: wlan config crash by Edward Smith on Friday, December 24, 2010 - 12:52 pm. (1 message)

Next thread: hammer assertion panic by Peter Avalos on Sunday, December 26, 2010 - 4:17 pm. (1 message)