Re: panic: System call lstat returning with 1 locks held

Previous thread: [head tinderbox] failure on i386/pc98 by FreeBSD Tinderbox on Tuesday, January 15, 2008 - 8:52 am. (1 message)

Next thread: [head tinderbox] failure on powerpc/powerpc by FreeBSD Tinderbox on Tuesday, January 15, 2008 - 9:58 am. (1 message)
To: <freebsd-current@...>
Date: Tuesday, January 15, 2008 - 9:52 am

When I boot a Jan 13th or Jan 15th kernel, and then run
/usr/local/etc/cvsup/update.sh to update the local CVS repository, I
get the following panic:

panic: System call lstat returning with 1 locks held
cpuid = 0
KDB: enter: panic
[thread ; pid 1240 tid 10031]
stopped at kdb_enter+0x3d: movq $0,0x41b048(%rip)
db> show alllocks
db> show locks
db> bt
tracing pid 1240 tid 10031 td 0xffffff001c1ad360
kdb_enter() at kdb_enter+0x3d
panic() at panic+0x176
syscalls() at syscalls+0x66d
Xfast_syscalls() at Xfast_syscalls+0xab
--- syscall (0, FreeBSD ELF64, nosys), rip = 0x8009e87ec, rsp=
0x72ec50, rbp = 0x72ed28 ---

----
$ strings /boot/kernel_hp_debug/kernel | grep CURRENT
@(#)FreeBSD 8.0-CURRENT #0: Tue Jan 15 01:30:50 CST 2008
FreeBSD 8.0-CURRENT #0: Tue Jan 15 01:30:50 CST 2008
8.0-CURRENT

$ strings /boot/kernel_hp_debug.old/kernel | grep CURRENT
@(#)FreeBSD 8.0-CURRENT #0: Sun Jan 13 13:12:56 CST 2008
FreeBSD 8.0-CURRENT #0: Sun Jan 13 13:12:56 CST 2008
8.0-CURRENT
---

When I try to look at the core file that gets generated, kgdb is
having problems reading it:

hp010# cd /sys/amd64/compile/DV8135NR
hp010# kgdb -n 14 kernel.debug
kgdb: kvm_read: invalid address (0x1050000)
[GDB will not be able to debug user-mode threads:
/usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".
Ready to go. Enter 'tr' to connect to the remote target
with /dev/cuad0, 'tr /dev/cuad1' to connect to a different port
or 'trf portno' to connect to the remote target with the firewire
interface. portno defaults to 5556.

Type 'getsyms' after connection to load kld symbols.

If you'...

To: Scot Hetzel <swhetzel@...>
Cc: <freebsd-current@...>
Date: Tuesday, January 15, 2008 - 10:39 am

I think this could be related to the recent vn_lock()/VOP_LOCK() KPI change=
s.
Please, add DEBUG_VFS_LOCKS to the kernel config, and do the
show lockedvnods
from the ddb prompt when the panic occurs. The witness does not track

To: Kostik Belousov <kostikbel@...>
Cc: Attilio Rao <attilio@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 8:28 am

I think I'm seeing the same panic on UFS. It's rather nasty: I
cannot rebuild CURRENT natively due to it so I have to build it
under 6-STABLE. My favourite way to trigger the panic reliably is
running `make install' in a simple port directory, e.g., portmaster,
but my system also panics during daily scripts run and, as already
said, if trying to build world.

Now my kernel config is:

include GENERIC
ident "_LOCKTEST"
options DEBUG_VFS_LOCKS

Attached is the debug output after the panic. Alas, no locked
vnodes are shown. How can I help to investigate this issue
further?

Yar

panic: System call lstat returning with 1 locks held
cpuid = 0
KDB: enter: panic
[thread pid 2024 tid 100102 ]
Stopped at kdb_enter+0x3a: movl $0,kdb_why
db> show lockedvnods
Locked vnodes
db> where
Tracing pid 2024 tid 100102 td 0xc3720000
kdb_enter(c0b0bdc0,c0b0bdc0,c0b3c319,d6218c8c,0,...) at kdb_enter+0x3a
panic(c0b3c319,c0b11902,1,c0b11902,c0bbf390,...) at panic+0x12c
syscall(d6218d38) at syscall+0x46e
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (0, FreeBSD ELF32, nosys), eip = 0x2815da8b, esp = 0xbfbfe86c, ebp = 0xbfbfe8f8 ---
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 9:18 am

Yar,
as it seems reproducible for you, can you please add this patch to the tree:
http://www.freebsd.org/~attilio/debug_tdlocks.diff

compile your kernel with:
options KTR
options KTR_COMPILE=(KTR_SPARE2)
options KTR_MASK=(KTR_SPARE2)
options KTR_ENTRIES=32768

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: <freebsd-current@...>
Date: Saturday, January 26, 2008 - 12:15 am

I added the above options to my kernel, and performed a scripted textdump.

/sbin/ddb script lockinfo="show locks; show alllocks; show lockedvnods"
/sbin/ddb script kdb.enter.panic="textdump set; capture on; show ktr ;
run lockinfo ; show pcpu; bt; ps; alltrace; capture off; call doadump;
reset"

After the kernel paniced, the kdb.enter.panic script ran and created a
textdump. When I extracted the 2.7M ddb.txt file, it didn't show any
calls to lockmgr_disown in the ktr trace.

Let me know if there is anything else that I can do.

To get this dump, DB_CAPTURE_MAXBUFSIZE (sys/ddb/db_capture.c) needed
to be increased from its default of 512K to 5M, and then setting the
debug.ddb.capure.bufsize to 5M after rebooting with the new kernel.

See PR 119993 (http://www.freebsd.org/cgi/query-pr.cgi?pr=119993)
which adds two new kernel options to allow the capture buffer size to
be changed at compile time.

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Scot Hetzel <swhetzel@...>
Cc: <freebsd-current@...>
Date: Saturday, January 26, 2008 - 9:09 am

Scot,
thanks a lot for your effort.

I think I will produce more patch which can help for diagnosis very
soon so that we can gather more informations and see what is the real

Oh, there is a PR bugathon too, this week... :)

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 1:47 pm

The first lines of the ktr output are attached. I can try to extract
all 32768 lines if needed, but they all look almost the same. I.e.,
the function name and the td pointer are the same while td_locks
bump is 1 or -1.

--
Yar

db> show ktr
430 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
429 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
428 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
427 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
426 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
425 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
424 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
423 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
422 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
421 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
420 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
419 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
418 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
417 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
416 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
415 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
414 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
413 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
412 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
411 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
410 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
409 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
408 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
407 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of -1 td_locks
406 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
405 (0xc37e1220:cpu0): _lockmgr: 0xc37e1220 bumping of 1 td_locks
404 (0xc37e1220:cpu0): _lockmgr: 0x...

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 1:54 pm

[Empty message]
To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 10:58 am

Thank you for your instant response!

The patched kernel is already being built, but I've got the following
question in the meanwhile: Should I have updated my kernel to get your
latest changes to kern_lock.c? Now my local copy of kern_lock.c is at
rev. 1.119, i.e., 1 revision behind today's change.

--
Yar
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 11:07 am

As long as this patch still applies, it should not any meaningful difference.

Thanks a lot for your effort of testing and reporting bugs!

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Friday, January 25, 2008 - 3:55 am

I don't deserve these kind words because I disinformed you seriously.

My panic appears to be related not to UFS, but to NTFS. Namely I
have an NTFS volume mounted read-only at /ntfs. I have no idea why
the ports framework touches the /ntfs sub-tree, but not mounting
it in the first place makes the panic go away. (I still wonder why
my system would also panic during buildworld, which should not touch
my /ntfs at all... Now I'll try to do a buildworld w/o /ntfs mounted.)

At the same time, dismounting the NTFS volume leads to an instant
panic of a similar kind:

panic: System call unmount returning with 5 locks held

More debug output is attached.

So at least UFS doesn't seem affected by the panic. Excuse me for
my having provided wrong info.

--
Yar

panic: System call unmount returning with 5 locks held
cpuid = 0
KDB: enter: panic
[thread pid 985 tid 100085 ]
Stopped at kdb_enter+0x3a: movl $0,kdb_why
db> show lockedvn
Locked vnodes
db> sh ktr
678 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
677 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
676 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
675 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of 1 td_locks
674 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
673 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of 1 td_locks
672 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
671 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of 1 td_locks
670 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
669 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of 1 td_locks
668 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
667 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of 1 td_locks
666 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
665 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of 1 td_locks
664 (0xc37dd000:cpu0): _lockmgr: 0xc37dd000 bumping of -1 td_locks
663 (0xc37dd000:cpu0)...

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Friday, January 25, 2008 - 4:00 am

Do you see any call to lockmgr_disown() in this ktr trace?
Can you past relevant lines of it, otherwise?

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Saturday, January 26, 2008 - 10:29 am

No doubt. :-) But the urgency of this problem appears much lower
than that I estimated in the first place--fortunately. Broken UFS

I've uploaded the full "show ktr" outputs for the lstat- and
umount-triggered panics there: http://people.freebsd.org/~yar/debug/

Here's their summary:

$ awk '{print $3}' ktr_lstat.txt | sort | uniq -c
32752 _lockmgr:
8 sharelock:
8 shareunlock:

$ awk '{print $4}' ktr_lstat.txt | sort | uniq -c
16 0xc322ccc0
32752 0xc37e1220

$ awk '{print $3}' ktr_umount.txt | sort | uniq -c
28663 _lockmgr:
1901 lockmgr_disown:
1102 sharelock:
1102 shareunlock:

$ awk '{print $4}' ktr_umount.txt | sort | uniq -c
4550 0xc322ccc0
288 0xc3281220
14 0xc3282220
322 0xc3282660
104 0xc33e6220
2 0xc33e6440
4 0xc3520220
10 0xc3520440
24 0xc3772220
82 0xc3772aa0
7149 0xc3772cc0
358 0xc3774000
17766 0xc3774220
288 0xc3774440
1058 0xc3774660
719 0xc3774880
30 0xc3775000

That is, I lied again, sorry: There were calls to functions other
than _lockmgr. But the ktr log for umount looks much more interesting
than that for lstat.

I'm ready to do more debug runs if needed--instructions are welcome.

Thank you!

--
Yar
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Attilio Rao <attilio@...>, Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Saturday, January 26, 2008 - 12:42 pm

IMO if we're going to ship NTFS support in the base it should actually
function, or at minimum not panic the box. As I reported earlier, I can
panic my -current system with 100% reliability with fairly light access
to an NTFS volume, which I consider to be a fairly large problem, at
least for my personal usage pattern.

Doug

--

This .signature sanitized for your protection
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Doug Barton <dougb@...>
Cc: Attilio Rao <attilio@...>, Yar Tikhiy <yar@...>, Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Saturday, January 26, 2008 - 7:02 pm

This is a symptom of a general problem, "things that we doubt work very
well". What's your suggestion on how we can flag these?

The traditional argument is that if we don't ship code in the base, it
will never get tested. IMHO it would a big change to turn off everything
that isn't 100% solid.

I'm open to suggestions.

mcl
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Doug Barton <dougb@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Saturday, January 26, 2008 - 1:18 pm

I'm not sure now, are you referring to some problems introduced by my
patches or not?

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Saturday, January 26, 2008 - 7:09 pm

Not sure of the timeline. I think this is the most relevant post on
the matter, let me know if there is anything I can do to help diagnose
this.

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=644445+0+archive/2008/freeb...

--

This .signature sanitized for your protection

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Doug Barton <dougb@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Saturday, January 26, 2008 - 8:57 pm

As my really first commit about VFS happened on 28 december (and it
should also be a nop), and you reported a 23 december kernel, it seems
like the problem was alredy there by time.

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Sunday, January 27, 2008 - 12:40 am

Ok, so you're off the hook. :) Interested in helping track down why
it's panic'ing?

Doug

--

This .signature sanitized for your protection

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Doug Barton <dougb@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Sunday, January 27, 2008 - 10:42 am

Sure, I'm testing a patch which instruments lockmgr with ktr and
witness support.
I will post later in the day in order to make consumer-available.

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Tuesday, January 29, 2008 - 6:08 pm

ok. FYI I tried torture-testing it today, and thing are looking a
little better. It only paniced once, with the same message as in the
subject but it was stat, not lstat. Unfortunately it didn't actually
do the dump, so I don't have a backtrace. If I can get it to
panic&dump I'll let you know.

Doug

--

This .signature sanitized for your protection

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Doug Barton <dougb@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Tuesday, January 29, 2008 - 6:11 pm

Which fs? always NTFS?
I'm committing my WITNESS patch now to perforce so that other people
can hopefully stress-test it before to be committed.

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Wednesday, January 30, 2008 - 9:08 am

Do you think that that patch is applicable in my case? I.e., shall
I use it to get more debug info on my panics?

If so, where is the patched file in the depot?

Thanks!

--
Yar
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Wednesday, January 30, 2008 - 11:07 am

Sorry but I had to delay the operation so far.
In the end, a suitable patch is located here:
http://www.freebsd.org/~attilio/witness_lockmgr.diff

I tried it and it alredy reported 4 LORs just when booting the kernel :)
So I would expect reasonably LOR cascades with this patch.

If you all 3 (Scot, Yar and Doug) could try and test it I would
appreciate a lot.

Thanks,
Attilio

PS: This is the commit log to perforce:
Add WITNESS support to lockmgr.
A couple of notes:
- Two options have been added in order to serve WITNESS:
* LK_NOWITNESS which disables the support for the specified
lock
* LK_NODUP which disallows the usual DUPOK behaviour
assumed as the default with lockmgr
- In the case of lockmgr_disown() the lock is simply dropped.
This means that a printout won't show the lock held even if it
is basically held by LK_KERNPROC
- In the case of upgrade we can have 3 different cases:
* The shared lock is unheld but consequent acquisition
fails; in this case the lock is reported dropped
* We are the first upgrader so there is an effective
WITNESS_UPGRADE
* We are not the first upgrader so after the shared unlocking
we need to acquire the lock in exclusive mode; this will be
reported with 2 different WITNESS steps.
- In the case of LK_DRAIN the lock will be only checked about the
order but it won't be marked as acquired. This happens because a
drained lock is directly destroyed and not really released, so
witness_destroy() would badly panic in this case
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Wednesday, January 30, 2008 - 5:52 pm

Reading back to Doug's and Yar's messages regarding the NTFS
filesystem, I noticed that I am also mounting NTFS filesystems at boot
time. I disabled the mounting of the NTFS filesystems. When 'cd
/usr/ports ; find . -print' or '/usr/local/etc/cvsup/update.sh' is
run, the panic doesn't occur.

But when I mount the NTFS filesystem, and rerun the above commands,
they cause the lstat panic. Even though these commands are not
touching the NTFS filesystems.

Also mounting/unmounting a NTFS filesystem will cause a panic.

I applied the above patch to sources that were checked out about 2 hrs
ago. Rebuilt/installed kernel and rebooted.

If I don't mount a NTFS filesystem then the kernel doesn't panic when
the above commands are run.

But when the NTFS filesystem is mounted, the following lock order
reversal occurs:

lock order reversal:
1st 0xffffff0023285288 pseudofs (pseudofs) @ kern/vfs_subr.c:2061
2nd 0xffffff00232f2ca0 vfslock (vfslock) @ kern/vfs_subr.c:364
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_checkorder() at witness_checkorder+0x606
_lockmgr() at _lockmgr+0x4cb
vfs_busy() at vfs_busy+0xdf
vfs_donmount() at vfs_donmount+0x9aa
nmount() at nmount+0xa4
syscall() at syscall+0x1ce
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8
28, rbp = 0x65a9d0 ---
lock order reversal:
1st 0xffffff002347f668 ntfs (ntfs) @ kern/vfs_subr.c:2061
2nd 0xffffff00232f2650 vfslock (vfslock) @ kern/vfs_subr.c:364
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_checkorder() at witness_checkorder+0x606
_lockmgr() at _lockmgr+0x4cb
vfs_busy() at vfs_busy+0xdf
vfs_donmount() at vfs_donmount+0x9aa
nmount() at nmount+0xa4
syscall() at syscall+0x1ce
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x80079a57c, rsp = 0x7fffffffe8
28, rbp = 0x65ad80 ---

Instead of getting the lstat panic, I am now getting th...

To: Scot Hetzel <swhetzel@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Thursday, January 31, 2008 - 6:43 am

The assertion failing should not happen now.
Could you please hand-add a check in _lockmgr_disown()
(kern/kern_lock.c) in order to check for the panicstr before to call
WITNESS? I cannot access to perforce now and produce a suitable diff,
so you can just do this by hand:

if (lkp->lk_lockholder == td) {
if (panicstr != NULL)
WITNESS_UNLOCK(&lkp->lk_object, LOP_EXCLUSIVE, file, line);
td->td_locks--;
}

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Thursday, January 31, 2008 - 9:02 am

Shouldn't the test for panicstr be inverse: `panicstr == NULL'?
I guess we shouldn't call WITNESS when panicing, should we?
Sorry if I got it wrong.

--
Yar
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Thursday, January 31, 2008 - 9:04 am

Weee, you are right, sorry!

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Friday, February 1, 2008 - 2:41 am

I added this change to kern/kern_lock.c, but I'm still getting this
panic after mounting the ntfs filesystem, and using cvsup to update
the local mirror:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x8:0xffffffff80301051
stack pointer = 0x10:0xffffffffd43b9100
frame pointer = 0x10:0xffffffffd43b9190
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 1229 (cvsup)
panic: Assertion !mtx_owned(&w_mtx) failed at ../../../kern/subr_witness.c:959
cpuid = 0
Uptime: 4m38s
Physical memory: 2031 MB
Dumping 324 MB: 309 293 277 261 245 229 213 197 181 165 149 133 117
101 85 69 53 37 21 5

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Scot Hetzel <swhetzel@...>
Cc: Attilio Rao <attilio@...>, Kostik Belousov <kostikbel@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Friday, February 1, 2008 - 10:50 am

FWIW, the same panic happens in my case, too. In addition, reported
are a number of LORs I haven't seen before. The relevant kernel
message log from the serial console is attached. Thanks!

--
Yar

[...]
WARNING: WITNESS option enabled, expect reduced performance.
GEOM_LABEL: Label for provider ad0s1 is ntfs/SYSTEM.
GEOM_LABEL: Label for provider ad0s2 is ntfs/STORE.
lock order reversal:
1st 0xc2ecfe28 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2061
2nd 0xc2fbead4 devfsmount (devfsmount) @ /usr/src/sys/fs/devfs/devfs_vnops.c:20
1
KDB: stack backtrace:
db_trace_self_wrapper(c0b0fa49,d3cecbbc,c07a2c5e,c0b11f90,c2fbead4,...) at db_tr
ace_self_wrapper+0x26
kdb_backtrace(c0b11f90,c2fbead4,c0b0343a,c0b0343a,c0b0347b,...) at kdb_backtrace
+0x29
witness_checkorder(c2fbead4,9,c0b0347b,c9,c7,...) at witness_checkorder+0x6de
_sx_xlock(c2fbead4,0,c0b0347b,c9,c2fbead4,...) at _sx_xlock+0x7d
devfs_allocv(c2fbd700,c2fc3000,d3cecc28,c2d0fcc0,c0b17d73,...) at devfs_allocv+0
x144
devfs_root(c2fc3000,2,c0e1c118,c2d0fcc0,ca,...) at devfs_root+0x51
set_rootvnode(c0e1c100,0,c0b17d73,5ed,c07e0150,...) at set_rootvnode+0x2b
vfs_mountroot(c0dc9d50,4,c0b078ec,260,0,...) at vfs_mountroot+0x356
start_init(0,d3cecd38,c0b091cd,30c,c2d0dab0,...) at start_init+0x65
fork_exit(c0732530,0,d3cecd38) at fork_exit+0xb8
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xd3cecd70, ebp = 0 ---
Trying to mount root from ufs:/dev/ad0s3a
WARNING: / was not properly dismounted
lock order reversal:
1st 0xc2ecf9e8 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2061
2nd 0xc2fc3000 vfslock (vfslock) @ /usr/src/sys/kern/vfs_subr.c:364
KDB: stack backtrace:
db_trace_self_wrapper(c0b0fa49,d3cec9e0,c07a2c5e,c0b11f90,c2fc3000,...) at db_tr
ace_self_wrapper+0x26
kdb_backtrace(c0b11f90,c2fc3000,c0b17e71,c0b17e71,c0b1840e,...) at kdb_backtrace
+0x29
witness_checkorder(c2fc3000,1,c0b1840e,16c,151,...) at witness_checkorder+0x6de
_lockmgr(c2fc3000,2001,c2fc3030,c0b1840e,16c,...) at _lockmgr+0x174
vfs_bus...

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Friday, February 1, 2008 - 2:41 pm

[Empty message]
To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Tuesday, February 5, 2008 - 12:36 pm

BTW, I seem to be hitting yet another lockmgr-related panic.

My system won't reboot under certain conditions: it'll panic instead
in vfs_unmountall(). Namely it will panic if I reboot from single-user
mode while it'll reboot OK from multi-user mode. I have no idea
yet about the exact reason for the behaviour. The output from kgdb
attached.

Note that this panic shouldn't be NTFS-related as I can trigger it
by typing `reboot' immediately after booting into single user.

Thanks!

--
Yar

Waiting (max 60 seconds) for system process `vnlru' to stop...done
Waiting (max 60 seconds) for system process `bufdaemon' to stop...done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining...0 0 0 0 0 0 0 0 done
All buffers synced.
panic: lock (lockmgr) vfslock not locked @ /usr/src/sys/kern/vfs_mount.c:1317
cpuid = 0
KDB: enter: panic
panic: from debugger
cpuid = 0
Uptime: 3m6s
Physical memory: 499 MB
Dumping 31 MB: 16

#0 doadump () at pcpu.h:195
195 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) where
#0 doadump () at pcpu.h:195
#1 0xc0768d4e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:417
#2 0xc0769013 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:571
#3 0xc048f567 in db_panic (addr=Could not find the frame base for "db_panic".
) at /usr/src/sys/ddb/db_command.c:444
#4 0xc048ff6c in db_command (last_cmdp=0xc0bf8e54, cmd_table=0x0, dopager=1)
at /usr/src/sys/ddb/db_command.c:411
#5 0xc049007a in db_command_loop () at /usr/src/sys/ddb/db_command.c:464
#6 0xc049181d in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228
#7 0xc0792036 in kdb_trap (type=3, code=0, tf=0xd614aadc)
at /usr/src/sys/kern/subr_kdb.c:510
#8 0xc0a798cb in trap (frame=0xd614aadc) at /usr/src/sys/i386/i386/trap.c:647
#9 0xc0a5f2bb in calltrap () at /usr/src/sys/i386/i386/exception.s:146
#10 0xc07921ba in kdb_enter (why=0xc0b0cbb4 "panic", msg=0xc0b0cbb4 "panic")
a...

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Tuesday, February 5, 2008 - 12:38 pm

Yes, sorry for this.
I know of this and I alredy patched my tree.
I did the WITNESS stuff based on the assumption drained lock will
never be release while they are. Unmounting is currently the only one
consumer of LK_DRAIN I know of and I just need to do a WITNESS_LOCK()
over there.
I will provide you an updated patch about this problem and another one
found by kris@.

I will look later at the backtrace so we can go on.

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Tuesday, February 5, 2008 - 1:00 pm

More specifically, here is the "fixed" version:
http://www.freebsd.org/~attilio/witness_lockmgr2.diff

(against stock -CURRENT).
Kris and me tested (rather) this version and I found it reliable, so I
want to commit to CVS in a couple of hours.
If you can add any other feedback to it, I would be very happy.

Thanks,
Attilio

PS: this WITNESS patch reports 3-4 different LORs at boot-time and
other 3-4 at shutdown time...

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Wednesday, February 6, 2008 - 6:30 am

Feedback 1: The panic before reboot is gone. Thanks a lot!
Now returning to the NTFS issue, stay tuned... :-)

--
Yar
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Tuesday, February 5, 2008 - 12:22 pm

DDB was there (my kernel was GENERIC + DEBUG_VFS_LOCKS,) but it
failed, too. Fortunately, I've managed to save a dump with the
whole call stack. Attached is the respective output from kgdb,
showing multiple failures including the one in NTFS.

I'm keeping the dump so that I can dig deeper into it under your
guidance. Thanks!

--
Yar

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xdeadc0ee
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc07a0676
stack pointer = 0x28:0xd615a9a0
frame pointer = 0x28:0xd615a9a4
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 40 (umount)
panic: Assertion !mtx_owned(&w_mtx) failed at /usr/src/sys/kern/subr_witness.c:9 59
cpuid = 0
Uptime: 1m0s
Physical memory: 499 MB
Dumping 32 MB: 17 1

#0 doadump () at pcpu.h:195
195 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) where
#0 doadump () at pcpu.h:195
#1 0xc0768d4e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:417
#2 0xc0769013 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:571
#3 0xc07a2839 in witness_checkorder (lock=0xc0dd2c2c, flags=Variable "flags" is not available.
)
at /usr/src/sys/kern/subr_witness.c:959
#4 0xc075be7c in _mtx_lock_flags (m=0xc0dd2c2c, opts=0,
file=0xc0b0f79f "/usr/src/sys/kern/subr_eventhandler.c", line=212)
at /usr/src/sys/kern/kern_mutex.c:179
#5 0xc07903e9 in eventhandler_find_list (name=0xc0adccf5 "dcons_poll")
at /usr/src/sys/kern/subr_eventhandler.c:212
#6 0xc055fd88 in dcons_os_checkc (dc=0xc0c015a0)
at /usr/src/sys/dev/dcons/dcons_os.c:264
#7 0xc055feae in dcons_cngetc (cp=0xc0b6d9e0)
at /usr/src/sys/dev/dcons/dcons_os.c:473
#8 0xc07b57a8 in cncheckc () a...

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Tuesday, February 5, 2008 - 3:56 pm

Currently it is DDB which let it fail in witness after memory corruption.
But I'm more interested in the panic originator; so, as far as it is
unusable, can you please remove DDB option and try to get the panic
again? it should not give you the failing assertion without DDB.

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Wednesday, February 6, 2008 - 7:29 am

Sure, here it is, attached.

By the way, not that I want to stop helping you, but I can provide
you with a small NTFS image so that you can test the driver against
it by yourself and save a few round-trips. :-) The crash session
shown in the attachment was conducted using this NTFS image file:

http://people.freebsd.org/~yar/debug/ntfs.bz2

Thanks!

--
Yar

[causing the panic]

Enter full pathname of shell or RETURN for /bin/sh:
# dumpon /dev/ad0s3b
# mdconfig -a -f /root/ntfs
WARNING: opening backing store: /root/ntfs readoGnly
EOM_LABEL: Label for provider md0 is ntfs/TEST_NTFS.
md0
# mount -r -t ntfs /dev/md0 /mnt
# umount /mnt
lock order reversal:
1st 0xc30566b8 ntfs (ntfs) @ /usr/src/sys/kern/vfs_subr.c:2361
2nd 0xc2fd4924 ntnode (ntnode) @ /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_s
ubr.c:361
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xdeadc0ee
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0791e86
stack pointer = 0x28:0xd61559a0
frame pointer = 0x28:0xd61559a4
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = resume, IOPL = 0
current process = 39 (umount)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 1m0s
Physical memory: 499 MB
Dumping 32 MB: 17 1
Dump complete
Automatic reboot in 15 seconds - press a key on the console to abort

[post-mortem kgdb session]

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0xdeadc0ee
fault code = supervisor read, page not present
instruction pointer = 0x20:0xc0791e86
stack pointer = 0x28:0xd61559a0
frame pointer = 0x28:0xd61559a4
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, g...

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Wednesday, February 6, 2008 - 9:49 am

Want to see if this bt has been helpful? :)
Can you try the attached patch and see if kernel rings a bell?:
http://www.freebsd.org/~attilio/ntfs_debug.diff

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Wednesday, February 6, 2008 - 10:48 am

On Wed, Feb 06, 2008 at 02:49:49PM +0100, Attilio Rao wrote:

The kernel just panics. :-)

--
Yar

panic: ntfs_ntput: lock should be unreleased!
cpuid = 0
Uptime: 1m14s
Physical memory: 499 MB
Dumping 32 MB: 17 1

#0 doadump () at pcpu.h:195
195 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0 doadump () at pcpu.h:195
#1 0xc075ba7e in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:417
#2 0xc075bd09 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:571
#3 0xc2fe9b65 in ntfs_ntput (ip=Variable "ip" is not available.
)
at /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_subr.c:467
#4 0xc2fe76a4 in ntfs_reclaim (ap=0xd614eb04)
at /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_vnops.c:262
#5 0xc0a51205 in VOP_RECLAIM_APV (vop=0xc2fed320, a=0xd614eb04)
at vnode_if.c:1566
#6 0xc07d84ff in vgonel (vp=0xc2fd3990) at vnode_if.h:819
#7 0xc07d9fb7 in vflush (mp=0xc2fb6a70, rootrefs=0, flags=1, td=0xc2fdeaa0)
at /usr/src/sys/kern/vfs_subr.c:2406
#8 0xc2fe6c4f in ntfs_unmount (mp=0xc2fb6a70, mntflags=134217728,
td=0xc2fdeaa0) at /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_vfsops.c:489
#9 0xc07d37c6 in dounmount (mp=0xc2fb6a70, flags=134217728, td=0xc2fdeaa0)
at /usr/src/sys/kern/vfs_mount.c:1299
#10 0xc07d3d90 in unmount (td=0xc2fdeaa0, uap=0xd614ecfc)
at /usr/src/sys/kern/vfs_mount.c:1195
#11 0xc0a45d53 in syscall (frame=0xd614ed38)
at /usr/src/sys/i386/i386/trap.c:1034
#12 0xc0a2ca50 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:203
#13 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Wednesday, February 6, 2008 - 10:52 am

This is the new I wanted to know! :)
With better checks in lockmgr code, we would have caught more
informations about it.

Can you please now add DDB support and once it breaks in DDB do a
'show alllocks' and maybe other small investigations?
This should shade a light for us.

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Wednesday, February 6, 2008 - 10:57 am

Could you please enable NTFS_DEBUG too and maybe see, when the kernel
panics, what is the value of i_usecount for the specified ip?
I want to exclude refcount leaking.

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>, <freebsd-current@...>, Doug Barton <dougb@...>
Date: Saturday, February 9, 2008 - 4:33 pm

i_usecount is just zero for the faulty ip:

(kgdb) bt
#0 doadump () at pcpu.h:195
#1 0xc07680de in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:417
#2 0xc07683a3 in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:571
#3 0xc048e507 in db_panic (addr=Could not find the frame base for "db_panic".
) at /usr/src/sys/ddb/db_command.c:444
#4 0xc048ef0c in db_command (last_cmdp=0xc0bdd194, cmd_table=0x0, dopager=1)
at /usr/src/sys/ddb/db_command.c:411
#5 0xc048f01a in db_command_loop () at /usr/src/sys/ddb/db_command.c:464
#6 0xc04907bd in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:228
#7 0xc07913b6 in kdb_trap (type=3, code=0, tf=0xd614e9fc)
at /usr/src/sys/kern/subr_kdb.c:510
#8 0xc0a5dedb in trap (frame=0xd614e9fc) at /usr/src/sys/i386/i386/trap.c:647
#9 0xc0a438cb in calltrap () at /usr/src/sys/i386/i386/exception.s:146
#10 0xc079153a in kdb_enter (why=0xc0af0f54 "panic", msg=0xc0af0f54 "panic")
at cpufunc.h:60
#11 0xc076838c in panic (fmt=0xc0aee673 "lockmgr still held")
at /usr/src/sys/kern/kern_shutdown.c:555
#12 0xc0755abe in lockdestroy (lkp=0xc0af0f54)
at /usr/src/sys/kern/kern_lock.c:574
#13 0xc2ff8688 in ntfs_ntput (ip=0xc2fbba00)
at /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_subr.c:467
#14 0xc2ff5eb8 in ntfs_reclaim (ap=0xd614eb04)
at /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_vnops.c:262
[...]
(kgdb) frame 13
#13 0xc2ff8688 in ntfs_ntput (ip=0xc2fbba00)
at /usr/src/sys/modules/ntfs/../../fs/ntfs/ntfs_subr.c:467
467 lockdestroy(&ip->i_lock);
(kgdb) p *ip
$2 = {i_devvp = 0xc2fe2dd0, i_dev = 0xc2e7de00, i_hash = {le_next = 0x0,
le_prev = 0xc302002c}, i_next = 0x0, i_prev = 0x0, i_mp = 0xc2fbb500,
i_number = 10, i_flag = 32768, i_lock = {lk_object = {
lo_name = 0xc2ffb672 "ntnode", lo_type = 0xc2ffb672 "ntnode",
lo_flags = 91947008, lo_witness_data = {lod_list = {
stqe_next = 0xc0c21190}, lod_witness = 0xc0c21190}},
lk_i...

To: Yar Tikhiy <yar@...>
Cc: <yar@...>, Scot Hetzel <swhetzel@...>, Doug Barton <dougb@...>, <freebsd-fs@...>, <freebsd-current@...>, Kostik Belousov <kostikbel@...>
Date: Saturday, February 9, 2008 - 7:11 pm

With the determinant yar's help, I think I found how the lock leak happens.
Basically, in ntfs_ntput() the inode refcount (ip->i_usecount) is
decreased and after checked. When check its i_usecount == 0, it means
that initially i_usecount == 1 which also means the lockmgr() was
held. For the i_usecount == 0 logic, however, no lockmgr release
operation is previewed.
This patch should fix the NTFS problems even with stricter assertions
I plan to commit rather soon:
http://www.freebsd.org/~attilio/ntfs.diff

This patch was initially provided by yar as a workaround, but it moved
me in analyzing refcount handling and finding this bug.

Please test and report if it solves problems for you.

Thanks,
Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Doug Barton <dougb@...>, <freebsd-current@...>
Date: Wednesday, January 30, 2008 - 11:12 am

What I forgot to mention in this log is that the patch also fixes what
seems a bug to me in the case a thread holds an exclusive lock and
tries to acquire the same lock in a shared way. What happens in
current CVS code is that the lock cames downgraded, but it seems to be
handled badly so this patch should fix the behaviour.

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Attilio Rao <attilio@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Tuesday, January 29, 2008 - 7:42 pm

Yes, sorry I wasn't clear. I'm using UFS2, NFS, MSDOSFS (occasionally)

Ok, let me know when it's ready for a mere mortal like me to test. :)

BTW, I had something very odd happen just now. I had some time where I
didn't need to be at the keyboard, so I exited X, and at the console I
ran the following:

for file in /mnt/ad0s1/<blah>/*; do
cp $file /tmp/
cmp $file /tmp/${file##*/}
rm /tmp/${file##*/}
done

where the <blah> directory has thousands of jpegs, and /tmp is a
memory disk, in case it matters. I repeated this 4-6 times (not sure
exactly) and it never crashed. But when I tried to restart X I got all
sorts of odd errors, all related in some way to files (e.g.,
/var/log/Xorg.0.log rename, hsetroot not being able to find my desktop
background jpeg and dumping core, xauth not being able to lock and/or
rename ~/.Xauthority, etc.).

I finally gave up and rebooted, now everything is back to normal.
Unfortunately it just occurred to me that I should have tried
unmounting the NTFS volume first, d'oh. But there is definitely
something wrong with the NTFS code if it can scramble things that
badly even when it's not being accessed.

Doug

--

This .signature sanitized for your protection

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Doug Barton <dougb@...>
Cc: Kostik Belousov <kostikbel@...>, Yar Tikhiy <yar@...>, <freebsd-current@...>
Date: Tuesday, January 29, 2008 - 6:16 pm

BTW, I've just found a bug in lockmgr() which could produce such
results, but it should be an old-standing one.

More infos to follow.

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Yar Tikhiy <yar@...>
Cc: Kostik Belousov <kostikbel@...>, Attilio Rao <attilio@...>, <freebsd-current@...>
Date: Thursday, January 24, 2008 - 9:12 am

Try backing the -CURRENT sources to "Jan 08 23:45 UTC 2008", and then
rebuild/install the kernel. This is the point before the
vn_lock/VOP_LOCK changes. Reboot your system and perform your steps
that caused the lstat panic (this kernel shouldn't give the lstat
panic).

Create a backup of this kernel:

cd /boot
cp -rp kernel kernel_good

Then try building/installing a kernel from the -CURRENT sources that
have been updated to "Jan 08 23:49 UTC 2008". Reboot and try the same
steps again, this kernel should give the lstat panic.

After it panics, just reboot, break into the boot loader, then do:

unload
boot kernel_good

Then either copy kernel_good to kernel or add the following to
/boot/loader.conf:

kernel="kernel_good"

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Kostik Belousov <kostikbel@...>
Cc: <freebsd-current@...>
Date: Wednesday, January 16, 2008 - 3:01 am

I added DEBUG_VFS_LOCKS to the kernel config file, rebuilt and
installed the kernel. After rebooting the system, I started the cvsup
update for my local mirror, when the panic occured I received a
similar panic to the one above. When I used 'show lockedvnods' the
only thing that was displayed was 'Locked vnodes' and that was it.

I'm going to try a binary search to see if I can narrow the problem down.

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Kostik Belousov <kostikbel@...>
Cc: <freebsd-current@...>
Date: Wednesday, January 16, 2008 - 8:24 pm

I found the point where the problem occurs. If I update /usr/src/sys
to Jan 08 23:45 UTC 2008, then I don't get the lstat panic. But when
I update to Jan 08 23:49 UTC 2008, the panic returns.

These are the files that change between these times:

dev/usb/ehci.c:
$FreeBSD: src/sys/dev/usb/ehci.c,v 1.57 2008/01/08 23:48:30 attilio Exp $

dev/usb/if_udav.c:
$FreeBSD: src/sys/dev/usb/if_udav.c,v 1.34 2008/01/08 23:48:30
attilio Exp $

fs/hpfs/hpfs_subr.h:
$FreeBSD: src/sys/fs/hpfs/hpfs_subr.h,v 1.4 2008/01/08 23:48:31
attilio Exp $

fs/ntfs/ntfs_subr.c:
$FreeBSD: src/sys/fs/ntfs/ntfs_subr.c,v 1.43 2008/01/08 23:48:31
attilio Exp $

kern/kern_lock.c:
$FreeBSD: src/sys/kern/kern_lock.c,v 1.117 2008/01/08 23:48:31
attilio Exp $

sys/buf.h:
$FreeBSD: src/sys/sys/buf.h,v 1.197 2008/01/08 23:48:31 attilio Exp $

sys/lockmgr.h:
$FreeBSD: src/sys/sys/lockmgr.h,v 1.56 2008/01/08 23:48:31 attilio Exp $

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Scot Hetzel <swhetzel@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Tuesday, February 5, 2008 - 5:40 pm

At least now I know why the problem has became visible just after these commits.
This is because before ntfs lockmgr were just working with the kernel
as owner; consequently td_locks could not be bumped and the problem
was hiding.
I think, also, the problem is not linked to vnodes, so having vnodes
debugging should not produce any difference. NTFS uses a lot of
lockmgr for tracking its internal stuffs.
More analysis to come.

Attilio

--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Date: Wednesday, January 16, 2008 - 10:10 pm

The local CVS repository is on a ZFS filesystem. Is anyone seeing
this problem on a UFS filesystem?

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Scot Hetzel <swhetzel@...>
Cc: <freebsd-current@...>
Date: Thursday, January 17, 2008 - 7:40 am

WITNESS won't work on ZFS' lock by default. Please add:

CFLAGS+=3D-DDEBUG

to sys/modules/zfs/Makefile and recompile zfs kernel module. DEBUG
define tells ZFS not to add NOWITNESS flag at lock initialization time.

--=20
Pawel Jakub Dawidek http://www.wheel.pl
pjd@FreeBSD.org http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!

To: Pawel Jakub Dawidek <pjd@...>
Cc: <freebsd-current@...>
Date: Friday, January 18, 2008 - 3:12 am

I rebuilt the zfs module as suggested.

When I reboot, I am now seeing 4 different lock order reversals related to ZFS:

1. This lock order reversal occurs most often:

lock order reversal:
1st 0xffffff0001b95838 dr->dt.di.dr_mtx (dr->dt.di.dr_mtx) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1866
2nd 0xffffff00017531c0 db->db_mtx (db->db_mtx) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1888
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_checkorder() at witness_checkorder+0x606
_sx_xlock() at _sx_xlock+0x52
dbuf_sync_list() at dbuf_sync_list+0x215
dbuf_sync_list() at dbuf_sync_list+0x194
dnode_sync() at dnode_sync+0x385
dmu_objset_sync() at dmu_objset_sync+0x116
dsl_pool_sync() at dsl_pool_sync+0x153
spa_sync() at spa_sync+0x39e
txg_sync_thread() at txg_sync_thread+0x17d
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffd72a3d30, rbp = 0 ---

2. This lock order reversal is similar to the one above:

lock order reversal:
1st 0xffffff0001b00d38 dr->dt.di.dr_mtx (dr->dt.di.dr_mtx) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1866
2nd 0xffffff0001a27760 db->db_mtx (db->db_mtx) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1837
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_checkorder() at witness_checkorder+0x606
_sx_xlock() at _sx_xlock+0x52
dbuf_sync_list() at dbuf_sync_list+0xaf
dbuf_sync_list() at dbuf_sync_list+0x194
dnode_sync() at dnode_sync+0x385
dmu_objset_sync() at dmu_objset_sync+0x116
dsl_pool_sync() at dsl_pool_sync+0x72
spa_sync() at spa_sync+0x39e
txg_sync_thread() at txg_sync_thread+0x17d
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffd72a3d30, rbp = 0 ---

3. lock order reversal related to the write sys...

To: Pawel Jakub Dawidek <pjd@...>
Cc: <freebsd-current@...>
Date: Friday, January 18, 2008 - 3:40 am

Previous message sent prematurely.

3. lock order reversal related to the write syscall

lock order reversal:
1st 0xffffff00269f6500 dn->dn_mtx (dn->dn_mtx) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode.c:874
2nd 0xffffff0026338d38 dr->dt.di.dr_mtx (dr->dt.di.dr_mtx) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode.c:875
KDB: Stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_check_order() at witness_check_order+0x606
_sx_lock() at _sx_lock+0x52
dnode_new_blkid() at dnode_new_blkid+0x15b
dbuf_dirty() at dbuf_dirty+0x7dc
dmu_write_uio() at dmu_write_uio+0x167
zfs_freebsd_write() at zfs_freebsd_write+0x9b4
VOP_WRITE_APV() at VOP_WRITE_APV+0x131
vn_write() at vn_write+0x24f
dofilewrite() at dofilewrite+0x85
kern_writev() at kern_writev+0x60
write() at write+0x54
syscall() at syscall+0x1ce
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (4, FreeBSD ELF64, write), rip = 0x8009f623c, rsp =
0x7729d0, rbp = 0x772a18 ---

4. lock order reversal related to the lstat syscall

lock order reversal:
1st 0xffffff0001578058 zfsvfs->z_um_lock (zfsvfs->z_um_lock) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:2949
2nd 0xffffff001725cd0 tx->tx_sync_lock (tx->tx_sync_lock) @
/usr/src/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/txg.c:414
KDB: Stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
witness_check_order() at witness_check_order+0x606
_sx_lock() at _sx_lock+0x52
txg_wait_open() at txg_wait_open+0x34
dmu_tx_assign() at dmu_tx_assign+0x2a5
zfs_inactive() at zfs_inactive+0x21f
zfs_freebsd_inactive() at zfs_freebsd_inactive+0x18
VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0xb5
vinactive() at vinactive+0x90
vput() at vput+0x24d
namei() at namei+0x29a
kern_lstat() at kern_lstat+0x5e
lstat() at lstat+0x2a
syscall() at syscall+0x1ce
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (190, FreeBSD ELF64, lstat...

To: Scot Hetzel <swhetzel@...>
Cc: <freebsd-current@...>, Pawel Jakub Dawidek <pjd@...>
Date: Friday, January 18, 2008 - 3:48 am

What does:
Try

ddb> show locks

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Kip Macy <kip.macy@...>
Cc: <freebsd-current@...>, Pawel Jakub Dawidek <pjd@...>
Date: Friday, January 18, 2008 - 5:11 am

It doesn't show any locks, I also tried show alllocks and nothing was displayed.

When the panic occured this time, lock order reversal 4 didn't occur.

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: Kip Macy <kip.macy@...>, Scot Hetzel <swhetzel@...>, Pawel Jakub Dawidek <pjd@...>
Date: Friday, January 18, 2008 - 12:55 pm

Hi.

I have a same trouble like swhetzei said. Please see also LOR
information on my /var/log/messages. Sorry, I couldn't split
LOR logs like swhetzei:-(.

And I couldn't get any lock information:
db> show lockedvnodes, alllocks, locks, ...
(I reproduced by 'find /usr/ports', of couse by csup, too)

P.S.
Thanks for pjd, dfr and simokawa. I lost the UFS world. I'm
living with Panasonic Toughbook CF-R4 in the new^W ZFS world.

# disklabel ad0s2
# /dev/ad0s2:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
a: 64772741 0 ZFS
b: 4194304 64772741 swap
c: 68967045 0 unused 0 0 # "raw" part, don't edit

# mount
zoot on / (zfs, local, noatime)
devfs on /dev (devfs, local)
zoot/compat on /compat (zfs, local, noatime)
zoot/home on /home (zfs, local, noatime)
zoot/home/nork on /home/nork (zfs, local, noatime)
zoot/tmp on /tmp (zfs, local, noatime)
zoot/usr on /usr (zfs, local, noatime)
zoot/usr/doc on /usr/doc (zfs, local, noatime)
zoot/usr/obj on /usr/obj (zfs, local, noatime)
zoot/usr/ports on /usr/ports (zfs, local, noatime)
zoot/usr/ports/distfiles on /usr/ports/distfiles (zfs, local, noatime)
zoot/usr/src on /usr/src (zfs, local, noatime)
zoot/var on /var (zfs, local, noatime)
zoot/var/empty on /var/empty (zfs, local, noatime, read-only)
/dev/ad0s1 on /ntfs (ntfs, local, read-only)

# df -h
Filesystem Size Used Avail Capacity Mounted on
zoot 29G 350M 29G 1% /
devfs 1.0K 1.0K 0B 100% /dev
zoot/compat 29G 0B 29G 0% /compat
zoot/home 29G 0B 29G 0% /home
zoot/home/nork 29G 0B 29G 0% /home/nork
zoot/tmp 29G 0B 29G 0% /tmp
zoot/usr 29G 211M 29G 1% /usr
zoot/usr/doc 29G 2...

To: Scot Hetzel <swhetzel@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Wednesday, January 16, 2008 - 5:55 am

Do you also have witness enabled?

Kris
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Kris Kennaway <kris@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Wednesday, January 16, 2008 - 8:33 pm

witness is enabled in the kernel:

# Debugging for use in -current
options KDB # Enable kernel debugger support.
options DDB # Support DDB.
options GDB # Support remote GDB.
options INVARIANTS # Enable calls of extra sanity checking
options INVARIANT_SUPPORT # Extra sanity checks of internal struct
ures, required by INVARIANTS
options WITNESS # Enable checks to detect deadlocks and
cycles
options WITNESS_SKIPSPIN # Don't run witness on spinlocks for spe
ed
options DEBUG_VFS_LOCKS

and sysctl debug.witness.watch is set to 1.

Scot
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: Kostik Belousov <kostikbel@...>, Scot Hetzel <swhetzel@...>
Date: Wednesday, January 16, 2008 - 11:11 am

It could be that lockmgr_disown() doesn't update curthread->td_locks which is
checked by INVARIANTS. Not sure if it needs to update td_locks, but it might.

--
John Baldwin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

Previous thread: [head tinderbox] failure on i386/pc98 by FreeBSD Tinderbox on Tuesday, January 15, 2008 - 8:52 am. (1 message)

Next thread: [head tinderbox] failure on powerpc/powerpc by FreeBSD Tinderbox on Tuesday, January 15, 2008 - 9:58 am. (1 message)