[Fwd: Re: kern/118258 sysctl causing panics on 7.0-xxx]

Previous thread: 7-BETA3 everyday reboot (was: BETA3 crash) by Alexandre Biancalana on Wednesday, November 28, 2007 - 6:37 am. (19 messages)

Next thread: host(1) & nslookup(1) hang, but name resolution ... works??!? by David Wolfskill on Wednesday, November 28, 2007 - 8:17 am. (3 messages)
From: Remko Lodder
Date: Wednesday, November 28, 2007 - 7:57 am

Hello,

So as per Jeff's information, can someone from the -current
list either contact jeff or try to resolve the problems
mentioned? :)

Cheers
remko

-------- Original Message --------
Subject: Re: kern/118258 sysctl causing panics on 7.0-xxx
Date: Wed, 28 Nov 2007 14:50:02 GMT
From: Jeff Palmer <jeff@rexdb.com>
Reply-To: Jeff Palmer <jeff@rexdb.com>
To: freebsd-bugs@FreeBSD.org

The following reply was made to PR kern/118258; it has been noted by GNATS.

From: Jeff Palmer <jeff@rexdb.com>
To: bug-followup@freebsd.org
Cc:
Subject: Re: kern/118258 sysctl causing panics on 7.0-xxx
Date: Wed, 28 Nov 2007 09:28:47 -0500

 Caught another vmcore,=20
 this time simply from 'sysctl -a'
 This one showed debugging data from the if_wpi driver on the terminal
 immediately before the panic.

 Since the wpi driver is in -CURRENT (and not in 7.3-BETA3) I'm uncertain
 if this should actually be filed as a PR,  or just brought up on the
 current@ list.
 I'm not certain it's directly related to the wpi driver,  so until
 informed otherwise,  I'll just keep submitting followups to this PR as I
 can.

 Thanks,

 Jeff Palmer


 Laptop# kgdb /usr/obj/usr/src/sys/GENERIC/kernel.debug /var/crash/vmcore.=
 1
 [GDB will not be able to debug user-mode threads:
 /usr/lib/libthread_db.so: Undefined symbol "
 ps_pglobal_lookup"]
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you =
 are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type "show copying" to see the conditions.
 There is absolutely no warranty for GDB.  Type "show warranty" for detail=
 s.
 This GDB was configured as "i386-marcel-freebsd".

 Unread portion of the kernel message buffer:


 Fatal trap 12: page fault while in kernel mode
 cpuid =3D 1; apic id =3D 01
 fault virtual address   =3D 0x18264
 fault code              =3D supervisor read, page not present
 instruction pointer   ...
From: Tai-hwa Liang
Date: Thursday, November 29, 2007 - 8:53 pm

This is a longstanding bug which also exists in RELENG_6.  It turns out
that 'sysctl kern.ttys' after a terminal device is removed could trigger
this panic reliably.  For example, do 'sysctl kern.ttys' multiple times
after detaching an USB serial-to-rs232 cable or a PCMCIA modem card.

   Alternatively, following script would demo the panic if you don't have
a physically removable terminal device:

#!/bin/sh
#
# Warning! Running this script as root will panic your CURRENT box...
#
while true; do
 	kldload dcons
 	kldunload dcons
 	ls /dev
 	sysctl kern.ttys
 	sleep 1
done

   This seems to be a race between devfs and destroy_dev(), Cc'ing kib@
-- 
Cheers,

Tai-hwa Liang
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: jeff
Date: Friday, November 30, 2007 - 1:35 am

Suddenly,  it all becomes quite clear.  I use umodem and ucom for my
cellphone (motorola razr v3) mostly to charge it, but occasionally to
provide my laptop with mobile connectivity.  So I add/remove a USB tty
almost every night.

My only question is:  up until recently the panic was a double panic and
hang,  with no vmcore available.  then all of a sudden it was a single
panic where savecore was actually useful.  I wonder what changed (or even
if it's the same panic)

If this is a known and longstanding bug, I'll just use a wall charger for
my cellphone, and leave everyone alone.  Thanks for the input!



_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: John Baldwin
Date: Thursday, January 3, 2008 - 2:52 pm

Try this patch.  Also available at
http://www.FreeBSD.org/~jhb/patches/ttys_sysctl.patch

--- //depot/vendor/freebsd/src/sys/fs/devfs/devfs_vnops.c	2007/10/24 19:06:35
+++ //depot/user/jhb/acpipci/fs/devfs/devfs_vnops.c	2007/11/01 17:09:40
@@ -995,17 +995,20 @@
 
 	vnode_destroy_vobject(vp);
 
+	VI_LOCK(vp);
 	dev_lock();
 	dev = vp->v_rdev;
 	vp->v_rdev = NULL;
 
 	if (dev == NULL) {
 		dev_unlock();
+		VI_UNLOCK(vp);
 		return (0);
 	}
 
 	dev->si_usecount -= vp->v_usecount;
 	dev_unlock();
+	VI_UNLOCK(vp);
 	dev_rel(dev);
 	return (0);
 }
--- //depot/vendor/freebsd/src/sys/kern/tty.c	2007/07/20 09:45:18
+++ //depot/user/jhb/acpipci/kern/tty.c	2007/11/13 18:59:58
@@ -3040,16 +3040,19 @@
  *
  * XXX: This shall sleep until all threads have left the driver.
  */
- 
 void
 ttyfree(struct tty *tp)
 {
+	struct cdev *dev;
 	u_int unit;
  
 	mtx_assert(&Giant, MA_OWNED);
 	ttygone(tp);
 	unit = tp->t_devunit;
-	destroy_dev(tp->t_mdev);
+	dev = tp->t_mdev;
+	tp->t_dev = NULL;
+	ttyrel(tp);
+	destroy_dev(dev);
 	free_unr(tty_unit, unit);
 }
 
@@ -3065,7 +3068,6 @@
 	tp = TAILQ_FIRST(&tty_list);
 	if (tp != NULL)
 		ttyref(tp);
-	mtx_unlock(&tty_list_mutex);
 	while (tp != NULL) {
 		bzero(&xt, sizeof xt);
 		xt.xt_size = sizeof xt;
@@ -3074,6 +3076,18 @@
 		xt.xt_cancc = tp->t_canq.c_cc;
 		xt.xt_outcc = tp->t_outq.c_cc;
 		XT_COPY(line);
+
+		/*
+		 * XXX: We hold the tty list lock while doing this to
+		 * work around a race with pty/pts tty destruction.
+		 * They set t_dev to NULL and then call ttyrel() to
+		 * free the structure which will block on the list
+		 * lock before they call destroy_dev() on the cdev
+		 * backing t_dev.
+		 *
+		 * XXX: ttyfree() now does the same since it has been
+		 * fixed to not leak ttys.
+		 */
 		if (tp->t_dev != NULL)
 			xt.xt_dev = dev2udev(tp->t_dev);
 		XT_COPY(state);
@@ -3096,6 +3110,7 @@
 		XT_COPY(olowat);
 		XT_COPY(ospeedwat);
 #undef XT_COPY
+		mtx_unlock(&tty_list_mutex);
 ...
From: Tai-hwa Liang
Date: Friday, January 4, 2008 - 3:19 am

With this patch, -CURRENT no longer boots and panics as follows:

Unread portion of the kernel message buffer:
<118>Configuring syscons:
<118> keyrate
Sleeping thread (tid 100048, pid 307) owns a non-sleepable lock
sched_switch(c3b97a50,0,1,394c04af,12,...) at sched_switch+0x146
mi_switch(1,0,c3b97a50,f888caa8,c051788a,...) at mi_switch+0x137
sleepq_switch(c3b97a50,0,c06945ef,19b,c065bdc0,...) at sleepq_switch+0x7e
sleepq_catch_signals(0,c3b97a50,f888cae8,c04b38d0,c3c999a8,...) at sleepq_catch_signals+0x24a
sleepq_wait_sig(c3c999a8,c3c99990,c0694a50,101,0,...) at sleepq_wait_sig+0x15
_cv_wait_sig(c3c999a8,c3c99990,c103c800,0,f888cb74,...) at _cv_wait_sig+0x180
seltdwait(c3f551d4,1,c3a9f200,c3b97a50,c3f59e58,...) at seltdwait+0xd6
poll(c3b97a50,f888ccfc,c,12,88cd2c,...) at poll+0x489
syscall(f888cd38) at syscall+0x317
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (209, FreeBSD ELF32, poll), eip = 0x28112baf, esp = 0xbfbfee1c, ebp = 0xbfbfee48 ---
panic: sleeping thread
KDB: enter: panic
panic: from debugger
Uptime: 8s
Physical memory: 1014 MB
Dumping 55 MB: 40 24 8

#0  doadump () at pcpu.h:195
195	pcpu.h: No such file or directory.
 	in pcpu.h
(kgdb) where
#0  doadump () at pcpu.h:195
#1  0xc04ea575 in boot (howto=260) at ../../../kern/kern_shutdown.c:417
#2  0xc04ea797 in panic (fmt=Variable "fmt" is not available.
) at ../../../kern/kern_shutdown.c:571
#3  0xc0446da7 in db_panic (addr=Could not find the frame base for "db_panic".
) at ../../../ddb/db_command.c:444
#4  0xc044751c in db_command (last_cmdp=0xc06dd754, cmd_table=0x0, dopager=1)
     at ../../../ddb/db_command.c:411
#5  0xc044762a in db_command_loop () at ../../../ddb/db_command.c:464
#6  0xc044908d in db_trap (type=3, code=0) at ../../../ddb/db_main.c:228
#7  0xc0510034 in kdb_trap (type=3, code=0, tf=0xf88df8dc)
     at ../../../kern/subr_kdb.c:510
#8  0xc0667bc7 in trap (frame=0xf88df8dc) at ../../../i386/i386/trap.c:647
#9  0xc065584b in calltrap () at ...
From: John Baldwin
Date: Friday, January 4, 2008 - 10:56 am

Fixed, was missing an unlock at the bottom of the loop.  Patch is updated at 
the same URL.

-- 
John Baldwin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Tai-hwa Liang
Date: Monday, January 7, 2008 - 8:50 pm

This one works like a charm!  I've tried with USB-to-RS232 cable
as well as aforementioned script.  Now -CURRENT wouldn't panic by
dumping kern.ttys.

   Great work!

-- 
Thanks,

Tai-hwa Liang
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Previous thread: 7-BETA3 everyday reboot (was: BETA3 crash) by Alexandre Biancalana on Wednesday, November 28, 2007 - 6:37 am. (19 messages)

Next thread: host(1) & nslookup(1) hang, but name resolution ... works??!? by David Wolfskill on Wednesday, November 28, 2007 - 8:17 am. (3 messages)