Re: splassert: vwakeup: and friends

Previous thread: INFORMES SOBRE DEUDORES MOROSOS by novar.consultores on Monday, December 14, 2009 - 5:21 pm. (1 message)

Next thread: Re: Inside Out Networks Edgeport USB Serial Adapters by Brynet on Monday, December 14, 2009 - 7:48 pm. (3 messages)
From: frantisek holop
Date: Monday, December 14, 2009 - 7:22 pm

hi there,

i am having difficulties copying from one external
usb device to the other.  the copying stops at certain
point and the target device stops responding.

/var/log/messages:
Dec 14 23:14:07 amaaq /bsd: umass0 at uhub0
Dec 14 23:14:07 amaaq /bsd:  port 2 configuration 1 interface 0 "Seagate USB Mass Storage" rev 2.00/0.02 addr 2
Dec 14 23:14:07 amaaq /bsd: umass0: using SCSI over Bulk-Only
Dec 14 23:14:07 amaaq /bsd: scsibus1 at umass0: 2 targets, initiator 0
Dec 14 23:14:07 amaaq /bsd: sd0 at scsibus1 targ 1 lun 0: <ST916082, 1A, 0000> SCSI0 0/direct fixed
Dec 14 23:14:07 amaaq /bsd: sd0: 152627MB, 512 bytes/sec, 312581808 sec total
Dec 15 02:45:27 amaaq /bsd: umass1 at uhub0
Dec 15 02:45:27 amaaq /bsd:  port 4 configuration 1 interface 0 "SanDisk Corporation U3 Cruzer Micro" rev 2.00/0.10 addr 3
Dec 15 02:45:27 amaaq /bsd: umass1: using SCSI over Bulk-Only
Dec 15 02:45:27 amaaq /bsd: scsibus2 at umass1: 2 targets, initiator 0
Dec 15 02:45:27 amaaq /bsd: sd1 at scsibus2 targ 1 lun 0: <SanDisk, U3 Cruzer Micro, 4.05> SCSI2 0/direct removable
Dec 15 02:45:27 amaaq /bsd: sd1: 3919MB, 512 bytes/sec, 8027789 sec total
...
Dec 15 02:54:42 amaaq /bsd: e: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0
Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0
...
Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0
Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have ...
From: Owain Ainsworth
Date: Monday, December 14, 2009 - 9:04 pm

please set kern.splassert to 2 and provide a dmesg of the same event.

-0-
-- 
Anything free is worth what you pay for it.

From: frantisek holop
Date: Tuesday, December 22, 2009 - 7:37 pm

hi there,

i was having difficulties reproducing this (as expected probably)
but i managed to get one trace:

splassert: biodone: want 80 have 0
Starting stack trace...
splassert_check(50,0,d074ff87,0) at splassert_check+0x46
splassert_check(50,d074ff87,d9655d2c,d340fe00) at splassert_check+0x46
biodone(d9ab12fc,d08c9b30,d16e1e00,d16e1e00) at biodone+0x20
sd_kill_buffers(d340fe00,ffffffff,5,d16fb080) at sd_kill_buffers+0x33
sdactivate(d340fe00,1,4,d1450b00) at sdactivate+0x27
config_deactivate(d340fe00,1,0,1,0) at config_deactivate+0x39
scsi_activate_target(d3647800,1,1,2,0) at scsi_activate_target+0x2f
scsi_activate_bus(d3647800,1,d9655dfc,d067e46c) at scsi_activate_bus+0x29
scsi_activate(d3647800,ffffffff,ffffffff,1,d16e1e00) at scsi_activate+0x65
scsibusactivate(d3647800,1,d9655e4c,d067dd0f) at scsibusactivate+0x15
config_deactivate(d3647800,d16e1e00,d9655e8c,d067e5db,0) at config_deactivate+0x39
umass_activate(d340f000,1,d9655ebc,d340f000) at umass_activate+0x3e
config_deactivate(d340f000,d340f014,10,d067e54b) at config_deactivate+0x39
config_detach(d340f000,1,d9655f0c,d067eac4,d1374780) at config_detach+0x23b
usb_disconnect_port(d138c918,d1450a80,10) at usb_disconnect_port+0x65
uhub_explore(d1374780,d067cba4,d9655f8c,d067cc59,0) at uhub_explore+0x205
usb_discover(d1374800,1a4,d0200928,d5aca580,d5aca6e0) at usb_discover+0x36
usb_event_thread(d1374800) at usb_event_thread+0x91
Bad frame pointer: 0xd0a28e78
End of stack trace.


but i am afraid this is being caused by a dying disk...

-f
-- 
last week i couldn't even spell engineer, now i are one.

From: David Gwynne
Date: Wednesday, December 23, 2009 - 3:50 am

can you tell me what version of src/sys/scsi/sd.c you are running?

cheers,
dlg


From: frantisek holop
Date: Wednesday, December 23, 2009 - 6:05 am

the snapshot being used is from Dec  4, so i'd guess
it is Revision 1.169

background:
i am trying to run an e2fsck on an 120G ext2 partition on that
usb external disk and it just stops reading from the disk at
random positions.  for example it gets up to 30% of pass 1,
then it just stops.  no read errors in dmesg, nohing just
idling. eventually the disk spins down.  top says it's in
'biowait'.  (it could also be a bug in e2fsck)

this is a go-between disk between a windows machine and
openbsd. windows seems to have no problems with it so far..


another strange thing is, that while the partition isn't
clean (i can never finish the fsck), it is being mounted
by mount from hotplugd when i connect it..

-f
-- 
i couldn't repair your brakes so i made your horn louder.

From: frantisek holop
Date: Thursday, December 24, 2009 - 12:02 pm

i have upgraded to the latest snapshot, and indeed
the splassert goes away.

(but e2fsck still does not finish the disk, it stops
reading at random positions of pass 1 (30%, 11%)
and sits there doing nothing.)

-f
-- 
save a tree.  eat a beaver.

From: Jan Stary
Date: Friday, December 25, 2009 - 11:44 am

make sure the disk itself is OK
before blaming anything higher up.

"nothing" meaning which process state, really?

From: frantisek holop
Date: Friday, December 25, 2009 - 6:43 pm

openbsd's fsck goes through with no problems.

eventually i managed to get e2fsck through as well:
i made a script to touch a file every 9 seconds
so the disk doesn't spin down and for some reason
e2fsck finished.  but i am not convinced this is
the actual reason, e2fsck was never idle for 10s
in the first place, it was reading through the disk
after all.  and if there were problems spinning
up the disk, i am sure dmesg or some other layer
of the system would have told me.

i am far beyond the point of pointing fingers
(disk, OS, etc), i just want it to work deterministically,
so my mails to misc don't look like dali paintings.

if it has errors, i should see them being reported.

sleep/biowait

-f
-- 
whatever you are, be a good one. -- abraham lincoln.

From: frantisek holop
Date: Friday, December 25, 2009 - 7:15 pm

ok.  no more splassert, but everything remains the same.
i just came back from another hard lock up after trying
to copy files from one external usb disk to another.
at some random point the copying just dies and then
first the processes doing anyhing with disks and then
gradually the whole system locks up. nothing in the logs,
no dmesg, nothing.

i have just copied the same files over without any problem
using my parents' notebook.

so it's either the bios/hw/usb ports or openbsd's scsi/usb
layer.

could someone help me please to create the most verbose
kernel possible (scsi+usb) and combined with some remote
syslog hopefully some of the logs will be readable?

-f
-- 
artificial intelligence: the other guy's opinion.

From: David Vasek
Date: Saturday, December 26, 2009 - 12:16 pm

Set ddb.console=1 in /etc/sysctl.conf and break to the debugger with 
Ctrl-Alt-Esc once the lockup happens. trace and ps there, then continue, 
break to debugger again, etc. This could give you some insight what's 
going on. Not exactly scientific solution, but easy and quick as the first 
attempt.

Regards,
David

Previous thread: INFORMES SOBRE DEUDORES MOROSOS by novar.consultores on Monday, December 14, 2009 - 5:21 pm. (1 message)

Next thread: Re: Inside Out Networks Edgeport USB Serial Adapters by Brynet on Monday, December 14, 2009 - 7:48 pm. (3 messages)