hi there, i am having difficulties copying from one external usb device to the other. the copying stops at certain point and the target device stops responding. /var/log/messages: Dec 14 23:14:07 amaaq /bsd: umass0 at uhub0 Dec 14 23:14:07 amaaq /bsd: port 2 configuration 1 interface 0 "Seagate USB Mass Storage" rev 2.00/0.02 addr 2 Dec 14 23:14:07 amaaq /bsd: umass0: using SCSI over Bulk-Only Dec 14 23:14:07 amaaq /bsd: scsibus1 at umass0: 2 targets, initiator 0 Dec 14 23:14:07 amaaq /bsd: sd0 at scsibus1 targ 1 lun 0: <ST916082, 1A, 0000> SCSI0 0/direct fixed Dec 14 23:14:07 amaaq /bsd: sd0: 152627MB, 512 bytes/sec, 312581808 sec total Dec 15 02:45:27 amaaq /bsd: umass1 at uhub0 Dec 15 02:45:27 amaaq /bsd: port 4 configuration 1 interface 0 "SanDisk Corporation U3 Cruzer Micro" rev 2.00/0.10 addr 3 Dec 15 02:45:27 amaaq /bsd: umass1: using SCSI over Bulk-Only Dec 15 02:45:27 amaaq /bsd: scsibus2 at umass1: 2 targets, initiator 0 Dec 15 02:45:27 amaaq /bsd: sd1 at scsibus2 targ 1 lun 0: <SanDisk, U3 Cruzer Micro, 4.05> SCSI2 0/direct removable Dec 15 02:45:27 amaaq /bsd: sd1: 3919MB, 512 bytes/sec, 8027789 sec total ... Dec 15 02:54:42 amaaq /bsd: e: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:42 amaaq /bsd: splassert: vwakeup: want 80 have 0 ... Dec 15 02:54:48 amaaq /bsd: splassert: biodone: want 80 have 0 Dec 15 02:54:48 amaaq /bsd: splassert: vwakeup: want 80 have ...
please set kern.splassert to 2 and provide a dmesg of the same event. -0- -- Anything free is worth what you pay for it.
hi there, i was having difficulties reproducing this (as expected probably) but i managed to get one trace: splassert: biodone: want 80 have 0 Starting stack trace... splassert_check(50,0,d074ff87,0) at splassert_check+0x46 splassert_check(50,d074ff87,d9655d2c,d340fe00) at splassert_check+0x46 biodone(d9ab12fc,d08c9b30,d16e1e00,d16e1e00) at biodone+0x20 sd_kill_buffers(d340fe00,ffffffff,5,d16fb080) at sd_kill_buffers+0x33 sdactivate(d340fe00,1,4,d1450b00) at sdactivate+0x27 config_deactivate(d340fe00,1,0,1,0) at config_deactivate+0x39 scsi_activate_target(d3647800,1,1,2,0) at scsi_activate_target+0x2f scsi_activate_bus(d3647800,1,d9655dfc,d067e46c) at scsi_activate_bus+0x29 scsi_activate(d3647800,ffffffff,ffffffff,1,d16e1e00) at scsi_activate+0x65 scsibusactivate(d3647800,1,d9655e4c,d067dd0f) at scsibusactivate+0x15 config_deactivate(d3647800,d16e1e00,d9655e8c,d067e5db,0) at config_deactivate+0x39 umass_activate(d340f000,1,d9655ebc,d340f000) at umass_activate+0x3e config_deactivate(d340f000,d340f014,10,d067e54b) at config_deactivate+0x39 config_detach(d340f000,1,d9655f0c,d067eac4,d1374780) at config_detach+0x23b usb_disconnect_port(d138c918,d1450a80,10) at usb_disconnect_port+0x65 uhub_explore(d1374780,d067cba4,d9655f8c,d067cc59,0) at uhub_explore+0x205 usb_discover(d1374800,1a4,d0200928,d5aca580,d5aca6e0) at usb_discover+0x36 usb_event_thread(d1374800) at usb_event_thread+0x91 Bad frame pointer: 0xd0a28e78 End of stack trace. but i am afraid this is being caused by a dying disk... -f -- last week i couldn't even spell engineer, now i are one.
can you tell me what version of src/sys/scsi/sd.c you are running? cheers, dlg
the snapshot being used is from Dec 4, so i'd guess it is Revision 1.169 background: i am trying to run an e2fsck on an 120G ext2 partition on that usb external disk and it just stops reading from the disk at random positions. for example it gets up to 30% of pass 1, then it just stops. no read errors in dmesg, nohing just idling. eventually the disk spins down. top says it's in 'biowait'. (it could also be a bug in e2fsck) this is a go-between disk between a windows machine and openbsd. windows seems to have no problems with it so far.. another strange thing is, that while the partition isn't clean (i can never finish the fsck), it is being mounted by mount from hotplugd when i connect it.. -f -- i couldn't repair your brakes so i made your horn louder.
i have upgraded to the latest snapshot, and indeed the splassert goes away. (but e2fsck still does not finish the disk, it stops reading at random positions of pass 1 (30%, 11%) and sits there doing nothing.) -f -- save a tree. eat a beaver.
make sure the disk itself is OK before blaming anything higher up. "nothing" meaning which process state, really?
openbsd's fsck goes through with no problems. eventually i managed to get e2fsck through as well: i made a script to touch a file every 9 seconds so the disk doesn't spin down and for some reason e2fsck finished. but i am not convinced this is the actual reason, e2fsck was never idle for 10s in the first place, it was reading through the disk after all. and if there were problems spinning up the disk, i am sure dmesg or some other layer of the system would have told me. i am far beyond the point of pointing fingers (disk, OS, etc), i just want it to work deterministically, so my mails to misc don't look like dali paintings. if it has errors, i should see them being reported. sleep/biowait -f -- whatever you are, be a good one. -- abraham lincoln.
ok. no more splassert, but everything remains the same. i just came back from another hard lock up after trying to copy files from one external usb disk to another. at some random point the copying just dies and then first the processes doing anyhing with disks and then gradually the whole system locks up. nothing in the logs, no dmesg, nothing. i have just copied the same files over without any problem using my parents' notebook. so it's either the bios/hw/usb ports or openbsd's scsi/usb layer. could someone help me please to create the most verbose kernel possible (scsi+usb) and combined with some remote syslog hopefully some of the logs will be readable? -f -- artificial intelligence: the other guy's opinion.
Set ddb.console=1 in /etc/sysctl.conf and break to the debugger with Ctrl-Alt-Esc once the lockup happens. trace and ps there, then continue, break to debugger again, etc. This could give you some insight what's going on. Not exactly scientific solution, but easy and quick as the first attempt. Regards, David
