Upgraded my kernel from 2.6.23-rc2 to 2.6.23-rc5.
System hangs (caps lock and scroll lock leds are both
flashing).It *randomly* happens but most of the time during
after login to KDE.I have not investigated it yet because I have not
tried doing it
before.Also the system is really not responding so I can't
do much.. Just
hope it blue screen so I can send the error code
easily :)Just CC me bec I'm not subscribed.
Regards,
____________________________________________________
Tired of spam? Yahoo! Mail has the best spam protection around
http://ph.mail.yahoo.com
-
i experienced hangs, with the flashing caps and scroll locks as you've
described, in a few of my later pulls prior to rc5. i couldn't
reproduce the hangs and my logs didn't show evidence of a problem. my
system under rc5, so far, hasn't hung on me.charles
-
Oh, I thought I was the only one. I also had a single hang+flashing
Caps & Scroll Lock with -rc5, but haven't had one since.I had VMWare Player modules loaded at the time though, and I
recently rebuilt them with the any-any-patch-113 (earlier versions
would not build with very recent kernels).-rc4-git2 and -git3 never hung even with VMWare modules loaded.
So I disabled the autoload of VMWare stuff. The problem has not
reproduced so far. My system is a Dell D610, running updated
Fedora 7, with Intel 915GM video chipset.--alessandro
"I can't believe you if I can't hear you"
(Editors, 'Smokers Outside The Hospital Doors')
-
[...]
Hmm, just occured here, no chance to capture anything. Happend under
some system and network load (distcc/nfs) (latest atheros/madwifi
tainted however, but never had troubles). Not had much uptime with pre-5
-rc's.Anything I can help to debug this?
config & dmesg & lspci:
http://www.mittendorfer.com/rm/temp/info-2.6.23-rc5.txtsl, ritch
-
First of all, try this patch :
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -560,7 +560,7 @@ static u32 tcp_rto_min(struct sock *sk)
struct dst_entry *dst = __sk_dst_get(sk);
u32 rto_min = TCP_RTO_MIN;
- if (dst_metric_locked(dst, RTAX_RTO_MIN))
+ if (dst && dst_metric_locked(dst, RTAX_RTO_MIN))
rto_min = dst->metrics[RTAX_RTO_MIN-1];
return rto_min;
}If that doesn't help, then setup netconsole or serial console and
try to capture some output from the hang.
(details on how to setup net & serial consoles can be found in
Documentation/networking/netconsole.txt
and
Documentation/serial-console.txt
)
Make sure you've set your console loglevel high enough to log
everything.Also try enabling sysrq in your kernel and, if possible, capture a
sysrq+t dump when the crash happens and send in the dmesg output
after sysrq+t - details in Documentation/sysrq.txt - there's also
info on console log level in there.You can also try building a kernel with most (or all) of the debug
options found in the 'Kernel hacking' menu enabled. That can often
help by producing extra valuable debug output (you need to be able
to capture it though, so getting net/serial console setup as well
is usually a good idea if the box hangs completely and you can't
just get info by running dmesg).Kind regards,
Jesper Juhl-
On Sun, 2 Sep 2007 22:38:28 +0200
I've just had a kernel panic, running a vanilla 2.6.23-rc5, no
proprietary modules loaded, running X11 under moderate load.
2.6.23-rc[2-4] have been rock solid on the same box. I've reverted
to rc4 for now.François
-
Try this from net-2.6 tree:
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -560,7 +560,7 @@ static u32 tcp_rto_min(struct sock *sk)
struct dst_entry *dst = __sk_dst_get(sk);
u32 rto_min = TCP_RTO_MIN;- if (dst_metric_locked(dst, RTAX_RTO_MIN))
+ if (dst && dst_metric_locked(dst, RTAX_RTO_MIN))
rto_min = dst->metrics[RTAX_RTO_MIN-1];
return rto_min;
}
-
Here is a relevant oops for this hang.
[ 7329.832382] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000025
[ 7329.934755] printing eip:
[ 7329.967145] 802cb921
[ 7329.993347] *pde = 00000000
[ 7330.026799] Oops: 0000 [#1]
[ 7330.060246] Modules linked in: usblp ipt_MASQUERADE iptable_nat nf_nat ipt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables pppoe pppox ppp_generic slhc dvb_pll cx22702 usbhid tuner hid cx88_dvb video_buf_dvb dvb_core cx8800 cx8802 cx88_alsa cx88xx ir_common i2c_algo_bit tveeprom i2c_core videodev compat_ioctl32 v4l2_common v4l1_compat video_buf ehci_hcd uhci_hcd btcx_risc usbcore
[ 7330.518076] CPU: 0
[ 7330.518077] EIP: 0060:[<802cb921>] Not tainted VLI
[ 7330.518079] EFLAGS: 00210246 (2.6.23-rc5-dirty #72)
[ 7330.669129] EIP is at tcp_rto_min+0xb/0x16
[ 7330.718088] eax: 00000032 ebx: d76b5ab0 ecx: 0000209d edx: 00000000
[ 7330.799277] esi: d76b5ab0 edi: 00000000 ebp: 803ecdd8 esp: 803ecdd8
[ 7330.880468] ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068
[ 7330.950225] Process firefox-bin (pid: 2625, ti=803ec000 task=d1bfcaa0 task.ti=d1bfd000)
[ 7331.043890] Stack: 803ecde8 802cb9e8 d76b5ab0 d76b5ab0 803ecdf4 802cc1e2 803bc3e0 803ece6c
[ 7331.144727] 802cd85f 03ccf75e 208b3367 291abfc5 6eaa5ef6 d6c52818 00000000 00000000
[ 7331.245564] 00000001 03ccf6f4 eddd6e3e 03ccf75e 00000001 00000001 803eceac 03ccf75e
[ 7331.346403] Call Trace:
[ 7331.377800] [<80102abd>] show_trace_log_lvl+0x1a/0x2f
[ 7331.439343] [<80102b6d>] show_stack_log_lvl+0x9b/0xa3
[ 7331.500885] [<80102d1f>] show_registers+0x1aa/0x278
[ 7331.560349] [<80102ed5>] die+0xe8/0x205
[ 7331.607336] [<802f0bea>] do_page_fault+0x469/0x53a
[ 7331.665761] [<802ef73a>] error_code+0x6a/0x70
[ 7331.718987] [<802cb9e8>] tcp_rtt_estimator+0xbc/0x102
[ 7331.780529] [<802cc1e2>] tcp_ack_saw_tstamp+0x17/0x47
[ 7331.842071] [<802cd85f>...
That's my impression as well. That's way too core/busy a codepath to have
a bug in. As I said earlier, almost anybody testing -rc5 is sure to hit
this within a few hours (probably less) -- sad, it greatly erodes from the
usefulness of -rc5 as a release candidate.
-
On Mon, 3 Sep 2007 04:15:01 +0530 (IST)
For the record, I haven't been able to reproduce the lockup with this
patch.Cheers
François
-
The above patch also fixed my problems using NFS mounted home directories.
I already applied it yesterday without giving any feedback, so I just
thought I should comment.In addition I totally agree with Satyam's comment above: either
anybody is testing rc's these days, or people simply stopped reporting.My concern is that we have to live with a git tree that has this
problem, because core developers are unavailable and nobody has the
authority to commit this fix. (That was already posted before).I am not voting for public commit rights and personally I don't want
them anyway. It's just my observation that nothing really happens
whenever a point release is only a few days away.Maybe most people believe that they have to wait for the 2.6.24
cycly to start before they join the discussion again ...Patrick
-
I've hit the problem, found this thread, found the proposed patch and
tried it. Presumably others did the same. It's not easy to report a
hang which leaves no message in the logs and happens at random,
infrequent times. It takes time to gather enough information to make a
useful report. And it takes time to make sure that the patch actually
fixed the problem, too.(Satyam's estimation that everybody would hit the bug within an hour
The fix appears to be in the tree already, so you needn't worry.
--
Jean Delvare
-
~10 people bothered to report this on LKML -- and there could easily have
been others who didn't, and simply switched to -rc4. And then many more
might've hit it, foudn this thread immediately (like you did), and hence
didn't report again ... so I still pretty much stand by what I'd said :-)
But in any case as you said it's fixed already so no bother.
-
Hi,
I had network-related locks with rc5, but refrained to post here
because I am using ndiswrapper... I was waiting to go back home and try
to reproduce without it, by connecting the laptop to a wired network. I
will try the patch proposed.Thanks,
Romano.--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.
-
Well - when my box locked up I was also doing a 300MB scp
through my ipw2200 wireless interface. But I did another today
and nothing bad happened.In another email someone mentioned /var/log/messages - no,
that was totally void of anything peculiar, despite the fact that
I rebooted via Alt-SysRq T / P / S / U / B (this last did reboot
the box). Upon restart however no, nothing in /var/log/messages.
Didn't think it was the case to set up netconsole - only one
occurrence AND with VMWare modules loaded...I will apply the above fix and will post again in case the hang
issue pops up again. Thanks, ciao,--alessandro
"I can't believe you if I can't hear you"
(Editors, 'Smokers Outside The Hospital Doors')
-
Looks like a full-blown panic to me. Didn't any of you guys manage to
catch anything in /var/log/messages? Else, please try reproducing this
with a serial console or netconsole at least. netconsole should be easy
to setup, instructions on how to use that here:(very long URL)
http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.24.git;a=blob_pl...
-
| Linus Torvalds | Linux 2.6.21 |
| Greg Kroah-Hartman | [PATCH 002/196] Chinese: rephrase English introduction in HOWTO |
| Con Kolivas | Re: [PATCH][RSDL-mm 0/7] RSDL cpu scheduler for 2.6.21-rc3-mm2 |
| Andrew Morton | echo mem > /sys/power/state |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | Re: [GIT]: Networking |
| Michael S. Tsirkin | Re: [RFC PATCH v2 03/19] vbus: add connection-client helper infrastructure |
