Hi,
This happens everytime there is a power disconnection
(switching to battery). Complete dmesg attached. This
in particular is of kernel 2.6.24.3.
Greetings,
Kind Regards,
Sanjeev
BUG: unable to handle kernel NULL pointer dereference at virtual
address 00000020
printing eip: c04c4716 *pde = 578c2067
Oops: 0000 [#1] SMP
Modules linked in: cbc(U) geode_aes(U) blkcipher(U) aes_i586(U)
aes_generic(U) dm_crypt(U) ipt_MASQUERADE(U) iptable_nat(U) nf_nat(U)
bridge(U) autofs4(U) nf_conntrack_ipv4(U) xt_state(U) nf_conntrack(U)
xt_tcpudp(U) ipt_REJECT(U) iptable_filter(U) ip_tables(U) x_tables(U)
cpufreq_ondemand(U) acpi_cpufreq(U) fuse(U) loop(U) dm_mirror(U)
dm_multipath(U) dm_mod(U) ipv6(U) snd_hda_intel(U) snd_seq_dummy(U)
snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) snd_seq_device(U)
snd_pcm_oss(U) sr_mod(U) snd_mixer_oss(U) snd_pcm(U) 8139cp(U)
snd_timer(U) button(U) 8139too(U) mii(U) snd_page_alloc(U) cdrom(U)
video(U) output(U) snd_hwdep(U) ac(U) snd(U) pcspkr(U) i2c_piix4(U)
i2c_core(U) battery(U) joydev(U) soundcore(U) sg(U) pata_atiixp(U)
pata_acpi(U) sata_sil(U) ata_generic(U) libata(U) sd_mod(U)
scsi_mod(U) ext3(U) jbd(U) mbcache(U) uhci_hcd(U) ohci_hcd(U)
ehci_hcd(U)
Pid: 69, comm: kacpi_notify Not tainted (2.6.24.3 #5)
EIP: 0060:[<c04c4716>] EFLAGS: 00010246 CPU: 0
EIP is at sysfs_addrm_start+0x21/0x81
EAX: c04c47d7 EBX: 00000000 ECX: 00000000 EDX: f78b8000
ESI: f78b8eb8 EDI: f78b8ec8 EBP: 00000000 ESP: f78b8ea4
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process kacpi_notify (pid: 69, ti=f78b8000 task=f78d8000 task.ti=f78b8000)
Stack: f229ff54 f229ff54 f5b8d390 fffffff4 c04c4b45 00000000 00000000 00000000
00000000 f229ff54 00000000 00000000 f782601c c04c4bab f78b8ee0 c04fe65d
f229ff54 c04fe8a8 f722e67f ffffffff ffffffff 00000007 f722e678 f78261d8
Call Trace:
[<c04c4b45>] create_dir+0x33/0x6c
[<c04c4bab>] sysfs_create_dir+0x2d/0x40
[<c04fe65d>] kobject_get+0xf/0x13
[<c04fe8a8>] ...Hi, Please find the dmesg below: BTW, I tried basically with kernels after 2.6.21 and I get the same result. Kind Regards, Sanjeev Initializing cgroup subsys cpuset Linux version 2.6.24.3 (root@draksha.cultuzz.in) (gcc version 4.3.0 20080222 (Red Hat 4.3.0-0.11) (GCC) ) #5 SMP Tue Apr 1 00:04:55 IST 2008 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009dc00 (usable) BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000ce000 - 00000000000d0000 (reserved) BIOS-e820: 00000000000dc000 - 00000000000e0000 (reserved) BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 0000000057e80000 (usable) BIOS-e820: 0000000057e80000 - 0000000057e96000 (ACPI data) BIOS-e820: 0000000057e96000 - 0000000057f00000 (ACPI NVS) BIOS-e820: 0000000057f00000 - 0000000058000000 (reserved) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 510MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f7130 Using x86 segment limits to approximate NX protection Entering add_active_range(0, 0, 360064) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem 229376 -> 360064 Movable zone start PFN for each node early_node_map[1] active PFN ranges 0: 0 -> 360064 On node 0 totalpages: 360064 DMA zone: 32 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4064 pages, LIFO batch:0 Normal zone: 1760 pages used for memmap Normal zone: 223520 pages, LIFO batch:31 HighMem zone: 1021 pages used for memmap HighMem zone: 129667 pages, LIFO batch:31 Movable zone: 0 pages used for memmap DMI present. Using APIC driver default ACPI: RSDP 000F7070, 0014 (r0 TOSQCI) ACPI: RSDT 57E8F4EF, 0038 (r1 TOSQCI TOSQCI00 6040000 ...
Looks like a cpuidle problem (or at least acpi). I seem to recall having seen other reports of this? --
Hello Andrew Morton, Greetings! Thank you for the update. Is there anything I can do from my side? I thought it was a acpi (dsdt) problem. And based on a tutorial, I have tried to extract, fix, recompile the dsdt and use it with the kernel. But still I have the same problem. Let me know if I shall attach the dsdt (original) decompiled code, if that helps. Kind Regards, Sanjeev On Wed, Apr 2, 2008 at 12:44 PM, Andrew Morton --
Hi, this could be due a general memory corruption problem through ACPICA. If you get different backtraces on reboots even you only modified things that do not have to do with the problem, it's probably that and related to: http://bugzilla.kernel.org/show_bug.cgi?id=10339 You might want to try the latest kernel or the patch posted there. Then it might be something else... Thomas --
Thank you for the update. I have checked the bug and unfortunately its not the same issue. Things work absolutely fine, when I'm running on AC power. It even displays the exact battery (and charging) status to me. It messes up when suddenly AC power gets disconnected and switches to battery mode (The time when I get Oops). The system is still usable after switching to battery mode and I still get correct battery stats until its completely discharged. However most of the commands like kill, poweroff, java doesn't work after the Oops. BTW there is one similarity with the referenced bug. If I boot the computer without AC Power, it gives the same Oops and stops during booting itself. I shall try the latest kernel once and shall update you. Kind Regards,
The bug is not related to battery, but to AML parsing and can therefore That would be great. If it works, please give the patch there a try, IMO this one should see 2.6.2[34].X stable kernels soon. Thanks, --
Hello Thomas, I have got the lastest kernel 2.6.25-rc8 today. I observed that the referenced patch is already in the kernel. However this didn't solve the problem in question. I get the same Oops on this kernel as well. Find the latest dmesg along with the Oops: Regards, Sanjeev ------------[ cut here ]------------ WARNING: at lib/kref.c:43 kref_get+0x17/0x1c() Modules linked in: wlan_scan_sta ath_rate_sample ath_pci wlan ath_hal(P) sit tunnel4 ipv6 cbc aes_i586 aes_generic dm_crypt ipt_MASQUERADE iptable_nat nf_nat bridge autofs4 nf_conntrack_ipv4 xt_state nf_conntrack xt_tcpudp ipt_REJECT iptable_filter ip_tables x_tables cpufreq_ondemand acpi_cpufreq fuse loop dm_mirror dm_multipath dm_mod snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer sr_mod video 8139cp snd_page_alloc 8139too snd_hwdep i2c_piix4 i2c_core pcspkr output snd battery soundcore ac mii joydev sg cdrom button pata_atiixp pata_acpi sata_sil ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 70, comm: kacpi_notify Tainted: P 2.6.25-rc8 #2 [<c0427dcc>] warn_on_slowpath+0x40/0x65 [<c043007b>] switch_uid+0x5a/0x70 [<c04147bc>] smp_call_function_single+0x27/0x47 [<c04f1610>] number+0x120/0x1e6 [<c04f1ed1>] vsnprintf+0x40a/0x447 [<c04ef103>] kref_get+0x17/0x1c [<c04ee6a0>] kobject_get+0xf/0x13 [<c04ee72f>] kobject_add_internal+0x42/0x13b [<c04ee8dc>] kobject_init_and_add+0x23/0x25 [<c0596b90>] cpuidle_add_state_sysfs+0x63/0xd7 [<c05144d4>] acpi_os_execute_deferred+0x0/0x25 [<c0596498>] cpuidle_enable_device+0x35/0xac [<c0531c28>] acpi_processor_cst_has_changed+0x40/0x54 [<c052fac1>] acpi_processor_notify+0x83/0xde [<c0519549>] acpi_ev_notify_dispatch+0x4c/0x57 [<c05144f1>] acpi_os_execute_deferred+0x1d/0x25 [<c0435441>] run_workqueue+0x74/0xef [<c0435572>] worker_thread+0xb6/0xc2 [<c0437f8a>] autoremove_wake_function+0x0/0x2d [<c04354bc>] worker_thread+0x0/0xc2 [<c0437d35>] ...
Ok.
Maybe best is you document the backtraces/oopses with kernel versions at
bugzilla.kernel.org and add dmesg and acpidump.
It seems your machine notifies OS that the C-state table changed.
AFAIK this is rare and there might be a general bug in the cpuidle layer
which I do not know well.
Best you add Venkatesh and Shaohua Li <shaohua.li@intel.com> to CC of
the bug.
While the backtrace shows a lot, cpuidle IMO is missing a general debug
option like in the cpufreq layer.
I couldn't find a single printk in the whole cpuidle/{cpuidle,sysfs}.c
files, even on error paths. Also in the cpuidle specific parts of
drivers/acpi/processor_idle.c some debug printks may help for future bug
reports. It is very hard to guess what happened...
--
This looks like cpuidle and kobject interaction. The latest oops looks different from the original one. Latest one is a warn_on in lib/kref.c:43 We (Me or Shaohua) will take a deeper look at get back on this. Thanks, Venki
Hello Venki, On Fri, Apr 4, 2008 at 6:01 AM, Pallipadi, Venkatesh Thank you for the update. Let me know if I can be of any help! Kind Regards,
Hello Thomas, Thank you for the update: I have just registered a bug at http://bugzilla.kernel.org/show_bug.cgi?id=10394 as directed by you. Kind Regards, Sanjeev
Hello Thomas, Andrew, The recent patch given by Venkatesh at http://bugzilla.kernel.org/show_bug.cgi?id=10394 has fixed the problem. Thank you all for the support extended regarding the same. Kind Regards, Sanjeev On Sat, Apr 5, 2008 at 6:45 PM, Sanjeev Aditya Naga
