Re: 2.6.27-rc5-mm1: 3 WARN_ON dumps during boot (acpi + vmap_pte_range)

Previous thread: 2.6.26.3-rt7 by Steven Rostedt on Friday, September 5, 2008 - 10:40 pm. (1 message)

Next thread: ACPI video.c brightness handler conflicts with toshiba_acpi by Andrey Borzenkov on Saturday, September 6, 2008 - 12:08 am. (15 messages)
From: Krzysztof Helt
Date: Friday, September 5, 2008 - 11:45 pm

Hi,

There is a dmesg dump  below from my Compaq AP550 workstation.
It has 3 WARN_ON() dumps: 1 from acpi layer and 2 from vmap_pte_range()
There is no such thing in 2.6.27-rc4 which I use daily so I assume
it is something in the -mm tree.

It is a Pentium3 SMP machine.

Kind regards,
Krzysztof

Linux version 2.6.27-rc5-mm1 (root@xxx) (gcc version 3.4.6) #1 SMP Sat Sep 6 07:47:58 CEST 2008
PAT WC disabled due to known CPU erratum.
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
 BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved)
last_pfn = 0x1fff0 max_arch_pfn = 0x100000
kernel direct mapping tables up to 1fff0000 @ 7000-d000
DMI 2.3 present.
ACPI: RSDP 000E0010, 0014 (r0 COMPAQ)
ACPI: RSDT 000E0080, 0040 (r1 COMPAQ CPQB154  20010410             0)
ACPI: FACP 000E00EC, 0074 (r1 COMPAQ CARMEL          1             0)
ACPI: DSDT 000E0230, 1B6C (r1 COMPAQ     DSDT        1 MSFT  100000D)
ACPI: FACS 000E0040, 0040
ACPI: APIC 000E0160, 0068 (r1 COMPAQ CARMEL          1             0)
ACPI: SSDT 000E1D9C, 005B (r1 COMPAQ   ZURICH        1 MSFT  100000D)
ACPI: SSDT 000E1FE8, 06A5 (r1 COMPAQ PNP_PRSS        1 MSFT  100000D)
ACPI: SSDT 000E3414, 005D (r1 COMPAQ     FHUB        1 MSFT  100000D)
ACPI: SSDT 000E268D, 0CBF (r1 COMPAQ  THERMAL        1 MSFT  100000D)
ACPI: SSDT 000E334C, 0024 (r1 COMPAQ       S1        1 MSFT  100000D)
511MB LOWMEM available.
  mapped low ram: 0 - 1fff0000
  low ram: 00000000 - 1fff0000
  bootmap 00002000 - 00006000
(8 early reservations) ==> bootmem [0000000000 - 001fff0000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000001000 - 0000002000]    EX TRAMPOLINE ==> [0000001000 - 0000002000]
  #2 [0000006000 - 0000007000]       TRAMPOLINE ==> [0000006000 - 0000007000]
  #3 [0000100000 - ...
From: Andrew Morton
Date: Friday, September 5, 2008 - 11:50 pm

yup thanks.  The acpi guys and Rusty are still scratching each others


That's coming out of the module loader and is a new one.  It's the same

I'm suspecting an overactive assertion in the new vmap code?
--

From: Krzysztof Helt
Date: Saturday, September 6, 2008 - 3:18 am

On Fri, 5 Sep 2008 23:50:14 -0700

I wanted only to report. I cannot use the 2.6.27-rc5-mm1 because
the Xorg does not start (hangs) with it. My card is AGP so can it be

The third one is interesting because it is "moving". One the first reboot
(the posted was the second) this warning happened when the snd_au8820
module was loaded and there was no warning about ext3.

Regards,
Krzysztof

----------------------------------------------------------------------
Nie zwlekaj! Tapetuj z nami!
Sprawdz >>  http://link.interia.pl/f1f01

--

From: Nick Piggin
Date: Monday, September 8, 2008 - 2:05 am

It shouldn't be, because the old code has that same warning I think.

It also probably shouldn't be the caller, because nothing is using
the new interfaces yet.

I'm sure it must be something wrong with the vmap rewrite patch, but
I'm simply not having any luck reproducing it yet. Is 32-bit a common
theme? (I'm trying to test 64-bit with a greatly reduced vmalloc space,
but I don't have access to a 32-bit compiler just now - travelling).

I might have to send a test-and-report-back debug patch...
--

From: Nick Piggin
Date: Monday, September 8, 2008 - 2:37 am

OK, would it be possible to test the following patch on the failing
machine(s), and send me the complete dmesg trace afterwards, please?

The patch does a little bit of extra page table checking, and also
prints a trace of operations on the vmap-space.

Thanks,
Nick

From: Krzysztof Helt
Date: Monday, September 8, 2008 - 10:52 am

On Mon, 8 Sep 2008 19:37:16 +1000

The dmesg with Nick's patch below (now the third WARN_ON is
triggered in different module).

Regards,
Krzysztof

Linux version 2.6.27-rc5-mm1 (root@xxx) (gcc version 3.4.6) #2 SMP Mon Sep 8 17:49:16 CEST 2008
PAT WC disabled due to known CPU erratum.
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
 BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved)
last_pfn = 0x1fff0 max_arch_pfn = 0x100000
kernel direct mapping tables up to 1fff0000 @ 7000-d000
DMI 2.3 present.
ACPI: RSDP 000E0010, 0014 (r0 COMPAQ)
ACPI: RSDT 000E0080, 0040 (r1 COMPAQ CPQB154  20010410             0)
ACPI: FACP 000E00EC, 0074 (r1 COMPAQ CARMEL          1             0)
ACPI: DSDT 000E0230, 1B6C (r1 COMPAQ     DSDT        1 MSFT  100000D)
ACPI: FACS 000E0040, 0040
ACPI: APIC 000E0160, 0068 (r1 COMPAQ CARMEL          1             0)
ACPI: SSDT 000E1D9C, 005B (r1 COMPAQ   ZURICH        1 MSFT  100000D)
ACPI: SSDT 000E1FE8, 06A5 (r1 COMPAQ PNP_PRSS        1 MSFT  100000D)
ACPI: SSDT 000E3414, 005D (r1 COMPAQ     FHUB        1 MSFT  100000D)
ACPI: SSDT 000E268D, 0CBF (r1 COMPAQ  THERMAL        1 MSFT  100000D)
ACPI: SSDT 000E334C, 0024 (r1 COMPAQ       S1        1 MSFT  100000D)
511MB LOWMEM available.
  mapped low ram: 0 - 1fff0000
  low ram: 00000000 - 1fff0000
  bootmap 00002000 - 00006000
(8 early reservations) ==> bootmem [0000000000 - 001fff0000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000001000 - 0000002000]    EX TRAMPOLINE ==> [0000001000 - 0000002000]
  #2 [0000006000 - 0000007000]       TRAMPOLINE ==> [0000006000 - 0000007000]
  #3 [0000100000 - 00004dbb24]    TEXT DATA BSS ==> [0000100000 - 00004dbb24]
  #4 [00004dc000 - 00004df000]    INIT_PG_TABLE ==> [00004dc000 - 00004df000]
  #5 [000009f800 - ...
From: Nick Piggin
Date: Monday, September 8, 2008 - 8:04 pm

Thanks for that, it clearly shows the virtual address allocator
is allowing an overlapping allocation after a vm_unmap_aliases()
call. Unfortunately, my "random" test case happened not to
trigger that... I should have paid more attention to edge cases
rather than just random testing.

Anyway, I hope this fix should solve the problem for you? (it
fixes it here)
From: Krzysztof Helt
Date: Monday, September 8, 2008 - 10:05 pm

On Tue, 9 Sep 2008 13:04:47 +1000

Your patch fixes two WARN_ON dumps from my original dmesg (agp 
related and module loading related).
The remaining one is the acpi kobject duplication.

Tested-by: Krzysztof Helt <krzysztof.h1@wp.pl>

Thanks a lot Nick,
Krzysztof

----------------------------------------------------------------------
Tanie polaczenia z Polska i ze swiatem
Sprawdz >>  http://link.interia.pl/f1f00 


--

From: Nick Piggin
Date: Tuesday, September 9, 2008 - 12:55 am

Great, thanks very much for reporting and testing.
--

Previous thread: 2.6.26.3-rt7 by Steven Rostedt on Friday, September 5, 2008 - 10:40 pm. (1 message)

Next thread: ACPI video.c brightness handler conflicts with toshiba_acpi by Andrey Borzenkov on Saturday, September 6, 2008 - 12:08 am. (15 messages)