Hum, kernel says: http://img177.imageshack.us/my.php?image=overlappingus2.jpg Overlapping early reservations b98000-eff266 RAMDISK to 200000-d09cf7 TEXT DATA BSS It would appear that the initramfs is overlapping the kernel itself, is the boot loader (LILO) doing something stupid? Luca --
Is there anything that the kernel could to do confuse lilo? The issue started appearing with 2.6.27 and the outcome of the boot process varies between versions and seems sensitive to configuration changes (though a "bad" kernel consistently fails). Orthogonal to my problem: the panic() in reserve_early is useless for debugging since the output won't reach the screen or the serial console (even worse: the kernel takes an exception while trying to execute the panic). Is it acceptable to replace it with an early_printk + hlt? Luca --
good question. Does your successful 2.6.26 bootup actually _depend_ on the initrd? Or does it perhaps have enough built-in drivers that make it boot just fine? in that case v2.6.26 might just have stomped on the initrd silently, corrupted it (during kernel decompress), and the initrd unpacker saw the corruption and ignored it. Userspace wouldnt care as the kernel had all the drivers it needed. or perhaps something made your v2.6.27 bzImage larger so that the very much so. I was wondering about that already. In any case it would make sense to turn that particular overlap situation into a warning message and disable initrd decompress - and try to boot with whatever is built-in the kernel. Ingo --
console=uart8250,io,0x3f8,115200n8 could help YH --
Nope, parse_early_param() is called in start_kernel(), my kernel dies How does LILO decides where to put the initrd (I find LILO code... obscure)? I mean, it gets a compressed image: how does it know the size of the uncompressed kernel image? Is it the payload_length in the real mode header? (answer to self: no, it appears to be the compressed I'm already using the latest version. Luca --
On Mon, Sep 8, 2008 at 10:54 AM, Luca Tettamanti <kronos.it@gmail.com> wrote: can you post boot log with working kernel + "debug"? YH --
This is the map of the early reservations (will send the dmesg + debug later): [ 0.000000] (6 early reservations) ==> bootmem [0000000000 - 00bbf90000] [ 0.000000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] [ 0.000000] #1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000] [ 0.000000] #2 [0000200000 - 0000d012b8] TEXT DATA BSS ==> [0000200000 - 0000d012b8] [ 0.000000] #3 [00037dc000 - 00040fe2d9] RAMDISK ==> [00037dc000 - 00040fe2d9] [ 0.000000] #4 [000009c800 - 0000100000] BIOS reserved ==> [000009c800 - 0000100000] [ 0.000000] #5 [0000008000 - 000000b000] PGTABLE ==> [0000008000 - 000000b000] As a side note: I've bigger older (2.6.26) kernels that boots fine, and smaller 2.6.27 kernels that do not work, e.g. this one: Overlapping early reservations b71000-effb43 RAMDISK to 200000-c84ecf TEXT DATA BSS Luca --
that could explain sth. big kernel use more, and lilo put ramdisk need to figure out, lilo put ramdisk so low... need to know e820 table layout... YH --
BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009c800 (usable) BIOS-e820: 000000000009c800 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000bbf90000 (usable) BIOS-e820: 00000000bbf90000 - 00000000bbf9e000 (ACPI data) BIOS-e820: 00000000bbf9e000 - 00000000bbfe0000 (ACPI NVS) BIOS-e820: 00000000bbfe0000 - 00000000bc000000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) last_pfn = 0xbbf90 max_arch_pfn = 0x3ffffffff dmesg is attached, but I haven't rebooted yet. Luca
Yes, that's correct, but it doesn't seem related to a particular configuration item (moon phase maybe). For example the kernel I'm using right now (-rc4-something) has the same config as a non-working kernel minus LOCKDEP, but git-current minus LOCKDEP does not work. On another kernel I got a working config just enabling DEBUG_INFO, in another case I disabled MTD. Luca --
do you mean tip/master? http://people.redhat.com/mingo/tip.git/readme.txt YH --
I mean git pulled from Linus' tree. Btw, I'm attaching dmesg with debug. Luca
The image is mostly modular (e.g. ahci drivers + LVM + fs are on the ramdisk), see: The compressed image (bzImage) is roughly the same size as older kernels (2MB), the uncompressed vmlinux is slightly bigger, but not much (~100k) 452K ./mm/built-in.o 100K ./ipc/built-in.o 8.0K ./security/built-in.o 3.0M ./drivers/built-in.o 4.0K ./usr/built-in.o 216K ./block/built-in.o 96K ./init/built-in.o 292K ./lib/built-in.o 1.3M ./net/built-in.o 16K ./crypto/built-in.o 1.1M ./kernel/built-in.o 1.3M ./fs/built-in.o 4.0K ./sound/built-in.o Luca --
hm, that doesnt seem to match the ranges that got printed: | Kernel is loaded at the standard 2MB physical, and goes up to 13.6MB | physical. That's a tad large at 11.6 MB but still valid. so how come a ~2MB vmlinux takes 11.6 MB? Is the bss that large for some reason perhaps? Ingo --
Sorry, the sentence above is not very clear: I meant that with 2.6.27 the uncompressed vmlinux is only slightly bigger than older kernels; as you stated the size is 11.6MB. Luca --
yeah. Kernel is loaded at the standard 2MB physical, and goes up to 13.6MB physical. That's a tad large at 11.6 MB but still valid. ramdisk image goes from 11.6 MB to 14.9 MB - roughly standard size. That overlaps 2 MB into the kernel image so we have to panic. LILO should have loaded the ramdisk somewhere else. (or should have aborted the boot if it cannot do that) We could perhaps print a prominent warning, delay the boot for 5 seconds or so via mdelay(5000) and simply not load the ramdisk if this happens? The kernel is obviously still functional - and such a large vmlinuz likely has all the built-in drivers to boot up to user-space - the lack of the ramdisk does not necessarily hurt . Ingo --
wonder if lilo is fixing bzImage from 1M, and when it is calculating pos of ramdisk...base that later on-same-position uncompressing, put vmlinux from 2M... it seems kexec is puting initrd as high as possible, or could specify the ramdisk postion .. wonder if new lilo could help. YH --
