Re: [GIT PULL] x86 setup: correct booting on 486 (revised)

Previous thread: [PATCH 2/6] IA64: fix memset size error by Li Zefan on Sunday, November 4, 2007 - 10:17 pm. (4 messages)

Next thread: [PATCH 1/6] ARM: fix memset size error by Li Zefan on Sunday, November 4, 2007 - 10:15 pm. (3 messages)
To: Linus Torvalds <torvalds@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, H. Peter Anvin <hpa@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Sunday, November 4, 2007 - 10:16 pm

Hi Linus; please pull:

git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git for-linus

H. Peter Anvin (2):
x86 setup: add a near jump to serialize %cr0 on 386/486
x86 setup: set %ebx == %ebp == %edi == 0 on protected mode entry

arch/x86/boot/pmjump.S | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)

commit 142a92e61f9c405a114cb2bfaf3ce3f537a48a89
Author: H. Peter Anvin <hpa@zytor.com>
Date: Sun Nov 4 17:54:31 2007 -0800

x86 setup: set %ebx == %ebp == %edi == 0 on protected mode entry

In accordance with the newly formalized 32-bit boot protocol, set
%ebx == %ebp == %edi == 0 in order to support future extensions to the
protocol.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index 18732f7..0d24e96 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -28,13 +28,15 @@
* void protected_mode_jump(u32 entrypoint, u32 bootparams);
*/
protected_mode_jump:
- xorl %ebx, %ebx # Flag to indicate this is a boot
movl %edx, %esi # Pointer to boot_params table
movl %eax, 3f # Patch ljmpl instruction
jmp 1f # Short jump to flush instruction q.
1:

movw $__BOOT_DS, %cx
+ xorl %ebx, %ebx # Per the 32-bit boot protocol
+ xorl %ebp, %ebp # Per the 32-bit boot protocol
+ xorl %edi, %edi # Per the 32-bit boot protocol

movl %cr0, %edx
orb $1, %dl # Protected mode (PE) bit

commit ad676d0fdf2e59ccc28ee9f6f9593ff14a3d8a5a
Author: H. Peter Anvin <hpa@zytor.com>
Date: Sun Nov 4 17:50:12 2007 -0800

x86 setup: add a near jump to serialize %cr0 on 386/486

The 386 and 486 needs a jump immediately after setting %cr0 in order
to serialize the pipeline.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index 2e55923..18732f7 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump...

To: Linus Torvalds <torvalds@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, H. Peter Anvin <hpa@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Sunday, November 4, 2007 - 11:58 pm

Just for the record, I realized this patch could be done slightly
cleaner and cleaned it up accordingly.

git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git for-linus

H. Peter Anvin (2):
x86 setup: add a near jump to serialize %cr0 on 386/486
x86 setup: set %ebx == %ebp == %edi == 0 on protected mode entry

arch/x86/boot/pmjump.S | 8 +++++---
1 files changed, 5 insertions(+), 3 deletions(-)

commit 9f259cc59ba45b8db401d60be9700e275676fb15
Author: H. Peter Anvin <hpa@zytor.com>
Date: Sun Nov 4 17:54:31 2007 -0800

x86 setup: set %ebx == %ebp == %edi == 0 on protected mode entry

In accordance with the newly formalized 32-bit boot protocol, set
%ebx == %ebp == %edi == 0 in order to support future extensions to the
protocol.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index 26baeab..fa6bed1 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -28,11 +28,13 @@
* void protected_mode_jump(u32 entrypoint, u32 bootparams);
*/
protected_mode_jump:
- xorl %ebx, %ebx # Flag to indicate this is a boot
movl %edx, %esi # Pointer to boot_params table
movl %eax, 2f # Patch ljmpl instruction

movw $__BOOT_DS, %cx
+ xorl %ebx, %ebx # Per the 32-bit boot protocol
+ xorl %ebp, %ebp # Per the 32-bit boot protocol
+ xorl %edi, %edi # Per the 32-bit boot protocol

movl %cr0, %edx
orb $1, %dl # Protected mode (PE) bit

commit 7ed192906a2144ebc8ca2925a85d27b9c5355668
Author: H. Peter Anvin <hpa@zytor.com>
Date: Sun Nov 4 17:50:12 2007 -0800

x86 setup: add a near jump to serialize %cr0 on 386/486

The 386 and 486 needs a jump immediately after setting %cr0 in order
to serialize the pipeline.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index 2e55923..26baeab 100644
--- a/arch/x86/boot/pmjump.S
+...

To: H. Peter Anvin <hpa@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 1:15 pm

Ok, I'm obviously happier, but I have to admit that the original code was
safer than the new code. It did both the short jump and the far jump
before reloading any segments.

So I suspect the new code _works_ fine, but it's simply not as
fundamentally safe as the old code was.

The old code did do some instructions in between the short jump and the
far jump, but they were all the kind of instructions that didn't care
about the PE bit: there was a _read_ of the segment descriptor value, but
that's mode-independent (it's only the writes that matter), and the other
instructions were bog-standard integer instructions.

So I would actually prefer some additional safety, with something like
the appended..

This is TOTALLY UNTESTED! I checked with objdump that the result looks
roughly ok, but I didn't really think through the segment base address in
that long jump thing. Do we have the difference between flat mode and the
16-bit bootup mode in some better way?

Hmm?

Linus

--
arch/x86/boot/pmjump.S | 25 +++++++++++++++++--------
1 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index fa6bed1..587dc04 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -29,7 +29,11 @@
*/
protected_mode_jump:
movl %edx, %esi # Pointer to boot_params table
- movl %eax, 2f # Patch ljmpl instruction
+
+ xorl %ecx, %ecx # add data segment offset to
+ movw %ds, %cx # the "in_32_bit_mode" thing.
+ shll $4, %ecx
+ addl %ecx, 2f

movw $__BOOT_DS, %cx
xorl %ebx, %ebx # Per the 32-bit boot protocol
@@ -42,15 +46,20 @@ protected_mode_jump:
jmp 1f # Short jump to serialize on 386/486
1:

- movw %cx, %ds
- movw %cx, %es
- movw %cx, %fs
- movw %cx, %gs
- movw %cx, %ss
-
# Jump to the 32-bit entrypoint
.byte 0x66, 0xea # ljmpl opcode
-2: .long 0 # offset
+2: .long in_32_bit_mode # offset
.word __BOOT_CS # segment

.size protected_mode_jump, .-protected_m...

To: Linus Torvalds <torvalds@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 1:56 pm

Well, we *could* do a 16-bit PM segment (and do two far jumps), but that
seems rather silly. We'd have to patch the GDT for the base in that
case, anyway.

This is more or less the same code I had for the first version of the
patch, modulo moving the short jump of course. I do like making the
32-bit code a separate function, but it really should be "movl %ecx,..."
in the 32-bit code.

I have to admit I agree with Eric that this is probably overkill, but
hey, there is nothing like a bit of overkill to make sure something is
really and truly dead.

Cooking up a tree now.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 2:12 pm

Yeah, there is no point in having two far jumps. One is enough.

The point being that since apparently the new boot standards say that the
32-bit code is entered with segments etc set to specific values, then we
shouldn't do the jump to that 32-bit standard with a far jump: we should
do it as a regular jump, because we'd want to to set up the segments etc

At least my assembler does the right thing with just the plain "mov" for
segments, but yes, there may be old assemblers that add a useless data
size override. So "movl %ecx,%*s" is probably the right thing to do to
make sure they don't do anything stupid..

Btw, on that same kind of thread: I think we should move the clearing of
the registers into the 32-bit mode too, since that makes the instructions
shorter (no operand size override), and makes more sense anyway (then we
can also clean %edx/%ecx.

Final comment: shouldn't we set up %esp to be correct for the new %ss too?

Linus
-

To: Linus Torvalds <torvalds@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 2:32 pm

Well, the 32-bit code needs to set up its own stack, and only it knows
where it wants its stack; we don't guarantee that the stack is valid
when we enter the 32-bit code and we're entering with both INT and NMI
disabled (requiring a stack would probably break all existing users of
the 32-bit entrypoint.)

However, that being said, doing so is trivial, and it might help some
debugging hack; anything that makes debugging easier is a Good Thing[TM].

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Eric Biederman <ebiederm@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 2:36 pm

I agree. But it would be nice if some basic instructions still worked: as
is, you cannot even do things like reloading %eflags, because the only way

Yeah. Even if it was just re-using the boot-time stack area temporarily,
just to give code the choice to use a common set of instructions.

Linus
-

To: Linus Torvalds <torvalds@...>
Cc: H. Peter Anvin <hpa@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 4:21 pm

If I had to do it from scratch today I would make the 32-bit entry
point require a stack, segments and use C calling conventions to pass
struct boot_params *.

Besides %esi I'm not really fond of requiring anything in the 32bit
entrypoint. At the same time I totally agree that it is always nice
to provide way more then you need.

Eric
-

To: Eric W. Biederman <ebiederm@...>
Cc: Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 4:31 pm

Nailing down the interface as hard as possible is a good idea, to avoid
tying your hands for the future.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 5:14 pm

I'm just saying be liberal in what you accept and conservative in what
you send.

Making the entire process well defined is useful so things don't break
unnecessarily, and the maintainers of the pieces of software that use
the interface know what they can reliably get away with and what is
just luck.

Currently using the 32-bit entry point reliably requires:
%cs to be set.
%esi to be set.
%ebx be set to 0.
%gdt to be set and have:
0x10 a 32bit 4G code segment with base of 0
0x18 a 32bit 4G data segment with base of 0

With the latest generation of the boot protocol if KEEP_SEGMENTS
is set then it is only required that the data segments %ds, %es, %fs,
%gs and %ss be initialized to a valid value.

I have no problem with code providing more then what is required
above, and in fact I think it is likely a good thing.

For future expansion of the protocol things will go easiest if
we don't add additional requirements to the list above, as that
is all that I think all current boot loaders provide.

Anyway this is getting off topic. So far the changes to pmjump.S
look to be going well.

Eric
-

To: Eric W. Biederman <ebiederm@...>
Cc: Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 5:28 pm

Actually, I suspect the currently code will handle %ebx with any value,

Specifying now that unused GPRs should be zeroed will allow for changes
if and when we need it. It's an easy requirement to fulfill, so boot
loader authors can put it through the pipe now. Then, if we find

Thanks. I just pushed two more patches to the git tree; one to do the
paranoia thing, and one to initialize LDTR and TR; the latter is for the
benefit of Intel VT and is not required for correctness, but it should
be able to speed up booting slightly on VT-based hardware.

See:

http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-x86setup.git;a=l...

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>, Jeremy Fitzhardinge <jeremy@...>
Date: Monday, November 5, 2007 - 5:58 pm

Correct. So a bootloader must set %ebx to zero to handle those older kernels.

Sure it is reasonable to ask for. The bootloader pipe is awfully long though.
But putting it in at the time we clean up the rules for the 32bit entry point
is the best chance we are going to get to be able to change things.

Eric

-

To: H. Peter Anvin <hpa@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Monday, November 5, 2007 - 4:51 pm

Erm, I guess I see what you mean, but it comes to the effect of tying
your hands now in a specific way, rather than having them tied in an
unknown way later on...

But I hadn't noticed the 32-bit boot protocol spec go in. Unfortunately
it isn't useful for booting a pv Xen guest; I just mailed my comments.
I hope we can iterate this to something more generally useful before
getting too wedded to the current protocol.

J
-

To: Jeremy Fitzhardinge <jeremy@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Monday, November 5, 2007 - 5:06 pm

I'm not so sure about that. Xen PV is rather fundamentally a different

This is addressed by the "don't reload segments" bit in LOADFLAGS.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Monday, November 5, 2007 - 8:59 pm

OK.

J
-

To: Jeremy Fitzhardinge <jeremy@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Monday, November 5, 2007 - 9:11 pm

Yes; specifically, boot_params.hdr.hardware_subarch == 0 (as opposed to
compile-time subarchitectures, like Voyager, which still boots the same
way as far as I know.)

It would definitely be good to document what other values in this field

Specifically, with this bit set the decompression code won't touch the
segment registers at all, and it's up to the caller to have all code and
data segments set up with suitable descriptors. The kernel will still
try to install its own GDT when the kernel proper starts; this becomes a
hardware_subarch issue.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Monday, November 5, 2007 - 9:18 pm

Yes, though it would be nice to use this mechanism to deal with voyager
booting too, so that it can be normalized as a pvop backend rather than

Yes, the "setup proper kernel gdt" is part of the hwsubarch-specific
startup code.

Another thing it would be nice to add is an elf-note-like notion so that
the kernel can export arbitrary key/value data to the bootloader (ie,
the converse of the bootloader->kernel value list). Xen currently does
this via ELF notes, but any semanically equivalent mechanism would do.
It's probably simpler than trying to work out how to mush bzimage and
ELF together.

J
-

To: Jeremy Fitzhardinge <jeremy@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Monday, November 5, 2007 - 9:31 pm

I suspect all we need is an offset-pointer field pointing into the
kernel image. As far as the kernel build process is concerned, it
becomes a section in the boot/compressed link script. That offset then
needs to get exported to the setup.elf link stage and there adjusted to
become a file offset.

The ELF note format is sane enough, although it looks like it's not
self-terminating, so we'd either need an offset and a length field, or
adopt the convention that namesz = descsz = type = 0 terminates the
block (I prefer the latter, myself.) We also need the notes documented,
obviously.

-hpa

-

To: H. Peter Anvin <hpa@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 12:17 pm

Hm, I think offset+length would be better: it's how they're represented
in a normal ELF file, so you can just extract the length if you're
extracting the notes. Also, generating a terminating note with the
current linker-based notes machinery would be a bit of a pain.

J
-

To: Jeremy Fitzhardinge <jeremy@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 12:27 pm

.notes : {
*(.note.*)
. = ALIGN(4);
LONG(0);
LONG(0);
LONG(0);
}

Am I missing something?

-

To: H. Peter Anvin <hpa@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 1:04 pm

I don't think adding a length any harder.

The all zero note is reserved so using it this way should be ok.
Regardless this sounds like a sane thing to be looking at.

Eric
-

To: H. Peter Anvin <hpa@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 12:55 pm

Oh, I suppose, but I never much liked putting data-definition into the
linker script.

J

-

To: Jeremy Fitzhardinge <jeremy@...>
Cc: Eric W. Biederman <ebiederm@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 1:00 pm

I think it should be sparsely used, but stuff like simple end markers is
pretty much what it's good for.

The main reason I want to avoid adding another header field is that the
header is a finite resource; one of the many poor decisions in its
original design was using a 2-byte jump at the top, so address 0x281 is
the end of the universe.

-hpa

-

To: H. Peter Anvin <hpa@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 1:09 pm

That was fixed long ago (by having a 4 byte reserved field in the middle) that
we can do a two byte jump and then do a farther jump from there to the 16bit
code. So as long as we actually use discipline and really reserve
the field for a further jump there should be no need for 0x281 being the end
of the universe.

Eric
-

To: Eric W. Biederman <ebiederm@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 1:57 pm

That's not the only complication. The thing that concern me more is
boot loaders using the jump as a length indicator, and there is really
very little chance to test that out safely, except perhaps by breaking
it immediately (by adding a 16-byte jump at the end; that way we provide
a minimum of overlap for boot loader authors.)

That being said, I don't see any such field (bootsect_kludge could be
recycled, arguably, and pad2 is three bytes which is enough for a 16-bit
jump.)

At the moment, though, that would only push the maximum from 0x281 to
0x290, then we run into the next field in struct boot_params. Although
this field can also be relocated over time, it once again shows that
breaking this particular limit is nontrivial, and that we're better off
trying to avoid pushing it.

However, with a little discipline I think we can make 0x281 last us for
the usable lifetime of this format. In the 10 years since the 2.00
format was created, we have only added 36 bytes of header, and we have
57 bytes left (plus 5 bytes of pad and 6 bytes of recyclable field.)
When we get closer to full, if we haven't already created a mechanism
making field additions obsolete I think we would be better off creating
a pointer to a secondary header than trying to break the limitations
involved in the current header format.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 2:27 pm

The old setup.S had that 16-byte jump in there.
We actually goofed when we added the relocatable bzImage support and

I have a hard time believing in discipline when I see the amount of
not invented here and various oddball mistakes (cause by overlooking
things) that seems to go on when extending the format. We never
needed to change the way the command line was passed, and we should
have kept the longer jump where we had it.

If we are going to through and add an additional pointer to a notes section
let's please put a jump in there so we can make the header longer as
we choose.

Pointers really, really, really suck for maintenance of binary formats.
Offsets against a known base are better, but better still is if you can
avoid them entirely. For what we are doing allocating a contiguous piece
of memory or file is not at all unreasonable.

Eric
-

To: Eric W. Biederman <ebiederm@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, Linus Torvalds <torvalds@...>, Linux Kernel Mailing List <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, Mikael Petterson <mikpe@...>
Date: Tuesday, November 6, 2007 - 2:41 pm

The longer jump was never documented, and so didn't exist. There was
definitely no way to rely on it.

The old command-line protocol had some really ugly interactions with the
absolutely insane hoisting code from the pre-2.02 days. I didn't have
enough guts back then to scream and just rip it out, mostly because it
took me a long time to figure out what the heck it really did (as
opposed to what it claimed it did.) That being said, we probably could
have gotten away with leaving the protocol as-is while ripping out the
guts (as I eventually did in the rewrite), even if the old protocol only

The problem is that that will only buy us 15 bytes, and eat up 3 (in
practice, 4) of them...

It might be worth doing anyway, as it'd only break the 32-bit entrypoint
users to reorganize struct boot_params.

-hpa

-

Previous thread: [PATCH 2/6] IA64: fix memset size error by Li Zefan on Sunday, November 4, 2007 - 10:17 pm. (4 messages)

Next thread: [PATCH 1/6] ARM: fix memset size error by Li Zefan on Sunday, November 4, 2007 - 10:15 pm. (3 messages)