Re: [patch 2/3] Introduce BOOTMEM_EXCLUSIVE

Previous thread: [patch 0/3] Protect crashkernel against BSS overlap by Bernhard Walle on Tuesday, October 16, 2007 - 12:28 pm. (1 message)

Next thread: [patch 1/3] Add BSS to resource tree by Bernhard Walle on Tuesday, October 16, 2007 - 12:28 pm. (1 message)
To: <linux-kernel@...>, <kexec@...>
Cc: <akpm@...>, <ak@...>
Date: Tuesday, October 16, 2007 - 12:28 pm

This flag changes the reserve_bootmem() function to accept a new flag
BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with
-EBUSY if the memory already has been reserved in the past. This is to
avoid conflicts.

IMPORTANT: The patch is only proof of concept. This means that it's only for
x86 and breaks other architectures. If the patch is ok, I'll change all other
architectures, too.

Signed-off-by: Bernhard Walle <bwalle@suse.de>

---
arch/x86/kernel/mpparse_32.c | 4 ++--
arch/x86/kernel/setup_32.c | 12 ++++++------
arch/x86/kernel/setup_64.c | 2 +-
include/linux/bootmem.h | 13 ++++++++++++-
mm/bootmem.c | 15 ++++++++++-----
5 files changed, 31 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/mpparse_32.c
+++ b/arch/x86/kernel/mpparse_32.c
@@ -736,7 +736,7 @@ static int __init smp_scan_config (unsig
smp_found_config = 1;
printk(KERN_INFO "found SMP MP-table at %08lx\n",
virt_to_phys(mpf));
- reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE);
+ reserve_bootmem(virt_to_phys(mpf), PAGE_SIZE, 0);
if (mpf->mpf_physptr) {
/*
* We cannot access to MPC table to compute
@@ -751,7 +751,7 @@ static int __init smp_scan_config (unsig
unsigned long end = max_low_pfn * PAGE_SIZE;
if (mpf->mpf_physptr + size > end)
size = end - mpf->mpf_physptr;
- reserve_bootmem(mpf->mpf_physptr, size);
+ reserve_bootmem(mpf->mpf_physptr, size, 0);
}

mpf_found = mpf;
--- a/arch/x86/kernel/setup_32.c
+++ b/arch/x86/kernel/setup_32.c
@@ -317,7 +317,7 @@ static void __init reserve_ebda_region(v
unsigned int addr;
addr = get_bios_ebda();
if (addr)
- reserve_bootmem(addr, PAGE_SIZE);
+ reserve_bootmem(addr, PAGE_SIZE, 0);
}

#ifndef CONFIG_NEED_MULTIPLE_NODES
@@ -439,13 +439,13 @@ void __init setup_bootmem_allocator(void
* bootmem allocator with an invalid RAM area.
*/
reserve_bootmem(__pa_symbol(_text), (PFN_PHYS(min_low_pfn) +
...

To: Bernhard Walle <bwalle@...>
Cc: <linux-kernel@...>, <kexec@...>, <akpm@...>, <ak@...>
Date: Wednesday, October 17, 2007 - 7:05 am

I think we should unreserve the chunks of memory we have reserved so
far (Memory reserved from sidx to i), in case of error.

Thanks
Vivek
-

To: Vivek Goyal <vgoyal@...>
Cc: <linux-kernel@...>, <kexec@...>
Date: Thursday, October 18, 2007 - 7:15 am

True. Next version is coming.

Thanks,
Bernhard
-

To: Vivek Goyal <vgoyal@...>
Cc: <linux-kernel@...>, <kexec@...>, <akpm@...>, <ak@...>
Date: Wednesday, October 17, 2007 - 7:36 am

Unfortunately, that's not possible without using a lock (or counters
instead of a bitmap) any more. If we just do

for (i--; i >= sidx; i--)
clear_bit(i, bdata->node_bootmem_map);

then another thread of execution could reserve the memory (without
BOOTMEM_EXCLUSIVE) in between -- and the code would free the memory
which is already reserved.

I think that could be modelled with a rwlock, not changing the default
case where BOOTMEM_EXCLUSIVE is not specified.

Thanks,
Bernhard
-

To: <linux-kernel@...>, <kexec@...>, <akpm@...>, <ak@...>
Date: Thursday, October 18, 2007 - 12:32 am

On Wed, Oct 17, 2007 at 01:36:51PM +0200, Bernhard Walle wrote:

SMP initialization takes place after bootmem allocator has retired. That
would mean only one thread will be using bootmem allocator. Hence I think
unreserving memory without any kind of locking should be safe.

Thanks
Vivek
-

To: Bernhard Walle <bwalle@...>
Cc: <linux-kernel@...>, <kexec@...>, <akpm@...>, <ak@...>
Date: Tuesday, October 16, 2007 - 2:08 pm

Could you give all of these 0's a name? I really hate seeing random
magic numbers in these things. 0 completely kills the ability of
someone to read the code and figure out what it is trying to do without
going and looking at reserve_bootmem().

Or, alternatively, do something like this:

-extern void reserve_bootmem(unsigned long addr, unsigned long size);
+/*
+ * If flags is 0, then the return value is always 0 (success). If
+ * flags contains BOOTMEM_EXCLUSIVE, then -EBUSY is returned if the
+ * memory already was reserved.
+ */
+extern int reserve_bootmem(unsigned long addr, unsigned long size, int flag);
+int reserve_bootmem(unsigned long addr, unsigned long size)
+{
+ /* the 0 is because we don't
+ return reserve_bootmem_exclusive(addr, size, 0);
+}

Where all of the existing callers stay the same. But, the ones wanting
exclusive access actually call the _exclusive() variant.

-- Dave

-

To: Dave Hansen <haveblue@...>
Cc: <linux-kernel@...>, <kexec@...>, <akpm@...>, <ak@...>
Date: Tuesday, October 16, 2007 - 2:44 pm

Hi,

Andi was against more bootmem functions. ;)

Thanks,
Bernhard
-

To: Bernhard Walle <bwalle@...>
Cc: <linux-kernel@...>, <kexec@...>, <akpm@...>, <ak@...>
Date: Tuesday, October 16, 2007 - 2:58 pm

Yeah, I can't really blame him.

-- Dave

-

Previous thread: [patch 0/3] Protect crashkernel against BSS overlap by Bernhard Walle on Tuesday, October 16, 2007 - 12:28 pm. (1 message)

Next thread: [patch 1/3] Add BSS to resource tree by Bernhard Walle on Tuesday, October 16, 2007 - 12:28 pm. (1 message)