Re: [RFC 0/3] bootmem rewrite

Previous thread: [PATCH 3/3 2.6.27] cxgb3 - Add LRO support by Divy Le Ray on Tuesday, May 20, 2008 - 6:57 pm. (1 message)

Next thread: [RFC 1/3] mm: Move bootmem descriptors to a single place by Johannes Weiner on Tuesday, May 20, 2008 - 6:37 pm. (1 message)
From: Johannes Weiner
Date: Tuesday, May 20, 2008 - 6:37 pm

Hi,

This is a complete overhaul of the bootmem allocator while preserving
its original functionality, excluding bugs.

free_bootmem and reserve_bootmem become a bit stricter than they are
right now, callsites have to make sure that the PFN range is
contiguous but it might go across node boundaries.

alloc_bootmem satisfying the allocation goal is more likely as the
routines will try to allocate on the node holding the goal first
before falling back as opposed to the original behaviour that
satisfies the goal only if it is on the first node.

All in all, I think the code has become simpler and cleaner.  All
public interfaces have been documented, too.

The first patch moves the bootmem node descriptor definitions into
bootmem.c where they belong.

The second patch is the new allocator itself.

The third patch converts all users of ->node_boot_start to
->node_min_pfn as this is what they really use.  It then removes the
unused ->node_boot_start.

Compile and runtime tested on X86_32, therefor RFC only.

 arch/alpha/mm/numa.c             |    8 +-
 arch/arm/mm/discontig.c          |   34 +-
 arch/arm/plat-omap/fb.c          |    4 +-
 arch/avr32/mm/init.c             |    3 +-
 arch/ia64/mm/discontig.c         |   30 +-
 arch/m32r/mm/discontig.c         |    4 +-
 arch/m32r/mm/init.c              |    4 +-
 arch/m68k/mm/init.c              |    4 +-
 arch/mips/sgi-ip27/ip27-memory.c |    3 +-
 arch/mn10300/mm/init.c           |    6 +-
 arch/parisc/mm/init.c            |    3 +-
 arch/powerpc/mm/numa.c           |    3 +-
 arch/sh/mm/init.c                |    2 +-
 arch/sh/mm/numa.c                |    5 +-
 arch/sparc64/mm/init.c           |    3 +-
 arch/x86/mm/discontig_32.c       |    3 +-
 arch/x86/mm/numa_64.c            |    6 +-
 include/linux/bootmem.h          |  115 ++---
 mm/bootmem.c                     |  914 +++++++++++++++++++-------------------
 mm/page_alloc.c                  |    4 +-
 20 files changed, 560 insertions(+), 598 ...
From: Andrew Morton
Date: Wednesday, May 21, 2008 - 4:57 pm

On Wed, 21 May 2008 03:37:35 +0200


Oh gee.

bootmem is an area where large numbers of people have done hit-and-run
jobs over a lot of years.  Nobody owns it and I'm sure that you are now
the world's expert.  We just need to push ahead with this, I guess.

I expect there will be problems - so many architectures which do such
different things, and all the configuration options churning things
around.

So how to move ahead with this?

- I think I'd prefer not to drop

  mm-fix-free_all_bootmem_core-alignment-check.patch
  mm-normalize-internal-argument-passing-of-bootmem-data.patch
  mm-unexport-__alloc_bootmem_core.patch

  because those are small, simple things which are on track for
  2.6.27 whereas a massive rewrite may take longer to get merged, and
  may never get there at all, in which case we lost those little
  fixes.

- It would suit my purposes to have these patches right at the tail
  of the -mm patch queue so that I can drop them easily if problems
  occur, and so that others can revert them easily when diagnosing
  problems.

- It would be nice to get some review attention from architecture
  guys, but I can understand them finding other things to do, when
  bootmem is presumably good-enough-for-now.

- Is x86_32 the only test platform which you have available?  Awkward.

Anyway, if you can redo these patches against most-recent-mm or,
better, against http://userweb.kernel.org/~akpm/mmotm then it would
make things easier for me to handle.  I can then at least test it all
on my seven-odd test boxes.  Please feel free to ping me if you want a
single rolled-up patch - that's always trivial and I can do it in three
minutes.

Finally, if you haven't done so, I'd encourage you to stuff as many
handy debugging printks into this code as you possibly can.  Just fill
'er up with them.  So that when people start running it and it goes
boom, they can send you their debug output _without_ having to go
through another handful of email-email-patch-rebuild-retest ...
From: Johannes Weiner
Date: Wednesday, May 21, 2008 - 5:33 pm

Hi Andrew,


Anyone who has seen this code realizes that this kind of advertisment is

Okay, seeing it again it looks a bit brutal.  But it's not.  The basic
principles are the same, it's not that I completely changed the
implementation.  Okay, perhaps I did.


I expect problems too.  I just can not go any further with it on the





Okay, I will make it gossip and send you a -mmotm-based version of it.

	Hannes
--

From: Andrew Morton
Date: Wednesday, May 21, 2008 - 6:10 pm

Good luck :)
--

Previous thread: [PATCH 3/3 2.6.27] cxgb3 - Add LRO support by Divy Le Ray on Tuesday, May 20, 2008 - 6:57 pm. (1 message)

Next thread: [RFC 1/3] mm: Move bootmem descriptors to a single place by