Re: [PATCH] GSoC 2010 - Memory hotplug support for Xen guests - third fully working version

Previous thread: [PATCH] ARM: Fix gen_nand probe structures contents by Marek Vasut on Wednesday, August 11, 2010 - 6:14 pm. (3 messages)

Next thread: [PATCH] regulator: max8998 - fix memory allocation size for max8998->rdev by Axel Lin on Wednesday, August 11, 2010 - 6:32 pm. (2 messages)
From: Daniel Kiper
Date: Wednesday, August 11, 2010 - 6:22 pm

Hi,

Here is the third version of memory hotplug support
for Xen guests patch. This one cleanly applies to
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git
repository, xen/memory-hotplug head.

On Fri, Aug 06, 2010 at 04:03:18PM +0400, Vasiliy G Tolstov wrote:

Thx.

On Fri, Aug 06, 2010 at 12:34:08PM -0400, Konrad Rzeszutek Wilk wrote:

Thx. Fixed.


No. Only those changes are needed where



There is no simple way to do that. It requiers to do some
changes in drivers/base/memory.c code. I think it should
be done as kernel boot option (on by default to not break
things using this interface now). If it be useful for maintainers
of mm/memory_hotplug.c and drivers/base/memory.c code then
I could do that. Currently original arch/x86/Kconfig version

Good question. I looked throught the code and could
not find any simple explanation why mm/memory_hotplug.c
authors used __ref instead __meminit. Could you (mm/memory_hotplug.c



As I know no however as I saw anybody do not differentiate between
normal and hotplugged memory. I thought about that ealier however
stated that this soultion does not give us any real gain. That is why
I decided to use standard name for hotplugged memory.

If you have a questions please drop me a line.

Daniel

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
---
 arch/x86/Kconfig               |    2 +-
 drivers/xen/balloon.c          |   95 ++++++++-------------------------------
 include/linux/memory_hotplug.h |    3 +-
 mm/memory_hotplug.c            |   55 ++++++++++++++++-------
 4 files changed, 61 insertions(+), 94 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index beb1aa7..9458685 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1273,7 +1273,7 @@ config ARCH_SELECT_MEMORY_MODEL
 	depends on ARCH_SPARSEMEM_ENABLE
 
 config ARCH_MEMORY_PROBE
-	def_bool X86_64 && !XEN
+	def_bool X86_64
 	depends on MEMORY_HOTPLUG
 
 config ILLEGAL_POINTER_VALUE
diff --git a/drivers/xen/balloon.c ...
From: Jeremy Fitzhardinge
Date: Thursday, August 12, 2010 - 5:43 pm

First step is to post it to lkml for discussion, cc:ing the relevant 
maintainers. (I'm not really sure who that is at the moment.  It will 

I think adding a global flag which the Xen balloon driver can disable 
should be sufficient.  There's no need to make an separate user-settable 

Quite possibly a left-over from something else.  You could just try 
making it __meminit, then compile with, erm, the option which shows you 
section conflicts (it shows the number of conflicts at the end of the 

Yes, but I'm assuming the interaction between S3 and ACPI hotplug memory 
isn't something that concerns a Xen guest; our hotplug mechanism is 

Its cosmetic, but it would be useful to see what's going on.

I'll send more detailed comments on the whole patch in a separate mail.

     J
--

From: Daniel Kiper
Date: Monday, August 16, 2010 - 8:44 am

Hi,


I took all relevant addresses (sorry if I missed somebody) from MAINTAINERS


Small reminder: make CONFIG_DEBUG_SECTION_MISMATCH=y

I reviewed kernel source code once again. It is OK. Normaly it is
not allowed to reference code/data tagged as .init.* because
that sections are freed at the end of kernel boot sequence and
they do not exists any more in memory. However it is sometimes
required to use code/data marked .init.*. To allow that __ref
tag is used and then referenced objects are not removed from

Suspend/Hibernation code in Linux Kernel is platform independent
to some extent and it does not require ACPI. It means that
lock_system_sleep/unlock_system_sleep is required in that

If you wish I will do that, however then it should be changed
as well add_registered_memory() function syntax. It should
contain pointer to name published through /sys/firmware/memmap
interface. I am not sure it is good solution to change
add_registered_memory() function syntax which I think should be

No. It supports multiple allocations. This variables are
used mostly for communication between allocate_additional_memory

For full description of current algorithm

It is always PAGE_SIZE aligned and not below than


First memory is allocated in batches of min(nr_pages, ARRAY_SIZE(frame_list))

As I know (maybe I have missed something) currently Xen does not support
NUMA in guests and nid is always 0. However maybe it will be good to create



This updates /sys/devices/system/memory/memory*/state files

Because memory is allocated in relatively small

It was not available in Linux Kernel Ver. 2.6.32.*
on which based first versions of this patch.

Here is current algorithm:
  - allocate_resource() with size requested by user,
  - allocate memory in relatively small batches,
  - add_registered_memory(),
  - online_pages(),


I think that sysfs should stay intact because it contains some
useful information for admins. We should reconsider avaibilty
of ...
From: Jeremy Fitzhardinge
Date: Wednesday, August 25, 2010 - 2:33 pm

Unfortunately MAINTAINERS is often poorly maintained.  While you should
include those addresses, its also worth looking at the git history to

My question is more along the lines of whether there's an *inherent*
dependency/interaction between suspend/hibernate and hotplug memory, or
whether the interaction is a side-effect of the x86 implementation.



Using globals to communicate values between two functions is not
generally good practice.  Could the code be restructured to avoid it (by

Is the batch allocation just to avoid having a single great big piece,
or is there some other deeper reason?  If not, I don't see why that

That's OK as far as it goes, but I do tend to see memory hotplug as an
underlying implementation detail rather than something that should be
directly exposed to users (ie memory hotplug as the mechanism to allow

My understanding is that on systems with real physical hotplug memory,
the process is:

   1. you insert/enable a DIMM or whatever to make the memory
      electrically active
   2. the kernel notices this and generates a udev event
   3. a usermode script sees this and, according to whatever policy it
      wants to implement, choose to online the memory at some point

I'm concerned that if we partially implement this but leave "online" as
a timebomb then existing installs with hotplug scripts in place may poke
at it - thinking they're dealing with physical hotplug - and cause problems.

    J
--

From: KAMEZAWA Hiroyuki
Date: Wednesday, August 25, 2010 - 6:23 pm

On Wed, 25 Aug 2010 14:33:06 -0700

IIUC, IBM guys, using LPAR?, does memory hotplug on VM.

The operation is.
	1. tell the region of memory to be added to a userland daemon.
	2. The daemon write 0xXXXXXX > /sys/devices/system/memory/probe
	   (This notifies that memory is added physically.)
	   Here, memory is created.
	3. Then, online memory.

I think VM guys can use similar method rather than simulating phyiscal hotplug.
Then, you don't have to worry about udev etc...
No ?

Thanks,
-Kame

--

From: Jeremy Fitzhardinge
Date: Thursday, August 12, 2010 - 5:46 pm

Thanks.  I'll paste in the full diff and comment on that rather than 


So does this mean you only support adding a single hotplug region?  What 
happens if your initial increase wasn't enough and you want to add 
more?  Would you make this a list of hot-added memory or something?


So this just reserves enough resource to satisfy the current outstanding 
requirement?  That's OK if we can repeat it, but it looks like it will 






Is the entire reserved memory range guaranteed to be within one node?

I see that this function has multiple definitions depending on a number 

Are the pages available for allocation by the rest of the kernel from 
this point on?




In your earlier, patch I think you made the firmware_map_add_hotplug() 
be specific to add_memory, but now you have it in __add_memory.  Does it 

As before, this all looks reasonably good.  I think the next steps 
should be:

   1. identify how to incrementally allocate the memory from Xen, rather
      than doing it at hotplug time
   2. identify how to disable the sysfs online interface for Xen
      hotplugged memory

For 1., I think the code should be something like:

increase_address_space(unsigned long pages)
{
	- reserve resource for memory section
	- online section
	for each page in section {
		online page
		mark page structure allocated
		add page to ballooned_pages list
		balloon_stats.balloon_(low|high)++;
	}
}


The tricky part is making sure that the memory for the page structures 
has been populated so it can be used.  Aside from that, there should be 
no need to have another call to 
HYPERVISOR_memory_op(XENMEM_populate_physmap, ...) aside from the 
existing one.

Or to look at it another way, memory hotplug is the mechanism for 
increasing the amount of available physical address space, but 
ballooning is the way to increase the number of allocated pages.  They 
are orthogonal.


2 requires a deeper understanding of the existing hotplug code.  It 
needs to be ...
From: Jeremy Fitzhardinge
Date: Thursday, August 12, 2010 - 5:48 pm

Gah, what a mess.  Will repost later.

     J
--

Previous thread: [PATCH] ARM: Fix gen_nand probe structures contents by Marek Vasut on Wednesday, August 11, 2010 - 6:14 pm. (3 messages)

Next thread: [PATCH] regulator: max8998 - fix memory allocation size for max8998->rdev by Axel Lin on Wednesday, August 11, 2010 - 6:32 pm. (2 messages)