Re: [PATCH] mm: show node to memory section relationship with symlinks in sysfs

Previous thread: [RFC] Btrfs mainline plans by Chris Mason on Monday, September 29, 2008 - 12:44 pm. (17 messages)

Next thread: Put unused PCI devices in D3 by Jeffrey W. Baker on Monday, September 29, 2008 - 1:11 pm. (6 messages)
From: Gary Hade
Date: Monday, September 29, 2008 - 1:05 pm

Show node to memory section relationship with symlinks in sysfs

Add /sys/devices/system/node/nodeX/memoryY symlinks for all
the memory sections located on nodeX.  For example:
/sys/devices/system/node/node1/memory135 -> ../../memory/memory135
indicates that memory section 135 resides on node1.

Successfully tested with 2.6.27-rc7 source on 2-node x86_64,
2-node ppc64, and 2-node ia64 systems.

Also revises documentation to cover this change as well as updating
Documentation/ABI/testing/sysfs-devices-memory to include descriptions
of memory hotremove files 'phys_device', 'phys_index', and 'state'
that were previously not described there.

Supersedes the "mm: show memory section to node relationship in sysfs"
patch posted on 05 Sept 2008 which created node ID containing 'node'
files in /sys/devices/system/memory/memoryX instead of symlinks.
Changed from files to symlinks due to feedback that symlinks were
more consistent with the sysfs way.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>

---
 Documentation/ABI/testing/sysfs-devices-memory |   51 ++++++++++++
 Documentation/memory-hotplug.txt               |   16 +++
 drivers/base/memory.c                          |   10 ++
 drivers/base/node.c                            |   61 +++++++++++++++
 include/linux/memory.h                         |    4 
 include/linux/node.h                           |   11 ++
 6 files changed, 146 insertions(+), 7 deletions(-)

Index: linux-2.6.27-rc5/Documentation/ABI/testing/sysfs-devices-memory
===================================================================
--- linux-2.6.27-rc5.orig/Documentation/ABI/testing/sysfs-devices-memory	2008-09-24 13:19:23.000000000 -0700
+++ linux-2.6.27-rc5/Documentation/ABI/testing/sysfs-devices-memory	2008-09-25 13:36:41.000000000 -0700
@@ -6,7 +6,6 @@
 		internal state of the kernel memory blocks. Files could be
 		added or removed dynamically to represent hot-add/remove
 		operations.
-
 ...
From: Yasunori Goto
Date: Tuesday, September 30, 2008 - 1:06 am

:

I think this patch is convenience even when memory hotplug is disabled.


If the first page of the section is not valid, then this section_nr_to_nid()
doesn't return correct value.

I tested this patch. In my box, the start_pfn of node 1 is 1200400, but 
section_nr_to_pfn(mem_blk->phys_index) returns 1200000. As a result,
the section is linked to node 0.

Bye.
-- 
Yasunori Goto 


--

From: Dave Hansen
Date: Tuesday, September 30, 2008 - 8:50 am

Crap, I was worried about that.

Gary, this means that we have a N:1 relationship between NUMA nodes and
sections.  This normally isn't a problem because sections don't really
care about nodes and they layer underneath them.

We'll probably need multiple symlinks in each section directory.

-- Dave

--

From: Gary Hade
Date: Tuesday, September 30, 2008 - 12:41 pm

So, using Yasunori-san's example the memory section starting at

or perhaps symlinks to the same section directory from >1 node directory.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
--

From: Yasunori Goto
Date: Tuesday, September 30, 2008 - 7:48 pm

It may be possible that one section is divided to different node in theory.
(I don't know really there is...)

But, the cause of my trouble differs from it.
There is a memory hole which is occupied by firmware.
So, the memory map of my box is here.

----
early_node_map[3] active PFN ranges
    0: 0x00000100 -> 0x00006d00
    0: 0x00408000 -> 0x00410000
    1: 0x01200400 -> 0x01210000
----

memmap_init() initializes from start_pfn (to end_pfn).
So, the memmaps for this first hole (0x1200000 - 0x12003ff) are not initialized,
and node id is not set for them. This is true cause.


Bye.

-- 
Yasunori Goto 


--

From: Gary Hade
Date: Wednesday, October 1, 2008 - 9:51 am

Thanks for the clarification.  I think we need to cover both the
theoretical single memory section spanning multiple nodes case
and your memory hole/memory section intersection case.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
--

From: Gary Hade
Date: Tuesday, September 30, 2008 - 4:29 pm

Yes, this would be nice but unfortunately the presence of the
memory section directories that are referenced by the symlinks
also depend on CONFIG_MEMORY_HOTPLUG_SPARSE being enabled.  Removal
of the memory hotplug dependency for the code in drivers/base/memory.c
will require more than a simple CONFIG_MEMORY_HOTPLUG_SPARSE to
CONFIG_SPARSEMEM dependency change.  I am still looking at this.

Thanks for the review and testing.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
--

Previous thread: [RFC] Btrfs mainline plans by Chris Mason on Monday, September 29, 2008 - 12:44 pm. (17 messages)

Next thread: Put unused PCI devices in D3 by Jeffrey W. Baker on Monday, September 29, 2008 - 1:11 pm. (6 messages)