Re: [PATCH -mmotm] [BUGFIX] pagemap: fix pfn calculation for hugepage

Previous thread: [PATCH] x86: mce: Xeon75xx specific interface to get corrected memory error information v2 by Andi Kleen on Tuesday, March 23, 2010 - 10:40 pm. (6 messages)

Next thread: [PATCH -mmotm] pagemap: add #ifdefs CONFIG_HUGETLB_PAGE on code walking hugetlb vma by Naoya Horiguchi on Tuesday, March 23, 2010 - 10:42 pm. (1 message)
From: Naoya Horiguchi
Date: Tuesday, March 23, 2010 - 10:42 pm

When we look into pagemap using page-types with option -p, the value
of pfn for hugepages looks wrong (see below.)
This is because pte was evaluated only once for one vma
although it should be updated for each hugepage. This patch fixes it.

  $ page-types -p 3277 -Nl -b huge
  voffset   offset  len     flags
  7f21e8a00 11e400  1       ___U___________H_G________________
  7f21e8a01 11e401  1ff     ________________TG________________
               ^^^
  7f21e8c00 11e400  1       ___U___________H_G________________
  7f21e8c01 11e401  1ff     ________________TG________________
               ^^^

One hugepage contains 1 head page and 511 tail pages in x86_64 and
each two lines represent each hugepage. Voffset and offset mean
virtual address and physical address in the page unit, respectively.
The different hugepages should not have the same offset value.

With this patch applied:

  $ page-types -p 3386 -Nl -b huge
  voffset   offset   len    flags
  7fec7a600 112c00   1      ___UD__________H_G________________
  7fec7a601 112c01   1ff    ________________TG________________
               ^^^
  7fec7a800 113200   1      ___UD__________H_G________________
  7fec7a801 113201   1ff    ________________TG________________
               ^^^
               OK

Changelog:
 - add hugetlb entry walker in mm/pagewalk.c
   (the idea based on Kamezawa-san's patch)

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 fs/proc/task_mmu.c |   27 +++++++--------------------
 include/linux/mm.h |    4 ++--
 mm/pagewalk.c      |   47 +++++++++++++++++++++++++++++++++++++----------
 3 files changed, 46 insertions(+), 32 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 2a3ef17..9635f0b 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -662,31 +662,18 @@ static u64 huge_pte_to_pagemap_entry(pte_t pte, int offset)
 	return pme;
 }
 
-static int pagemap_hugetlb_range(pte_t ...
From: Zhu, Yijun (NSN - CN/Beijing)
Date: Tuesday, March 23, 2010 - 10:58 pm

Hi All:

Thank you very much firstly.
I have send the similar questions last time and I also search solutions
in the internet.

Scenario:
multi-core cpu, but only one of cores is running linux for control
plane, while others are running the private 100% computing

Solution:
The following steps are done:
1. Add the isolcpus opinion in the kernel command
2. The init thread only can run in the control CPU
3. disable all the local interrupt in the other CPUs/cores.

Result:
The system can be up, but NOT stable. Any system call may stuck the
shell(i.e  ping/chkconfig ). 

Question:
Could some give me some help/suggestion for the next steps?

Thanks a lot.
--

From: KAMEZAWA Hiroyuki
Date: Tuesday, March 23, 2010 - 10:57 pm

On Wed, 24 Mar 2010 14:42:27 +0900
Seems good.

More info.
 - This patch modifies walk_page_range()'s hugepage walker.
   But the change only affects pagemap_read(), it's only caller of hugepage callback.

 - Before patch, hugetlb_entry() callback is called once per pgd. Then,
   hugtlb_entry() has to walk pgd's contents by itself. 
   This caused BUG.

 - After patch, hugetlb_entry() callback is called once per hugepte entry.
nitpick.

seems nicer than mine but "return 0" is ok if you add "return err" in the loop.

Thanks,
-Kame

--

From: Naoya Horiguchi
Date: Wednesday, March 24, 2010 - 10:55 pm

I think there is misunderstanding on this part.
I would add this instead:
 - Without this patch, hugetlb_entry() callback is called once per vma,

OK. I fixed it.

Thanks,
Naoya Horiguchi
--

From: Matt Mackall
Date: Wednesday, March 24, 2010 - 9:54 am

Looks good to me.

Acked-by: Matt Mackall <mpm@selenic.com>

-- 
http://selenic.com : development and support for Mercurial and Linux


--

Previous thread: [PATCH] x86: mce: Xeon75xx specific interface to get corrected memory error information v2 by Andi Kleen on Tuesday, March 23, 2010 - 10:40 pm. (6 messages)

Next thread: [PATCH -mmotm] pagemap: add #ifdefs CONFIG_HUGETLB_PAGE on code walking hugetlb vma by Naoya Horiguchi on Tuesday, March 23, 2010 - 10:42 pm. (1 message)