Hi Ingo, This series begins to lay the groundwork for Xen domain 0 support. Domain 0 is much like a normal Xen domain, but it is also allowed to have direct access to the underlying hardware (including both IO space and various BIOS tables), so it can run device drivers, etc. This means that we need to be able to distinguish between a Xen domain's normal pseudo-physical address space, and the real machine physical address space. This series: - Adds a new pte flag - _PAGE_IOMAP - used to indicate that a mapping is intended to map an IO device, and the physical address is actually a real machine physical address rather than a pseudo-physical address. This is exposed as PAGE_KERNEL_IO and so on. ioremap() and early_ioremap() are modified to set this flag, as they (should) always be used to map IO devices and not other memory. By default __supported_pte_mask masks this flag out, so it won't end up in the final pagetable. The Xen (and any other) code which cares about this flag unmask it from __supported_pte_mask. - Add early_memremap() to deal with the cases where early_ioremap() actually being used to map normal memory. - Remove a bogus check in x86-64's implementation of set_pte_vaddr_pud(), which prevents memory mappings from being updated. - Convert __acpi_map_table to use early_ioremap(), rather than have its own implementation. - Make __acpi_map_table always map the memory rather than assuming that the linear map maps some of the acpi tables. This won't be true in the virtual case, and always mapping doesn't hurt in the native case. I've tested these patches on both 32 and 64 native booting, and it all works for me. Thanks, J --
early_ioremap() is redeclared in several places; remove them. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- include/asm-x86/io.h | 14 -------------- include/asm-x86/io_64.h | 3 --- 2 files changed, 17 deletions(-) diff --git a/include/asm-x86/io.h b/include/asm-x86/io.h --- a/include/asm-x86/io.h +++ b/include/asm-x86/io.h @@ -4,20 +4,6 @@ #define ARCH_HAS_IOREMAP_WC #include <linux/compiler.h> - -/* - * early_ioremap() and early_iounmap() are for temporary early boot-time - * mappings, before the real ioremap() is functional. - * A boot-time mapping is currently limited to at most 16 pages. - */ -#ifndef __ASSEMBLY__ -extern void early_ioremap_init(void); -extern void early_ioremap_clear(void); -extern void early_ioremap_reset(void); -extern void *early_ioremap(unsigned long offset, unsigned long size); -extern void early_iounmap(void *addr, unsigned long size); -extern void __iomem *fix_ioremap(unsigned idx, unsigned long phys); -#endif #define build_mmio_read(name, size, type, reg, barrier) \ static inline type name(const volatile void __iomem *addr) \ diff --git a/include/asm-x86/io_64.h b/include/asm-x86/io_64.h --- a/include/asm-x86/io_64.h +++ b/include/asm-x86/io_64.h @@ -165,9 +165,6 @@ #include <asm-generic/iomap.h> -extern void *early_ioremap(unsigned long addr, unsigned long size); -extern void early_iounmap(void *addr, unsigned long size); - /* * This one maps high address device memory and turns off caching for that area. * it's useful if some control registers are in such an area and write combining --
Use one of the software-defined PTE bits to indicate that a mapping is
intended for an IO address. On native hardware this is irrelevent,
since a physical address is a physical address. But in a virtual
environment, physical addresses are also virtualized, so there needs
to be some way to distinguish between pseudo-physical addresses and
actual hardware addresses; _PAGE_IOMAP indicates this intent.
By default, __supported_pte_mask masks out _PAGE_IOMAP, so it doesn't
even appear in the final pagetable.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
arch/x86/mm/init_32.c | 2 +-
arch/x86/mm/init_64.c | 2 +-
arch/x86/mm/ioremap.c | 8 ++++----
include/asm-x86/fixmap.h | 2 +-
include/asm-x86/pgtable.h | 14 ++++++++++++--
5 files changed, 19 insertions(+), 9 deletions(-)
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -503,7 +503,7 @@
int nx_enabled;
-pteval_t __supported_pte_mask __read_mostly = ~(_PAGE_NX | _PAGE_GLOBAL);
+pteval_t __supported_pte_mask __read_mostly = ~(_PAGE_NX | _PAGE_GLOBAL | _PAGE_IOMAP);
EXPORT_SYMBOL_GPL(__supported_pte_mask);
#ifdef CONFIG_X86_PAE
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -88,7 +88,7 @@
int after_bootmem;
-unsigned long __supported_pte_mask __read_mostly = ~0UL;
+pteval_t __supported_pte_mask __read_mostly = ~_PAGE_IOMAP;
EXPORT_SYMBOL_GPL(__supported_pte_mask);
static int do_not_nx __cpuinitdata;
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -223,16 +223,16 @@
switch (prot_val) {
case _PAGE_CACHE_UC:
default:
- prot = PAGE_KERNEL_NOCACHE;
+ prot = PAGE_KERNEL_IO_NOCACHE;
break;
case _PAGE_CACHE_UC_MINUS:
- prot = PAGE_KERNEL_UC_MINUS;
+ prot = PAGE_KERNEL_IO_UC_MINUS;
break;
case ...Could PTE_SPECIAL, added for get_user_pages_really_fast(), be reused for this? -- error compiling committee.c: too many arguments to function --
I'm not sure; I still don't really understand how _PAGE_SPECIAL gets
used, other than being user-mode mapping only. But in principle,
_PAGE_IOMAP could be set on both kernel and user mappings (if you direct
map a device into a process address space), so I think they would
conflict then?
Also, _PAGE_SPECIAL is also shared with _PAGE_CPA_TEST, which is only
used on kernel mappings, so they can co-exist happily.
Is _PAGE_IOMAP at all useful for device passthrough in kvm?
J
--
It's a "don't refcount me" flag, which is not sematically the same as No. We don't care if a page is an I/O page or RAM (other than refcounting correctness and page attributes, which are handled by other means). -- error compiling committee.c: too many arguments to function --
That's basically what our _PAGE_IO flag (in our old Linux patchset) means. We use it to cause pte_pfn() to return an invalid pfn and hence avoid reference counting that way. Since kernel mappings are never reference counted (I think?) perhaps we could use _PAGE_SPECIAL even if it is restricted to use on user mappings. -- Keir --
Well, _PAGE_IOMAP's most important semantic from Xen's perspective is
that the frame number is considered to already be an MFN and so isn't
converted. It may be that _PAGE_SPECIAL is also useful for its "no
refcount" properties, but we could set both in that case.
At present we can't set _PAGE_SPECIAL on kernel mappings because we
assume its usermode only, and its shared with _PAGE_CPA_TEST which is
kernel-only. Most _PAGE_IOMAP mappings are in kernel space, but I
wouldn't discount user mappings if we end up allowing direct access to
passthrough devices.
J
--
Oh, yes, our _PAGE_IO has the same meaning. -- Keir --
Actually it's more of a "no struct page" flag, which implies no refcounting. And not having a struct page should correspond well to a pte not requiring pfn->mfn conversion and being an I/O page. -- error compiling committee.c: too many arguments to function --
Hm, is that actually true enough to define it? Could we rename it
something like _PAGE_NOSTRUCTPAGE or something a bit more specific than
But _PAGE_SPECIAL is only set in a few places. It's not set in ioremap
mappings and so on. Should it be?
There's also the hiccup that it gets set in a pte with pte_mkspecial() -
but at that point its too late because you've already constructed the
pte and done the pfn->mfn conversion. _PAGE_IOMAP can only be set when
you initially construct the pte out of a frame number and a pgprot.
J
--
It complements vm_normal_page, which was there first (and coined by Linus). It is the opposite of normal. This question always comes up and my answer is always yes, if you can convince Linus to rename vm_normal_page to the corresponding term :) It's not exactly _PAGE_NOSTRUCTPAGE. There can be struct pages under Kernel address space, you mean? No, it is only ever used on user I don't see this would be any problem because the pte is always constructed in a single line in both places where it is used. --
Not really. Normal is normal, but "special" doesn't tell us what kind
To the extent that the struct page may as well not exist? Does it
contain any meaningful state? Are they always IO mappings? Could we
Right. But if we fold _PAGE_SPECIAL and _PAGE_IOMAP together, it would
start getting used on kernel addresses (and obviously we'd need to
OK. If we were to fold these two together, then pte_mkspecial() would
have to go, since it wouldn't possible to use correctly in my use case.
J
--
The remappings in setup.c are all just ordinary memory, so use
early_memremap() rather than early_ioremap().
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
arch/x86/kernel/setup.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -302,7 +302,7 @@
if (clen > MAX_MAP_CHUNK-slop)
clen = MAX_MAP_CHUNK-slop;
mapaddr = ramdisk_image & PAGE_MASK;
- p = early_ioremap(mapaddr, clen+slop);
+ p = early_memremap(mapaddr, clen+slop);
memcpy(q, p+slop, clen);
early_iounmap(p, clen+slop);
q += clen;
@@ -379,7 +379,7 @@
return;
pa_data = boot_params.hdr.setup_data;
while (pa_data) {
- data = early_ioremap(pa_data, PAGE_SIZE);
+ data = early_memremap(pa_data, PAGE_SIZE);
switch (data->type) {
case SETUP_E820_EXT:
parse_e820_ext(data, pa_data);
@@ -402,7 +402,7 @@
return;
pa_data = boot_params.hdr.setup_data;
while (pa_data) {
- data = early_ioremap(pa_data, sizeof(*data));
+ data = early_memremap(pa_data, sizeof(*data));
e820_update_range(pa_data, sizeof(*data)+data->len,
E820_RAM, E820_RESERVED_KERN);
found = 1;
@@ -428,7 +428,7 @@
return;
pa_data = boot_params.hdr.setup_data;
while (pa_data) {
- data = early_ioremap(pa_data, sizeof(*data));
+ data = early_memremap(pa_data, sizeof(*data));
sprintf(buf, "setup data %x", data->type);
reserve_early(pa_data, pa_data+sizeof(*data)+data->len, buf);
pa_data = data->next;
--
early_ioremap() is also used to map normal memory when constructing
the linear memory mapping. However, since we sometimes need to be able
to distinguish between actual IO mappings and normal memory mappings,
add a early_memremap() call, which maps with PAGE_KERNEL (as opposed
to PAGE_KERNEL_IO for early_ioremap()), and use it when constructing
pagetables.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
arch/x86/mm/init_64.c | 2 +-
arch/x86/mm/ioremap.c | 22 +++++++++++++++++-----
include/asm-x86/io.h | 1 +
3 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -312,7 +312,7 @@
if (pfn >= table_top)
panic("alloc_low_page: ran out of memory");
- adr = early_ioremap(pfn * PAGE_SIZE, PAGE_SIZE);
+ adr = early_memremap(pfn * PAGE_SIZE, PAGE_SIZE);
memset(adr, 0, PAGE_SIZE);
*phys = pfn * PAGE_SIZE;
return adr;
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -549,12 +549,12 @@
}
static inline void __init early_set_fixmap(enum fixed_addresses idx,
- unsigned long phys)
+ unsigned long phys, pgprot_t prot)
{
if (after_paging_init)
- set_fixmap(idx, phys);
+ __set_fixmap(idx, phys, prot);
else
- __early_set_fixmap(idx, phys, PAGE_KERNEL);
+ __early_set_fixmap(idx, phys, prot);
}
static inline void __init early_clear_fixmap(enum fixed_addresses idx)
@@ -582,7 +582,7 @@
}
late_initcall(check_early_ioremap_leak);
-void __init *early_ioremap(unsigned long phys_addr, unsigned long size)
+static void __init *__early_ioremap(unsigned long phys_addr, unsigned long size, pgprot_t prot)
{
unsigned long offset, last_addr;
unsigned int nrpages, nesting;
@@ -631,7 +631,7 @@
idx0 = FIX_BTMAP_BEGIN - NR_FIX_BTMAPS*nesting;
idx = idx0;
while (nrpages > 0) {
- early_set_fixmap(idx, ...The check prevents flags on mappings from being changed, which is not desireable. There's no need to check for replacing a mapping, and x86-32 does not do this check. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/mm/init_64.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -195,9 +195,6 @@ } pte = pte_offset_kernel(pmd, vaddr); - if (!pte_none(*pte) && pte_val(new_pte) && - pte_val(*pte) != (pte_val(new_pte) & __supported_pte_mask)) - pte_ERROR(*pte); set_pte(pte, new_pte); /* --
__acpi_map_table() effectively reimplements early_ioremap(). Rather
than have that duplication, just implement it in terms of
early_ioremap().
However, unlike early_ioremap(), __acpi_map_table() just maintains a
single mapping which gets replaced each call, and has no corresponding
unmap function. Implement this by just removing the previous mapping
each time its called. Unfortunately, this will leave a stray mapping
at the end.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
arch/x86/kernel/acpi/boot.c | 27 +++++++--------------------
include/asm-x86/acpi.h | 3 ---
include/asm-x86/fixmap_32.h | 4 ----
include/asm-x86/fixmap_64.h | 4 ----
4 files changed, 7 insertions(+), 31 deletions(-)
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -121,8 +121,8 @@
*/
char *__init __acpi_map_table(unsigned long phys, unsigned long size)
{
- unsigned long base, offset, mapped_size;
- int idx;
+ static char *prev_map;
+ static unsigned long prev_size;
if (!phys || !size)
return NULL;
@@ -130,26 +130,13 @@
if (phys+size <= (max_low_pfn_mapped << PAGE_SHIFT))
return __va(phys);
- offset = phys & (PAGE_SIZE - 1);
- mapped_size = PAGE_SIZE - offset;
- clear_fixmap(FIX_ACPI_END);
- set_fixmap(FIX_ACPI_END, phys);
- base = fix_to_virt(FIX_ACPI_END);
+ if (prev_map)
+ early_iounmap(prev_map, prev_size);
- /*
- * Most cases can be covered by the below.
- */
- idx = FIX_ACPI_END;
- while (mapped_size < size) {
- if (--idx < FIX_ACPI_BEGIN)
- return NULL; /* cannot handle this */
- phys += PAGE_SIZE;
- clear_fixmap(idx);
- set_fixmap(idx, phys);
- mapped_size += PAGE_SIZE;
- }
+ prev_size = size;
+ prev_map = early_ioremap(phys, size);
- return ((unsigned char *)base + offset);
+ return prev_map;
}
#ifdef CONFIG_PCI_MMCONFIG
diff --git a/include/asm-x86/acpi.h b/include/asm-x86/acpi.h
--- ...It would be better to just fix the ACPI code to unmap. -Andi --
I was concerned that would cause lots of cross-arch churn, but of course
the only other relevant architecture is ia64. I'll prep a followup patch.
J
--
uhm, there's a nasty trap in that route: it can potentially cause a lot of breakage. It's not robust to assume that the ACPI code is sane wrt. mapping/unmapping, because it currently simply doesnt rely on robust unmapping (in the linear range). I tried it in the past and i found tons of crappy ACPI code all around that just never unmapped tables. Leaking ACPI maps are hard to find as well, and it can occur anytime during bootup. As a general principle it might be worth fixing those places, and we've hardened up the early-ioremap code for leaks during the PAT rewrite, still please realize that it can become non-trivial and it might cause a lot of unhappy users. So i'd suggest a different, more carful approach: keep the new code you wrote, but print a WARN()ing if prev_map is not unmapped yet when the next mapping is acquired. That way the ACPI code can be fixed gradually and without breaking existing functionality. There's another complication: ACPI might rely on multiple mappings being present at once, so unmapping the previous one might not be safe. But it _should_ be fine most of the time as __acpi_map_table() is only used inearly init code - and we fixed most of these things in the PAT patchset in any case. Ingo --
You mean there's code which just assumes that it can keep using a
linear-mapped acpi even after __acpi_map_table() should have implicitly
__acpi_map_table() is called by acpi_map_table(), which does have a
acpi_unmap_table() counterpart. But it only calls iounmap() once we're
past the stage of calling early_*(). I could easily make it call
__acpi_unmap_table()->early_iounmap(). But if the concern is that the
early boot callers of acpi_map_table() "know" that they never need to
And the current behaviour of __acpi_map_table() is to remove the
previous mapping (implicitly, by overwriting the same fixmap slots), so
its only an issue if the callers assume they can keep using
linear-mapped acpi tables after a subsequent call to __acpi_map_table().
J
--
Yes, you're right, it's a mess. I put a unmap and warning in there, but
there's lots of acpi code which doesn't unmap after using its table. It
doesn't seem to intermix using two tables, fortunately, so the "unmap
previous" behaviour of __acpi_map_table works OK, at least as far as I
can see.
J
--
ok, i stuck in your patches into tip/master today and -tip testing quickly found an early-ioremap leak: [ 36.625100] calling check_early_ioremap_leak+0x0/0x3d [ 36.630253] ------------[ cut here ]------------ [ 36.634884] WARNING: at arch/x86/mm/ioremap.c:577 check_early_ioremap_leak+0x28/0x3d() [ 36.642811] Debug warning: early ioremap leak of 1 areas detected. find the full log below with ioremap-leak-tracing turned on. I've excluded these commits for now from tip/master. Ingo ----------------> [ 0.000000] Linux version 2.6.27-rc6-tip-00189-gafeef79-dirty (mingo@titan) (gcc version 4.2.3) #1 SMP Wed Sep 10 13:53:53 CEST 2008 [ 0.000000] Command line: root=/dev/sda1 earlyprintk=serial,ttyS0,115200 console=ttyS0,115200 console=tty 5 profile=0 debug initcall_debug apic=debug apic=verbose ignore_loglevel sysrq_always_enabled pci=nomsi early_ioremap_debug [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel [ 0.000000] AMD AuthenticAMD [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 000000003ed94000 (usable) [ 0.000000] BIOS-e820: 000000003ed94000 - 000000003ee4e000 (ACPI NVS) [ 0.000000] BIOS-e820: 000000003ee4e000 - 000000003fea2000 (usable) [ 0.000000] BIOS-e820: 000000003fea2000 - 000000003fee9000 (ACPI NVS) [ 0.000000] BIOS-e820: 000000003fee9000 - 000000003feed000 (usable) [ 0.000000] BIOS-e820: 000000003feed000 - 000000003feff000 (ACPI data) [ 0.000000] BIOS-e820: 000000003feff000 - 000000003ff00000 (usable) [ 0.000000] console [earlyser0] enabled [ 0.000000] debug: ignoring loglevel setting. [ 0.000000] last_pfn = 0x3ff00 max_arch_pfn = 0x3ffffffff [ 0.000000] init_memory_mapping [ 0.000000] 0000000000 - 003fe00000 page ...
Yes, that leak is expected, unfortunately. __acpi_map_table() has no
corresponding unmap, and only maintains one mapping. So it will leak
its last mapping when it switches over from using __acpi_map_table() to
ioremap().
So, yes, its ugly, but its guaranteed to be a single leaked mapping.
But I'm not sure what the best approach to deal with it is.
(All those other backtraces are just informational, right?)
J
--
the false positive should be avoided, for example by unmapping the no, they cause hard failures in my test setup. Nor do we want to litter the bootup log with messages that are not correct. Ingo --
Yep. I'll make it unmap just before setting acpi_gbl_permanent_mmap,
No, I mean that early_ioremap_debug causes all early_ioremap()s to dump
a stack trace, so most of the stack dumps in the boot log are
diagnostic/informational rather than indications of lots of problems.
J
--
ah - now i understand what you mean, sure. The only non-diagnostic message that the test barfed on was the leak warning itself. The others were there because i did a second boot with the diagnostic messages enabled as well. (so that there's a full track of what is mapped where) Ingo --
acpi_os_map_memory is the only user for __acpi_map_table (except es7000_32.c)
void __iomem *__init_refok
acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
{
if (phys > ULONG_MAX) {
printk(KERN_ERR PREFIX "Cannot map memory that high\n");
return NULL;
}
if (acpi_gbl_permanent_mmap)
/*
* ioremap checks to ensure this is in reserved space
*/
return ioremap((unsigned long)phys, size);
else
return __acpi_map_table((unsigned long)phys, size);
}
EXPORT_SYMBOL_GPL(acpi_os_map_memory);
void acpi_os_unmap_memory(void __iomem * virt, acpi_size size)
{
if (acpi_gbl_permanent_mmap) {
iounmap(virt);
}
}
EXPORT_SYMBOL_GPL(acpi_os_unmap_memory);
just let acpi_os_unmap_memory to call __acpi_unmap_table...
YH
--
That was my first attempt. Unfortunately a lot of the acpi code seems
pretty sloppy about unmapping its tables, and basically relies on
__acpi_map_table's current behaviour of removing the previous mapping
when creating the new mapping.
J
--
could we just emit a WARN_ON_ONCE() warning when we have to remove a previous mapping, so that the ACPI code can be fixed eventually? Ingo --
could be some code is shared between early and permanent, and later may not need unmap. anyway those code need to be cleanup, to double check if some fix-map is overwrited unexpected.. YH --
Always map acpi tables, rather than assuming we can use the normal linear mapping to access the acpi tables. This is necessary in a virtual environment where the linear mappings are to pseudo-physical memory, but the acpi tables exist at a real physical address. It doesn't hurt to map in the normal non-virtual case, so just do it unconditionally. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/kernel/acpi/boot.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -126,9 +126,6 @@ if (!phys || !size) return NULL; - - if (phys+size <= (max_low_pfn_mapped << PAGE_SHIFT)) - return __va(phys); if (prev_map) early_iounmap(prev_map, prev_size); --
actually, case 1: acpi tables near mmio, range, we don't map them from 2.6.27-rc1, and it is bigger than max_low_mapped... case 2: some strange system put acpi in the middle of RAM... like when 8G ram installed, but MMIO is 3.5G, BIOS put acpi tables around 2G.. YH --
OK, so what's your conclusion? Is this change OK or not?
J
--
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> --
