>From 086179f2ab6950b339dbb4e66f749a7c87e9a5be Mon Sep 17 00:00:00 2001 From: Eli Cohen <eli@mellanox.co.il> Date: Fri, 20 Jun 2008 15:36:44 +0300 Subject: [PATCH] IB/mthca: Clear ICM pages before handing to FW Current memfree FW has a bug which in some cases, assumes that ICM pages passed to it are cleared -- this patch will clear any ICM page passed to the FW. It is essential to use this patch since there are many cards already in use which use FW with this bug. At some time when we're sure all systems are updated with the new FW (yet to be released), we can remove this patch. Signed-off-by: Eli Cohen <eli@mellanox.co.il> --- This fixes the bug reported by Arthur from SGI here: http://lists.openfabrics.org/pipermail/general/2008-May/050026.html drivers/infiniband/hw/mthca/mthca_memfree.c | 29 +++++++++++++++++++++++++++ 1 files changed, 29 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c b/drivers/infiniband/hw/mthca/mthca_memfree.c index 9e77ba9..9596c9b 100644 --- a/drivers/infiniband/hw/mthca/mthca_memfree.c +++ b/drivers/infiniband/hw/mthca/mthca_memfree.c @@ -103,6 +103,33 @@ void mthca_free_icm(struct mthca_dev *dev, struct mthca_icm *icm, int coherent) kfree(icm); } +static void clear_pages(struct page *page, int npages) +{ + struct page **page_arr; + int i; + void *buf; + + page_arr = kmalloc(npages * sizeof *page_arr, GFP_KERNEL); + if (!page_arr) { + printk(KERN_WARNING "page array alloc failure\n"); + return; + } + + for (i = 0; i < npages; ++i) + page_arr[i] = page++; + + buf = vmap(page_arr, npages, VM_MAP, PAGE_KERNEL); + if (!buf) { + printk(KERN_WARNING "vmap failed\n"); + goto exit; + } + memset(buf, 0, PAGE_SIZE * npages); + vunmap(buf); + +exit: + kfree(page_arr); +} + static int mthca_alloc_icm_pages(struct scatterlist *mem, int order, gfp_t gfp_mask) { struct page *page; @@ -111,6 +138,8 @@ static int mthca_alloc_icm_pages(struct scatterlist *mem, int order, gfp_t ...
> Current memfree FW has a bug which in some cases, assumes that ICM > pages passed to it are cleared -- this patch will clear any ICM page > passed to the FW. It is essential to use this patch since there are > many cards already in use which use FW with this bug. At some time > when we're sure all systems are updated with the new FW (yet to be > released), we can remove this patch. realistically I think we leave this workaround forever. it's not like it costs very much. Does mlx4 have any similar problem? > --- > This fixes the bug reported by Arthur from SGI here: > http://lists.openfabrics.org/pipermail/general/2008-May/050026.html This should be in the changelog itself... there's no reason to throw away this useful information when merging the patch. Anyway, thanks for debugging this... one question, why not just use the following (completely untested) change instead? Does this not work? At least using clear_highpage() instead of vmap() by hand seems much simpler too if __GFP_ZERO isn't usable for some reason. diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c b/drivers/infiniband/hw/mthca/mthca_memfree.c index 9e77ba9..4fb3c85 100644 --- a/drivers/infiniband/hw/mthca/mthca_memfree.c +++ b/drivers/infiniband/hw/mthca/mthca_memfree.c @@ -107,7 +107,7 @@ static int mthca_alloc_icm_pages(struct scatterlist *mem, int order, gfp_t gfp_m { struct page *page; - page = alloc_pages(gfp_mask, order); + page = alloc_pages(gfp_mask | __GFP_ZERO, order); if (!page) return -ENOMEM; _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
I was thinking about applications which create and destroy HCA Looks to me like using __GFP_ZERO is the cleanest approach. I like less clear_highpage() since it uses kmap_atomic() which could fail. _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> I was thinking about applications which create and destroy HCA
> resources at a high rate which might be affected.
I would expect all the other costs of mapping pages into/out of the ICM
to be large enough that the cost of zeroing a page is not too big a
deal. But if there is anyone who cares we can always make this
__GFP_ZERO flag conditional on FW version I guess.
> Looks to me like using __GFP_ZERO is the cleanest approach. I like
> less clear_highpage() since it uses kmap_atomic() which could fail.
To the best of my knowledge, kmap_atomic() cannot fail -- and if you
think about it, what could make it fail? The whole point of
kmap_atomic() is that there is a per-cpu pre-reserved slot to map the
memory at for highmem pages, so it has to always work. As far as I can
see, __GFP_ZERO allocations use clear_highpage() internally anyway, so
it ends up being the same thing.
Since just adding __GFP_ZERO is so much simpler, I'll just commit the
patch below and send it to Linus in a day or two if it seems OK to you:
commit 801d1ad7b6bdb3418a462c6b4950aee56dbac940
Author: Eli Cohen <eli@mellanox.co.il>
Date: Sun Jun 22 12:56:58 2008 -0700
IB/mthca: Clear ICM pages before handing to FW
Current memfree FW has a bug which in some cases, assumes that ICM
pages passed to it are cleared. This patch uses __GFP_ZERO to
allocate all ICM pages passed to the FW. Once firmware with a fix is
released, we can make the workaround conditional on firmware version.
This fixes the bug reported by Arthur Kepner <akepner@sgi.com> here:
http://lists.openfabrics.org/pipermail/general/2008-May/050026.html
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
[ Rewritten to be a one-liner using __GFP_ZERO instead of vmap()ing
ICM memory and memset()ing it to 0. - Roland ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c ...Looking at the implementation of kmap_atomic() quoted bellow:
void *page_address(struct page *page)
{
unsigned long flags;
void *ret;
struct page_address_slot *pas;
if (!PageHighMem(page))
return lowmem_page_address(page);
pas = page_slot(page);
ret = NULL;
spin_lock_irqsave(&pas->lock, flags);
if (!list_empty(&pas->lh)) {
struct page_address_map *pam;
list_for_each_entry(pam, &pas->lh, list) {
if (pam->page == page) {
ret = pam->virtual;
goto done;
}
}
}
done:
spin_unlock_irqrestore(&pas->lock, flags);
return ret;
}
static inline void *kmap_atomic(struct page *page, enum km_type idx)
{
pagefault_disable();
return page_address(page);
}
I can't see page_address doing anything actively to map the page. It
just searches for the page and returns it's mapping. But I'm probably
Of course. Go ahead and send it.
Thanks.
_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
You're looking at the no-highmem version. Check the one in arch/x86/mm/highmem_32.c for a more interesting example. -- Pete _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
