From: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
...because, on virtualized hardware like System p, we can't be sure that the
physical pages behind them are contiguous.Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
---Another patch for 2.6.24 that will apply cleanly on top of my previous
patchset. Please review and apply. Thanks!drivers/infiniband/hw/ehca/ehca_classes.h | 8 ++--
drivers/infiniband/hw/ehca/ehca_mrmw.c | 82 +++++++++++++++++++++++++----
2 files changed, 75 insertions(+), 15 deletions(-)diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h
index 206d4eb..c2edd4c 100644
--- a/drivers/infiniband/hw/ehca/ehca_classes.h
+++ b/drivers/infiniband/hw/ehca/ehca_classes.h
@@ -99,10 +99,10 @@ struct ehca_sport {
struct ehca_sma_attr saved_attr;
};-#define HCA_CAP_MR_PGSIZE_4K 1
-#define HCA_CAP_MR_PGSIZE_64K 2
-#define HCA_CAP_MR_PGSIZE_1M 4
-#define HCA_CAP_MR_PGSIZE_16M 8
+#define HCA_CAP_MR_PGSIZE_4K 0x80000000
+#define HCA_CAP_MR_PGSIZE_64K 0x40000000
+#define HCA_CAP_MR_PGSIZE_1M 0x20000000
+#define HCA_CAP_MR_PGSIZE_16M 0x10000000struct ehca_shca {
struct ib_device ib_device;
diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c b/drivers/infiniband/hw/ehca/ehca_mrmw.c
index 4c8f3b3..1bb9d23 100644
--- a/drivers/infiniband/hw/ehca/ehca_mrmw.c
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -41,6 +41,8 @@
*/#include <asm/current.h>
+#include <linux/mm.h>
+#include <linux/hugetlb.h>#include <rdma/ib_umem.h>
@@ -51,6 +53,7 @@
#define NUM_CHUNKS(length, chunk_size) \
(((length) + (chunk_size - 1)) / (chunk_size))
+
/* max number of rpages (per hcall register_rpages) */
#define MAX_RPAGES 512@@ -279,6 +282,52 @@ reg_phys_mr_exit0:
} /* end ehca_reg_phys_mr() *//*----------------------------------------------------------------------*/
+static int ehca_is_mem_hugetlb(unsigned long addr, unsigned long size)
+{
+ s...
> -#define HCA_CAP_MR_PGSIZE_4K 1
> -#define HCA_CAP_MR_PGSIZE_64K 2
> -#define HCA_CAP_MR_PGSIZE_1M 4
> -#define HCA_CAP_MR_PGSIZE_16M 8
> +#define HCA_CAP_MR_PGSIZE_4K 0x80000000
> +#define HCA_CAP_MR_PGSIZE_64K 0x40000000
> +#define HCA_CAP_MR_PGSIZE_1M 0x20000000
> +#define HCA_CAP_MR_PGSIZE_16M 0x10000000Not sure I understand what this has to do with things... is this an
unrelated fix?> +static int ehca_is_mem_hugetlb(unsigned long addr, unsigned long size)
This is rather awful -- another call to get_user_pages() to iterate
over all the vmas...I would suggest extending ib_umem_get() to check the vmas and adding a
member to struct ib_umem to say whether the memory is entirely covered
by hugetlb pages or not.> + ret = ehca_is_mem_hugetlb(virt, length);
> + switch (ret) {
> + case 0: /* mem is not from hugetlb */
> + hwpage_size = PAGE_SIZE;
> + break;
> + case 1:
> + if (length <= EHCA_MR_PGSIZE4K
> + && PAGE_SIZE == EHCA_MR_PGSIZE4K)
> + hwpage_size = EHCA_MR_PGSIZE4K;
> + else if (length <= EHCA_MR_PGSIZE64K)
> + hwpage_size = EHCA_MR_PGSIZE64K;
> + else if (length <= EHCA_MR_PGSIZE1M)
> + hwpage_size = EHCA_MR_PGSIZE1M;
> + else
> + hwpage_size = EHCA_MR_PGSIZE16M;
> + break;
> + default: /* out of mem */
> + ib_mr = ERR_PTR(-ENOMEM);
> + goto reg_user_mr_exit1;It seems like it would be better to just assume the memory is not from
a hugetlb is ehca_is_mem_hugetlb() fails its memory allocation and
fall back to the PAGE_SIZE case rather than failing entirely.Also if someone runs a kernel with 64K pages on a machine where they
end up being simulated from 4K pages, do you have the same issue with
the hypervisor ganging together non-contiguous pages?- R.
-
If ehca_is_mem_hugetlb() runs out of memory, ehca_reg_mr() is rather
unlikely to get the memory, but it's worth a try, I'll give you that. I'll
make the umem patch work that way.Joachim
-
With todays hypervisor and todays pagesizes and todays MMUs
we don't have this problem if eHCA is enabled.It is difficult to make predictions about the future,
- Christoph R.
-
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
git: | |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Frans Pop | svc: failed to register lockdv1 RPC service (errno 97). |
