> We've been hit by this twice this week on two NFS/RDMA servers, so I'm
> glad to see this! But, for us it happens with memless ConnectX - our mthca
> devices are ok (but OTOH they're memfull not memfree)
Strange... as I said before though something seems to have changed to
affect this, though I have no idea what. I'm including the test program
I use to check if QP creation succeeds, you can run this on any suspect
systems and see what it prints.
> I'll be happy to test it with our misbehaving cards, but I can't do it until
> next week since they just went into a box for shipping. In the meantime,
> dare I ask - what's different about memfree cards that limits the sge
> attributes like this? And, what values result from the new code? The
> ConnectX ones I have report 32, and fail when trying to set that.
The patch doesn't change ConnectX -- creating a QP with max send/recv
sge 32 works fine for me here with mlx4 from 2.6.26-rc2. For mem-free
the new max_sge reported is 27 sge entries, and for memful it is 59 (and
creating such QPs succeeds of course). The difference between memfree
and memful that matters is just that the max_sge on memfree runs into
the max WQE size, and the code didn't handle that correctly without the
patch.
Here's the test program to check QP creation vs reported max_sge:
#include <stdio.h>
#include <string.h>
#include <infiniband/verbs.h>
int main(int argc, char *argv)
{
struct ibv_device **dev_list;
struct ibv_device_attr dev_attr;
struct ibv_context *context;
struct ibv_pd *pd;
struct ibv_cq *cq;
struct ibv_qp_init_attr qp_attr;
int t;
static const struct {
enum ibv_qp_type type;
char *name;
} type_tab[] = {
{ IBV_QPT_RC, "RC" },
{ IBV_QPT_UC, "UC" },
{ IBV_QPT_UD, "UD" },
};
dev_list = ibv_get_device_list(NULL);
if (!dev_list) {
printf("No RDMA devices found\n");
return 1;
}
for (; *dev_list; ++dev_list) {
printf("%s:\n", ibv_get_device_name(*dev_list));
context = ibv_open_device(*dev_list);
if (!context) {
printf(" ibv_open_device failed\n");
continue;
}
if (ibv_query_device(context, &dev_attr)) {
printf(" ibv_query_device failed\n");
continue;
}
cq = ibv_create_cq(context, 1, NULL, NULL, 0);
if (!cq) {
printf(" ibv_create_cq failed\n");
continue;
}
pd = ibv_alloc_pd(context);
if (!pd) {
printf(" ibv_alloc_pd failed\n");
continue;
}
for (t = 0; t < sizeof type_tab / sizeof type_tab[0]; ++t) {
memset(&qp_attr, 0, sizeof qp_attr);
qp_attr.send_cq = cq;
qp_attr.recv_cq = cq;
qp_attr.cap.max_send_wr = 1;
qp_attr.cap.max_recv_wr = 1;
qp_attr.cap.max_send_sge = dev_attr.max_sge;
qp_attr.cap.max_recv_sge = dev_attr.max_sge;
qp_attr.qp_type = type_tab[t].type;
printf(" %s: SGE %d ", type_tab[t].name, dev_attr.max_sge);
if (ibv_create_qp(pd, &qp_attr))
printf("ok (got %d/%d)\n",
qp_attr.cap.max_send_sge,
qp_attr.cap.max_recv_sge);
else
printf("FAILED\n");
}
}
return 0;
}
_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
| David Miller | [GIT]: Networking |
| Thomas Gleixner | Re: Regression in 2.6.27 caused by commit bfc0f59 |
| Rafael J. Wysocki | [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected |
| jmerkey | [ANNOUNCE] mdb: Merkey's Linux Kernel Debugger 2.6.27-rc4 released |
git: | |
| Mike | I don't want the .git directory next to my code. |
| Kevin Ballard | Re: git on MacOSX and files with decomposed utf-8 file names |
| Karl | Re: git-svn should default to --repack |
| Ken Pratt | pack operation is thrashing my server |
| carlopmart | About Xen: maybe a reiterative question but .. |
| NetOne - Doichin Dokov | OpenBSD as Xen domU |
| Nick Guenther | Re: Real men don't attack straw men |
| Paul Barbeau | RAID/Intel Installation Problem |
| Wang Chen | [V#2 PATCH 0/18] netdevice: Fix directly reference of netdev->priv |
| Indan Zupancic | Re: Realtek 8111C transmit timed out |
| Alexey Kuznetsov | Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets |
| Alexey Dobriyan | Re: [GIT]: Networking |
| Personal opinions about Video Poker | 5 minutes ago | Applications and Utilities |
| trouble with my Asus Mainboard | 41 minutes ago | Linux kernel |
| Which games would you prefer in online casinos? | 42 minutes ago | Linux kernel |
| help in UDP catching module.. | 18 hours ago | Linux kernel |
| Is there anything like Real-time drivers? | 2 days ago | Linux general |
| ns16550 serail console in Linux 2.6.19 | 2 days ago | Linux general |
| what class should i use to register my devices | 2 days ago | Linux kernel |
| reset bios pasword toshiba | 3 days ago | Hardware |
| Analysis of Process Scheduling | 4 days ago | Linux kernel |
| RT Kernel and SSH Server Panics | 4 days ago | Linux kernel |
