| From | Subject | Date |
|---|---|---|
| Roland Dreier | [ofa-general] Updated InfiniBand/RDMA merge plans for 2.6.24
Since 2.6.23 still isn't out, and I've managed to reduce my patch
review backlog a bit, it's probably a good idea to give another update
about what I have queued for 2.6.24 already and what I hope to get to
before the merge window opens.
Core:
- My user_mad P_Key index support patch. Merged this, although I
still owe Sasha a patch to update libraries to use this.
- A fix to the user_mad 32-bit big-endian userspace 64/32 problem
with the method_mask when registering agents. Merged.
...
| Oct 5, 7:18 pm 2007 |
| Roland Dreier | [ofa-general] Re: [PATCH 3 of 3 for-2.6.24] mlx4: implement ...
Thanks, I applied cleaned-up versions of all three patches for 2.6.24.
One thing I changed was to just pass an error back to the caller
rather than doing BUG_ON() anywhere. It's very unfriendly to the user
to crash the whole machine just because of a driver bug -- much better
to try and continue so that the user sees the error and can report it.
_______________________________________________
general mailing list
general@lists.openfabrics.org
[ message continues ] " title="http://lists.openfabrics.org/cgi-bin/mailman/listinfo/...">http://lists.openfabrics.org/cgi-bin/mailman/listinfo/... | Oct 5, 6:56 pm 2007 |
| akepner | [ofa-general] mpi failures on large ia64/ofed/IB clusters
On "large" IB-connected ia64 clusters, I (and some customers) are
seeing failures in MPI programs. This is commoner the bigger the
cluster nodes are, but I've seen it with as few as 32P/node.
I'm using "Mellanox Technologies MT23108 InfiniHost (rev a1)"
HCAs, with firmware version 3.5.0 (but this has been seen with
several firmware revisions) and OFED-1.2.
For example, with 2-128P systems connected via a single IB port,
using this simple MPI program:
int main(int argc, char **argv)
{
...
| Oct 5, 6:36 pm 2007 |
| Roland Dreier | Re: [ofa-general] mpi failures on large ia64/ofed/IB clusters
> On one run we got this in syslog (ib_mthca's debug_level set to 1):
>
> 15:34:34 ib_mthca 0012:01:00.0: Command 21 completed with status 09
> 15:35:34 ib_mthca 0012:01:00.0: HW2SW_MPT failed (-16)
> ....
> (status 0x9==MTHCA_CMD_STAT_BAD_RES_STATE => problem with mpi?)
>
> or on another run:
>
> 13:57:15 ib_mthca 0005:01:00.0: Command 1a completed with status 01
> 13:57:15 ib_mthca 0005:01:00.0: modify QP 1->2 returnedstatus 01.
> ....
| Oct 5, 6:46 pm 2007 |
| Roland Dreier | Re: [ofa-general] mpi failures on large ia64/ofed/IB clusters
> I don't really see anything racy in the FW command stuff, but it's
> possible that there's something like an mmiowb() missing somewhere (I
> have a hard time spotting that type of race for some reason).
Another possibility (independent of the hack I suggested before) would
be to add an mmiowb() before the mutex_unlock() in mthca_cmd_post().
I actually have a good feeling about this theory....
- R.
_______________________________________________
general mailing list
general@list...
| Oct 5, 6:51 pm 2007 |
| akepner | Re: [ofa-general] mpi failures on large ia64/ofed/IB clusters
Genius!
I have completed over 275 runs with the patch below, so
we can be very confident that this has fixed things.
Roland, should I submit a proper patch, or do you want
to take care of this? (And thanks alot, too!)
diff -rup ofa_kernel-1.2.orig/drivers/infiniband/hw/mthca/mthca_cmd.c ofa_kernel-1.2/drivers/infiniband/hw/mthca/mthca_cmd.c
--- ofa_kernel-1.2.orig/drivers/infiniband/hw/mthca/mthca_cmd.c 2007-06-21 07:38:47.000000000 -0700
+++ ofa_kernel-1.2/drivers/infiniband/hw/mthca/mthc...
| Oct 5, 8:22 pm 2007 |
| Zulfi Imani | [ofa-general] OFED libibverbs API
------=_Part_13327_10039423.1191617160200
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hi all,
I wanted to find out where I can get the libibverbs API specification from.
I checked the openfabrics.org website but could not find anything
immediately.
Thanks
Zulfi
------=_Part_13327_10039423.1191617160200
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hi all,<br>&l...
| Oct 5, 4:46 pm 2007 |
| Steve Wise | Re: [ofa-general] OFED libibverbs API
OFA Admins:
It would be nice to put the man pages on-line...
If we installed the man pages, then used man2html or something we could
point folks at that for on-line docs...
Zulfi, if you build/install ofed-1.2.5, you can then get man pages for
the verbs and rdmacm APIs. Also there are header files and examples
that get build/installed.
Steve.
_______________________________________________
general mailing list
general@lists.openfabrics.org
[ message continues ] " title="http://lists.openfabrics.org/cgi-bin/ma...">http://lists.openfabrics.org/cgi-bin/ma... | Oct 5, 5:53 pm 2007 |
| Zulfi Imani | Re: [ofa-general] OFED libibverbs API
------=_Part_13581_6151579.1191622736139
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Thanks Steve.
Just a couple of questions. I have installed the OFED1.2 stack. You said I
bin include lib lib64 mpi sbin src
I do not see any subdir for example programs ?
Also where can I find simple programs like file transfer using RDMA and
libibverbs ?
Does the "verbs.h" in the $INSTALL/include/infiniband represent the libverbs
API ?
I...
| Oct 5, 6:18 pm 2007 |
| Sean Hefty | RE: [ofa-general] [PATCH] rdma/cm: add locking around QP acc...
Rick, have you had a chance to test out this patch?
_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
| Oct 5, 1:19 pm 2007 |
| Vladimir Sokolovsky ... | [ofa-general] ofa_1_3_kernel 20071005-0200 daily build status
This email was generated automatically, please do not reply
git_url: git://git.openfabrics.org/ofed_1_3/linux-2.6.git
git_branch: ofed_kernel
Common build parameters: --with-ipoib-mod --with-sdp-mod --with-srp-mod --with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-mlx4-mod --with-core-mod --with-addr_trans-mod --with-rds-mod --with-cxgb3-mod --with-nes-mod
Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.22
Passed ...
| Oct 5, 5:53 am 2007 |
| che_del_rosario | [ofa-general] Big mover shows market today
FRLE begins to deliver promised returns, Shares up over 31%.
Fearless International Inc. (F R L E)
$0.25 UP 31.76 %
Hard climb for the hottest new yacht on the market, shares jumped nearly
32% today. You cant ignore these kind of numbers, this is going to be
huge. There is a time and place for everything, and Friday more is
yours, grab this one early.
_______________________________________________
general mailing list
general@lists.openfabrics.org
[ message continues ] " title="http://lists.openfabrics.org/cgi-bin/mailma...">http://lists.openfabrics.org/cgi-bin/mailma... | Oct 5, 5:42 am 2007 |
| WINNING NOTIFICATION | [ofa-general] ***SPAM*** CONFIRM YOUR WINNING PRIZE Ref: XYL...
Ref: XYL /26510460037/05
Batch: 24/00319/IPD
WINNING NOTIFICATION
We happily announce to you the draw (#1071)winner of the cash prize
of £2,696,385held on the 4th of October 2007 in London Uk.
contact our fiduaciary claims department
Agents Name: Van Williams
Email: claims_uknationallotterydept3@yahoo.co.uk
Tel: +447024096270
1.Name...2.Address...3.Nationality....4.Age...5.Sex...
6.Occupation...7.Phone/Fax..8.COUNTRY..
Cordially,
Rose Woo...
| Oct 5, 5:24 am 2007 |
| damaru | [ofa-general] Do we like the same books?
----boundary_2054012_0b00d57b-fba4-47c8-849c-606cdcdd600d
content-type: text/plain; charset=iso-8859-1
content-transfer-encoding: quoted-printable
I just joined Shelfari to connect with other book lovers. Come see the books=
I love and see if we have any in common. Then pick my next book so I can=
keep on reading.=0A=0AClick below to join my group of friends on Shelfari!=
=0A=0Ahttp://www.shelfari.com/Register.aspx?ActivityId=3D22801633&InvitationCode=3Dc70c284d-a0dd-4a69-a023-3022d4752243=
...
| Oct 5, 4:30 am 2007 |
| kliteyn | [ofa-general] nightly osm_sim report 2007-10-05:normal compl...
OSM Simulation Regression Summary
[Generated mail - please do NOT reply]
OpenSM binary date = 2007-10-04
OpenSM git rev = Tue_Oct_2_22:28:56_2007 [d5c34ddc158599abff9f09a6cc6c8cad67745f0b]
ibutils git rev = Tue_Sep_4_17:57:34_2007 [4bf283f6a0d7c0264c3a1d2de92745e457585fdb]
Total=520 Pass=520 Fail=0
Pass:
39 Stability IS1-16.topo
39 Pkey IS1-16.topo
39 OsmTest IS1-16.topo
39 OsmStress IS1-16.topo
39 Multicast IS1-16.topo
39 LidMgr IS1-16.topo
13 Stability IS3-loop.topo
13...
| Oct 5, 1:09 am 2007 |
| Troy Benjegerdes | Re: [ofa-general] Setting lowest-common denominator ipoib mu...
I think it would help usability a lot to put the PARTITION CONFIGURATION
section in a separate 'opensm-partitions.conf' man page with the values
for rate, mtu and scope listed directly.
_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
| Oct 5, 2:54 pm 2007 |
| Dotan Barak | Re: [ofa-general] Issues to scale to 64K ranks.
Hi.
This number of QPs (and any other resource) is per HCA basis.
The HCA itself support much more QPs (and more elements from any other
resource),
but the driver have limited the number of the QPs to consume less memory.
The mthca low level driver support changing the number of resources with
module parameters,
Until those module parameters will be added, the only way to do is to
hack the low level driver.
Dotan
_______________________________________________
general mailing list
gen...
| Oct 5, 1:51 pm 2007 |
| Roland Dreier | Re: [ofa-general] InfiniBand/RDMA merge plans for 2.6.24
> I tested this by simulating a slow passive side responder, and it worked as
> expected for those tests. Using an MRA does add another MAD to the CM exchange,
> which is why it is sent only after seeing a duplicate request. Alternatively,
> we can take the OFED module parameter patch.
What the heck, I added this for 2.6.24. If it doesn't work out we can
back it out.
- R.
_______________________________________________
general mailing list
general@lists.openfabrics.org
http...
| Oct 5, 7:10 pm 2007 |
| Roland Dreier | Re: [ofa-general] [PATCH] mlx4: increase permissible number ...
Thanks, I just applied Jack's patch and also this:
commit adeeb48f21a36693fed11b318bce132571ed3679
Author: Roland Dreier <rolandd@cisco.com>
Date: Fri Oct 5 16:03:44 2007 -0700
IB/mthca: Increase max number of QPs per multicast group to 56
Increase the number of QPs allowed per multicast group from 8 to 56.
This allows for one QP per core on 16-core systems, which are now
quite common, and allows some space for future growth.
This is basically the same pat...
| Oct 5, 7:12 pm 2007 |
| Zulfi Imani | Re: [ofa-general] Problem running SDP apps using OFED 1.2
------=_Part_13547_30004409.1191621670073
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Hi Dotan,
ifconfig shows up
ib0 Link encap:InfiniBand HWaddr
80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr: 140.221.37.32 Bcast: 140.221.37.255 Mask:
255.255.255.0
inet6 addr: fe80::211:7500:ff:d802/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packet...
| Oct 5, 6:01 pm 2007 |
| Scott Weitzenkamp (s... | RE: [ofa-general] Problem running SDP apps using OFED 1.2
This is a multi-part message in MIME format.
------_=_NextPart_001_01C8079C.3DCD473F
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Can you ping between the two nodes using the IPoIB IP address?
=20
Scott Weitzenkamp
SQA and Release Manager
Server Virtualization Business Unit
Cisco Systems
=20
________________________________
From: general-bounces@lists.openfabrics.org
[mailto:general-bounces@lists.openfabrics.org] On Behalf Of Zulfi Imani
...
| Oct 5, 6:08 pm 2007 |
| Zulfi Imani | Re: [ofa-general] Problem running SDP apps using OFED 1.2
------=_Part_13764_7924086.1191626722839
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
I restarted openibd and now my interfaces are up.
mach#1
ib0 Link encap:InfiniBand HWaddr
80:00:00:02:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:140.221.37.46 Bcast:140.221.37.255 Mask:255.255.255.0
inet6 addr: fe80::211:7500:ff:d7f2/64 Scope:Link
mach#2
ib0 Link encap:InfiniBand HWaddr
80:00:00:...
| Oct 5, 7:25 pm 2007 |
| Scott Weitzenkamp (s... | RE: [ofa-general] Problem running SDP apps using OFED 1.2
This is a multi-part message in MIME format.
------_=_NextPart_001_01C807B9.6764E868
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Does "lsmod | grep sdp" report SDP is loaded on both machines?
=20
I would then use strace with the client to watch the socket system calls
happening, to make sure the client is trying to use SDP.
=20
Scott
________________________________
From: Zulfi Imani [mailto:zulfiimani@gmail.com]=20
Sent: Friday, October...
| Oct 5, 9:37 pm 2007 |
| Zulfi Imani | Re: [ofa-general] Problem running SDP apps using OFED 1.2
------=_Part_13612_8676690.1191623167533
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
For machine#1 my IPoIB interface is
ib0 Link encap:InfiniBand HWaddr
80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:140.221.37.32 Bcast:140.221.37.255 Mask:255.255.255.0
inet6 addr: fe80::211:7500:ff:d802/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packe...
| Oct 5, 6:26 pm 2007 |
| Or Gerlitz | Re: [ofa-general] Problem running SDP apps using OFED 1.2
------=_Part_8145_19775178.1191625124041
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
ib0 on machine#2 is not running, but it seems that your bigger problem is
lack of some essential background on TCP/IP operation, where this list is
not the best place to gain it.
Or.
------=_Part_8145_19775178.1191625124041
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
On 10/6/07, ...
| Oct 5, 6:58 pm 2007 |
| Roland Dreier | Re: [ofa-general] Re: [PATCH RFC v2] IB/ipoib: enable IGMP f...
> I understand this desire... just need a little clarification from you
> re hotplug. First, as for OFED, looking on the openibd service script
> (excerpts below) installed by OFED 1.3 I see that mode and mtu are set
> "manually", that is the user sets/provides the mode and mtu params for
> the script and the script uses sysfs to configure the device. This
> does not address devices created after the service has started nor
> seem a very elegant way to do so.
I don't kno...
| Oct 5, 6:59 pm 2007 |
| Or Gerlitz | Re: [ofa-general] Re: [PATCH RFC v2] IB/ipoib: enable IGMP f...
------=_Part_8152_27677453.1191625598155
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
OK, AFAIK under both Red Hat and SLES there is a way to intall pre-up and
post-down hooks for the iftools, if this is what you were referring to in
"hot-plug", then we are on the same page, thanks.
Or.
------=_Part_8152_27677453.1191625598155
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inlin...
| Oct 5, 7:06 pm 2007 |
| Sean Hefty | [ofa-general] [PATCH-2.6.24 2/2 v2] [RFC] ib/cm: add basic p...
Add performance/debug counters to track sent/received messages, retries,
and duplicates. Counters are tracked per CM message type, per port.
The counters are always enabled, so intrusive state tracking is not done.
Counters are exported as:
/sys/class/infiniband_cm/device/port/counter_description/cm_attribute
for example:
/sys/class/infiniband_cm/mthca0/1/cm_tx_msgs/req
/sys/class/infiniband_cm/mthca0/1/cm_tx_retries/rep
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
This m...
| Oct 5, 5:31 pm 2007 |
| previous day | today | next day |
|---|---|---|
| October 4, 2007 | October 5, 2007 | October 6, 2007 |
| Jeremy Allison | Re: [RFC] Heads up on sys_fallocate() |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Joerg Roedel | [PATCH 03/34] AMD IOMMU: add defines and structures for ACPI scanning code |
| Eric W. Biederman | [PATCH] powerpc pseries eeh: Convert to kthread API |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Natalie Protasevich | [BUG] New Kernel Bugs |
git: | |
