Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Roland Dreier
Date: Tuesday, September 18, 2007 - 7:27 am

Thanks for testing on ehca...

 > While using IPoIB over EHCA (rc6 bits), unregister_netdev hangs with

I don't think you're actually using rc6 bits, since in your patch you have:

 > -poll_more:

and I think that is only in Dave's net-2.6.24 tree now, right?

 > The problem is that the poll handler does netif_rx_complete (which
 > does a dev_put) followed by netif_rx_reschedule() to schedule for
 > more receives (which again does a dev_put). This reduces refcount to
 > < 0 (depending on how many times netif_rx_complete followed by
 > netif_rx_reschedule was called).

Dave, the real problem seems to be that netif_rx_recschedule() calls
__napi_schedule() rather than __netif_rx_schedule(), so it misses the
call to dev_hold() that is needed to balance the dev_put() in
netif_rx_complete().  The current netif_rx_reschedule() looks like it
really should be napi_reschedule(), and we need a new function that
takes a netdev too.  Or am I misunderstanding the refcounting?

I'll send a patch once I've had some breakfast and had a chance to at
least compile it...

Krishna, unfortunately your proposed fix has a race:

 > -		netif_rx_complete(dev, napi);
 > -		if (unlikely(ib_req_notify_cq(priv->cq,
 > -					      IB_CQ_NEXT_COMP |
 > -					      IB_CQ_REPORT_MISSED_EVENTS)) &&
 > -		    netif_rx_reschedule(napi))
 > -			goto poll_more;
 > +		if (likely(!ib_req_notify_cq(priv->cq,
 > +					     IB_CQ_NEXT_COMP |
 > +					     IB_CQ_REPORT_MISSED_EVENTS)))

It is possible for an interrupt to happen immediately right here,
before the netif_rx_complete(), so that netif_rx_schedule() gets
called while we are still on the poll list.

 > +			netif_rx_complete(dev, napi);

 - R.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 1/2] IPoIB: Fix unregister_netdev hang, Krishna Kumar, (Tue Sep 18, 4:18 am)
[PATCH 2/2] IPoIB: Code cleanup, Krishna Kumar, (Tue Sep 18, 4:18 am)
Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang, Roland Dreier, (Tue Sep 18, 7:27 am)
Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang, Krishna Kumar2, (Tue Sep 18, 8:23 pm)
Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang, Roland Dreier, (Tue Sep 18, 8:30 pm)
Re: [PATCH 1/2] IPoIB: Fix unregister_netdev hang, Krishna Kumar2, (Tue Sep 18, 9:24 pm)