Re: [patch] net: avoid race between netpoll and network fast path

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: David Miller <davem@...>
Cc: <mpm@...>, <netdev@...>
Date: Wednesday, October 17, 2007 - 1:46 am

David Miller wrote:
	


	Isn't net_rx_action() calling ->poll() to free the TX space ?
	TX queue full can only be emptied when the device is done transmitting
	not because of netpoll ->poll() it.  The softirq (net_rx_action)
	is the purpose for such an event.  Netconsole messages will be
	dropped if the device can't keep up with it regardless of netpoll
	->poll() or not.  If no dropping can be tolerated, then the
	netpoll upper layer probably should be redesigned to buffer the data.

	The poll_list currently is in a per_cpu structure, not being
	protected globally that netpoll thread from any cpu can
	trash it.



	The precise race is
	1) net_rx_action get the dev from poll_list
	2) at the same time, netpoll poll_napi() get a hold of the poll lock
	   and calls ->poll(), remove dev from the poll list
	3) after it finishes, net_rx_action get the poll lock, and calls
	   ->poll() the second time, and panic when trying to remove (again)
	   the dev from the poll list.
	and I had logged all the crash info from the crash scenes into the
	bug database.

	As Matt Mackall had acknowledged, the network fast path went to great
	length to reduce locking overhead, should that be undone because of
	netpoll if that's what it takes to fix it more correctly ?

	

	

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [patch] net: avoid race between netpoll and network fast..., Tina Yang, (Wed Oct 17, 1:46 am)