Re: [Bug #10326] inconsistent lock state in net_rx_action

Previous thread: [Bug #10323] panic using bridging on linus kernel 2.6.25-rc6 by Rafael J. Wysocki on Tuesday, April 8, 2008 - 7:02 pm. (1 message)

Next thread: [Bug #10421] Filesystem failing without any warning or etc by Rafael J. Wysocki on Tuesday, April 8, 2008 - 7:02 pm. (1 message)
To: Linux Kernel Mailing List <linux-kernel@...>
Cc: Marcus Better <marcus@...>
Date: Tuesday, April 8, 2008 - 7:02 pm

This message has been generated automatically as a part of a report
of recent regressions.

The following bug entry is on the current list of known regressions
from 2.6.24. Please verify if it still should be listed.

Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=10326
Subject : inconsistent lock state in net_rx_action
Submitter : Marcus Better <marcus@better.se>
Date : 2008-03-25 13:21 (15 days old)
Patch : http://bugzilla.kernel.org/show_bug.cgi?id=10326#c20

--

To: <rjw@...>
Cc: <linux-kernel@...>, <marcus@...>
Date: Tuesday, April 8, 2008 - 7:56 pm

From: "Rafael J. Wysocki" <rjw@sisk.pl>

This is already fixed by:

commit 50fd4407b8bfbde7c1a0bfe4f24de7df37164342
Author: David S. Miller <davem@davemloft.net>
Date: Thu Mar 27 17:42:50 2008 -0700

[NET]: Use local_irq_{save,restore}() in napi_complete().

Based upon a lockdep report.

Since ->poll() can be invoked from netpoll with interrupts
disabled, we must not unconditionally enable interrupts
in napi_complete().

Instead we must use local_irq_{save,restore}().

Noticed by Peter Zijlstra:

<irqs disabled>

netpoll_poll()
poll_napi()
spin_trylock(&napi->poll_lock)
poll_one_napi()
napi->poll() := sky2_poll()
napi_complete()
local_irq_disable()
local_irq_enable() <--- *BUG*

<irq>
irq_exit()
do_softirq()
net_rx_action()
spin_lock(&napi->poll_lock) <--- Deadlock!

Because we still hold the lock....

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a2f0032..fae6a7e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -383,9 +383,11 @@ static inline void __napi_complete(struct napi_struct *n)

static inline void napi_complete(struct napi_struct *n)
{
- local_irq_disable();
+ unsigned long flags;
+
+ local_irq_save(flags);
__napi_complete(n);
- local_irq_enable();
+ local_irq_restore(flags);
}

/**
--

To: David Miller <davem@...>
Cc: <linux-kernel@...>, <marcus@...>
Date: Tuesday, April 8, 2008 - 8:00 pm

Thanks, closed.

Rafael
--

Previous thread: [Bug #10323] panic using bridging on linus kernel 2.6.25-rc6 by Rafael J. Wysocki on Tuesday, April 8, 2008 - 7:02 pm. (1 message)

Next thread: [Bug #10421] Filesystem failing without any warning or etc by Rafael J. Wysocki on Tuesday, April 8, 2008 - 7:02 pm. (1 message)