Re: [PATCH] Fix race with shared tag queue maps

Previous thread: 2.6.23-rc4-mm1: deadlock while mmaping video device by Jiri Slaby on Thursday, September 13, 2007 - 7:43 am. (3 messages)

Next thread: [PATCH] v4l: Build error with et61x251, if V4L1_COMPAT is not selected by aherrman on Thursday, September 13, 2007 - 8:36 am. (11 messages)
To: <linux-kernel@...>, <linux-scsi@...>
Cc: <orgis@...>, <arekm@...>, <ed.lin@...>, Andrew Morton <akpm@...>, Linus Torvalds <torvalds@...>, <James.Bottomley@...>
Date: Thursday, September 13, 2007 - 8:26 am

Hi,

There's a race condition in blk_queue_end_tag() for shared tag maps,
users include stex (promise supertrak thingy) and qla2xxx. The former at
least has reported bugs in this area, not sure why we haven't seen any
for the latter. It could be because the window is narrow and that other
conditions in the qla2xxx code hide this. It's a real bug, though, as
the stex smp users can attest.

We need to ensure two things - the tag bit clearing needs to happen
AFTER we cleared the tag pointer, as the tag bit clearing/setting is
what protects this map. Secondly, we need to ensure that the visibility
of the tag pointer and tag bit clear are ordered properly.

This is for 2.6.23-rc6-current, but it needs to go into -stable as well
once Linus has committed it. I'm cc'ing users that reported stex
problems, hopefully they can test this patch and report back.

Also see http://bugzilla.kernel.org/show_bug.cgi?id=7842

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index a15845c..3d9e6a1 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1075,12 +1075,6 @@ void blk_queue_end_tag(struct request_queue *q, struct request *rq)
*/
return;

- if (unlikely(!__test_and_clear_bit(tag, bqt->tag_map))) {
- printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)\n",
- __FUNCTION__, tag);
- return;
- }
-
list_del_init(&rq->queuelist);
rq->cmd_flags &= ~REQ_QUEUED;
rq->tag = -1;
@@ -1090,6 +1084,23 @@ void blk_queue_end_tag(struct request_queue *q, struct request *rq)
__FUNCTION__, tag);

bqt->tag_index[tag] = NULL;
+
+ /*
+ * Ensure ordering with tag section
+ */
+ smp_mb__before_clear_bit();
+
+ if (unlikely(!test_and_clear_bit(tag, bqt->tag_map))) {
+ printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)\n",
+ __FUNCTION__, tag);
+ return;
+ }
+
+ /*
+ * Ensure ordering between ->tag_index[tag] clear and tag clear
+ */
+ smp_mb__aft...

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <orgis@...>, <ed.lin@...>, Andrew Morton <akpm@...>, Linus Torvalds <torvalds@...>, <James.Bottomley@...>
Date: Friday, September 14, 2007 - 2:19 am

I'm running 2.6.22 with f3da54ba140c6427fa4a32913e1bf406f41b5dda patch from
Linus git tree (instead of Ed Lin workaround patch) and doing heavy rsync for
last 8 hours. No oops - seems to be working fine (+stex driver).

Thanks!

--
Arkadiusz Miśkiewicz PLD/Linux Team
arekm / maven.pl http://ftp.pld-linux.org/
-

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <orgis@...>, <arekm@...>, <ed.lin@...>, Andrew Morton <akpm@...>, <James.Bottomley@...>
Date: Thursday, September 13, 2007 - 11:16 am

You don't need the "smp_mb__before_clear_bit()" there.

The regular "clear_bit()" needs it, but the "test_and_xxx()" versions are
architecturally defined to be memory barriers, exactly because they are
regularly used for locking.

This is even documented - see Documentation/atomic_ops.txt.

Linus
-

To: Linus Torvalds <torvalds@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <orgis@...>, <arekm@...>, <ed.lin@...>, Andrew Morton <akpm@...>, <James.Bottomley@...>
Date: Thursday, September 13, 2007 - 11:22 am

My bad, I think I added the smp_mb__before_clear_bit() when it was
__test_and_set_bit() like in the first hunk.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index a15845c..dfc9f22 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1075,12 +1075,6 @@ void blk_queue_end_tag(struct request_queue *q, struct request *rq)
*/
return;

- if (unlikely(!__test_and_clear_bit(tag, bqt->tag_map))) {
- printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)\n",
- __FUNCTION__, tag);
- return;
- }
-
list_del_init(&rq->queuelist);
rq->cmd_flags &= ~REQ_QUEUED;
rq->tag = -1;
@@ -1090,6 +1084,18 @@ void blk_queue_end_tag(struct request_queue *q, struct request *rq)
__FUNCTION__, tag);

bqt->tag_index[tag] = NULL;
+
+ if (unlikely(!test_and_clear_bit(tag, bqt->tag_map))) {
+ printk(KERN_ERR "%s: attempt to clear non-busy tag (%d)\n",
+ __FUNCTION__, tag);
+ return;
+ }
+
+ /*
+ * Ensure ordering between ->tag_index[tag] clear and tag clear
+ */
+ smp_mb__after_clear_bit();
+
bqt->busy--;
}

--
Jens Axboe

-

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <orgis@...>, <arekm@...>, <ed.lin@...>, Andrew Morton <akpm@...>, <James.Bottomley@...>
Date: Thursday, September 13, 2007 - 11:42 am

Ahh, that wouldn't work at all. The "__test_and_set_bit()" thing isn't
atomic at all, and no amount of memory barriers around it would help
(you'd need to use real locking, but at that point the memory barriers are

You still left this one. But never mind - I already edited your original
patch and it's in my tree with both of those things removed.

Linus
-

To: Linus Torvalds <torvalds@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <orgis@...>, <arekm@...>, <ed.lin@...>, Andrew Morton <akpm@...>, <James.Bottomley@...>
Date: Friday, September 14, 2007 - 3:47 am

Hence the change, it looks like an oversight from when we didn't allow

I took at look at the committed patch and it looks fine, thanks Linus.

--
Jens Axboe

-

Previous thread: 2.6.23-rc4-mm1: deadlock while mmaping video device by Jiri Slaby on Thursday, September 13, 2007 - 7:43 am. (3 messages)

Next thread: [PATCH] v4l: Build error with et61x251, if V4L1_COMPAT is not selected by aherrman on Thursday, September 13, 2007 - 8:36 am. (11 messages)