Re: [PATCH] BUG: ll_merge_requests_fn() updates req->nr_phys_segments wrongly

Previous thread: [PATCH] - Add early detection of UV system types by Jack Steiner on Tuesday, September 23, 2008 - 11:28 am. (6 messages)

Next thread: Interrupt handler latency and Interrupt handling issues by Singaravelan Nallasellan on Tuesday, September 23, 2008 - 11:31 am. (1 message)
From: Nikanth Karthikesan
Date: Tuesday, September 23, 2008 - 11:28 am

From: Nikanth Karthikesan <knikanth@suse.de>

ll_merge_requests_fn() decreases the req->nr_phys_segments if 
blk_phys_contig_segment() returns true, but it is perfectly possible that 
blk_hw_contig_segment() is false. A new hw_segment implies a new phys_segment. 
So decrementing nr_phys_segments wrongly here triggers the BUG_ON() in 
scsi_init_sgtable(), as blk_rq_map_sg() would map 1 phys_segment more than  
req->nr_phys_segment. This is easily reproducible. Other callers of 
blk_rq_map_sg() should also be affected by this bug.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
---

 block/blk-merge.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 5efc9e7..6e6d04b 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -392,11 +392,6 @@ static int ll_merge_requests_fn(struct request_queue *q, 
struct request *req,
 		return 0;
 
 	total_phys_segments = req->nr_phys_segments + next->nr_phys_segments;
-	if (blk_phys_contig_segment(q, req->biotail, next->bio))
-		total_phys_segments--;
-
-	if (total_phys_segments > q->max_phys_segments)
-		return 0;
 
 	total_hw_segments = req->nr_hw_segments + next->nr_hw_segments;
 	if (blk_hw_contig_segment(q, req->biotail, next->bio)) {
@@ -410,9 +405,12 @@ static int ll_merge_requests_fn(struct request_queue *q, 
struct request *req,
 		if (next->nr_hw_segments == 1)
 			next->biotail->bi_hw_back_size = len;
 		total_hw_segments--;
+		if (blk_phys_contig_segment(q, req->biotail, next->bio))
+			total_phys_segments--;
 	}
 
-	if (total_hw_segments > q->max_hw_segments)
+	if (total_phys_segments > q->max_phys_segments ||
+	    total_hw_segments > q->max_hw_segments)
 		return 0;
 
 	/* Merge is OK... */









--

From: FUJITA Tomonori
Date: Thursday, September 25, 2008 - 12:35 am

On Tue, 23 Sep 2008 23:58:02 +0530

Yeah, in fact, blk_hw_contig_segment() is always false on the majority
of architectures (on only PARISC and Alpha, it could be true).

Your patch doesn't look correct. Virtually, the patch always disables

I have no idea how BUG_ON() in scsi_init_sgtable() is triggered.

Can you give more information, HBA, IOMMU (if you use), and the values
of req->nr_phys_segment, req->nr_hw_segment, count, etc in in
scsi_init_sgtable() when you hit the bug?


BTW, blk_hw_contig_segment() will be removed for 2.6.28 (virtual
merging account in the block layer will be removed).
--

From: Jens Axboe
Date: Thursday, September 25, 2008 - 6:55 am

This is totally broken, I gather you are thinking the bug is fixed since
the counts always match up. Well that's mainly because you know don't do

Hmm?

I think you should try and describe the problem you are seeing instead,
then we can perhaps find the real issue.

-- 
Jens Axboe

--

Previous thread: [PATCH] - Add early detection of UV system types by Jack Steiner on Tuesday, September 23, 2008 - 11:28 am. (6 messages)

Next thread: Interrupt handler latency and Interrupt handling issues by Singaravelan Nallasellan on Tuesday, September 23, 2008 - 11:31 am. (1 message)