Re: [PATCH update] ieee1394: sbp2: enforce s/g segment size limit

Previous thread: Regression in 2.6.27-rc1 for set_cpus_allowed_ptr by Langsdorf, Mark on Friday, August 8, 2008 - 12:44 pm. (13 messages)

Next thread: [PATCH -tip] x86: apic interrupts - move assignments to irqinit_32.c by Cyrill Gorcunov on Friday, August 8, 2008 - 12:50 pm. (4 messages)
From: Stefan Richter
Date: Friday, August 8, 2008 - 12:44 pm

Hi all,

the block layer usually tries to merge s/g segments if consecutive 
segments combined fit into the queue's max_segment_size.  When such a 
scatter gather list is DMA-mapped, can it happen that an IOMMU collapses 
the elements even further, so that sg_dma_len() of a DMA-mapped s/g 
segment exceeds max_segment_size?

As I understood some discussions in the past, this could indeed happen, 
which is a nuisance.  But I may have misunderstood something, or 
something may have changed in the meantime...
-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
--

From: Grant Grundler
Date: Friday, August 8, 2008 - 1:25 pm

On Fri, Aug 8, 2008 at 12:44 PM, Stefan Richter

I don't see how. The IOMMU code only collapses the "physical" mappings and
does not add new elements to the SG list. ergo sg_dma_len() shouldn't change.

--

From: Stefan Richter
Date: Friday, August 8, 2008 - 2:21 pm

Well, I'm just doing my homework and am tracking down the various 
dma_map_sg implementations.

Here is what PPC does:
http://lxr.linux.no/linux+v2.6.26/arch/powerpc/kernel/iommu.c#L270

It looks at dma_get_max_seg_size(dev); and then merges according to it.

That's all nice and well, but in my case (FireWire storage protocol 
a.k.a. SBP-2, which is basically remote DMA), the max_segment_size of 
the PCI device is different from the size limit of the protocol.  We 
currently have to deconstruct such merges in the sbp2 drivers again:
http://lxr.linux.no/linux+v2.6.26/drivers/firewire/fw-sbp2.c#L1384

Either I keep it that way, or I let the protocol driver manipulate the 
FireWire controller's dev->dma_parms->max_segment_size (via 
dma_set_max_seg_size() of course), which is not entirely correct.

I first wanted to use blk_queue_max_segment_size(), but that falls short 
with dma_map_sg implementations like the above are at work.
-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
--

From: FUJITA Tomonori
Date: Friday, August 8, 2008 - 2:31 pm

On Fri, 08 Aug 2008 23:21:28 +0200


Why is it not correct? Device drivers can set

Yeah, we store the same value at two places (request_queue and device)
though it's not nice.
--

From: Stefan Richter
Date: Friday, August 8, 2008 - 2:58 pm

FireWire is a multi-protocol bus.  SBP-2 is just one of many quite 
different protocols.  SBP-2 targets read or write the initiators memory 
buffers via remote DMA.  These buffer may be exposed as s/g lists to the 
target.  The protocol limits these s/g lists to up to 65535 elements of 
up to 65535 bytes size each.

FireWire controllers on the other hand get their maximum segment size 
set to 65536 by the PCI subsystem.  (All FireWire controllers supported 
by mainline Linux are PCI or PCIe devices.)

In case of the drivers/firewire/ stack, the SBP-2 driver is currently 
the only one which uses dma_map_sg.  In case of the drivers/ieee1394/ 
stack, also the drivers for isochronous protocols, including userspace 
drivers via raw1394, use dma_map_sg.

So if the SBP-2 driver manipulated the controller device's 
max_segment_size, it would influence the DMA mappings of the other 
protocols.  It wouldn't be a big deal; the isochronous mappings could 
only be collapsed to chunks of at most 15 pages instead of 16 pages. 
However, the mapping deconstructing code in the SBP-2 drivers is not a 
big deal either.
-- 
Stefan Richter
-=====-==--- =--- -=---
http://arcgraph.de/sr/
--

From: FUJITA Tomonori
Date: Friday, August 8, 2008 - 3:17 pm

On Fri, 08 Aug 2008 23:58:23 +0200


IMHO, setting the safest value to max_segment_size is fine. After all,
--

From: Stefan Richter
Date: Saturday, August 9, 2008 - 11:20 am

1. We don't need to round the SBP-2 segment size limit down to a
   multiple of 4 kB (0xffff -> 0xf000).  It is only necessary to
   ensure quadlet alignment (0xffff -> 0xfffc).

2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
   and the block IO layer about the restriction.  This way we can
   remove the size checks and segment splitting in the queuecommand
   path.

   This assumes that no other code in the ieee1394 stack uses
   dma_map_sg() with conflicting requirements.  It furthermore assumes
   that the controller device's platform actually allows us to set the
   segment size to our liking.  Assert the latter with a BUG_ON().

3. Also use blk_queue_max_segment_size() to tell the block IO layer
   about it.  It cannot know it because our scsi_add_host() does not
   point to the FireWire controller's device.

We can also uniformly use dma_map_sg() for the single segment case just
like for the multi segment case, to further simplify the code.

Also clean up how the page table is converted to big endian.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---

Applicable after
   [PATCH update] ieee1394: sbp2: stricter dma_sync
   [PATCH] ieee1394: sbp2: check for DMA mapping failures
from a few minutes ago.

 drivers/ieee1394/sbp2.c |  106 +++++++++++++---------------------------
 drivers/ieee1394/sbp2.h |   37 +++++--------
 2 files changed, 52 insertions(+), 91 deletions(-)

Index: linux-2.6.27-rc2/drivers/ieee1394/sbp2.h
===================================================================
--- linux-2.6.27-rc2.orig/drivers/ieee1394/sbp2.h
+++ linux-2.6.27-rc2/drivers/ieee1394/sbp2.h
@@ -139,13 +139,10 @@ struct sbp2_logout_orb {
 	u32 status_fifo_lo;
 } __attribute__((packed));
 
-#define PAGE_TABLE_SET_SEGMENT_BASE_HI(v)	((v) & 0xffff)
-#define PAGE_TABLE_SET_SEGMENT_LENGTH(v)	(((v) & 0xffff) << 16)
-
 struct sbp2_unrestricted_page_table {
-	u32 length_segment_base_hi;
-	u32 segment_base_lo;
-} ...
From: Stefan Richter
Date: Saturday, August 9, 2008 - 11:21 am

1. We don't need to round the SBP-2 segment size limit down to a
   multiple of 4 kB (0xffff -> 0xf000).  It is only necessary to
   ensure quadlet alignment (0xffff -> 0xfffc).

2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
   and the block IO layer about the restriction.  This way we can
   remove the size checks and segment splitting in the queuecommand
   path.

   This assumes that no other code in the firewire stack uses
   dma_map_sg() with conflicting requirements.  It furthermore assumes
   that the controller device's platform actually allows us to set the
   segment size to our liking.  Assert the latter with a BUG_ON().

3. Also use blk_queue_max_segment_size() to tell the block IO layer
   about it.  It cannot know it because our scsi_add_host() does not
   point to the FireWire controller's device.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-sbp2.c |   68 +++++++++++++++++--------------------
 1 file changed, 33 insertions(+), 35 deletions(-)

Index: linux-2.6.27-rc2/drivers/firewire/fw-sbp2.c
===================================================================
--- linux-2.6.27-rc2.orig/drivers/firewire/fw-sbp2.c
+++ linux-2.6.27-rc2/drivers/firewire/fw-sbp2.c
@@ -29,6 +29,7 @@
  */
 
 #include <linux/blkdev.h>
+#include <linux/bug.h>
 #include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
@@ -181,10 +182,16 @@ struct sbp2_target {
 #define SBP2_MAX_LOGIN_ORB_TIMEOUT	40000U	/* Timeout in ms */
 #define SBP2_ORB_TIMEOUT		2000U	/* Timeout in ms */
 #define SBP2_ORB_NULL			0x80000000
-#define SBP2_MAX_SG_ELEMENT_LENGTH	0xf000
 #define SBP2_RETRY_LIMIT		0xf		/* 15 retries */
 #define SBP2_CYCLE_LIMIT		(0xc8 << 12)	/* 200 125us cycles */
 
+/*
+ * The default maximum s/g segment size of a FireWire controller is
+ * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
+ * be quadlet-aligned, we set the length limit to 0xffff & ~3.
+ */
+#define ...
From: Grant Grundler
Date: Monday, August 11, 2008 - 12:52 pm

On Sat, Aug 9, 2008 at 11:21 AM, Stefan Richter

I didn't check the rest of the driver - but it would be good if it
explicitly called dma_set_mask() or  pci_dma_set_mask() with a 32-bit
mask value. Most drivers assume 32-bit and that's why I point this
out.

The rest looks ok at first glance.

thanks,
--

From: Stefan Richter
Date: Wednesday, August 13, 2008 - 2:38 am

I thought about this and currently think that I should not do this. 
There is the API to set the mask, but there is no API to _decrease_ the 
mask only.  The SBP-2 protocol driver should not set DMA_32BIT_MASK if 
any other driver already set for example DMA_31BIT_MASK for whatever reason.

For now, this is no issue.  FireWire low-level drivers exist only for 
controllers with 32 bit local bus addressing capability.  I am sure that 
somebody will remember to modify the SBP-2 driver(s) if there will ever 
be a controller type and a driver for it which can address more than 4 GB.

On the other hand, we also currently have no reason to set a smaller 
mask.  We recently discovered a chip bug which requires DMA_31BIT_MASK 
for a special mode of data reception (only for cache-coherent DMA 
though), but we now work around this chip bug by simply falling back to 
an alternative mode.
-- 
Stefan Richter
-=====-==--- =--- -==-=
http://arcgraph.de/sr/
--

From: Grant Grundler
Date: Tuesday, August 12, 2008 - 10:04 am

On Sat, Aug 9, 2008 at 11:20 AM, Stefan Richter

While this code will probably work correctly, I believe if the
sg_dma_len() returns 0, one has walked off the end of the
"bus address" list.

pci_map_sg() returns how many DMA addr/len pairs are used
and that count could (should?) be used when pulling the
dma_len/addr pairs out of the sg list.

Effectively, the SG list is really two lists with seperate counts:
the original list with virtual addresses/len/count and the "DMA mapped"
list with it's own count of addresses/len pairs. When no IOMMU
is used those lists are 1:1.

hth,
--

From: FUJITA Tomonori
Date: Tuesday, August 12, 2008 - 4:44 pm

On Tue, 12 Aug 2008 10:04:14 -0700

Yeah, from a quick look, seems that this patch wrongly handles
sg_count.

This patch sets scsi_sg_count(sc) to sg_count, right? for_each_sg is
expected to be used with a return value of pci_map_sg.

Then this patch can simply do something like:

for_each_sg(sg, sg, sg_count, i) {
        pt[i].high = cpu_to_be32(sg_dma_len(sg) << 16);
        pt[i].low = cpu_to_be32(sg_dma_address(sg));
}
--

From: Stefan Richter
Date: Wednesday, August 13, 2008 - 3:19 am

Well, I keep the original scsi_sg_count to call dma_unmap_sg with it
later.  According to Corbet/ Rubini/ Kroah-Hartman: LDD3, dma_unmap_sg
is to be called with the number of s/g elements before DMA-mapping.
From a quick look at some dma_unmap_sg implementations, this seems still

Thanks a lot for looking at this; here is my patch fixed.

 
From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Subject: ieee1394: sbp2: enforce s/g segment size limit

1. We don't need to round the SBP-2 segment size limit down to a
   multiple of 4 kB (0xffff -> 0xf000).  It is only necessary to
   ensure quadlet alignment (0xffff -> 0xfffc).

2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
   and the block IO layer about the restriction.  This way we can
   remove the size checks and segment splitting in the queuecommand
   path.

   This assumes that no other code in the ieee1394 stack uses
   dma_map_sg() with conflicting requirements.  It furthermore assumes
   that the controller device's platform actually allows us to set the
   segment size to our liking.  Assert the latter with a BUG_ON().

3. Also use blk_queue_max_segment_size() to tell the block IO layer
   about it.  It cannot know it because our scsi_add_host() does not
   point to the FireWire controller's device.

We can also uniformly use dma_map_sg() for the single segment case just
like for the multi segment case, to further simplify the code.

Also clean up how the page table is converted to big endian.

Thanks to Grant Grundler and FUJITA Tomonori for advice.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/ieee1394/sbp2.c |  100 ++++++++++++----------------------------
 drivers/ieee1394/sbp2.h |   37 ++++++--------
 2 files changed, 46 insertions(+), 91 deletions(-)

Index: linux/drivers/ieee1394/sbp2.h
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.h
+++ linux/drivers/ieee1394/sbp2.h
@@ -139,13 +139,10 @@ struct ...
From: Stefan Richter
Date: Wednesday, August 13, 2008 - 3:20 am

1. We don't need to round the SBP-2 segment size limit down to a
   multiple of 4 kB (0xffff -> 0xf000).  It is only necessary to
   ensure quadlet alignment (0xffff -> 0xfffc).

2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
   and the block IO layer about the restriction.  This way we can
   remove the size checks and segment splitting in the queuecommand
   path.

   This assumes that no other code in the firewire stack uses
   dma_map_sg() with conflicting requirements.  It furthermore assumes
   that the controller device's platform actually allows us to set the
   segment size to our liking.  Assert the latter with a BUG_ON().

3. Also use blk_queue_max_segment_size() to tell the block IO layer
   about it.  It cannot know it because our scsi_add_host() does not
   point to the FireWire controller's device.

Thanks to Grant Grundler and FUJITA Tomonori for advice.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/firewire/fw-sbp2.c |   64 ++++++++++++++++---------------------
 1 file changed, 28 insertions(+), 36 deletions(-)

Index: linux/drivers/firewire/fw-sbp2.c
===================================================================
--- linux.orig/drivers/firewire/fw-sbp2.c
+++ linux/drivers/firewire/fw-sbp2.c
@@ -29,6 +29,7 @@
  */
 
 #include <linux/blkdev.h>
+#include <linux/bug.h>
 #include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/dma-mapping.h>
@@ -181,10 +182,16 @@ struct sbp2_target {
 #define SBP2_MAX_LOGIN_ORB_TIMEOUT	40000U	/* Timeout in ms */
 #define SBP2_ORB_TIMEOUT		2000U	/* Timeout in ms */
 #define SBP2_ORB_NULL			0x80000000
-#define SBP2_MAX_SG_ELEMENT_LENGTH	0xf000
 #define SBP2_RETRY_LIMIT		0xf		/* 15 retries */
 #define SBP2_CYCLE_LIMIT		(0xc8 << 12)	/* 200 125us cycles */
 
+/*
+ * The default maximum s/g segment size of a FireWire controller is
+ * usually 0x10000, but SBP-2 only allows 0xffff. Since buffers have to
+ * be quadlet-aligned, we set the length limit to 0xffff ...
From: Stefan Richter
Date: Wednesday, August 13, 2008 - 3:27 am

PS:  That's the 2.6.26 version; the 2.6.27 patch has an additional 
argument here, if anybody cares...
-- 
Stefan Richter
-=====-==--- =--- -==-=
http://arcgraph.de/sr/
--

From: FUJITA Tomonori
Date: Wednesday, August 13, 2008 - 5:55 pm

On Wed, 13 Aug 2008 12:19:59 +0200 (CEST)

You need to keep it? You can access to it via scsi_sg_count when

Are you talking about?

=
To unmap a scatterlist, just call:

	pci_unmap_sg(pdev, sglist, nents, direction);

Again, make sure DMA activity has already finished.

PLEASE NOTE:  The 'nents' argument to the pci_unmap_sg call must be
              the _same_ one you passed into the pci_map_sg call,
	      it should _NOT_ be the 'count' value _returned_ from the
              pci_map_sg call.



I think that all the IOMMUs are fine even if you call dma_unmap_sg
with a 'nents' value that dma_map_sg returned. But, well, it's against
the rule.
--

From: Stefan Richter
Date: Thursday, August 14, 2008 - 12:12 am

Yes, I can easily do it this way; fixed below.


Right.  How did I forget about that?



From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Subject: ieee1394: sbp2: remove redundant struct members

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/ieee1394/sbp2.c |   14 +++++---------
 drivers/ieee1394/sbp2.h |    4 ----
 2 files changed, 5 insertions(+), 13 deletions(-)

Index: linux/drivers/ieee1394/sbp2.c
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.c
+++ linux/drivers/ieee1394/sbp2.c
@@ -656,11 +656,11 @@ static struct sbp2_command_info *sbp2uti
 static void sbp2util_mark_command_completed(struct sbp2_lu *lu,
 					    struct sbp2_command_info *cmd)
 {
-	if (cmd->sg_buffer) {
-		dma_unmap_sg(lu->ud->ne->host->device.parent, cmd->sg_buffer,
-			     cmd->sg_count, cmd->sg_dir);
-		cmd->sg_buffer = NULL;
-	}
+	if (scsi_sg_count(cmd->Current_SCpnt) > 0)
+		dma_unmap_sg(lu->ud->ne->host->device.parent,
+			     scsi_sglist(cmd->Current_SCpnt),
+			     scsi_sg_count(cmd->Current_SCpnt),
+			     cmd->Current_SCpnt->sc_data_direction);
 	list_move_tail(&cmd->list, &lu->cmd_orb_completed);
 }
 
@@ -1505,10 +1505,6 @@ static int sbp2_prep_command_orb_sg(stru
 	if (n == 0)
 		return -ENOMEM;
 
-	cmd->sg_buffer = sg;
-	cmd->sg_count = sg_count;
-	cmd->sg_dir = dma_dir;
-
 	orb->data_descriptor_hi = ORB_SET_NODE_ID(hi->host->node_id);
 	orb->misc |= ORB_SET_DIRECTION(orb_direction);
 
Index: linux/drivers/ieee1394/sbp2.h
===================================================================
--- linux.orig/drivers/ieee1394/sbp2.h
+++ linux/drivers/ieee1394/sbp2.h
@@ -252,10 +252,6 @@ struct sbp2_command_info {
 	struct sbp2_unrestricted_page_table
 		scatter_gather_element[SG_ALL] __attribute__((aligned(8)));
 	dma_addr_t sge_dma;
-
-	struct scatterlist *sg_buffer;
-	int sg_count;
-	enum dma_data_direction sg_dir;
 };
 
 /* Per FireWire host */


-- 
Stefan ...
From: FUJITA Tomonori
Date: Thursday, August 14, 2008 - 12:21 am

On Thu, 14 Aug 2008 09:12:01 +0200 (CEST)

static inline unsigned scsi_sg_count(struct scsi_cmnd *cmd)


--

Previous thread: Regression in 2.6.27-rc1 for set_cpus_allowed_ptr by Langsdorf, Mark on Friday, August 8, 2008 - 12:44 pm. (13 messages)

Next thread: [PATCH -tip] x86: apic interrupts - move assignments to irqinit_32.c by Cyrill Gorcunov on Friday, August 8, 2008 - 12:50 pm. (4 messages)