Re: [PATCH -mm 0/11] fix iommu sg merging problem

Previous thread: none

Next thread: none
To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jens.axboe@...>, <greg@...>, <jeff@...>, <muli@...>, <paulus@...>, <tony.luck@...>, <davem@...>, <kyle@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:47 am

IOMMUs merges scatter/gather segments without considering a low level
driver's restrictions. The problem is that IOMMUs can't access to the
limitations because they are in request_queue.

This patchset introduces a new structure, device_dma_parameters,
including dma information. A pointer to device_dma_parameters is added
to struct device. The bus specific structures (like pci_dev) includes
device_dma_parameters. Low level drivers can use dma_set_max_seg_size
to tell IOMMUs about the restrictions.

We can move more dma stuff in struct device (like dma_mask) to struct
device_dma_parameters later (needs some cleanups before that).

This includes patches for all the IOMMUs that could merge sg (x86_64,
ppc, IA64, alpha, sparc64, and parisc) though only the ppc patch was
tested. The patches for other IOMMUs are only compile tested.

Thanks to everyone for the comments on the previous submission
to linux-scsi.

This is against 2.6.24-rc1. The same patchset is also available:

git://git.kernel.org/pub/scm/linux/kernel/git/tomo/linux-2.6-misc.git iommu-sg-fixes
-

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <greg@...>, <jeff@...>, <muli@...>, <paulus@...>, <tony.luck@...>, <davem@...>, <kyle@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 9:24 am

Looks good to me, I think we should get this included sooner rather than
later.

--
Jens Axboe

-

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jens.axboe@...>, <greg@...>, <muli@...>, <paulus@...>, <tony.luck@...>, <davem@...>, <kyle@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:40 am

Are you interested in collecting patches to libata that eliminate the
hand-rolled S/G splitting? e.g. now ata_fill_sg() and mv_fill_sg() can
be made much more efficient.

Again, thanks for doing this! It's been needed for a long time.

Jeff

-

To: <jeff@...>
Cc: <tomof@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jens.axboe@...>, <greg@...>, <muli@...>, <paulus@...>, <tony.luck@...>, <davem@...>, <kyle@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 10:32 am

On Wed, 24 Oct 2007 07:40:50 -0400

Yeah, but ATA has the segment boundary limit too. It's much more
tricky than the segment size limit. It makes the IOMMU free space

My pleasure.
-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jens.axboe@...>, <greg@...>, <jeff@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

iommu code merges scatter/gather segments without considering a low
level driver's restrictions. The problem is that iommu code can't
access to the limitations because they are in request_queue.

This patch adds a new structure, device_dma_parameters, including dma
information. A pointer to device_dma_parameters is added to struct
device.

- there are only max_segment_size and segment_boundary_mask there but
we'll move more dma stuff in struct device (like dma_mask) to struct
device_dma_parameters later. segment_boundary_mask is not supported
yet.

- new accessors for the dma parameters are added. So we can easily
change where to place struct device_dma_parameters in the future.

- dma_get_max_seg_size returns 64K if dma_parms in struct device isn't
set up properly. 64K is the default max_segment_size in the block
layer.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
include/linux/device.h | 11 +++++++++++
include/linux/dma-mapping.h | 15 +++++++++++++++
2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/include/linux/device.h b/include/linux/device.h
index 2e15822..9eb883f 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -397,6 +397,15 @@ extern int devres_release_group(struct device *dev, void *id);
extern void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp);
extern void devm_kfree(struct device *dev, void *p);

+struct device_dma_parameters {
+ /*
+ * a low level driver may set these to teach IOMMU code about
+ * sg limitations.
+ */
+ unsigned int max_segment_size;
+ unsigned long segment_boundary_mask;
+};
+
struct device {
struct klist klist_children;
struct klist_node knode_parent; /* node in sibling list */
@@ -432,6 +441,8 @@ struct device {
64 bit addresses for consistent
allocations such descriptors. */

+ struct device_dma_parameters *dma_parms;
+
struct list_head dma_pools; /* dma pools (if dma'ble) */

struct dma_coherent_mem ...

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jens.axboe@...>, <greg@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:33 am

ACK

-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <greg@...>, <James.Bottomley@...>, <jeff@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This adds struct device_dma_parameters in struct pci_dev and properly
sets up a pointer in struct device.

The default max_segment_size is set to 64K, same to the block layer's
default value.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
drivers/pci/pci.c | 8 ++++++++
drivers/pci/probe.c | 3 +++
include/linux/pci.h | 4 ++++
3 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 71d561f..de97c2b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1397,6 +1397,14 @@ pci_set_consistent_dma_mask(struct pci_dev *dev, u64 mask)
}
#endif

+#ifndef HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_SIZE
+int pci_set_dma_max_seg_size(struct pci_dev *dev, unsigned int size)
+{
+ return dma_set_max_seg_size(&dev->dev, size);
+}
+EXPORT_SYMBOL(pci_set_dma_max_seg_size);
+#endif
+
/**
* pcix_get_max_mmrbc - get PCI-X maximum designed memory read byte count
* @dev: PCI device to query
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 463a5a9..54edea2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -985,8 +985,11 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)

set_dev_node(&dev->dev, pcibus_to_node(bus));
dev->dev.dma_mask = &dev->dma_mask;
+ dev->dev.dma_parms = &dev->dma_parms;
dev->dev.coherent_dma_mask = 0xffffffffull;

+ pci_set_dma_max_seg_size(dev, 65536);
+
/* Fix up broken headers */
pci_fixup_device(pci_fixup_header, dev);

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5d2281f..f8cba3b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -152,6 +152,8 @@ struct pci_dev {
this if your device has broken DMA
or supports 64-bit transfers. */

+ struct device_dma_parameters dma_parms;
+
pci_power_t current_state; /* Current operating state. In ACPI-speak,
this is D0-D3, D0 being fully functional,
and D3 being off...

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <greg@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:34 am

this should check the return value of pci_set_dma_max_seg_size(), and do
something useful.

ACK everything else

-

To: <jeff@...>
Cc: <tomof@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <greg@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 9:41 am

On Wed, 24 Oct 2007 07:34:07 -0400

Thanks.

I wasn't sure about what to do. Should pci_device_add fail in case of
pci_set_dma_max_seg_size failure or just should we print warning?
-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <muli@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This patch makes pci-gart iommu respect segment size limits when
merging sg lists.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
arch/x86/kernel/pci-gart_64.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index c56e9ee..dfe3828 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -384,6 +384,8 @@ static int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents,
int start;
unsigned long pages = 0;
int need = 0, nextneed;
+ unsigned int seg_size;
+ unsigned int max_seg_size;
struct scatterlist *s, *ps, *start_sg, *sgmap;

if (nents == 0)
@@ -395,6 +397,8 @@ static int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents,
out = 0;
start = 0;
start_sg = sgmap = sg;
+ seg_size = 0;
+ max_seg_size = dma_get_max_seg_size(dev);
ps = NULL; /* shut up gcc */
for_each_sg(sg, s, nents, i) {
dma_addr_t addr = sg_phys(s);
@@ -408,11 +412,13 @@ static int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents,
/* Can only merge when the last chunk ends on a page
boundary and the new one doesn't have an offset. */
if (!iommu_merge || !nextneed || !need || s->offset ||
+ (s->length + seg_size > max_seg_size) ||
(ps->offset + ps->length) % PAGE_SIZE) {
if (dma_map_cont(start_sg, i - start, sgmap,
pages, need) < 0)
goto error;
out++;
+ seg_size = 0;
sgmap = sg_next(sgmap);
pages = 0;
start = i;
@@ -420,6 +426,7 @@ static int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents,
}
}

+ seg_size += s->length;
need = nextneed;
pages += to_pages(s->offset, s->length);
ps = s;
--
1.5.2.4

-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <paulus@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This patch makes iommu respect segment size limits when merging sg
lists.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
arch/powerpc/kernel/dma_64.c | 2 +-
arch/powerpc/kernel/iommu.c | 8 ++++++--
include/asm-powerpc/iommu.h | 2 +-
3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/dma_64.c b/arch/powerpc/kernel/dma_64.c
index 14206e3..1806d96 100644
--- a/arch/powerpc/kernel/dma_64.c
+++ b/arch/powerpc/kernel/dma_64.c
@@ -68,7 +68,7 @@ static void dma_iommu_unmap_single(struct device *dev, dma_addr_t dma_handle,
static int dma_iommu_map_sg(struct device *dev, struct scatterlist *sglist,
int nelems, enum dma_data_direction direction)
{
- return iommu_map_sg(dev->archdata.dma_data, sglist, nelems,
+ return iommu_map_sg(dev, sglist, nelems,
device_to_mask(dev), direction);
}

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 2d0c9ef..7a5d247 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -270,15 +270,17 @@ static void iommu_free(struct iommu_table *tbl, dma_addr_t dma_addr,
spin_unlock_irqrestore(&(tbl->it_lock), flags);
}

-int iommu_map_sg(struct iommu_table *tbl, struct scatterlist *sglist,
+int iommu_map_sg(struct device *dev, struct scatterlist *sglist,
int nelems, unsigned long mask,
enum dma_data_direction direction)
{
+ struct iommu_table *tbl = dev->archdata.dma_data;
dma_addr_t dma_next = 0, dma_addr;
unsigned long flags;
struct scatterlist *s, *outs, *segstart;
int outcount, incount, i;
unsigned long handle;
+ unsigned int max_seg_size;

BUG_ON(direction == DMA_NONE);

@@ -297,6 +299,7 @@ int iommu_map_sg(struct iommu_table *tbl, struct scatterlist *sglist,

spin_lock_irqsave(&(tbl->it_lock), flags);

+ max_seg_size = dma_get_max_seg_size(dev);
for_each_sg(sglist, s, nelems, i) {
unsigned long vaddr, npages, entry, slen;

@@ -338,7 +341,8 @@ int...

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This sets the segment size limit properly via pci_set_dma_max_seg_size
and remove blk_queue_max_segment_size because scsi-ml calls it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
drivers/scsi/aacraid/linit.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/aacraid/linit.c b/drivers/scsi/aacraid/linit.c
index 038980b..04d6a65 100644
--- a/drivers/scsi/aacraid/linit.c
+++ b/drivers/scsi/aacraid/linit.c
@@ -435,9 +435,6 @@ static int aac_slave_configure(struct scsi_device *sdev)
else if (depth < 2)
depth = 2;
scsi_adjust_queue_depth(sdev, MSG_ORDERED_TAG, depth);
- if (!(((struct aac_dev *)host->hostdata)->adapter_info.options &
- AAC_OPT_NEW_COMM))
- blk_queue_max_segment_size(sdev->request_queue, 65536);
} else
scsi_adjust_queue_depth(sdev, 0, 1);

@@ -1045,6 +1042,12 @@ static int __devinit aac_probe_one(struct pci_dev *pdev,
if (error < 0)
goto out_deinit;

+ if (!(aac->adapter_info.options & AAC_OPT_NEW_COMM)) {
+ error = pci_set_dma_max_seg_size(pdev, 65536);
+ if (error)
+ goto out_deinit;
+ }
+
/*
* Lets override negotiations and drop the maximum SG limit to 34
*/
--
1.5.2.4

-

To: FUJITA Tomonori <tomof@...>, <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, AACRAID <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 9:34 am

ACK

Based on the presence of the call. 2.6.22, for instance, does not have
this capability...

I did not test this change, just accepting on the principals. How much
testing of the change did you do Fujita?

-

To: <mark_salyzyn@...>
Cc: <tomof@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 12:21 pm

On Wed, 24 Oct 2007 09:34:23 -0400

Unfortunately, I tested only the main component (device and pci
changes) with ppc64 IOMMU. I don't have the other IOMMUs.

I didn't test the aacraid patch but I guess that I have aacraid in the
workplace so I can test the patch (without the IOMMU chages).
-

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, AACRAID <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 12:25 pm

Not requesting you to test (aacraid), just scoping any effort.

The cards in question are the (old) Dell PERC variety that would trigger
the need. I will notify our Dell liaison to see what they can do.

-

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:31 am

is this needed, given that the default is already 65536?

Jeff

-

To: Jeff Garzik <jeff@...>, FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, AACRAID <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 9:34 am

Apparently so, as we had to add it in the past, mainly because the
feature to limit was not part of the SCSI layer when the original limit
code was added. At that time it replaced a complicated sg breakup
algorithm.

Does your statement guarantee that the default will not change to a
large value? If not, then we need to enforce it in perpetuity...

Sincerely -- Mark Salyzyn
-

To: <jeff@...>
Cc: <tomof@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <aacraid@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:35 am

On Wed, 24 Oct 2007 07:31:30 -0400

Yeah, but I thought that it would be better to set it explicitly
because 'aac->adapter_info.options & AAC_OPT_NEW_COMM' HBAs looks to
be able to handle larger segments.
-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

[Empty message]
To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <kyle@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This patch makes iommu respect segment size limits when merging sg
lists.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
drivers/parisc/ccio-dma.c | 2 +-
drivers/parisc/iommu-helpers.h | 7 ++++++-
drivers/parisc/sba_iommu.c | 2 +-
3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/parisc/ccio-dma.c b/drivers/parisc/ccio-dma.c
index 7c60cbd..2ec7db1 100644
--- a/drivers/parisc/ccio-dma.c
+++ b/drivers/parisc/ccio-dma.c
@@ -941,7 +941,7 @@ ccio_map_sg(struct device *dev, struct scatterlist *sglist, int nents,
** w/o this association, we wouldn't have coherent DMA!
** Access to the virtual address is what forces a two pass algorithm.
*/
- coalesced = iommu_coalesce_chunks(ioc, sglist, nents, ccio_alloc_range);
+ coalesced = iommu_coalesce_chunks(ioc, dev, sglist, nents, ccio_alloc_range);

/*
** Program the I/O Pdir
diff --git a/drivers/parisc/iommu-helpers.h b/drivers/parisc/iommu-helpers.h
index 0a1f99a..97ba828 100644
--- a/drivers/parisc/iommu-helpers.h
+++ b/drivers/parisc/iommu-helpers.h
@@ -95,12 +95,14 @@ iommu_fill_pdir(struct ioc *ioc, struct scatterlist *startsg, int nents,
*/

static inline unsigned int
-iommu_coalesce_chunks(struct ioc *ioc, struct scatterlist *startsg, int nents,
+iommu_coalesce_chunks(struct ioc *ioc, struct device *dev,
+ struct scatterlist *startsg, int nents,
int (*iommu_alloc_range)(struct ioc *, size_t))
{
struct scatterlist *contig_sg; /* contig chunk head */
unsigned long dma_offset, dma_len; /* start/len of DMA stream */
unsigned int n_mappings = 0;
+ unsigned int max_seg_size = dma_get_max_seg_size(dev);

while (nents > 0) {

@@ -142,6 +144,9 @@ iommu_coalesce_chunks(struct ioc *ioc, struct scatterlist *startsg, int nents,
IOVP_SIZE) > DMA_CHUNK_SIZE))
break;

+ if (startsg->length + dma_len > max_seg_size)
+ break;
+
/*
** Next see if we can append the next chunk (i.e.
...

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <kyle@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:35 am

patches 3-8 look OK

-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jeff@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This sets the segment size limit properly via pci_set_dma_max_seg_size
and remove blk_queue_max_segment_size because scsi-ml calls it.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
drivers/ata/sata_inic162x.c | 25 +++++++++++++------------
1 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/ata/sata_inic162x.c b/drivers/ata/sata_inic162x.c
index 08595f3..4f3ba83 100644
--- a/drivers/ata/sata_inic162x.c
+++ b/drivers/ata/sata_inic162x.c
@@ -108,17 +108,6 @@ struct inic_port_priv {
u8 cached_pirq_mask;
};

-static int inic_slave_config(struct scsi_device *sdev)
-{
- /* This controller is braindamaged. dma_boundary is 0xffff
- * like others but it will lock up the whole machine HARD if
- * 65536 byte PRD entry is fed. Reduce maximum segment size.
- */
- blk_queue_max_segment_size(sdev->request_queue, 65536 - 512);
-
- return ata_scsi_slave_config(sdev);
-}
-
static struct scsi_host_template inic_sht = {
.module = THIS_MODULE,
.name = DRV_NAME,
@@ -132,7 +121,7 @@ static struct scsi_host_template inic_sht = {
.use_clustering = ATA_SHT_USE_CLUSTERING,
.proc_name = DRV_NAME,
.dma_boundary = ATA_DMA_BOUNDARY,
- .slave_configure = inic_slave_config,
+ .slave_configure = ata_scsi_slave_config,
.slave_destroy = ata_scsi_slave_destroy,
.bios_param = ata_std_bios_param,
};
@@ -730,6 +719,18 @@ static int inic_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
return rc;
}

+ /*
+ * This controller is braindamaged. dma_boundary is 0xffff
+ * like others but it will lock up the whole machine HARD if
+ * 65536 byte PRD entry is fed. Reduce maximum segment size.
+ */
+ rc = pci_set_dma_max_seg_size(pdev, 65536 - 512);
+ if (rc) {
+ dev_printk(KERN_ERR, &pdev->dev,
+ "failed to set the maximum segment size.\n");
+ return rc;
+ }
+
rc = init_controller(iomap[MMIO_BAR], hpriv->cached_hctl);
if (rc) {
dev_printk(KERN_ERR, &pdev->dev,
...

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 7:39 am

ACK

-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <jeff@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

request_queue and device struct must have the same value of a segment
size limit. This patch adds blk_queue_segment_boundary in
__scsi_alloc_queue so LLDs don't need to call both
blk_queue_segment_boundary and set_dma_max_seg_size. A LLD can change
the default value (64KB) can call device_dma_parameters accessors like
pci_set_dma_max_seg_size when allocating scsi_host.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
drivers/scsi/scsi_lib.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 61fdaf0..23a30ab 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1645,6 +1645,7 @@ struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
request_fn_proc *request_fn)
{
struct request_queue *q;
+ struct device *dev = shost->shost_gendev.parent;

q = blk_init_queue(request_fn, NULL);
if (!q)
@@ -1673,6 +1674,8 @@ struct request_queue *__scsi_alloc_queue(struct Scsi_Host *shost,
blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost));
blk_queue_segment_boundary(q, shost->dma_boundary);

+ blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
+
if (!shost->use_clustering)
clear_bit(QUEUE_FLAG_CLUSTER, &q->queue_flags);
return q;
--
1.5.2.4

-

To: FUJITA Tomonori <tomof@...>
Cc: <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <fujita.tomonori@...>, Jens Axboe <jens.axboe@...>
Date: Wednesday, October 24, 2007 - 7:39 am

it would be nice to have something more general that's useable in
drivers/block/sx8.c (for example), something like

static inline void
dev_blk_associate(struct device *dev, request_queue *q)
{
blk_queue_max_segment_size(q,
dma_get_max_seg_size(dev));
}

still, I will ACK the above patch (#9) in case you wish that to become a
future cleanup, or others dislike this suggestion

-

To: <jeff@...>, <jens.axboe@...>
Cc: <tomof@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 10:15 am

On Wed, 24 Oct 2007 07:39:16 -0400

Yeah, I thought about something like that. But I can't find many
non-scsi drivers (ide, sx8, mmc/card/, anymore?) doing dma and having
the restrictions so I just call blk_queue_max_segment_size here.

Either is ok with me. I'll modify the patch if Jens prefers such
function. Jens?
-

To: FUJITA Tomonori <tomof@...>
Cc: <jens.axboe@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 10:34 am

My main idea was that request_queue will eventually want to know struct

I have no strong opinion... it was just a thought.

Jeff

-

To: FUJITA Tomonori <tomof@...>
Cc: <jeff@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 10:28 am

If there's a use for it, sure.

--
Jens Axboe

-

To: <jens.axboe@...>
Cc: <tomof@...>, <jeff@...>, <akpm@...>, <linux-kernel@...>, <linux-scsi@...>, <James.Bottomley@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 10:36 am

On Wed, 24 Oct 2007 16:28:11 +0200

Ok, I'll update the patch and add some patches for non-scsi drivers.
-

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <davem@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This patch makes iommu respect segment size limits when merging sg
lists.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
arch/sparc64/kernel/iommu.c | 2 +-
arch/sparc64/kernel/iommu_common.c | 8 ++++++--
arch/sparc64/kernel/iommu_common.h | 3 ++-
arch/sparc64/kernel/pci_sun4v.c | 2 +-
4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/sparc64/kernel/iommu.c b/arch/sparc64/kernel/iommu.c
index 070a484..4b9115a 100644
--- a/arch/sparc64/kernel/iommu.c
+++ b/arch/sparc64/kernel/iommu.c
@@ -580,7 +580,7 @@ static int dma_4u_map_sg(struct device *dev, struct scatterlist *sglist,

/* Step 1: Prepare scatter list. */

- npages = prepare_sg(sglist, nelems);
+ npages = prepare_sg(dev, sglist, nelems);

/* Step 2: Allocate a cluster and context, if necessary. */

diff --git a/arch/sparc64/kernel/iommu_common.c b/arch/sparc64/kernel/iommu_common.c
index b70324e..62c3218 100644
--- a/arch/sparc64/kernel/iommu_common.c
+++ b/arch/sparc64/kernel/iommu_common.c
@@ -4,6 +4,7 @@
* Copyright (C) 1999 David S. Miller (davem@redhat.com)
*/

+#include <linux/dma-mapping.h>
#include "iommu_common.h"

/* You are _strongly_ advised to enable the following debugging code
@@ -201,21 +202,24 @@ void verify_sglist(struct scatterlist *sglist, int nents, iopte_t *iopte, int np
}
#endif

-unsigned long prepare_sg(struct scatterlist *sg, int nents)
+unsigned long prepare_sg(struct device *dev, struct scatterlist *sg, int nents)
{
struct scatterlist *dma_sg = sg;
unsigned long prev;
u32 dent_addr, dent_len;
+ unsigned int max_seg_size;

prev = (unsigned long) sg_virt(sg);
prev += (unsigned long) (dent_len = sg->length);
dent_addr = (u32) ((unsigned long)(sg_virt(sg)) & (IO_PAGE_SIZE - 1UL));
+ max_seg_size = dma_get_max_seg_size(dev);
while (--nents) {
unsigned long addr;

sg = sg_next(sg);
addr = (unsigned long) sg_virt(sg);
- if (! VCONTIG(prev, addr)) {
+ ...

To: <akpm@...>
Cc: <linux-kernel@...>, <linux-scsi@...>, <tony.luck@...>, <fujita.tomonori@...>
Date: Wednesday, October 24, 2007 - 6:48 am

This patch makes sba iommu respect segment size limits when merging sg
lists.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
arch/ia64/hp/common/sba_iommu.c | 8 ++++++--
1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/hp/common/sba_iommu.c b/arch/ia64/hp/common/sba_iommu.c
index bc859a3..82bf7fa 100644
--- a/arch/ia64/hp/common/sba_iommu.c
+++ b/arch/ia64/hp/common/sba_iommu.c
@@ -1265,7 +1265,7 @@ sba_fill_pdir(
* the sglist do both.
*/
static SBA_INLINE int
-sba_coalesce_chunks( struct ioc *ioc,
+sba_coalesce_chunks(struct ioc *ioc, struct device *dev,
struct scatterlist *startsg,
int nents)
{
@@ -1275,6 +1275,7 @@ sba_coalesce_chunks( struct ioc *ioc,
struct scatterlist *dma_sg; /* next DMA stream head */
unsigned long dma_offset, dma_len; /* start/len of DMA stream */
int n_mappings = 0;
+ unsigned int max_seg_size = dma_get_max_seg_size(dev);

while (nents > 0) {
unsigned long vaddr = (unsigned long) sba_sg_address(startsg);
@@ -1314,6 +1315,9 @@ sba_coalesce_chunks( struct ioc *ioc,
> DMA_CHUNK_SIZE)
break;

+ if (dma_len + startsg->length > max_seg_size)
+ break;
+
/*
** Then look for virtually contiguous blocks.
**
@@ -1441,7 +1445,7 @@ int sba_map_sg(struct device *dev, struct scatterlist *sglist, int nents, int di
** w/o this association, we wouldn't have coherent DMA!
** Access to the virtual address is what forces a two pass algorithm.
*/
- coalesced = sba_coalesce_chunks(ioc, sglist, nents);
+ coalesced = sba_coalesce_chunks(ioc, dev, sglist, nents);

/*
** Program the I/O Pdir
--
1.5.2.4

-

Previous thread: none

Next thread: none