On Wed, Jul 14, 2010 at 06:41:48PM -0700, Zach Pfeffer wrote:
That's not entirely correct. The DMA API provides two things:
1. An API for allocating DMA coherent buffers
2. An API for mapping streaming buffers
Some implementations of (2) end up using (1) to work around broken
hardware - but that's a separate problem (and causes its own set of
problems.)
You're making it sound like extremely hard work.
struct scatterlist *sg;
int i, nents = 11;
sg = kmalloc(sizeof(*sg) * nents, GFP_KERNEL);
if (!sg)
return -ENOMEM;
sg_init_table(sg, nents);
for (i = 0; i < nents; i++) {
if (i != nents - 1)
len = 1048576;
else
len = 64*1024;
buf = alloc_buffer(len);
sg_set_buf(&sg[i], buf, len);
}
There's no need to split the scatterlist elements up into individual
pages - the block layer doesn't do that when it passes scatterlists
down to block device drivers.
I'm not saying that it's reasonable to pass (or even allocate) a 1MB
buffer via the DMA API.
This is something the DMA API doesn't do - probably because there hasn't
been a requirement for it.
One of the issues for drivers is that by separating the mapped scatterlist
from the input buffer scatterlist, it creates something else for them to
allocate, which causes an additional failure point - and as all users sit
well with the current API, there's little reason to change especially
given the number of drivers which would need to be updated.
What you can do is:
struct map {
dma_addr_t addr;
size_t len;
};
int map_sg(struct device *dev, struct scatterlist *list,
unsigned int nents, struct map *map, enum dma_data_direction dir)
{
struct scatterlist *sg;
unsigned int i, j = 0;
for_each_sg(list, sg, nents, i) {
map[j]->addr = dma_map_page(dev, sg_page(sg), sg->offset,
sg->length, dir);
map[j]->len = length;
if (dma_mapping_error(map[j]->addr))
break;
j++;
}
return j;
}
void unmap(struct device *dev, struct map *map, unsigned int nents,
enum dma_data_direction dir)
{
while (nents) {
dma_unmap_page(dev, map->addr, map->len, dir);
map++;
nents--;
}
}
Note: this may not be portable to all architectures. It may also break
if there's something like the dmabounce or swiotlb code remapping buffers
which don't fit the DMA mask for the device - that's a different problem.
You can then map the same scatterlist into multiple different 'map'
arrays for several devices simultaneously. What you can't do is access
the buffers from the CPU while they're mapped to any device.
I'm not saying that you should do the above - I'm just proving that it's
not as hard as you seem to be making out.
--