> > and storage devices typically use map_sg. But they both exercise the same
> > underlying resource management code since it's the same IOMMU they poke at.
> >
> > ...
> > >> Sorry, I didn't see a replacement for the deferred_flush_tables.
> > >> Mark Gross and I agree this substantially helps with unmap performance.
> > >> See
http://lkml.org/lkml/2008/3/3/373
> > >
> > > Yeah, I can add a nice trick in parisc sba_iommu uses. I'll try next
> > > time.
> > >
> > > But it probably gives the bitmap method less gain than the RB tree
> > > since clear the bitmap takes less time than changing the tree.
> > >
> > > The deferred_flush_tables also batches flushing TLB. The patch flushes
> > > TLB only when it reaches the end of the bitmap (it's a trick that some
> > > IOMMUs like SPARC does).
> >
> > The batching of the TLB flushes is the key thing. I was being paranoid
> > by not marking the resource free until after the TLB was flushed. If we
> > know the allocation is going to be circular through the bitmap, flushing
> > the TLB once per iteration through the bitmap should be sufficient since
> > we can guarantee the IO Pdir resource won't get re-used until a full
> > cycle through the bitmap has been completed.
>
> Not necessarily ... there's a safety vs performance issue here. As long
> as the iotlb mapping persists, the device can use it to write to the
> memory. If you fail to flush, you lose the ability to detect device dma
> after free (because the iotlb may still be valid). On standard systems,
> this happens so infrequently as to be worth the tradeoff. However, in
> virtualised systems, which is what the intel iommu is aimed at, stale
> iotlb entries can be used by malicious VMs to gain access to memory
> outside of their VM, so the intel people at least need to say whether
> they're willing to accept this speed for safety tradeoff.