login
Header Space

 
 

Re: [PATCH]iommu-iotlb-flushing

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: mark gross <mgross@...>
Cc: Grant Grundler <grundler@...>, Andrew Morton <akpm@...>, <greg@...>, lkml <linux-kernel@...>, <linux-pci@...>
Date: Wednesday, March 5, 2008 - 2:23 pm

On Mon, Mar 03, 2008 at 10:34:11AM -0800, mark gross wrote:

So if we flushed the IOTLB every 16 unmap calls, it should be < 1% penalty?

Using that logic is how I decided the current calue of DELAYED_RESOURCE_CNT
on parisc.


Agreed. PA-RISC is a legacy platform at this point.


Otherwise, Anything running in a legacy IDE mode is using 32-bit DMA.
So linux  still has plenty of SATA drivers using 32-bit DMA.

All the NICs, FC, Infiniband, et al are PCI-e and thus by definition
64-bit DMA capable.


*nod* - I know. That why I use pktgen to measure the dma map/unmap overhead.
Note that with solid state storage (should be more visible in the next year
or so), the transaction rate is going to look alot more like a NIC than
the traditional HBA storage controller. So map/unmap performance will
matter for those configurations too.



Ok - I wasn't sure which step was the "syncronize step".

BTW, I hope you (and others from Intel - go willy! ;) can give feedback
to the Intel/AMD chipset designers to ditch this design ASAP.
It clearly sucks.


*shrug* I didn't think too much about it on the original implementation
and it happened to work out nicely. Think cacheline "life cycle" when
picking array sizes (or when comparing to linked lists) and you'll
usually end up with the best performing design.


Well, it can be more scientific. Determine the overhead (with vs without).
Then divide by how many times one can afford to ignore the sync op.
That will be (roughly) be the new overhead.


Huh? I think you misunderstood. Decide how _many_ you want to defer
at compile time and leave the general feature enabled as a runtime option.
(e.g. use_vtd=1 boot param or when booting with Xen or KVM enabled).


Ok - my suggestion still applies. "compile time constant" only refers
to the number of unmaps the code will defer. e.g. "borrow" DELAY_RESOURCE_CNT.
I understand the DMAR needs to be enabled at runtime for production.
(I've spent ~5 years dealing with RH/SuSE/Debian IA64-linux kernels...)

If you can reduce the overhead to < 1% for pktgen, TPC-C won't
notice it and I doubt specweb will either.


Excellent. I'm also happy to answer questions on linux-pci about this.
And willy will be back in oregon I'm sure. :)
You can also visit him in Ottawa if you register for OLS this year.


Well, you still need a consistent benchmark/workload (e.g. pktgen) and
a precise tool to measure the overhead (e.g. oprofile or perfmon2).

Using the IOAT would allow people without 10G NICs to mess around with this.


*nod*


Understood. That's why netperf (see netperf.org) measures "service demand".
Taking CPU away from user space generally results in lower benchmark/app perf.

thanks,
grant
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH]iommu-iotlb-flushing, mark gross, (Wed Feb 20, 8:06 pm)
Re: [PATCH]iommu-iotlb-flushing, Andrew Morton, (Sat Feb 23, 4:05 am)
Re: [PATCH]iommu-iotlb-flushing, mark gross, (Fri Feb 29, 7:18 pm)
Re: [PATCH]iommu-iotlb-flushing, Grant Grundler, (Sat Mar 1, 3:10 am)
Re: [PATCH]iommu-iotlb-flushing, mark gross, (Mon Mar 3, 2:34 pm)
Re: [PATCH]iommu-iotlb-flushing, Grant Grundler, (Wed Mar 5, 2:23 pm)
Re: [PATCH]iommu-iotlb-flushing, Greg KH, (Sat Mar 1, 1:54 am)
Re: [PATCH]iommu-iotlb-flushing, mark gross, (Mon Feb 25, 12:28 pm)
Re: [PATCH]iommu-iotlb-flushing, Andrew Morton, (Mon Feb 25, 2:40 pm)
speck-geostationary