Signed-off-by: Fenghua Yu <email@example.com> Signed-off-by: Tony Luck <firstname.lastname@example.org> --- Kconfig | 17 ++++ Makefile | 1 configs/generic_defconfig | 2 configs/tiger_defconfig | 2 dig/Makefile | 2 dig/dig_vtd_iommu.c | 59 +++++++++++++++++ dig/machvec_vtd.c | 3 include/asm/cacheflush.h | 2 include/asm/device.h | 3 include/asm/dma-mapping.h | 50 ++++++++++++++ include/asm/iommu.h | 11 +++ include/asm/machvec.h | 2 include/asm/machvec_dig_vtd.h | 38 +++++++++++ include/asm/machvec_init.h | 1 include/asm/pci.h | 3 include/asm/swiotlb.h | 56 ++++++++++++++++ kernel/Makefile | 4 + kernel/acpi.c | 17 ++++ kernel/msi_ia64.c | 80 +++++++++++++++++++++++ kernel/pci-dma.c | 143 ++++++++++++++++++++++++++++++++++++++++++ kernel/pci-swiotlb.c | 46 +++++++++++++ kernel/setup.c | 42 ++++++++---- lib/flush.S | 55 ++++++++++++++++ 23 files changed, 626 insertions(+), 13 deletions(-) diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 48e496f..ae965aa 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -125,6 +125,7 @@ config IA64_GENERIC select NUMA select ACPI_NUMA select SWIOTLB + select PCI_MSI help This selects the system type of your hardware. A "generic" kernel will run on any supported IA-64 system. However, if you configure @@ -132,6 +133,7 @@ config IA64_GENERIC generic For any supported IA-64 system DIG-compliant For DIG ("Developer's Interface Guide") compliant systems + DIG+Intel+IOMMU For DIG systems with Intel IOMMU HP-zx1/sx1000 For HP systems HP-zx1/sx1000+swiotlb For HP systems with (broken) DMA-constrained devices. SGI-SN2 For SGI Altix systems @@ ...
This patch adds clflush_cache_range(), but it's not used anywhere. If you do need it, it'd be nice if the arguments were the same types as for flush_icache_range(), and if there were a comment describing why it is necessary for VT-d. And maybe the name could be more like Please use dev_info() here. I see you just copied this from x86, but we should fix x86, too. Or better, since this doesn't appear to be arch-specific, maybe this should be moved to drivers/pci/quirks.c Shouldn't forbid_dac be a per-device or at least a per-bridge The "PCI: " should be removed since dev_info() will add the driver name and device ID. Bjorn --
Since clflush_cache_range(start, size) is defined in x86, I just want to keep the same definition. So this patch set won't change __iommu_flush_cache(). Otherwise, the patch set will have #ifdef CONFIG_IA64 in __iommu_flush_cache() which is not desired. Will change this. Bjorn --
Oh, OK. I didn't look hard enough to find __iommu_flush_cache() (currently in drivers/pci/intel-iommu.c). Architecturally, I'm surprised that ia64 would need to actually do a cache flush. I would think the VT-d hardware would do coherent accesses which would make the cache flush unnecessary. Bjorn --
VT-d hardware supports both non cache coherency and cache coherency by bit Coherency in Extended Capabilities Register. Could you please point me to the doc that explicitly says that architecturally ia64 doesn't need cache flush? Thanks. -Fenghua --
The current patch set works just fine for both cache coherency and non cache coherency. We don't need to abandon non cache coherency support on ia64 unless there is explicit spec claiming that non cache coherency is a requirement on all ia64 platforms. Thanks. -Fenghua --
I don't know the details of VT-d, so I'm just asking the question. I do know the HP IOMMU does not require flushing because it participates in the coherency domain, so I was just surprised to see this in Intel chipset support. The following sections in volume 2 of the SDM mention DMA: Part 1, Sec 4.4.3, Cacheability and Coherency Attribute: The processor must ensure that transactions from other I/O agents (such as DMA) are physically coherent with the instruction and data cache. Part 2, Sec 2.5.4, DMA: Unlike Programmed I/O, which requires intervention from the CPU to move data from the device to main memory, data movement in DMA occurs without help from the CPU. A processor based on the Itanium architecture expects the platform to maintain coherency for DMA traffic. That is, the platform issues snoop cycles on the bus to invalidate cacheable pages that a DMA access modifies. These snoop cycles invalidate the appropriate lines in both instruction and data caches and thus maintain coherency. This behavior allows an operating system to page code pages without taking explicit actions to ensure coherency. Software must maintain coherency for DMA traffic through explicit action if the platform does not maintain coherency for this traffic. Software can provide coherency by using the flush cache instruction, fc, to invalidate the instruction and data cache lines that a DMA transfer modifies. It sounds like the expectation is that DMA will be fully coherent and no flushes would be required, but there is wiggle room in that last paragraph for platforms that don't maintain coherency. Bjorn --
The cache coherency bit in VT-d is for root, context, and page tables which are for DMA management, not DMA data itself. VT-d DMA data should be cache coherent.The Intel IOMMU code doesn't need to deal with non cache coherency in DMA data traffic. But root, context, and page tables could be non cache coherent and this is handled by Intel IOMMU code. Thanks. -Fenghua --
The patch set adds kernel parameter intel_iommu=pt to set up pass through mode in context mapping entry. This disables DMAR in linux kernel; but KVM still runs on VT-d. In this mode, kernel uses swiotlb for DMA API functions but other VT-d functionalities are enabled for KVM. KVM always uses multi level translation page table in VT-d. By default, pass though mode is disabled in kernel. This is useful when people don't want to enable VT-d DMAR in kernel for reasons like kernel iommu performance concern or debug purpose but still want to use KVM. Thanks. -Fenghua Signed-off-by: Fenghua Yu <email@example.com> Signed-off-by: Weidong Han <firstname.lastname@example.org> Signed-off-by: Allen Kay <email@example.com> Signed-off-by: David Woodhouse <firstname.lastname@example.org> --- Documentation/kernel-parameters.txt | 5 +++ arch/ia64/include/asm/iommu.h | 1 arch/ia64/kernel/pci-swiotlb.c | 2 - arch/x86/include/asm/iommu.h | 1 arch/x86/kernel/pci-swiotlb_64.c | 4 ++- drivers/pci/intel-iommu.c | 47 ++++++++++++++++++++++++++---------- include/linux/dma_remapping.h | 3 ++ include/linux/intel-iommu.h | 3 +- 8 files changed, 50 insertions(+), 16 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index e0f346d..b966185 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -931,6 +931,11 @@ and is between 256 and 4096 characters. It is defined in the file With this option on every unmap_single operation will result in a hardware IOTLB flush operation as opposed to batching them for performance. + pt [Default no Pass Through] + This option enables Pass Through in context mapping if + Pass Through is supported in hardware. With this option + DMAR is disabled in kernel and kernel uses swiotlb, but + KVM still uses VT-d hardware. io_delay= [X86-32,X86-64] I/O delay method 0x80 diff ...
The patch contains Intel IOMMU IA64 specific code. It defines new machvec dig_vtd, hooks for IOMMU, DMAR table detection, cache line flush function, etc. For a generic kernel with CONFIG_DMAR=y, if Intel IOMMU is detected, dig_vtd is used for machinve vector. Otherwise, kernel falls back to dig machine vector. Kernel parameter "machvec=dig" or "intel_iommu=off" can be used to force kernel to boot dig machine vector. Signed-off-by: Fenghua Yu <email@example.com> Signed-off-by: Tony Luck <firstname.lastname@example.org> --- arch/ia64/Kconfig | 17 ++++ arch/ia64/Makefile | 1 arch/ia64/configs/generic_defconfig | 2 arch/ia64/configs/tiger_defconfig | 2 arch/ia64/dig/Makefile | 5 + arch/ia64/dig/dig_vtd_iommu.c | 59 ++++++++++++++ arch/ia64/dig/machvec_vtd.c | 3 arch/ia64/include/asm/cacheflush.h | 2 arch/ia64/include/asm/device.h | 3 arch/ia64/include/asm/dma-mapping.h | 50 ++++++++++++ arch/ia64/include/asm/iommu.h | 16 +++ arch/ia64/include/asm/machvec.h | 2 arch/ia64/include/asm/machvec_dig_vtd.h | 38 +++++++++ arch/ia64/include/asm/machvec_init.h | 1 arch/ia64/include/asm/pci.h | 3 arch/ia64/include/asm/swiotlb.h | 56 +++++++++++++ arch/ia64/kernel/Makefile | 4 arch/ia64/kernel/acpi.c | 17 ++++ arch/ia64/kernel/msi_ia64.c | 80 +++++++++++++++++++ arch/ia64/kernel/pci-dma.c | 129 ++++++++++++++++++++++++++++++++ arch/ia64/kernel/pci-swiotlb.c | 46 +++++++++++ arch/ia64/kernel/setup.c | 42 +++++++--- arch/ia64/lib/flush.S | 55 +++++++++++++ 23 files changed, 620 insertions(+), 13 deletions(-) diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index 48e496f..ae965aa 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -125,6 +125,7 @@ ...
|Greg KH||Og dreams of kernels|
|Jens Axboe||[PATCH 31/33] Fusion: sg chaining support|
|Arnd Bergmann||Re: finding your own dead "CONFIG_" variables|
|Mark Brown||[PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset|
|Tony Breeds||[LGUEST] Look in object dir for .config|
|Brian Downing||Re: Git in a Nutshell guide|
|John Benes||Re: master has some toys|
|Matthias Lederhofer||[PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree|
|Alexander Sulfrian||[RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set|
|Junio C Hamano||Re: Rss produced by git is not valid xml?|