Re: [PATCH 2/2]Add Variable Page Size and IA64 Support in Intel IOMMU: IA64 Specific Part

Previous thread: [PATCH 1/2]Add Variable Page Size and IA64 Support in Intel IOMMU: Generic Part by Fenghua Yu on Wednesday, October 1, 2008 - 9:57 am. (9 messages)

Next thread: [RFC PATCH] LTTng relay buffer allocation, read, write v2 by Mathieu Desnoyers on Wednesday, October 1, 2008 - 10:07 am. (1 message)
From: Fenghua Yu
Date: Wednesday, October 1, 2008 - 9:57 am

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

---

 Kconfig                       |   17 ++++
 Makefile                      |    1 
 configs/generic_defconfig     |    2 
 configs/tiger_defconfig       |    2 
 dig/Makefile                  |    2 
 dig/dig_vtd_iommu.c           |   59 +++++++++++++++++
 dig/machvec_vtd.c             |    3 
 include/asm/cacheflush.h      |    2 
 include/asm/device.h          |    3 
 include/asm/dma-mapping.h     |   50 ++++++++++++++
 include/asm/iommu.h           |   11 +++
 include/asm/machvec.h         |    2 
 include/asm/machvec_dig_vtd.h |   38 +++++++++++
 include/asm/machvec_init.h    |    1 
 include/asm/pci.h             |    3 
 include/asm/swiotlb.h         |   56 ++++++++++++++++
 kernel/Makefile               |    4 +
 kernel/acpi.c                 |   17 ++++
 kernel/msi_ia64.c             |   80 +++++++++++++++++++++++
 kernel/pci-dma.c              |  143 ++++++++++++++++++++++++++++++++++++++++++
 kernel/pci-swiotlb.c          |   46 +++++++++++++
 kernel/setup.c                |   42 ++++++++----
 lib/flush.S                   |   55 ++++++++++++++++
 23 files changed, 626 insertions(+), 13 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 48e496f..ae965aa 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -125,6 +125,7 @@ config IA64_GENERIC
 	select NUMA
 	select ACPI_NUMA
 	select SWIOTLB
+	select PCI_MSI
 	help
 	  This selects the system type of your hardware.  A "generic" kernel
 	  will run on any supported IA-64 system.  However, if you configure
@@ -132,6 +133,7 @@ config IA64_GENERIC
 
 	  generic		For any supported IA-64 system
 	  DIG-compliant		For DIG ("Developer's Interface Guide") compliant systems
+	  DIG+Intel+IOMMU	For DIG systems with Intel IOMMU
 	  HP-zx1/sx1000		For HP systems
 	  HP-zx1/sx1000+swiotlb	For HP systems with (broken) DMA-constrained devices.
 	  SGI-SN2		For SGI Altix systems
@@ ...
From: Bjorn Helgaas
Date: Thursday, October 2, 2008 - 8:51 am

This patch adds clflush_cache_range(), but it's not used anywhere.

If you do need it, it'd be nice if the arguments were the same types
as for flush_icache_range(), and if there were a comment describing
why it is necessary for VT-d.  And maybe the name could be more like

Please use dev_info() here.  I see you just copied this from x86, but
we should fix x86, too.  Or better, since this doesn't appear to be
arch-specific, maybe this should be moved to drivers/pci/quirks.c

Shouldn't forbid_dac be a per-device or at least a per-bridge

The "PCI: " should be removed since dev_info() will add the driver
name and device ID.

Bjorn


--

From: Yu, Fenghua
Date: Thursday, October 2, 2008 - 10:46 am

Since clflush_cache_range(start, size) is defined in x86, I just want to keep the same definition. So this patch set won't change __iommu_flush_cache(). Otherwise, the patch set will have #ifdef CONFIG_IA64 in __iommu_flush_cache() which is not desired.


Will change this.

Bjorn


--

From: Bjorn Helgaas
Date: Friday, October 3, 2008 - 8:41 am

Oh, OK.  I didn't look hard enough to find __iommu_flush_cache()
(currently in drivers/pci/intel-iommu.c).

Architecturally, I'm surprised that ia64 would need to actually do a
cache flush.  I would think the VT-d hardware would do coherent accesses
which would make the cache flush unnecessary.

Bjorn
--

From: Yu, Fenghua
Date: Friday, October 3, 2008 - 5:53 pm

VT-d hardware supports both non cache coherency and cache coherency by bit Coherency in Extended Capabilities Register.

Could you please point me to the doc that explicitly says that architecturally ia64 doesn't need cache flush?

Thanks.

-Fenghua
--

From: David Woodhouse
Date: Friday, October 3, 2008 - 11:09 pm

But is the version without the cache coherency actually going to be

For safety, we can always make the driver just refuse to initialise on
IA64 if the cache coherency bit isn't set.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

--

From: Yu, Fenghua
Date: Saturday, October 4, 2008 - 7:17 am

The current patch set works just fine for both cache coherency and non cache coherency. We don't need to abandon non cache coherency support on ia64 unless there is explicit spec claiming that non cache coherency is a requirement on all ia64 platforms.

Thanks.

-Fenghua
--

From: Bjorn Helgaas
Date: Monday, October 6, 2008 - 7:55 am

I don't know the details of VT-d, so I'm just asking the question.  I
do know the HP IOMMU does not require flushing because it participates
in the coherency domain, so I was just surprised to see this in Intel
chipset support.

The following sections in volume 2 of the SDM mention DMA:

Part 1, Sec 4.4.3, Cacheability and Coherency Attribute:

  The processor must ensure that transactions from other I/O agents
  (such as DMA) are physically coherent with the instruction and data
  cache.

Part 2, Sec 2.5.4, DMA:

  Unlike Programmed I/O, which requires intervention from the CPU
  to move data from the device to main memory, data movement in DMA
  occurs without help from the CPU.  A processor based on the Itanium
  architecture expects the platform to maintain coherency for DMA
  traffic.  That is, the platform issues snoop cycles on the bus to
  invalidate cacheable pages that a DMA access modifies.  These snoop
  cycles invalidate the appropriate lines in both instruction and
  data caches and thus maintain coherency. This behavior allows an
  operating system to page code pages without taking explicit actions
  to ensure coherency.

  Software must maintain coherency for DMA traffic through explicit
  action if the platform does not maintain coherency for this traffic.
  Software can provide coherency by using the flush cache instruction,
  fc, to invalidate the instruction and data cache lines that a DMA
  transfer modifies.

It sounds like the expectation is that DMA will be fully coherent
and no flushes would be required, but there is wiggle room in that
last paragraph for platforms that don't maintain coherency.

Bjorn
--

From: Fenghua Yu
Date: Monday, October 6, 2008 - 5:35 pm

The cache coherency bit in VT-d is for root, context, and page tables which are for DMA management, not DMA data itself. VT-d DMA data should be cache coherent.The Intel IOMMU code doesn't need to deal with non cache coherency in DMA data traffic. But root, context, and page tables could be non cache coherent and this is handled by Intel IOMMU code.

Thanks.

-Fenghua
--

From: Fenghua Yu
Date: Monday, November 24, 2008 - 12:53 pm

The patch set adds kernel parameter intel_iommu=pt to set up pass through mode in
context mapping entry. This disables DMAR in linux kernel; but KVM still runs on
VT-d. In this mode, kernel uses swiotlb for DMA API functions but other VT-d 
functionalities are enabled for KVM. KVM always uses multi level translation
page table in VT-d. By default, pass though mode is disabled in kernel.

This is useful when people don't want to enable VT-d DMAR in kernel for
reasons like kernel iommu performance concern or debug purpose but still want to
use KVM.

Thanks.

-Fenghua


Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: David Woodhouse <david.woodhouse@intel.com>

---

 Documentation/kernel-parameters.txt |    5 +++
 arch/ia64/include/asm/iommu.h       |    1 
 arch/ia64/kernel/pci-swiotlb.c      |    2 -
 arch/x86/include/asm/iommu.h        |    1 
 arch/x86/kernel/pci-swiotlb_64.c    |    4 ++-
 drivers/pci/intel-iommu.c           |   47 ++++++++++++++++++++++++++----------
 include/linux/dma_remapping.h       |    3 ++
 include/linux/intel-iommu.h         |    3 +-
 8 files changed, 50 insertions(+), 16 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index e0f346d..b966185 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -931,6 +931,11 @@ and is between 256 and 4096 characters. It is defined in the file
 			With this option on every unmap_single operation will
 			result in a hardware IOTLB flush operation as opposed
 			to batching them for performance.
+		pt	[Default no Pass Through]
+			This option enables Pass Through in context mapping if
+			Pass Through is supported in hardware. With this option
+			DMAR is disabled in kernel and kernel uses swiotlb, but
+			KVM still uses VT-d hardware.
 
 	io_delay=	[X86-32,X86-64] I/O delay method
 		0x80
diff ...
From: Fenghua Yu
Date: Monday, October 6, 2008 - 5:02 pm

The patch contains Intel IOMMU IA64 specific code. It defines new machvec dig_vtd, hooks for IOMMU, DMAR table detection, cache line flush function, etc.

For a generic kernel with CONFIG_DMAR=y, if Intel IOMMU is detected, dig_vtd is used for machinve vector. Otherwise, kernel falls back to dig machine vector. Kernel parameter "machvec=dig" or "intel_iommu=off" can be used to force kernel to boot dig machine vector.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
 
---

 arch/ia64/Kconfig                       |   17 ++++
 arch/ia64/Makefile                      |    1 
 arch/ia64/configs/generic_defconfig     |    2 
 arch/ia64/configs/tiger_defconfig       |    2 
 arch/ia64/dig/Makefile                  |    5 +
 arch/ia64/dig/dig_vtd_iommu.c           |   59 ++++++++++++++
 arch/ia64/dig/machvec_vtd.c             |    3 
 arch/ia64/include/asm/cacheflush.h      |    2 
 arch/ia64/include/asm/device.h          |    3 
 arch/ia64/include/asm/dma-mapping.h     |   50 ++++++++++++
 arch/ia64/include/asm/iommu.h           |   16 +++
 arch/ia64/include/asm/machvec.h         |    2 
 arch/ia64/include/asm/machvec_dig_vtd.h |   38 +++++++++
 arch/ia64/include/asm/machvec_init.h    |    1 
 arch/ia64/include/asm/pci.h             |    3 
 arch/ia64/include/asm/swiotlb.h         |   56 +++++++++++++
 arch/ia64/kernel/Makefile               |    4 
 arch/ia64/kernel/acpi.c                 |   17 ++++
 arch/ia64/kernel/msi_ia64.c             |   80 +++++++++++++++++++
 arch/ia64/kernel/pci-dma.c              |  129 ++++++++++++++++++++++++++++++++
 arch/ia64/kernel/pci-swiotlb.c          |   46 +++++++++++
 arch/ia64/kernel/setup.c                |   42 +++++++---
 arch/ia64/lib/flush.S                   |   55 +++++++++++++
 23 files changed, 620 insertions(+), 13 deletions(-)


diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 48e496f..ae965aa 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -125,6 +125,7 @@ ...
Previous thread: [PATCH 1/2]Add Variable Page Size and IA64 Support in Intel IOMMU: Generic Part by Fenghua Yu on Wednesday, October 1, 2008 - 9:57 am. (9 messages)

Next thread: [RFC PATCH] LTTng relay buffer allocation, read, write v2 by Mathieu Desnoyers on Wednesday, October 1, 2008 - 10:07 am. (1 message)