Re: [PATCH 1/2] arm: msm: Add System MMU support.

Previous thread: [PATCH 0/4 -tip] delay documentation and checkpatch additions by Patrick Pannuto on Tuesday, July 27, 2010 - 3:39 pm. (5 messages)

Next thread: [PATCH] tracing: wake up tasks reading trace_pipe on write to trace_marker by Marcin Slusarz on Tuesday, July 27, 2010 - 3:44 pm. (5 messages)
From: Stepan Moskovchenko
Date: Tuesday, July 27, 2010 - 3:41 pm

Add support for the System MMUs found on the 8x60 and 8x72
families of Qualcomm chips. These SMMUs allow virtualization
of the address space used by most of the multimedia cores
on these chips.

Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
---
 arch/arm/mach-msm/include/mach/smmu_driver.h  |  322 +++++
 arch/arm/mach-msm/include/mach/smmu_hw-8xxx.h | 1860 +++++++++++++++++++++++++
 arch/arm/mach-msm/smmu_driver.c               |  834 +++++++++++
 3 files changed, 3016 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/mach-msm/include/mach/smmu_driver.h
 create mode 100644 arch/arm/mach-msm/include/mach/smmu_hw-8xxx.h
 create mode 100644 arch/arm/mach-msm/smmu_driver.c

diff --git a/arch/arm/mach-msm/include/mach/smmu_driver.h b/arch/arm/mach-msm/include/mach/smmu_driver.h
new file mode 100644
index 0000000..29a643d
--- /dev/null
+++ b/arch/arm/mach-msm/include/mach/smmu_driver.h
@@ -0,0 +1,322 @@
+/* Copyright (c) 2010, Code Aurora Forum. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are
+ * met:
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above
+ *       copyright notice, this list of conditions and the following
+ *       disclaimer in the documentation and/or other materials provided
+ *       with the distribution.
+ *     * Neither the name of Code Aurora Forum, Inc. nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED "AS IS" AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT
+ * ARE DISCLAIMED.  ...
From: Daniel Walker
Date: Tuesday, July 27, 2010 - 3:43 pm

This should be GPLv2 ..

Daniel

-- 
Sent by an consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--

From: Arnd Bergmann
Date: Wednesday, July 28, 2010 - 1:39 am

How is this different from an IOMMU?

From a very brief look, it seems that you should be using the
existing dma-mapping APIs here instead of making up your own.

	Arnd
--

From: stepanm
Date: Wednesday, July 28, 2010 - 10:39 am

These are just SMMU APIs, and the DMA-mapping API is one layer above this.

We have our own SMMU API for the MSM SoCs because we have muliple IOMMUs,
each one having multiple contexts, or even having multiple instances of
the same context. Our usage model is also quite a bit different from how
the DMA APIs are set up. I believe only two IOMMU drivers actually make
use of the DMA API (Intel and AMD) and the other ones (OMAP and other
SoCs) have their own APIs for their specific use cases.

Steve

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--

From: Arnd Bergmann
Date: Wednesday, July 28, 2010 - 10:50 am

The DMA API is extremely flexible, it works just fine with all the
IOMMUs that I've seen so far. Please take a look at
include/asm-generic/dma-mapping-common.h and its users to see how
to use multiple IOMMUs depending on the device.

If the OMAP developers got this wrong, that's not your problem :-)

	Arnd
--

From: Russell King - ARM Linux
Date: Wednesday, July 28, 2010 - 2:21 pm

We don't yet use those DMA API interface extensions because we haven't
had the need.  If someone who has the need wants to put the effort in
though...

One of the problems with it though is the abstraction of the sync*
operations is the wrong way around for stuff like dmabounce - we want
to be passed the base address of the buffer (so we can look this up),
plus offset and length.  We don't want to know just the region which
is affected.
--

From: FUJITA Tomonori
Date: Wednesday, July 28, 2010 - 9:15 pm

On Wed, 28 Jul 2010 22:21:56 +0100

We can't pass the base address because the DMA API callers don't pass
the base address for dma_sync_single_for_{device|cpu}.

dma_sync_single_range_for_* requires the base address but they are
obsolete.

So you need to fix dmabounce. Actually, I send you a patch to fix
dmabounce long ago (looks like not applied yet):

http://kerneltrap.org/mailarchive/linux-netdev/2010/4/5/6274046
--

From: Arnd Bergmann
Date: Thursday, July 29, 2010 - 1:12 am

Right, it shouldn't be hard now that the groundwork for that is done.
Also, it's only really needed if you have IOMMUs of different types in the
same system. If msm doesn't have any swiotlb or dmabounce devices,

Yes, but that is an unrelated (dmabounce specific) problem that seems to
be fixed by an existing patch.

The driver posted by Stepan doesn't even support the dma_sync_single_*
style operations, and I don't think it can run into that specific problem.
Are there even (hardware) IOMMUs that are connected to noncoherent
buses? AFAICT, anything that needs to flush a dcache range in dma_sync_*
has a trivial mapping between bus and phys addresses.

	Arnd
--

From: Russell King - ARM Linux
Date: Thursday, July 29, 2010 - 4:47 am

It's not unrelated because it stands in the way of using that interface.
The patch also seems to be buggy in that it doesn't fix the for_device
case - it leaves 'off' as zero.

I'm also not sold on this idea that the sync_range API is being obsoleted.
It seems to me to be a step in the wrong direction.  The range API is a
natural subset of the 'normal' sync API, yet people are trying to shoehorn
the range API into the 'norma' API.  If anything it's the 'normal' API
which should be obsoleted as it provides reduced information to
implementations, which then have to start fuzzy-matching the passed
address.

If we're going to start fuzzy-matching the passed address, then I think
we also need to add detection of overlapping mappings and BUG() on such
cases - otherwise we risk the possibility of having multiple overlapping
mappings and hitting the wrong mapping with this reduced-information sync

Yes.  Virtually all ARM systems have non-cache coherent DMA.  Doesn't
matter if there's an IOMMU or not.
--

From: FUJITA Tomonori
Date: Thursday, July 29, 2010 - 11:14 pm

On Thu, 29 Jul 2010 12:47:26 +0100

Ah, sorry about the bug. Surely, the for_device needs to do the same
as the for_cpu. I've attached the updated patch.

We need to fix dmabounce.c anyway (even if we keep the sync_range API)

It would have been nice if you had opposed when this issue was
discussed...

commit 8127bfc5645db0e050468e0ff971b4081f73ddcf
Author: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date:   Wed Mar 10 15:23:18 2010 -0800

    DMA-API.txt: remove dma_sync_single_range description


As you said, the range API might be safer (since it requires more
information). However, there were already drivers using the
dma_sync_single_for API to do a partial sync (i.e. do a sync on
range).

Inspecting all the usage of the dma_sync_single_for API to see which
drivers to do a partial sync looks unrealistic. So keeping the
dma_sync_single_range_for API is pointless since drivers keep using 
dma_sync_single_for API.

And the majority of implementations doesn't use 'range' information,
i.e., the implementation of dma_sync_single_for and

Strict checking would be nice. If architectures can do such easily, we
had better to do so.

However, I'm not sure we need to take special care for the
dma_sync_single_for API. In general, misuse of the majority of the DMA
functions is deadly.

=
From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Subject: [PATCH] ARM: dmabounce: fix partial sync in dma_sync_single_* API

Some network drivers do a partial sync with
dma_sync_single_for_{device|cpu}. The dma_addr argument might not be
the same as one as passed into the mapping API.

This adds some tricks to find_safe_buffer() for
dma_sync_single_for_{device|cpu}.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
---
 arch/arm/common/dmabounce.c |   32 +++++++++++++++++++++++---------
 1 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index cc0a932..dbd30dc 100644
--- ...
From: stepanm
Date: Wednesday, July 28, 2010 - 5:58 pm

Hi Arnd,

From what I have been able to tell, the IOMMU interface was written by
AMD/Intel to allow the kvm code to work with a common IOMMU interface. To
that end, it isn't really a generic IOMMU interface. We have chosen to use
our own interface because it provides us with a lightweight way of
managing mappings for more esoteric MSM-specific use cases.

These map functions also take into account the way in which we map buffers
that we get from our own physical pool, because the current API was not
intended to deal with prioritized allocation of things like on/off-chip
memory. We are currently evaluating how to use the DMA API with our own
specialized allocator, which has been undergoing some discussion on the
other lists. We would like to use this allocator to maximize TLB
performance, as well as to prioritize the allocation from several
different memory pools.

Steve

--

From: FUJITA Tomonori
Date: Wednesday, July 28, 2010 - 8:35 pm

On Wed, 28 Jul 2010 17:58:50 -0700 (PDT)

Don't confuse the IOMMU interface with the DMA API that Arnd
mentioned.

They are not related at all.

The DMA API is defined in Documentation/DMA-API.txt.

Arnd told you that include/asm-generic/dma-mapping-common.h is the
library to support the DMA API with multiple IOMMUs. Lots of
architectures (x86, powerpc, sh, alpha, ia64, microblaze, sparc)
use it.
--

From: Arnd Bergmann
Date: Thursday, July 29, 2010 - 1:26 am

Exactly, thanks for the clarification. I also didn't realize that there
is now an include/linux/iommu.h file that only describes the PCI SR-IOV
interfaces, unlike the generic IOMMU support that we have in your
include/linux/dma-mapping.h file.

Maybe we should rename linux/iommu.h to something more specific so we
can reduce this confusion in the future.

	Arnd
--

From: FUJITA Tomonori
Date: Thursday, July 29, 2010 - 1:35 am

On Thu, 29 Jul 2010 10:26:55 +0200

developers. The author might rethink.
--

From: Roedel, Joerg
Date: Thursday, July 29, 2010 - 1:40 am

Thats not 100% true. They are not strictly related, but they are related
as they may use the same backend kernel drivers to provide their

The IOMMU-API is not about SR-IOV. It is about the capabilities of
modern IOMMU hardware that we can not provide to the kernel with the
DMA-API such as the ability to choose ourself at which io-virtual
address a given cpu physical address should be mapped.
Also I wouldn't call the DMA-API an IOMMU interface. The API does not
depend on an IOMMU which is an important difference to the IOMMU-API.
The IOMMU-API is probably not generic enough to handle all kinds of
IOMMUs but its closer to a generic IOMMU-API than the DMA-API.

		Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: FUJITA Tomonori
Date: Thursday, July 29, 2010 - 1:46 am

On Thu, 29 Jul 2010 10:40:19 +0200

That's true. However, the point is that include/iommu.h is far from
the IOMMU-API.

You could still insist that include/iommu.h is designed for the
generic IOMMU-API. But the fact is that it's designed for very
specific purposes. No intention to make it for generic purposes.

Since you added it two years ago, nobody has tried to extend
it. Instead, we have something like
arch/arm/plat-omap/include/plat/iommu.h.
--

From: Roedel, Joerg
Date: Thursday, July 29, 2010 - 2:06 am

I have no clue about the ARM iommus on the omap-platform. From a quick
look into the header file I see some similarities to the IOMMU-API. I am
also very open for discussions about how the IOMMU-API could be extended
to fit the needs of other platforms. Only because nobody has tried to

And I think we should try to merge this platform-specific functionality
into the IOMMU-API.

	Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: FUJITA Tomonori
Date: Thursday, July 29, 2010 - 2:14 am

On Thu, 29 Jul 2010 11:06:08 +0200

ARM's iommu stuff might be more appropriate as the IOMMU-API than

Well, the reason (nobody has tried) might be that linux/iommu.h
doesn't look something intended for the generic IOMMU-API.
--

From: Roedel, Joerg
Date: Thursday, July 29, 2010 - 2:25 am

How does it not look like a generic intention? The function names are
all generic and do not express that this API should only be used for
KVM. If you talk about the design of the API itself, it was designed for
the IOMMUs I was aware of at the time writing the API (in fact, the
initial design was not my own, it was a generalization of the VT-d
interfaces for KVM).
In other words it was a bottom-up approach to fit the needs of the time
it was written. But its an kernel-only API so we can easily change it
and extend it for other users/iommus when the need arises. I think this
is the way we should go instead of letting each architecture design
their own IOMMU-interfaces.

		Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Roedel, Joerg
Date: Thursday, July 29, 2010 - 2:28 am

Oh, and as an additional note, the reason might also be that people were
not aware of its existence :-)

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: FUJITA Tomonori
Date: Thursday, July 29, 2010 - 2:44 am

On Thu, 29 Jul 2010 11:28:21 +0200

No. People actually read it and think that it's not intended for
generic purposes, i.e., it was designed for VT-d/AMD-IOMMU with KVM:

http://lkml.org/lkml/2010/7/28/470

You designed it for what you need at the time. It should have been
named appropriately to avoid confusion. Later, when we actually
understand what other IOMMUs need, we can evolve the specific API for
generic purposes. Then we can rename the API to more generic.
--

From: Roedel, Joerg
Date: Thursday, July 29, 2010 - 3:01 am

This states the as-is situation. There is not a single sentence that
states why the iommu-api can't be extended to fit their needs. Nobody

At the time the iommu-api was written is was generic enough for what we
had. So it was designed as an generic API. At this point in time nobody
knew what the future requirements would we. So today it turns out that
it is not generic enough anymore for latest hardware. The logical
consequence is to fix this in the iommu-api.

		Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Arnd Bergmann
Date: Thursday, July 29, 2010 - 4:25 am

Well, I think the real question is why we have two APIs that both claim
to work with IOMMUs in a generic way and how we could unify the two.

The Intel and AMD IOMMU drivers currently register at both the DMA
API and the IOMMU API. The first one is used by everything except
KVM and the second is only used by KVM.

I really think we should not extend the (KVM) IOMMU API further but
just use the generic DMA mapping api for KVM and extend it as necessary.
It already has the concept of cache coherency and mapping/unmapping
that are in the IOMMU API and could be extended to support domains
as well, through the use of dma_attrs.

	Arnd
--

From: Roedel, Joerg
Date: Thursday, July 29, 2010 - 5:12 am

The DMA-API itself does not claim to be an iommu-frontend. The purpose
of the DMA-API is to convert physical memory addresses into dma handles
and do all the management of these handles. Backend implementations can
use hardware iommus for this task. But depending on the hardware in the
system the DMA-API can very well be implemented without any hardware
support. This is an important difference to the IOMMU-API which needs

Right. But there is also a mode where the AMD IOMMU driver only

If we find a nice and clean way to expose lower-level iommu
functionality through the DMA-API, thats fine. We could certainly
discuss ideas in this direction.  I think this is going to be hard
because the DMA-API today does not provide enough flexibility to let the
user choose both sides of a io-virtual<->cpu-physical address mapping.
Thats fine for most drivers because it makes sense for them to use the
generic io-address-allocator the DMA-API provides but not for KVM which
needs this flexibility.

	Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Arnd Bergmann
Date: Thursday, July 29, 2010 - 6:01 am

Well, you could call that a limitation in the IOMMU API ;-)

The idea behind the DMA mapping API is to allow a device driver
to work without knowing if the hardware can, cannot or must use

One way to do this would be to add a new attribute, e.g.

enum dma_attr {
        DMA_ATTR_WRITE_BARRIER,
        DMA_ATTR_WEAK_ORDERING,
	DMA_ATTR_FIXED_MAPPING, /* this one is new */
        DMA_ATTR_MAX,
};

struct dma_attrs {
        unsigned long flags[__DMA_ATTRS_LONGS];
	dma_add_t dest;
};

Nothing except for KVM would need to use that attribute, and KVM would
obviously need a way to check if this is supported by the underlying
implementation.

	Arnd
--

From: stepanm
Date: Thursday, July 29, 2010 - 10:19 pm

Joerg,

Thanks for the information. I have been trying to adapt the MSM IOMMU
driver to use your IOMMU interface, and it looks like it might work, with
one minor modification.

Unlike a more traditional system with one IOMMU between the bus and
memory, MSM has multiple IOMMUs, with each one hard-wired to a dedicated
device. Furthermore, each IOMMU can have more than one translation
context. One of the use cases is being able to create mappings within
multiple instances of one context, and arbitrarily context-switch the
IOMMU on the fly.

It sounds like the domain abstraction and attach_device/detach_device can
encapsulate this rather nicely and I am in the process of updating my
driver to fit this framework.

My problem, however, is with iommu_domain_alloc(). This will set up a
domain and call the ops function to initialize it, but I want to be able
to pass it an “IOMMU id" that will tell the underlying driver which IOMMU
(and which "stream id") is to be associated with that domain instance.
This can be a void* parameter that gets passed through to domain_init. I
feel like this change will make it easy to deal with multiple
IOMMUs/translation contexts, and implementations that have only a singular
IOMMU/translation context are free to ignore that parameter.

The alternative for me is to have a separate msm_iommu_domain_alloc(void
*context_id) function, to which I can specify which IOMMU I want to use,
but I would like to fully use your API if possible.

What are your thoughts? I can prepare a patch if you like - the
domain_alloc change looks like it will be very innocuous.

Thanks



--

From: Arnd Bergmann
Date: Friday, July 30, 2010 - 1:01 am

This probably best fits into the device itself, so you can assign the
iommu data when probing the bus, e.g. (I don't know what bus you use)

struct msm_device {
	struct msm_iommu *iommu;
	struct device dev;
};

This will work both for device drivers using the DMA API and for KVM

No, that would require adding msm specific code to KVM and potential
other users.

	Arnd
--

From: stepanm
Date: Friday, July 30, 2010 - 9:25 am

Right, this makes sense, and that is similar to how we were planning to
set the iommus for the devices. But my question is, how does the IOMMU API
know *which* IOMMU to talk to? It seems like this API has been designed
with a singular IOMMU in mind, and it is implied that things like
iommu_domain_alloc, iommu_map, etc all use "the" IOMMU. But I would like
to allocate a domain and specify which IOMMU it is to be used for.

I can think of solving this in several ways.
One way would be to modify iommu_domain_alloc to take an IOMMU parameter,
which gets passed into domain_init. This seems like the cleanest solution.
Another way would be to have something like msm_iommu_domain_bind(domain,
iommu) which would need to be called after iommu_domain_alloc to set the
domain binding.
A third way that I could see is to delay the domain/iommu binding until
iommu_attach_device, where the iommu could be picked up from the device
that is passed in. I am not certain of this approach, since I had not been
planning to pass in full devices, as in the MSM case this makes little
sense (that is, if I am understanding the API correctly). On MSM, each
device already has a dedicated IOMMU hard-wired to it. I had been planning
to use iommu_attach_device to switch between active domains on a specific
IOMMU and the given device would be of little use because that association
is implicit on MSM.

Does that make sense? Am I correctly understanding the API? What do you
think would be a good way to handle the multiple-iommu case?

Thanks
Steve

--

From: Arnd Bergmann
Date: Friday, July 30, 2010 - 2:59 pm

The primary key is always the device pointer. If you look e.g. at 
arch/powerpc/include/asm/dma-mapping.h, you find

static inline struct dma_map_ops *get_dma_ops(struct device *dev)
{
        return dev->archdata.dma_ops;
}

From there, you know the type of the iommu, each of which has its
own dma_ops pointer. The dma_ops->map_sg() referenced there is
specific to one (or a fixed small number of) bus_type, e.g. PCI
or in your case an MSM specific SoC bus, so it can cast the device
to the bus specific data structure:

int msm_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
                enum dma_data_direction dir)
{
	struct msm_device *dev = container_of(dev, struct msm_device, dev);

	...

The iommu_domain is currently a concept that is only used in KVM, and there
a domain currently would always span all of the IOMMUs that can host
virtualized devices. I'm not sure what you want to do with domains though.
Are you implementing KVM or another hypervisor, or is there another use
case?

I've seen discussions about using an IOMMU to share page tables with
regular processes so that user space can program a device to do DMA into
its own address space, which would require an IOMMU domain per process
using the device.

However, most of the time, it is better to change the programming model
of those devices to do the mapping inside of a kernel device driver
that allocates a physical memory area and maps it into both the BUS
address space (using dma_map_{sg,single}) and the user address space

My impression is that you are confusing the multi-IOMMU and the multi-domain
problem, which are orthogonal. The dma-mapping API can deal with multiple
IOMMUs as I described above, but has no concept of domains. KVM uses the
iommu.h API to get one domain per guest OS, but as you said, it does not
have a concept of multiple IOMMUs because neither Intel nor AMD require that
today.

If you really need multiple domains across multiple IOMMUs, I'd suggest that
we first ...
From: stepanm
Date: Friday, July 30, 2010 - 3:58 pm

One of our uses cases actually does involve using domains pretty much as
you had described them, though only on one of the IOMMUs. That is, the
domain for that IOMMU basically abstracts its page table, and it is a
legitimate thing to switch out page tables for the IOMMU on the fly. I
guess the difference is that you described the domain as the set of
mappings made on ALL the IOMMUs, whereas I had envisioned there being one
(or more) domains for each IOMMU.


--

From: Arnd Bergmann
Date: Saturday, July 31, 2010 - 2:37 am

Can you be more specific on what kind of device would use multiple domains
in your case and how you intend to use them? Is this for some kind of DSP
interacting with user processes?

This seems to be a scenario that we haven't dealt with before (or perhaps
avoided intentionally), so if we need to make API changes, we should all
understand what we need them for. It's no problem to extend the API if you
have good reasons for using multiple domains, but I also want to make sure
that there isn't also a way to achieve the same or better result with the
current APIs.

	Arnd
--

From: Roedel, Joerg
Date: Monday, August 2, 2010 - 12:58 am

Hi Stephan,


The IOMMU-API supports multiple IOMMUs (at least multiple AMD/Intel
IOMMUs). But the face that there are more than one IOMMU is hidden in
the backend driver implementation. The API itself only works with
domains and devices. The IOMMU driver needs to know which IOMMU it needs
to program for a given device. If I understand the concept of your

In the means of the IOMMU-API the domain is the abstraction of an
address space (in other words a page table). The IOMMU(s) which this domain
is later assigned to are determined by the iommu_attach_device calls.
I think the right way to go here is to create the concept of a
device-context in the IOMMU-API and add functions like

	iommu_attach_context(struct iommu_domain *domain,
			     struct iommu_context *ctxt);
	iommu_detach_context(struct iommu_context *ctxt);

This would work if you can determine in your iommu-driver which iommu
you need to program for which device. What do you think?


	Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Zach Pfeffer
Date: Monday, August 2, 2010 - 1:29 pm

Joerg, I'd like to make sure I understand this. A domain is an address
space separate from the actual page-tables that may be part of an
iommu_context, correct? After I iommu_attach_context the ctxt will
--

From: Roedel, Joerg
Date: Tuesday, August 3, 2010 - 2:23 am

A domain is defined by a single page-table which can be modified using
the iommu_map/iommu_unmap function calls. I am not completly sure what
you mean by an iommu_context. Can you describe what it means in your
context?

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Stepan Moskovchenko
Date: Tuesday, August 3, 2010 - 11:43 am

Joerg,
I think with some rework, all my use cases can be handled by your 
existing iommu API. If the domain is treated basically a page table, 
there will be some changes, but I think it can be done. I will push a 
new version of my driver in a few days.

One thing that may be helpful for the future, however, is maybe 
something like adding iommu_tlb_flush to the ops. I suppose this would 
either have to take a device, or the domain would need to keep a list of 
devices it had been attached to (so that their TLBs can be invalidated). 
But I suppose on the other hand, iommu_map/unmap may be able to just 
implicitly invalidate the TLB also, since TLB invalidation often follows 
map/unmap. What are your thoughts?

Thanks
Steve
--

From: Roedel, Joerg
Date: Wednesday, August 4, 2010 - 2:52 am

Sounds good. I am curious for your patches :-)

For the TLB-flush question, I think it would make sense to add iommu
tlb flushing functions to the IOMMU-API. We currently flush the TLB
implicitly in the map/unmap calls but thats very inefficient. It would
be better to have a seperate function for it in the API. The right
parameter for such a function is a domain. The IOMMU driver knows which
devices are attached to a domain an could easily flush all TLBs.

One alternative I can think of: An iommu_domain_commit() function which
syncs software changes of a domain to the hardware. The map/unmap calls
have to save which parts of the tlb need to be flushed and commit does
flush those parts then (or flush everything).

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Benjamin Herrenschmidt
Date: Friday, July 30, 2010 - 8:15 pm

Hrm, indeed I just noticed that. Pretty gross... it should definitly be
renamed, is will caused endless confusion with unrelated iommu.h and
iommu_* interfaces which represent something different.

Ben.


--

From: Roedel, Joerg
Date: Monday, August 2, 2010 - 12:48 am

The first direction to go should be trying to unify all the different
iommu* interfaces into the iommu-api. The generic api will definitly
need to be extended for that, but since it is an in-kernel interface
thats no problem.

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Benjamin Herrenschmidt
Date: Monday, August 2, 2010 - 1:03 am

Well, I suppose I'm the de-facto candidate to take care of the powerpc
side then :-)

I don't have the bandwidth right now, but I'll try to have a look when
time permits.

Cheers,
Ben.


--

From: Roedel, Joerg
Date: Monday, August 2, 2010 - 1:10 am

Great :-)

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: FUJITA Tomonori
Date: Monday, August 2, 2010 - 1:30 am

On Mon, 02 Aug 2010 18:03:02 +1000

We already agreed that what the iommu-api looks like?

ARM's iommu code (arch/plat-omap/include/plat/iommu.h) is a library to
simplify the IOMMU implementations. It could be useful for all the
iommu implementations.

The current iommu-api (include/linux/iommu.h) provides the common
interface for specific purposes (for KVM).

I think that the current iommu-api can be a part of the former.

I also think that the IOMMU part of this new msm should be integrated
into the former.

Another question is how the above can work with the DMA-API.
--

From: Russell King - ARM Linux
Date: Monday, August 2, 2010 - 2:03 am

ITYM OMAP's iommu code.
--

From: FUJITA Tomonori
Date: Monday, August 2, 2010 - 2:20 am

On Mon, 2 Aug 2010 10:03:26 +0100

Yeah, I meant that we could extend it to make it useful for other
iommu implementations. At least, we could make something generic like
struct iommu_functions, I think. Then we can embed a generic iommu
structure into an iommu specific struct (like we do with inode).

The current iommu-api (include/linux/iommu.h) is just about domain and
mapping concept. We can implement it on the top of the above
infrastructure.

I'm still trying to figure out how the DMA-API can work well with
them.
--

From: Russell King - ARM Linux
Date: Monday, August 2, 2010 - 3:04 am

I'm not sure it can in totality.  The DMA-API solves the host CPU <->
device aspect of DMA support only.

However, there is another use case for IOMMUs, which is to allow two
separate peripheral devices to communicate with each other via system
memory.  As the streaming DMA-API involves the idea of buffer ownership
(of a singular agent), it is unsuitable for this use case.

The coherent allocation part of the DMA-API also only deals with the
idea of there being a singular DMA agent accessing the allocated buffer
(in conjunction with the host CPU).
--

From: FUJITA Tomonori
Date: Monday, August 2, 2010 - 8:26 am

On Mon, 2 Aug 2010 11:04:19 +0100

I don't have a clear idea what kinda API works well in the above case
yet.

But we have been talking about more bigger things? Not just about the
interface, how things work together.

- OMAP's iommu code is a library to simplify OMAP implementations.

- include/linux/iommu.h is an interface to IOMMUs for specific
  interfaces.

- the DMA API could access to IOMMUs internally.

IOMMU library provides generic iommu interfaces (a structure including
IOMMU implementation specific function pointers). KVM uses some of
them. Architectures could implement the DMA API on the top of some the
interfaces too.
--

From: Roedel, Joerg
Date: Monday, August 2, 2010 - 2:45 am

Well, we currently trying to figure out how to extend the IOMMU-API
concepts to fit the omap-hardware in. Thats what I currently discuss
with Stephan. It looks to me that we need to add the concept
of device contexts to the IOMMU-API. We should also add IO-TLB
management functions. The TLB management is currently handled completly
in the backend driver. This needs to be changed and makes sense for

To me it looks like a very hardware specific library. But it should fit
well in the domain/device concept the IOMMU-API provides (when we also

The IOMMU-API is not limited to the purposes of KVM. There is
currently development effort to use the IOMMU-API for UIO stuff. So the

This would work if we handle every device-context the platform provides
as 'struct device'. But does that really need to work with the DMA-API?
What is the driver use-case for that?

		Joerg

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Roedel, Joerg
Date: Monday, August 2, 2010 - 1:35 am

Btw. I have some ideas to extend the IOMMU-API to also support GART-like
IOMMUs. These pieces could also support (limited-size) domains (without
isolation) using segmentation. Not sure if this makes sense for the
use-cases in other architectures but we should not declare this
impossible for now.

-- 
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Benjamin Herrenschmidt
Date: Friday, July 30, 2010 - 7:30 pm

Also, some of the iommu layer actually originates from powerpc.

Cheers,
Ben.


--

Previous thread: [PATCH 0/4 -tip] delay documentation and checkpatch additions by Patrick Pannuto on Tuesday, July 27, 2010 - 3:39 pm. (5 messages)

Next thread: [PATCH] tracing: wake up tasks reading trace_pipe on write to trace_marker by Marcin Slusarz on Tuesday, July 27, 2010 - 3:44 pm. (5 messages)