Re: [PATCH] pasemi_dma: Driver for PA Semi PWRficient on-chip DMA engine

Previous thread: [PATCH] Alchemy Semi Au1000 pcmcia driver: convert pcmcia_sockets_lock in a mutex by Matthias Kaehlcke on Thursday, March 6, 2008 - 3:22 pm. (4 messages)

Next thread: Toggling preemption on a running kernel by Matthew Hodgson on Thursday, March 6, 2008 - 5:04 pm. (2 messages)
From: Olof Johansson
Date: Thursday, March 6, 2008 - 4:39 pm

pasemi_dma: Driver for PA Semi PWRficient on-chip DMA engine
    
First cut at a dma copy offload driver for PA Semi PWRficient. It uses the
platform-specific functions to allocate channels, etc.

Signed-off-by: Olof Johansson <olof@lixom.net>


---

This has some dependencies on other patches currently queued up in the
powerpc git trees for 2.6.26. I'd appreciate reviews and acked-bys, but
it might be easiest to just merge it up the powerpc path due to the
dependencies.


-Olof

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 27340a7..bbeaf10 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -54,6 +54,13 @@ config FSL_DMA_SELFTEST
 	  Enable the self test for each DMA channel. A self test will be
 	  performed after the channel probed to ensure the DMA works well.
 
+config PASEMI_DMA
+	tristate "PA Semi DMA Engine support"
+	depends on PPC_PASEMI
+	select DMA_ENGINE
+	help
+	  Enable support for the DMA Engine on PA Semi PWRficient SoCs
+
 config DMA_ENGINE
 	bool
 
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index c8036d9..6729959 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -4,3 +4,4 @@ obj-$(CONFIG_INTEL_IOATDMA) += ioatdma.o
 ioatdma-objs := ioat.o ioat_dma.o ioat_dca.o
 obj-$(CONFIG_INTEL_IOP_ADMA) += iop-adma.o
 obj-$(CONFIG_FSL_DMA) += fsldma.o
+obj-$(CONFIG_PASEMI_DMA) += pasemi_dma.o
diff --git a/drivers/dma/pasemi_dma.c b/drivers/dma/pasemi_dma.c
new file mode 100644
index 0000000..844ab11
--- /dev/null
+++ b/drivers/dma/pasemi_dma.c
@@ -0,0 +1,478 @@
+/*
+ * Driver for the PA Semi PWRficient DMA Engine (copy parts)
+ * Copyright (c) 2007,2008 Olof Johansson, PA Semi, Inc
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without ...
From: Stephen Rothwell
Date: Thursday, March 6, 2008 - 5:31 pm

Hi Olof,

Just one thing I noticed ...


device_dependency_added is going away in 2.6.26 ...

--=20
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
From: Olof Johansson
Date: Thursday, March 6, 2008 - 6:35 pm

Thanks, I guess I need to base this on top of both powerpc and the
async-tx git trees and just merge it late.


-Olof
--

From: Andrew Morton
Date: Tuesday, March 11, 2008 - 12:06 am

Applied this on Paul's latest and powerpc allmodconfig goes boom.

drivers/dma/pasemi_dma.c: In function `pasemi_dma_alloc_chan_resources':
drivers/dma/pasemi_dma.c:152: error: `PAS_DMA_TXCHAN_CFG_TY_COPY' undeclared (first use in this function)
drivers/dma/pasemi_dma.c:152: error: (Each undeclared identifier is reported only once
drivers/dma/pasemi_dma.c:152: error: for each function it appears in.)
drivers/dma/pasemi_dma.c:154: error: `PAS_DMA_TXCHAN_CFG_LPDQ' undeclared (first use in this function)
drivers/dma/pasemi_dma.c:155: error: `PAS_DMA_TXCHAN_CFG_LPSQ' undeclared (first use in this function)
drivers/dma/pasemi_dma.c: In function `pasemi_dma_probe':
drivers/dma/pasemi_dma.c:394: error: structure has no member named `device_dependency_added'


Also this driver from git-md-accel is pretty sick:


drivers/dma/fsldma.c:439: warning: comparison of distinct pointer types lacks a cast
drivers/dma/fsldma.c: In function `fsl_chan_xfer_ld_queue':
drivers/dma/fsldma.c:584: warning: long long unsigned int format, dma_addr_t arg (arg 4)
drivers/dma/fsldma.c: In function `fsl_dma_chan_do_interrupt':
drivers/dma/fsldma.c:661: warning: unsigned int format, different type arg (arg 5)
drivers/dma/fsldma.c:677: warning: long long unsigned int format, dma_addr_t arg (arg 4)
drivers/dma/fsldma.c:677: warning: long long unsigned int format, dma_addr_t arg (arg 5)
drivers/dma/fsldma.c:694: warning: unsigned int format, different type arg (arg 4)
drivers/dma/fsldma.c: In function `fsl_dma_self_test':
drivers/dma/fsldma.c:833: warning: int format, different type arg (arg 5)
drivers/dma/fsldma.c: In function `of_fsl_dma_probe':
drivers/dma/fsldma.c:1003: warning: unsigned int format, different type arg (arg 5)
drivers/dma/fsldma.c: At top level:
drivers/dma/fsldma.c:723: warning: 'fsl_dma_callback_test' defined but not used

--

From: Olof Johansson
Date: Tuesday, March 11, 2008 - 7:25 am

It's dependent on my latest pull request of pasemi.git for-2.6.26 that

.. and that one is caused by recent changes in async_tx.git. I was
waiting on other review comments from the DMA maintainers before

Yeah, Zhang Wei posted a patch for that on lkml yesterday.


-Olof
--

From: Andrew Morton
Date: Tuesday, March 11, 2008 - 10:53 am

On Tue, 11 Mar 2008 09:25:45 -0500

Maybe we should get that tree into -mm and/or linux-next.



OK.
--

From: Dan Williams
Date: Tuesday, March 11, 2008 - 11:15 am

I recently moved the async_tx git tree to kernel.org.  I gave a heads up
[1].  Here is the reasoning:

<quote>
For -mm please replace the 'git-md-accel' url with the following, it
should also be renamed to 'git-async-tx' as anything that impacts MD
will go through Neil.

	git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx.git upstream

My fault for not pushing out this cleanup to the old url while the
git-md-accel changeover was pending.

--
Dan

[1] http://marc.info/?l=linux-kernel&m=120465351720649&w=2


--

From: Kumar Gala
Date: Tuesday, March 11, 2008 - 11:29 am

Any reason not to push Zhang's fixes for 2.6.25?

I also have a patch I want to push for 2.6.25 to deal with the powerpc  
device tree.  I was going to handle this via the powerpc tree's since  
that seems to be the model we have used for netdev and other drivers  
that play with arch/powerpc.  I hope that's not an issue (I'll CC you  
on the patch).

- k
--

From: Dan Williams
Date: Tuesday, March 11, 2008 - 1:37 pm

I plan to push these fixes for 2.6.25.  Olof had some valid comments

Not an issue at all.  Just give me a heads up on what you want me to
pick up and what you want me to just ack.

[1] http://marc.info/?l=linux-kernel&m=120524510630957&w=2

--

From: Dan Williams
Date: Tuesday, March 11, 2008 - 10:04 am

Apologies for not getting to this sooner.

I notice that the driver does not handle callbacks in its descriptor
cleanup path.  This could be ok if your intent is only to support the
net_dma style polled operations, but this will not work for the
raid-offload async_tx case.  I think the solution is for async_tx to

Is spin_lock_bh() insufficient here?

...that's all that jumps out at the moment.

Regards,
Dan

--

From: Olof Johansson
Date: Thursday, March 13, 2008 - 12:54 pm

Good point, and correct - I have mostly been testing this with the
NET_DMA offload. async_tx doesn't make use of just memcpy, does it?

Also, how is DMA_INTERRUPT supposed to work? I see there's a separate
"prep_dma_interrupt" function, but that doesn't make sense to me. Don't
you want the interrupt associated with a specific transaction instead
of added as a separate (empty) transaction?

Once I add the descriptor to the ring, I can't change it to set the
interrupt request bit on it. I suppose I could just add a dummy descriptor

Can't do that if it's called both from the polling as well as the IRQ
context, it'd need to hold off irqs. I.e. once I add the DMA_INTERRUPT
support it will be needed.


-Olof

--

From: Dan Williams
Date: Thursday, March 13, 2008 - 3:29 pm

Right, it makes use of any capability it can get its hands on,

Yes, and that is what happens in most cases.  The additional
prep_dma_interrupt method is for two special cases:
1/ Locations where code is submitting an indeterminate number of
operations and wants an interrupt (callback) at the completion of the
chain.  Someting like:
list_for_each_entry()
     async_memcpy()
async_trigger_callback()

2/ For supporting channel switching on, for instance, a memcpy->xor
chain where chan1 supports memcpy and chan2 supports xor.  When
async_tx sees this it injects an interrupt for chan1.  The xor
operation gets kicked off by the bottom half of the chan1 interrupt
handler.  However, if chan1 can do both operations no interrupt is



Regards,
Dan
--

From: Olof Johansson
Date: Thursday, March 13, 2008 - 4:14 pm

Badly worded question, but it got answered anyway. What I really meant
to as was "does async_tx use memcpy", I thought it only used xor. Good

Ok, one could argue that it'd make more sense to have a way to issue a

Well, it'd be slightly more efficient to do add the interrupt attribute
to the last issued descriptor when it's known in advance. If the
underlying driver doesn't support it, adding a separate descriptor would
be a good fallback.

Anyway, this isn't likely to be a performance bottleneck. If it turns

Why? That just adds overhead and latency.


-Olof
--

From: Dan Williams
Date: Thursday, March 13, 2008 - 5:06 pm

When it is known in advance the interrupt attribute *is* set,

Would not hurt to have another pair of eyes on this part of the code.
A rewrite of the channel switch mechanism is currently pending in
async_tx.git#upstream.


The original ioat_dma code used nothing heavier than spin_lock_bh.
Async_tx now assumes that local_bh_disable prevents races with any
channel's cleanup routine.  Clients can place the same kind of code in
an async_tx callback as they would in a timer callback.  The
assumption is that code using async_tx can afford its extra overhead,
which is true for raid.  This is also why you don't see async_memcpy
calls in net_dma.

--
Dan
--

From: Olof Johansson
Date: Sunday, March 16, 2008 - 2:30 pm

pasemi_dma: Driver for PA Semi PWRficient on-chip DMA engine

DMA copy offload driver for PA Semi PWRficient. It uses the
platform-specific functions to allocate channels, etc.

Signed-off-by: Olof Johansson <olof@lixom.net>

---

Changes since last post:

* Add DMA_INTERRUPT support and handling of the interrupt flag
* Fix interrupt handler for above and add tasklet
* Switch to spin_lock_bh() where possible
* Remove empty dependency_added() function since it's no longer
  used in the framework.
* Fix bug in "ring full" estimation.

Note that this still needs to go on top of the powerpc.git tree due to the
pasemi_dma.h updates that this driver depends on. I suggest merging this
through pasemi.git->powerpc.git->linus with an Acked-by from the DMA guys.


-Olof

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index a703def..e4fd7e5 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -37,6 +37,13 @@ config INTEL_IOP_ADMA
 	help
 	  Enable support for the Intel(R) IOP Series RAID engines.
 
+config PASEMI_DMA
+	tristate "PA Semi DMA Engine support"
+	depends on PPC_PASEMI
+	select DMA_ENGINE
+	help
+	  Enable support for the DMA Engine on PA Semi PWRficient SoCs
+
 config DMA_ENGINE
 	bool
 
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index b152cd8..72bf45c 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -3,3 +3,4 @@ obj-$(CONFIG_NET_DMA) += iovlock.o
 obj-$(CONFIG_INTEL_IOATDMA) += ioatdma.o
 ioatdma-objs := ioat.o ioat_dma.o ioat_dca.o
 obj-$(CONFIG_INTEL_IOP_ADMA) += iop-adma.o
+obj-$(CONFIG_PASEMI_DMA) += pasemi_dma.o
diff --git a/drivers/dma/pasemi_dma.c b/drivers/dma/pasemi_dma.c
new file mode 100644
index 0000000..ad6235b
--- /dev/null
+++ b/drivers/dma/pasemi_dma.c
@@ -0,0 +1,532 @@
+/*
+ * Driver for the PA Semi PWRficient DMA Engine (copy parts)
+ * Copyright (c) 2007,2008 Olof Johansson, PA Semi, Inc
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the ...
From: Dan Williams
Date: Monday, March 17, 2008 - 11:46 am

Hi Olof,

Looks good, makes me want to go back and cleanup iop-adma a bit.  A

Ok, it still may not compile in mainline until after 2.6.26-rc1 due to
additional dmaengine cleanups like the ack-to-flags change I posted


Clients do not submit new operations in their callback routines so it
is "ok" to hold this lock over the callback.

Also, if your platform will need to support channel switching at some
point you can go ahead and add a call to async_tx_run_dependencies()
here.
--

From: Olof Johansson
Date: Monday, March 17, 2008 - 5:27 pm

Either go through -mm where Andrew can keep it applied in appropriate
order and send upstream, or just merge it late. It's a new driver, and
they're normally OK to go in a little later. That might be the easiest

Thanks!

-Olof

--

From: Nelson, Shannon
Date: Monday, March 17, 2008 - 1:34 pm

Hi Olof,

In the future please copy Maciej as one of "the DMA guys" as he has
taken over ioatdma for me.  Beyond that, one little picky comment

Is the number of channels defaulting to 2 or 4?

sln

--

From: Olof Johansson
Date: Monday, March 17, 2008 - 5:21 pm

Hi,


Time to set up a list, or have everyone monitor lkml, I'd say. I'd

Ah, yes, good catch.


-Olof
--

Previous thread: [PATCH] Alchemy Semi Au1000 pcmcia driver: convert pcmcia_sockets_lock in a mutex by Matthias Kaehlcke on Thursday, March 6, 2008 - 3:22 pm. (4 messages)

Next thread: Toggling preemption on a running kernel by Matthew Hodgson on Thursday, March 6, 2008 - 5:04 pm. (2 messages)