The PL330 is currently the dma controller using at the S5PC1XX arm SoC. This supports DMA_MEMCPY and DMA_SLAVE. The datasheet for the PL330 can find below url: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0424a/DDI0424A_dmac_pl330_r0p0_trm... Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com> --- Change log: v2: Convert into an amba_device driver. Code clean and update from v1 patch review. drivers/dma/Kconfig | 7 + drivers/dma/Makefile | 1 + drivers/dma/pl330_dmac.c | 900 ++++++++++++++++++++++++++++++++++++++++++++ drivers/dma/pl330_dmac.h | 175 +++++++++ include/linux/amba/pl330.h | 64 ++++ 5 files changed, 1147 insertions(+), 0 deletions(-) create mode 100644 drivers/dma/pl330_dmac.c create mode 100644 drivers/dma/pl330_dmac.h create mode 100644 include/linux/amba/pl330.h diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index c27f80e..5989b6e 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -149,6 +149,13 @@ config AMCC_PPC440SPE_ADMA help Enable support for the AMCC PPC440SPe RAID engines. +config PL330_DMAC + bool "PrimeCell DMA Controller(PL330) support" + depends on ARCH_S5PC1XX + select DMA_ENGINE + help + Enable support for the PrimeCell DMA Controller(PL330) support. + config ARCH_HAS_ASYNC_TX_FIND_CHANNEL bool diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile index 22bba3d..d98be12 100644 --- a/drivers/dma/Makefile +++ b/drivers/dma/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_TXX9_DMAC) += txx9dmac.o obj-$(CONFIG_SH_DMAE) += shdma.o obj-$(CONFIG_COH901318) += coh901318.o coh901318_lli.o obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/ +obj-$(CONFIG_PL330_DMAC) += pl330_dmac.o diff --git a/drivers/dma/pl330_dmac.c b/drivers/dma/pl330_dmac.c new file mode 100644 index 0000000..4427110 --- /dev/null +++ b/drivers/dma/pl330_dmac.c @@ -0,0 +1,900 @@ +/* + * pl330_dmac.c -- Driver for PL330 DMA Controller + * + * Copyright (C) 2009 Samsung Electronics ...
On Thu, Mar 25, 2010 at 12:17 PM, Joonyoung Shim [CC'ing Russell and Ben as stakeholders] Dear Maintainers I too have been writing a driver for PL330 after taking into account the suggestions of Russell, Ben and other participants of the thread http://lists.infradead.org/pipermail/linux-arm-kernel/2010-February/009856.html If you don't think this driver conflicts with the theme of the thread, may I ask you to please put this driver on hold until you checkout my implementation of solution to the issue... which should be soon. Regards, Jaswinder Singh Solution-1 Group Linux Kernel Team System LSI Division Samsung Electronics Co., Ltd. --
Please post the code as it looks today even if it's not compiling instead of asking others to hold their patches back. It will be obvious from what you have if there is some special use you're covering. Perhaps Joonyoung can simply port over the stuff you need to this driver if you show your code. Yours, Linus Walleij --
On Thu, Mar 25, 2010 at 5:30 PM, Linus Walleij My approach is to write a separate PL330 core driver as the backend which can be reused by any DMA API implementer driver. That will avoid having two copies of the PL330 driver, among other benefits. And if this patch is accepted, there _will_ exist two copies of the PL330 driver -- one in drivers/dma/pl330_dmac.c and another in arch/arm/plat-samsung/. Only the former will be lying unused until some other SoC vendor decided to use PL330, because S3C has come too long a way to change its drivers to driver/dma/ API and modify DMA drivers for every SoC. I plan something like, arch/arm/common/pl330-core.c implementing the specs in http://infocenter.arm.com/help/topic/com.arm.doc.ddi0424a/DDI0424A_dmac_pl330_r0p0_trm... and drivers/dma/pl330.c implement DMA API for SoCs that chose to use it... and arch/arm/plat-samsung/dma-pl330.c implementing regular S3C DMA API. I don't claim to have a silver bullet, nobody has atm, but my approach is at least more aligned with what maintainers want. I have the pl330-core part almost ready, but i need time to implement some _testable_ implementation of the scheme. If maintainers want to see structure of my code, I can Having worked on Samsung SoCs(with PL330 DMAC) based products, I would be _very_ surprised if any user found this implementation useful. Let alone testing, this implementation can't even explain usability for fast peripherals with shallow FIFOs. I didn't give feedback for this patch because I am not sure if this is the right way to go at all. regards. --
This is the wrong attitude. If it were not for a simple oversight Joonyoung's driver would already be upstream for the past two kernel releases. So you need to work together to improve that driver to incorporate what you need. It sounds like you just need to add an extension for the arch specific dma api. At first glance this can mimic the approach taken by Nobuhiro-san with the shdma driver. -- Dan --
Nothing wrong in attitude here. Giving feedback on the code only comes after one is convinced with the overall approach taken. The last time I raised the PL330 driver issue, most people were not enthusiastic with this drivers/dma/ approach. I wasn't active mainline discussions when the driver was originally submitted a few months ago. And now my replies are not very 'polite' because theres a lot going on I actually plan more than that. Apart from inefficient design, JoonYoung's driver has made some fatal assumptions about PL330, which will result in DMA aborts if used with SoCs that implement configuration of PL330 that is very different from Samsung SoCs' Of course, I address all such issues that I can think of, in my implementation. regards. --
Ok, I'll rely on acked-by's from the interested parties once your driver is out as I do not have a vested interest in this hardware, just the dmaengine framework issues. -- Dan --
Exactly what? We are already tested it our board and play the music well. As your previous mail, we can wait until this weekend. but If you don't post the your codes until this weekend. We assume that your works are not yet done so merge this patch first and then fix it if your words(DMA aborts issue) are right. Thank you, Kyungmin Park --
On Fri, Mar 26, 2010 at 9:29 AM, Kyungmin Park I didn't want to get into gory details, but it seems I need to explain ..... Please see my quick review to the original patch in this thread. It's not upto you to give me any deadline. Not the "Until this weekend" one !!! I think only the word of the maintainers count. --
I can wait your implementation and wonder what is the issue also. I welcome you try other design and want better driver is committed at mainline kernel too. --
Good to have you at last reply to my posts --
Why not just post it on the list? I'm curious! Since I'm working on a PrimeCell DMA API I would love to look at PrimeCell DMA engine drivers. Yours, Linus Walleij --
On Fri, Mar 26, 2010 at 12:20 AM, Linus Walleij The amount of code that will be modified or taken out of drivers/dma/pl330_dmac will so much that I will be left only with constrained data structures in the file to do tricks to make it work with the PL330 engine driver. I am not very keen on authoring the driver/dma/ driver but neither am I I'll post in a day or two when the PL330 core driver takes come shape closer to what it is supposed to look. That will help me getting suggestions for improvement, i hope. regards. --
On Fri, Mar 26, 2010 at 12:20 AM, Linus Walleij
Here is untested but only compilable PL330 engine driver, with the following
features/bugs/limitations ...
o The DMA API driver submits 'request' to PL330 engine.
A request is a sequence of DMA 'xfers' to be done before the DMA
API driver
wants to be notified.
A 'xfer' is the finest grain of DMA transaction that specifies
how many _bytes_
are to be move from 'source address' to 'destination address'.
A req can be a scatter-Gather-List.
o PL330 engine accepts requests from DMA API drivers in ping-pong manner,
i.e, at any time maximum two reqs can be queued. Other reqs have
to be buffered by DMA API drivers and enqueued whenever a req-finish
callback is made.
o Only necessary resources for a DMAC are allocated according to the
configuration
read from DMAC during startup.
Since MicroCode buffers are a channel's resource, they are
allocated just once
during startup and reused for every other xfer. A channel has its
MC buff divided into
two equal parts - one for each ping-pong request. The DMA API driver can
specify the size of MC buffer that each channel(hence req) has,
which decides the
maximum possible total xfer size that the channel can do in one request.
o It is possible to do DMA request at different channel configuration
every time; this feature can
be used by DMA API drivers for implementing 'server' channels that
are not dedicated
to a particular client but accept requests from any client. Of
course, such channels won't
be able to guarantee any QOS.
o Secure, Privilege and InsnAccess modes can be specified for each request.
o TODO: Desirable is to implement true LLI using MicroCode
modification during each
request enqueue, so that the xfer continues even while IRQ is
handled and callbacks made.
To me, there doesn't seem to be a way to flush ICACHE of a channel
without halting it, so we
can't modify MicroCode in ...Hi Jassi, this is looking good. The only advantage of the other driver by Joonyoung is that it is finished and ready for integration. If you finalize the DMA devices/engine API and post this in time for the next merge window I would easily vote for including this one rather than the other one. (Whatever that means for the world.) Simply for technical merits. It's sad that you two have done duplicate work but such is life.. I understand it that as this is the core engine so you intend to keep the core in arch/arm/common/* and then a separate interface to the DMAdevices implementing <linux/dmaengine.h> in drivers/dma/ and this is what the "DMA API" referenced below refers to? In that case I really like this clear separation between hardware driver and DMA devices/engine API. And I see that the DMA API is not far away. If you implement it you will be able to excersise this with the DMA engine memcpy test to assure it's working. There is nothing wrong with moving this entire thing except the header file into drivers/dma it will be more comfortable there, with the other DMA drivers. Whether the header should be in include/linux/amba or include/linux/dma is however a good question for the philosophers, but I would stick it into linux/amba with the rest of the PrimeCells. But perhaps you have better ideas. This hints that there is some other patch to provide that API This is great, do you also plan to support that for M<->M xfers like we added for the DMA40? Then we might want to lift that into the generic True, not as elegant as being able to do it with microcode but From the DMA API level in the PrimeCell drivers the crucial driver that need something like this is the AMBA PL011 UART driver, RX part, where data comes in from the outside and we have no control over the data flow. I trigger one transfer to a buffer here, then wait for it to complete or be interrupted. If it completes, I immediately trigger another transfer to the second buffer before I start ...
On Fri, Apr 2, 2010 at 8:23 AM, Linus Walleij Thank you. I am working on it full time. This submission was solely as a preview, the code needs to be optimized, polished and tested, though I don't anticipate much changes in the API. By when should I submit the final version? Yes, drivers/dma/pl330.c could be one DMA API driver. Another may implement S3C DMA API and reside in arch/arm/plat-samsung/ or wherever the S3C PL080 driver is atm. So, DMA API driver is the frontend that makes use of the arch/arm/common/pl330.c Yes, implementing PL330 core was supposed to be the major challenge. DMA API drivers should be easy, though I plan to first test the pl330.c with S3C DMA API because that way I'll have a benchmark to compare the stability and performance of this new driver. Of course, I'll like to see driver/dma driver as well, but maybe JoonYoung wants For platforms that choose to implement their own DMA API (how good or bad is a different subject), arch/arm/common/ seems be more appropriate for this pl330.c Right, and that is what I call Client/DMA-API driver. Though the patch is not I hope the driver is efficient/fast enough for some tough test cases I have, otherwise I might have to modify or add to the API to implement this If I get it right, that is common issue with any 'slave-receiver' type device and it might do good to have timeout option support in DMA API for such receive requests. That is provide whatever data is collected within some amount of time, Even with this implementation, for PAUSE, the DMA API driver can call pl330_chan_status, saving remaining requests locally and calling That is mostly to self. It just means that every variable in the block must be analyzed before making any change there. I think it is considered good practice to export every symbol of an API. As I said, this driver is designed to be usable with any DMA API, and some, like S3C, The check is there just for some robustness. yes I was already thinking on similar lines. I'll merge them ...
Hi, If there's no comments. it's time to select to merge it. I think it can't which is better. so I hope community select any one Only consideration which is proper for linux and future usage To Linux Walleij. I hope it will merge your patch series together. Thank you, Kyungmin Park --
As i said, I am working full time on it. PL330 core and S3C DMA API driver is done and it'll be a day or two before I am done testing them. So, around this wednesday I plan to submit patches for PL330 core driver and the S3C-DMA-API driver for S5P-6442, C1XX and V210 Once the PL330 core is accepted, we can freeze it's API and start work on DMA engine API driver for it. --
On Thu, 25 Mar 2010 12:17:15 +0900 desc_pool_virt is a virtual address (from dma_alloc_coherent). In such case, write[bwl] seems to be the wrong interface. I suggest the following code: M. -- I'm the slime oozin' out from your TV set... --
Right. The write[bwl] is api for address ioremapped of io device. I will PL330 DMA controller fetches variable length instructions that consist of --
On Thu, 25 Mar 2010 18:01:00 +0900, Joonyoung Shim
I'm not too concerned about the device side of things. I'm more worried
about the CPU access when writing the 'imm' value to memory.
Consider desc_pool_virt 16bit aligned when entering the function. Writing
the opcode makes it unaligned and then writing the 'imm' value will result
as an unaligned access.
M.
--
Who you jivin' with that Cosmik Debris?
--
Why desc_pool_virt should be aligned more than 16bit? --
On Thu, 25 Mar 2010 19:05:47 +0900, Joonyoung Shim
There is reason for desc_pool_virt to be 16bit aligned. It's just that you
have 50% chance that it will.
In such case, you will write 'imm' to a non 16bit-aligned address. In my
book, that's bad.
Same for pl330_dmamov(), which tries to write a 32bit value without
checking the proper alignment.
In such case, please use the put_unaligned macro to handle the possible
unaligned access.
M.
--
Who you jivin' with that Cosmik Debris?
--
OK. i will use put_unaligned. Thanks. --
Looks good to me now so: Acked-by: Linus Walleij <linus.walleij@stericsson.com> Linus Walleij --
On Thu, Mar 25, 2010 at 12:17 PM, Joonyoung Shim Here goes some quick technical feedback of the code. Please remember that these issues are only secondary. The primary drawback is the approach that this patch adopts, as already explained in other posts. As already mentioned by Marc, it doesn't have to be read/write. PL330 specifies the microcode buffers to be on system memory and that need not be treated like ioports. These allocations are inefficient and don't need to be done so often. My implementation allocates a pool of such buffers(size specified by DMA API driver) and manage them by simple pointer manipulation. This limit, though not serious, is unconditionally imposed by your design. There are ways to get around this situation by smarter generation of Again, the DMA API drivers will suffer just because someone didn't care to generate microcode efficiently. The microcode template for xfer takes only about 50 bytes and despite having PL330_POOL_SIZE buffer, you have to drop xfer requests just because the template is not properly designed. My implementation is limited only by the microcode buffer size, which in turn This instruction generation leaves no scope for Security permissions for xfers, that is a feature of PL330. I think the DMA API already prohibits doing non-irq-context things(like enqueue) in the callbacks, so why implement tasklets here? This may still get you "audio working fine" with Samsung I2S controller, but is likely to cause problems with more demanding peripherals like SPDIF if they operate at best QOS(even 24bits/sample Stereo at 96000Hz) and has shallow FIFO(8 samples deep and hence 84 usecs acceptable latency). Remember that SPDIF usually goes with other system load like HDMI HD playaback which only increases the interrupt latency. Not to forget, the overall throughput hit taken by other dma clients, like MMC over SPI that use 256/512 bytes DMA xfers, due to delayed DMA-DONE notifications. Also, using tasklet here may break any ...
-- -- Ben Q: What's a light-year? A: One-third less calories than a regular year. --
I meant during 'probe' of the DMAC a chunk of dma consistent memory is allocated for MicroCode for each channel. We use the same chunk during xfers, since we can generate MC in a way that 256 bytes are enough to do xfer of 2.5MB at burst size of 1 byte and for bulkier requests the DMA API driver can either break the bigger request or allocate bigger chunk for the channels. --
