Re: [PATCH 1/3] drivers: misc: add omap_hwspinlock driver

Previous thread: linux-next: build failure after merge of the final tree (net tree related) by Stephen Rothwell on Monday, October 18, 2010 - 12:36 am. (2 messages)

Next thread: linux-next: Tree for October 18 by Stephen Rothwell on Monday, October 18, 2010 - 12:51 am. (1 message)
From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 12:44 am

OMAP4 introduces a Spinlock hardware module, which provides hardware
assistance for synchronization and mutual exclusion between heterogeneous
processors and those not operating under a single, shared operating system
(e.g. OMAP4 has dual Cortex-A9, dual Cortex-M3 and a C64x+ DSP).

The intention of this hardware module is to allow remote processors,
that have no alternative mechanism to accomplish synchronization and mutual
exclusion operations, to share resources (such as memory and/or any other
hardware resource).

This patchset adds a new misc driver for this OMAP hwspinlock module.

Currently there are two users for this driver:

1. Inter-Processor Communications: on OMAP4, cpu-intensive multimedia
tasks are offloaded by the host to the remote M3 and/or C64x+ slave
processors.

To achieve fast message-based communications, a minimal kernel support
is needed to deliver messages arriving from a remote processor to the
appropriate user process.

This communication is based on simple data structures that are shared between
the remote processors, and access to it is synchronized using the hwspinlock
module (to allow the remote processor to directly place new messages in this
shared data structure).

This OMAP IPC system, called Syslink, is being actively developed by TI and
will be gradually submitted upstream.

2. OMAP I2C driver: On some OMAP4 boards, the I2C bus is shared between
the A9 and the M3, and the hwspinlock is used to synchronize access to it.


Patches are against linux-omap-2.6 master, which is 2.6.36-rc8 plus
2.6.37 omap material (needed for the omap specific patches in this set).

Tested on OMAP4 blaze.

Contributions
=============

Previous versions of this driver circulated in linux-omap several times,
and received substantial attention and contribution from many developers
(see [1][2][3][4][5][6]):

Simon Que did the initial implementation and pushed several iterations
Benoit Cousson provided extensive review, help, improvements and hwmod ...
From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 12:44 am

From: Simon Que <sque@ti.com>

Add driver for OMAP's Hardware Spinlock module.

The OMAP Hardware Spinlock module, initially introduced in OMAP4,
provides hardware assistance for synchronization between the
multiple processors in the device (Cortex-A9, Cortex-M3 and
C64x+ DSP).

Signed-off-by: Simon Que <sque@ti.com>
Signed-off-by: Hari Kanigeri <h-kanigeri2@ti.com>
Signed-off-by: Krishnamoorthy, Balaji T <balajitk@ti.com>
[ohad@wizery.com: disable interrupts/preemption to prevent hw abuse]
[ohad@wizery.com: add memory barriers to prevent memory reordering issues]
[ohad@wizery.com: relax omap interconnect between subsequent lock attempts]
[ohad@wizery.com: timeout param to use jiffies instead of number of attempts]
[ohad@wizery.com: remove code duplication in lock, trylock, lock_timeout]
[ohad@wizery.com: runtime pm usage count to reflect num of requested locks]
[ohad@wizery.com: move to drivers/misc, general cleanups, document]
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Benoit Cousson <b-cousson@ti.com>
Cc: Tony Lindgren <tony@atomide.com>
---
 Documentation/misc-devices/omap_hwspinlock.txt |  199 +++++++++
 drivers/misc/Kconfig                           |   10 +
 drivers/misc/Makefile                          |    1 +
 drivers/misc/omap_hwspinlock.c                 |  555 ++++++++++++++++++++++++
 include/linux/omap_hwspinlock.h                |  108 +++++
 5 files changed, 873 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/misc-devices/omap_hwspinlock.txt
 create mode 100644 drivers/misc/omap_hwspinlock.c
 create mode 100644 include/linux/omap_hwspinlock.h

diff --git a/Documentation/misc-devices/omap_hwspinlock.txt b/Documentation/misc-devices/omap_hwspinlock.txt
new file mode 100644
index 0000000..b093347
--- /dev/null
+++ b/Documentation/misc-devices/omap_hwspinlock.txt
@@ -0,0 +1,199 @@
+OMAP Hardware Spinlocks
+
+1. Introduction
+
+Hardware spinlock modules provide hardware assistance for synchronization
+and mutual exclusion ...
From: Greg KH
Date: Tuesday, October 19, 2010 - 8:46 am

One note, do you really want to fail if this option isn't built into the
kernel, yet you have a driver that is asking for it?  Shouldn't you
instead just silently succeed, and let the code path get compiled away?

We did that for debugfs, after learning the pain that procfs had with
its api for "is not built".  Doing it the way you are requires the user
to always test for -ENOSYS, when in reality, if that is returned,
there's nothing the driver can do about it, so it should just not worry
about it.

Just something to think about.

thanks,

greg k-h
--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 1:18 pm

Completely agree; if hwspinlock support is not needed, we better let
its users run uninterruptedly. I'll change it.

Thanks,
--

From: Kevin Hilman
Date: Tuesday, October 19, 2010 - 9:58 am

3. Ensures that in_atomic/might_sleep checks catch potential problems
   with hwspinlock usage (e.g. scheduler checks like 'scheduling while



[...]

Kevin
--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 1:21 pm

On Tue, Oct 19, 2010 at 6:58 PM, Kevin Hilman


Thanks.
--

From: Grant Likely
Date: Tuesday, October 19, 2010 - 10:01 am

I strongly recommend reconsidering the ERR_PTR() pattern in new driver
code.  It is impossible to tell from looking at the prototype of a
function if it returns an ERR_PTR() value, or a NULL on failure.  I
pretty much guarantee that new users of this code will miss the
subtlety and introduce new bugs by assuming that the return value can
be tested with "if (!hwlock)".

ERR_PTR() is only appropriate when the caller actually cares about the
failure code and has different behaviour depending on the result.  For
example, filesystem code that needs to return to userspace a specific
error code.  Very seldom does driver code like users of this API

Disabling irqs *might* be a concern as a source of RT latency.  It
might be better to make the caller responsible for managing local spin
locks and irq disable/enable.

OTOH, I also see that a spin lock is still needed internally to
protect the hwspinlock data structure.

g.
--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 1:43 pm

Agree; the hwspinlock users can surely live without an explicit error
code from the _request APIs, and some extra robustness can only do
good.


This a coming from an hardware requirement, rather than a choice the
user should take.

If a hwspinlock is taken over a long period of time, its other user
(with which we try to achieve synchronization) might be polling the
OMAP interconnect for too long (trying to take the hwspinlock) and
thus preventing it to be used for other transactions.

To prevent such lengthy polling on the interconnect, the hwspinlock
should only be used for very short period of times, with preemption
and interrupts disabled.

That's why we don't give users the choice whether to disable
interrupts or not - it's simply not a decision they should take.

Thanks,
Ohad.
--

From: Arnd Bergmann
Date: Tuesday, October 19, 2010 - 1:58 pm

Interrupts disabled in general might go a bit too far -- they are also
short and infrequent events unless you have seriously broken drivers.

When running with CONFIG_PREEMPT, I would guess you actually want to
turn the omap_hwspinlock into a sleeping operation, though that would
require much extra work to implement. Disabling preemption while the
hwspinlock is held is of course a correct implementation technically,

What about those cases where you already know that interrupts are
disabled, e.g. while holding a regular spin_lock_irq or inside
of an interrupt handler?

	Arnd
--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 2:57 pm

The difference is hardware-specific: the hwspinlock device is located
on the OMAP's L4 interconnect where access is slow, wasteful of power,
and spinning on it may prevent other peripherals from interconnecting.
--

From: Kevin Hilman
Date: Tuesday, October 19, 2010 - 10:16 am

Should be tristate so it can also be built as a module by default.

e.g., when building multi-OMAP kernels, no reason or this to be
built-in.  It can then be loaded as a module on OMAP4 only.

Kevin
--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 6:00 am

On Tue, Oct 19, 2010 at 7:16 PM, Kevin Hilman


But considering the current built-in use cases (i2c) we would then
need to have the relevant MACH_OMAP4_* select OMAP_HWSPINLOCK (only on
--

From: Kevin Hilman
Date: Wednesday, October 20, 2010 - 11:18 am

Yes, that is ok.  At least then it can be tested by default as a module.

Kevin
--

From: Arnd Bergmann
Date: Tuesday, October 19, 2010 - 10:21 am

This seems to encourage sloppy coding: The only variant you allow is the one
that corresponds to Linux's spin_lock_irqsave(), which is generally discouraged
in all places where you know if you need to disable interrupts or not.

IMHO the default should be a version that only allows locks that don't get
taken at IRQ time and consequently don't require saving the interrupt flags.

	Arnd
--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 1:51 pm

Yes, this is a hardware requirement to minimize the period of time in
which the hwspinlock is taken, in order to prevent lengthy polling on

Please note that the hwspinlocks should never be used to achieve
synchronization with Linux contexts running on the same cpu - it's
always about achieving mutual exclusion with a remote processor.

So whether the lock is taken at IRQ time or not does not affect the
requirement to disable interrupts while it is taken (very differently
from local spin_lock{_irqsave} synchronizations).

Thanks,
--

From: Arnd Bergmann
Date: Tuesday, October 19, 2010 - 2:08 pm

Right. There are two more things to consider though:

If you know that interrupts are disabled, you may still want to avoid
having to save the interrupt flag to the stack, to save some cycles
saving and restoring it. I don't know how expensive that is on ARM,
some other architectures take an microseconds to restore the interrupt
enabled flag from a register.

Even in the case where you know that interrupts are enabled, you may
want to avoid saving the interrupt flag to the stack, the simpler
API (one argument instead of two) gives less room for screwing up.

	Arnd
--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 3:43 pm

To do this we need to introduce a new set of API which will not
disable interrupts, and which should only be used when the caller
knows that interrupts are already disabled.

This will save some cycles, but my concern is that this API will be
abused by drivers, and will eventually be used to take the hwspinlocks
for lengthy period of times (which might even involve sleeping).

Given that the access to the L4 interconnect is anyway slow I also
feel that the gain is negligible.

Nevertheless, it's a viable way to squeeze some cycles.

We don't have a use case in which we know the interrupts are already

This sounds like adding a set of API that resembles spin_{unlock,lock}_irq.

My gut feeling here is that while this may be useful and simple at
certain places, it is somewhat error prone; a driver which would
erroneously use this at the wrong place will end up enabling
interrupts where it really shouldn't.

Don't you feel that proving a simple spin_lock_irqsave-like API is
actually safer and less error prone ?

I guess that is one of the reasons why spin_lock_irqsave is much more
popular than spin_lock_irq - people just know it will never screw up.

Thanks,
Ohad.
--

From: Arnd Bergmann
Date: Thursday, October 21, 2010 - 2:04 am

People can screw that up in different ways, e.g.

	spin_lock_irqsave(&lock_a, flags);
	spin_lock_irqsave(&lock_b, flags); /* overwrites flags */

	spin_lock_irqrestore(&lock_b, flags);
	spin_lock_irqrestore(&lock_a, flags); /* still disabled! */

I use the presence of spin_lock_irqsave in driver code as an indication
of whether the author had any clue about locking. Most experienced
coders use the right version intuitively, while beginners often just
use _irqsave because they didn't understand the API and they were told
that using this is safe.

	Arnd
--

From: Ohad Ben-Cohen
Date: Thursday, October 21, 2010 - 3:13 am

Agree.

I'll add the _irq pair of APIs, this will make the users' code
cleaner. This is valuable especially since all of our current users
have their interrupts enabled anyway.

The change is also pretty trivial:

* move the internal locking implementation to raw_ methods
* the raw_ methods would save the current interrupt state only if
given a placeholder
* wrap those raw_ methods with the desired API (but only support the
_irq and _irqsave flavors)

Thanks,
Ohad.
--

From: Arnd Bergmann
Date: Thursday, October 21, 2010 - 5:02 am

Sounds good!

	Arnd
--

From: Tony Lindgren
Date: Friday, October 22, 2010 - 10:00 am

Please let's not add yet another omap specific layer that will
make it incrementally harder to have generic drivers.

Instead, we can do the omap specific locking in the
platform code and then the drivers can use the functions
passed in the platform_data if they're implemented.

Regards,

Tony
--

From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 12:44 am

From: Benoit Cousson <b-cousson@ti.com>

Add hwspinlock hwmod data for OMAP4 chip

Signed-off-by: Cousson, Benoit <b-cousson@ti.com>
Signed-off-by: Hari Kanigeri <h-kanigeri2@ti.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Paul Walmsley <paul@pwsan.com>
---
 arch/arm/mach-omap2/omap_hwmod_44xx_data.c |   63 ++++++++++++++++++++++++++++
 1 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
index 0d5c6eb..07c3654 100644
--- a/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_44xx_data.c
@@ -1043,6 +1043,66 @@ static struct omap_hwmod omap44xx_uart4_hwmod = {
 	.omap_chip	= OMAP_CHIP_INIT(CHIP_IS_OMAP4430),
 };
 
+/*
+ * 'spinlock' class
+ * spinlock provides hardware assistance for synchronizing the processes
+ * running on multiple processors
+ */
+
+static struct omap_hwmod_class_sysconfig omap44xx_spinlock_sysc = {
+	.rev_offs	= 0x0000,
+	.sysc_offs	= 0x0010,
+	.syss_offs	= 0x0014,
+	.sysc_flags	= (SYSC_HAS_AUTOIDLE | SYSC_HAS_CLOCKACTIVITY |
+			   SYSC_HAS_ENAWAKEUP | SYSC_HAS_SIDLEMODE |
+			   SYSC_HAS_SOFTRESET | SYSS_HAS_RESET_STATUS),
+	.idlemodes	= (SIDLE_FORCE | SIDLE_NO | SIDLE_SMART),
+	.sysc_fields	= &omap_hwmod_sysc_type1,
+};
+
+static struct omap_hwmod_class omap44xx_spinlock_hwmod_class = {
+	.name = "spinlock",
+	.sysc = &omap44xx_spinlock_sysc,
+};
+
+/* spinlock */
+static struct omap_hwmod omap44xx_spinlock_hwmod;
+static struct omap_hwmod_addr_space omap44xx_spinlock_addrs[] = {
+	{
+		.pa_start	= 0x4a0f6000,
+		.pa_end		= 0x4a0f6fff,
+		.flags		= ADDR_TYPE_RT
+	},
+};
+
+/* l4_cfg -> spinlock */
+static struct omap_hwmod_ocp_if omap44xx_l4_cfg__spinlock = {
+	.master		= &omap44xx_l4_cfg_hwmod,
+	.slave		= &omap44xx_spinlock_hwmod,
+	.clk		= "l4_div_ck",
+	.addr		= omap44xx_spinlock_addrs,
+	.addr_cnt	= ARRAY_SIZE(omap44xx_spinlock_addrs),
+	.user		= OCP_USER_MPU | ...
From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 12:44 am

From: Simon Que <sque@ti.com>

Build and register an hwspinlock platform device.

Although only OMAP4 supports the hardware spinlock module (for now),
it is still safe to run this initcall on all omaps, because hwmod lookup
will simply fail on hwspinlock-less platforms.

Signed-off-by: Simon Que <sque@ti.com>
Signed-off-by: Hari Kanigeri <h-kanigeri2@ti.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Benoit Cousson <b-cousson@ti.com>
---
 arch/arm/mach-omap2/Makefile     |    1 +
 arch/arm/mach-omap2/hwspinlock.c |   67 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 68 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/mach-omap2/hwspinlock.c

diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
index 7352412..e55d1c5 100644
--- a/arch/arm/mach-omap2/Makefile
+++ b/arch/arm/mach-omap2/Makefile
@@ -190,3 +190,4 @@ obj-y					+= $(smc91x-m) $(smc91x-y)
 
 smsc911x-$(CONFIG_SMSC911X)		:= gpmc-smsc911x.o
 obj-y					+= $(smsc911x-m) $(smsc911x-y)
+obj-$(CONFIG_ARCH_OMAP4)		+= hwspinlock.o
diff --git a/arch/arm/mach-omap2/hwspinlock.c b/arch/arm/mach-omap2/hwspinlock.c
new file mode 100644
index 0000000..641a6d4
--- /dev/null
+++ b/arch/arm/mach-omap2/hwspinlock.c
@@ -0,0 +1,67 @@
+/*
+ * OMAP hardware spinlock device initialization
+ *
+ * Copyright (C) 2010 Texas Instruments. All rights reserved.
+ *
+ * Contact: Simon Que <sque@ti.com>
+ *          Hari Kanigeri <h-kanigeri2@ti.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this ...
From: Grant Likely
Date: Tuesday, October 19, 2010 - 10:05 am

On Tue, Oct 19, 2010 at 11:03 AM, Kevin Hilman

On that note, is there any reason why this file cannot be selected as a module?

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
--

From: Kevin Hilman
Date: Tuesday, October 19, 2010 - 10:03 am

Any reason this needs to be a postcore_initcall?  Are there users of
hwspinlocks this early in boot?  Probaly subsys or even device_initcall
is more appropriate here.

I would've suspected that any users of hwspinlocks will be dependent on
drivers for the other cores (e.g. syslink) which would likely be
initialized much later.

Kevin
--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 2:02 pm

On Tue, Oct 19, 2010 at 7:03 PM, Kevin Hilman

i2c-omap, which is subsys_initcall (the I2C bus is shared between the
A9 and the M3 on some OMAP4 boards).

And to allow early board code to reserve specific hwspinlock numbers
--

From: Grant Likely
Date: Tuesday, October 19, 2010 - 4:12 pm

Man. this is getting ugly.  I think we need to discuss how to solve
this at the Plumbers micro-conference. It kind of fits in with the
whole embedded (ab)use of the device model topic anyway. Actually,
this particular case isn't bad, but the moving of i2c and spi busses
to an earlier initcall is just band-aiding the real problem of driver
probe order dependencies.

g.
--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 7:09 am

+1

On Wed, Oct 20, 2010 at 1:53 AM, Kevin Hilman

+1

This whole thing is a mess, and today it's being solved in the wrong,
non-scalable and error-prone way.

The question is whether we want to gate hwspinlock until this issue is solved ?


What about doing something similar to the way suspend/resume and the
device hierarchy interact ?

device_resume waits for its parent to be resumed before waking up the
device - this sounds similar to what ->probe() should do: wait for its
device dependency to probe first (so in this case, i2c-omap should
wait for hwspinlock).

Conversely, device_suspend waits for all its children to be suspended
before continuing, which sounds just like what ->remove() should do
(so again, in this case, the hwspinlock device should wait for all its
users to be removed before bailing).

This is just a quick thought, I haven't even began to think of all the
use cases and requirements.
--

From: Grant Likely
Date: Wednesday, October 20, 2010 - 8:51 am

-1. Like Ryan points out below, the problem isn't modules vs.
built-in, it is drivers depending on implicit init order which sort of
works, but is fragile and will break in subtle ways. It needs to work
in both cases, and the only solution to fix the dependency issue.

I completely agree that more drivers need to support modules, but
that is an orthogonal issue.

I suspect that in most cases the driver model already provides the
needed functionality via the parent->child relationships, and a lot of
the problems can be removed by getting the driver of the parent device
to register the children.

It gets more complex (like in this case) when a single device needs
multiple devices to be initialized before it is probed. I can think of
a couple of ways to solve this. One option is for such drivers to
explicitly look for its dependant devices and defer .probe() work
until they appear. I hacked up some code for this a while back, but I
never pursued it through to completion[1].

Another option is to defer even registering the dependent devices
until the prerequisites are all probed.  This could potentially be
done in board support code by using a bus notifier to look for the
critical device.


No, don't gate hwspinlock.  I don't see any reason to defer it for


This works for the simple hierarchy case, but it doesn't help the
multi-dependency case.

g.
--

From: Kevin Hilman
Date: Tuesday, October 19, 2010 - 4:53 pm

Rather than moving towards having more drivers have to be built in (and
depend on their probe order) we need to be moving towards building all

There's no reason for board code to have to do this at initcall time.

This kind of thing needs to be done by platform_data function pointers,
as is done for every other driver that needs platform-specific driver
customization.

Kevin
--

From: Ryan Mallon
Date: Tuesday, October 19, 2010 - 6:20 pm

The issue of probe order still needs to be resolved for those of us who
do want all the drivers built into the kernel. Everything comes out in
the wash already if everything is built as modules by installing the
modules in the correct order right?

~Ryan

-- 
Bluewater Systems Ltd - ARM Technology Solution Centre

Ryan Mallon         		5 Amuri Park, 404 Barbadoes St
ryan@bluewatersys.com         	PO Box 13 889, Christchurch 8013
http://www.bluewatersys.com	New Zealand
Phone: +64 3 3779127		Freecall: Australia 1800 148 751
Fax:   +64 3 3779135			  USA 1800 261 2934
--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 7:38 am

On Wed, Oct 20, 2010 at 1:53 AM, Kevin Hilman

If we want to have allow both allocations of predefined hwspinlocks
with omap_hwspinlock_request_specific(int), and dynamic allocations
(where we don't care about the specific instance of the hwspinlock we
will get) with omap_hwspinlock_request(), we must ensure that the
former _specific() API will never be called after the latter.

If we will allow drivers to call omap_hwspinlock_request() before all
callers of omap_hwspinlock_request_specific() completed, then things
will break (because drivers might start getting hwspinlocks that are
predefined for dedicated use cases on the system).

So if we want the _specific API to work, we can only allow early board
code to use it in order to reserve those predefined hwspinlocks before
drivers get the chance to call omap_hwspinlock_request().

The tempting alternative is not to provide the
omap_hwspinlock_request_specific() API at all (which is something we
discussed internally).

Let's take the i2c-omap for example.

It sounds like it must have a predefined hwspinlock, but what if:

1. It will use omap_hwspinlock_request() to dynamically allocate a hwspinlock
2. Obviously, the hwspinlock id number must be communicated to the M3
BIOS, so the i2c-omap will publish that id using a small shared memory
entry that will be allocated for this sole purpose
3. we will make sure that 1+2 completes before the remote processor is
taken out of reset

This does not require any smart IPC and it will allow us to get rid of
the omap_hwspinlock_request_specific() API and its early-callers
requirement.

All we will be left to take care of is the order of the ->probe()
execution (assuming we want both the i2c and the hwspinlock drivers to

Why would we need platform-specific function pointers here ? I'm not
sure I'm following this one.

Thanks,
--

From: Grant Likely
Date: Wednesday, October 20, 2010 - 8:55 am

Having fought with this kind of thing before, I would strongly
recommend making the interface either all-dynamic, or all-predefined,
--

From: Kevin Hilman
Date: Wednesday, October 20, 2010 - 11:37 am

I understand the dependency between i2c and hwspinlock for some
platforms with a shared i2c bus.  Besides that being a broken hardware
design, I can see the need for the i2c driver to take a hwspinlock for
i2c xfers, but why does the i2c driver need to take the hwspinlock at
probe time?  Presumably, this is before the remote cores are executing

So that board code (built-in) does not call the hwspinlock driver
(potentially a module.)

The way to solve this is to have platform_data with function pointers,
so that when the driver's ->probe() is done, it can call cany custom
hooks registered by the board code.

Kevin

--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 12:21 pm

On Wed, Oct 20, 2010 at 8:37 PM, Kevin Hilman

Balaji, Nishant, are you OK with this ?

It means that the I2C driver will dynamically get a hwspinlock and
then it would need to announce the id of that hwspinlock before the M3
is taken out of reset.

It will help us get rid of static allocations that will never scale

It doesn't; it just needs to reserve a hwspinlock, and for that, we

Sorry, still not following.

Do you refer to the i2c driver calling the hwspinlcok_request function
at probe time via platform_data function pointers ?

With the previous _request_specific() allocation API, the important
issue to follow was the timing:  it had to be called before any driver
gets the chance to call the dynamic _request() API.
That's why we had to have early board code calling it. Obviously the
hwspinlock instance would then had to be delivered to the driver via
platform data.

But now, if we get rid of that static allocation method entirely,
i2c_omap can just call hwspinlock_request() on probe(), but again, no
pdata function pointers are involved because this API is
EXPORT_SYMBOL'ed.
--

From: Kevin Hilman
Date: Wednesday, October 20, 2010 - 4:58 pm

No, earlier in this discussion, in response to my question about users

My suggestion to use platform_data + function pointers was to address
the requesting of hwspinlocks in board/platform-specific code.

If the _request_specific() API is removed, and board code no longer
needs to call hwspinlock API, then this issue is moot.  However, if
board code ever needs to call the hwspinlock API, then pdata func
pointers are needed to handle both the case of late driver probe

Agreed.  If you get rid of the _request_specific() API, then this is not
needed in the board code and pdata function pointers should not be
needed.

So then, we're back to how to ensure probe order of hwspinlock vs
i2c. :/

I agree with others that this is a much broader problem, and should not
hold up the hwspinlock driver, so for now, making hwspinlock have an
initcall before subsys_initcall is OK with me.  Probably arch_initcall()
is fine here.

I suggest you add a comment in the code at the initcall point as to why
it is using arch_initcall(), namely it has to load before i2c because...

Kevin



--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 11:11 pm

[removed that bouncing email address we had all along :]

On Thu, Oct 21, 2010 at 1:58 AM, Kevin Hilman


How would pdata func pointers help here ?

As you say, pdata func pointers are used in scenarios where
platform-specific code needs to be executed from a driver.

But this is not the case here; here we have driver code ( the
_request_specific() API) that must be called very early in the boot,
before any other driver get the chance to call the dynamic _request()
API.

And to be able to call _request_specific() API that early in the boot,
we must have the hwspinlock driver in place (loaded and already
probe()'ed). So we had to give both the hwspinlock device creation
code and its driver a very early initcall state.

Of course, as we all agree, if we get rid of the _request_specific()
API, hwspinlock will only race with its users (namely i2c_omap), and

Sure.

Thanks,
Ohad.
--

From: Kamoolkar, Mugdha
Date: Thursday, October 21, 2010 - 1:36 am

The problem with this approach is that the i2c driver would have to 
sync up on the shared memory location that it uses to share the 
information of the spinlock ID. What location would this be? How would 
the i2c driver on the m3 know about this location? Does this mean 
hard-coding of the shared memory address? If yes, then there would be 
an impact to users if they wanted to change their memory map. Any kind 
of hard-coding of this sort can be painful in such cases, especially 
if it happens in multiple drivers. On the other hand, hard-coding 
(reserving) spinlock IDs is a much safer option and does not impact 

Regards,
Mugdha
--

From: Ohad Ben-Cohen
Date: Thursday, October 21, 2010 - 2:06 am

I agree.

But we seem to have this sort of problem anyway:

Since we are forbidden to take a hwspinlock over a lengthy period of
time, i2c-omap can't really use it directly to achieve mutual
exclusion over the entire i2c transfer (which is long and involved
sleeping).

It was suggested that i2c-omap would only use the hwspinlock to
protect a shared memory entry which would indicate the owner of the
i2c bus. Either that, or use something like Peterson's mutual
exclusion algorithm which is entirely implemented in software.

Both of the latter approaches involves sync'ing up on a shared memory
location..  so it seems like i2c-omap anyway has to deal with this
kind of pain, and having a predefined hwspinlock id number doesn't
relieve it.

What do you think ?

Thanks,
Ohad.
--

From: Kamoolkar, Mugdha
Date: Friday, October 22, 2010 - 2:59 am

That is true. Perhaps we should consider adding a software 
implementation over HW spinlocks. We were anyway considering doing this, 
because the number of hw spinlocks available for our usage are not 
sufficient when we look at multi-channel use-cases. So adding a software 
module that uses a hardware spinlock for protection of its shared memory 
could be written, and then i2c could use that software module. In that 
case, if this is the only reason why i2c driver needs shared memory, 
--

From: Ohad Ben-Cohen
Date: Friday, October 22, 2010 - 4:16 am

Yes, this is probably inevitable.

It would also be useful for the user space implementation of TI's
RingIO and FrameQueue protocols coming in Syslink 3.0, which will also
not be able to directly use the hwspinlock because of the same
reasons.

This layer would maintain the owner of each (virtual) lock in a
non-cacheable shared memory entry, together with the id of the
hwspinlock that would protect that 'owner' entry.

This way we no longer need to use predefined hwspinlocks (the id of
the hwspinlock is announced in the shared memory entry). So we can
ditch the _request_specific() API, and we can move to arch_initcall

Sure, this layer can provide any number of locks over a single
hwspinlock (with the obvious price of hwspinlock contention).
--

From: Kanigeri, Hari
Date: Thursday, October 21, 2010 - 5:26 am

We can avoid the hard-coding of shared memory location if I2C Driver maps using iommu some dynamic memory for shared memory location to M3 virtual address and shares this information to I2c driver counter-part on M3 using the IPC call. But this might not be trivial and this would be against what Ohad mentioned about not requiring any smart IPC :).
I prefer hard-coding of spinlock id to keep things simple.

Thank you,
Best regards,
Hari
--

From: Kamoolkar, Mugdha
Date: Friday, October 22, 2010 - 3:14 am

I also considered this, and had the same issue with it as what you 
mentioned above, that the virtual address for the i2c driver on m3 needs 
to be known to the Linux driver. This could potentially be taken in 
through kconfig, because there's no harm in fixing virtual maps for the 
slaves. The problem is in fixing physical memory regions. Note that this 
assumes presence of a slave MMU. In case slave MMU is not present, the 
physical address needs to be taken in through kconfig, which makes 
rearranging the memory map a virtual nightmare if multiple drivers start 
doing such things.

We do need to consider the possibility that in future there might be 
even more multi-core drivers which not only need hw spinlocks, but also 
need notifications and some shared memory to do their work effectively. As shared peripherals increase, this is a very likely possibility.
So I do believe that any effort we put on fixing this the right way 
needs to take into account all three (hw spinlocks, ipc interrupts and 
shared memory). That's of course, out of the scope of this hw spinlock 
--

From: Tony Lindgren
Date: Friday, October 22, 2010 - 9:56 am

Guys, let's try to come up with a generic interface for this instead of
polluting the device drivers with more omap specific interfaces.

We really want to keep the drivers generic and platform independent.

Sure we still have some omap specific stuff in the drivers, like
cpu_is_omapxxxx, and omap specific dma calls, but those will be going
away.

Unless somebody has better ideas, I suggest we pass a lock function
in the platform_data to the drivers that need it, and do the omap
specific nasty stuff in the platform code.

Regards,

Tony
--

From: Grant Likely
Date: Friday, October 22, 2010 - 10:03 am

For those of you going to plumbers, I'll put this on the embedded
microconference agenda when we're talking about common infrastructure.

g.

--

From: Tony Lindgren
Date: Friday, October 22, 2010 - 10:28 am

Great, thanks.

Tony
--

From: Ohad Ben-Cohen
Date: Sunday, October 24, 2010 - 10:54 am

Hi Tony,


I share this concern as well.

We basically have three options here:

1. Have the hwspinlock driver inside the omap folders and use pdata func ptrs
2. Have a generic hwspinlock framework, with a small omap-specific
implementation
3. Have a misc driver which is currently omap specific, but can very
easily be converted to a common framework

I don't like (1) so much; it's a driver that has very little that is
hardware specific (mainly just the two lines that actually access the
hardware - the lock and the unlock), and it's growing. E.g., we will
need to add it a user space interface (to allow userland IPC
implementations), we will need to add it a "virtual" locks layer that
would provide more locks than the hardware currently has, etc..

In addition, there seem to be a general discontent about drivers
piling up in the ARM folders, instead of having them visible in
drivers/ so they can become useful for other platforms as well.

Here's something Linus wrote about this awhile back:

http://www.spinics.net/lists/linux-arm-msm/msg00324.html

So initially I opted for (2). I have an implementation ready - it's a
common drivers/hwspinlock/core.c framework with a small omap-specific
part at drivers/hwspinlock/omap_hwspinlock.c. The core has all the
logic while the latter, omap-specific implementation, is really small
- it is basically just registering lock instances, that have a
trylock/unlock/relax ops struct, with the common framework.

But lack of additional users (besides the OMAP hardware), and lack of
generic drivers that would use it (we have syslink, which currently
tend to only need this from user space, i2c-omap and omap's upcoming
resource manager), made it look like an overkill for now.

So simplicity won, and (3) was eventually submitted. It exposes
exactly the same interface like (2) would (s/omap_hwspin/hwspin/), and
it's relatively easy to turn it back into a common framework in case
additional users show up.

But if you feel that (2) is ...
From: Tony Lindgren
Date: Monday, October 25, 2010 - 12:02 pm

Yes (2) please. I would assume there will be more use of this. And then
we (or probably me again!) don't have to deal with cleaning up the drivers

Yes variation of (2) where you only pass the locking function via
platform data would be best.

Regards,

Tony
--

From: Ohad Ben-Cohen
Date: Tuesday, October 26, 2010 - 4:54 am

It feels a bit funky to me because we would still have code that is
omap-specific inside the "common" probe()/remove() calls.

I suggest to move everything that is omap-specific to a small omap
module that, once probed, would register itself with the common
hwspinlock framework (after initializing its hardware).

That small platfom-specific module probably doesn't have to sit in the
arch/ folder; we can follow established conventions like
mmc/i2c/gpio/spi/etc..

With that in hand, the hwspinlock would really be hardware-agnostic,
and then applying s/omap_hwspin/hwspin/ would be justified.

Does this sound reasonable to you ?

Thanks,
Ohad.
--

From: Tony Lindgren
Date: Tuesday, October 26, 2010 - 12:06 pm

Sounds good to me.

Tony
--

From: Peter Zijlstra
Date: Monday, October 18, 2010 - 5:46 am

I just have to ask... was it really easier to build silicon than to
agree on a spinlock ABI?
--

From: Russell King - ARM Linux
Date: Monday, October 18, 2010 - 6:35 am

I'm not really sure what point you're trying to make, but if you're
suggesting that Linux's spinlock should be exposed to these other
processors, you're completely off your rocker.

Doing so would set the kernels spinlock API in stone, which is really
something you don't want to do.  Not only that, but it would mean that
software written for the M3 and DSP would have to know about the GPL'd
spinlock layout, and I suspect that would cause major licencing headaches.

In any case, Linux's spinlock API (or more accurately, the ARM exclusive
access instructions) relies upon hardware coherency support (a piece of
hardware called an exclusive monitor) which isn't present on the M3 nor
DSP processors.  So there's no way to ensure that updates from the M3
and DSP are atomic wrt the A9 updates.
--

From: Peter Zijlstra
Date: Monday, October 18, 2010 - 6:43 am

Of course not, that would indeed be utterly silly, nor would it serve
any purpose, the Linux kernel spinlocks are internal spinlocks and need
not interact with anything out side of it.

But for the purpose of communicating with a heterogeneous CPU/DSP it
would make perfect sense to specify a spinlock ABI. Creating specific

Right, so the problem is that there simply is no way to do atomic memory
access from these auxiliary processing units wrt the main CPU? Seeing as
they operate on the same memory space, wouldn't it make sense to have
them cache-coherent and thus provide atomicy guarantees through that?

But that's water under the bridge, and your last paragraph does indeed
answer my question.
--

From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 7:28 am

Yes. There are a few relevant system-wide limitations, one of them is
that simply the system interconnect does not support those fancy
atomic operations.
--

From: Peter Zijlstra
Date: Monday, October 18, 2010 - 7:33 am

Does it support full memory coherency though? Otherwise I can see memory
based message passing becoming rather interesting.

Without coherency everybody needs to be damn sure to flush the relevant
bits before unlocking and such.


--

From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 7:39 am

Yes, either that, or use non-cacheable memory.
--

From: Catalin Marinas
Date: Monday, October 18, 2010 - 8:27 am

With cache coherency you may get atomicity of writes or reads but
usually not atomic modifications.

-- 
Catalin
--

From: Peter Zijlstra
Date: Monday, October 18, 2010 - 8:32 am

Sure, but you can 'easily' extend your coherency protocols with support
for things like ll/sc (or larger transactions).

Have ll bring the cacheline into exclusive state and tag it, then
anything that demotes the cacheline will clear the tag and make sc fail.


--

From: Ohad Ben-Cohen
Date: Monday, October 18, 2010 - 8:35 am

In our case, there are no coherency protocols supported between the
A9, M3 and the DSP.
--

From: Peter Zijlstra
Date: Monday, October 18, 2010 - 8:48 am

Sure, I got that, I was simply commenting on Catalin's statement that
cache coherency doesn't (need to) bring atomic modifications.


--

From: Catalin Marinas
Date: Monday, October 18, 2010 - 8:51 am

For the ll/sc operations on ARM (exclusive load/store) there is a
per-CPU local exclusive monitor and a (virtual) global one. The global
one may either be a separate piece of hardware or emulated via cache
lines as you said. But if you need synchronisation with a CPU (or DSP)
like Cortex-M3 which doesn't have any built-in caches, you can only get
atomic operations on the main processor (A9) but not on the M3 (as you
can't have a cache line in exclusive state on the M3).

The M3 may have a local exclusive monitor (like the main CPU) but it
isn't cleared by memory accesses from the main CPU.

-- 
Catalin
--

From: Peter Zijlstra
Date: Monday, October 18, 2010 - 8:58 am

Right, and I take it that modifying the M3 to participate in the full

Sounds like asking for trouble if you ask me ;-)



--

From: Daniel Walker
Date: Tuesday, October 19, 2010 - 4:31 pm

Does this code interface with some hardware unit (other than the other
processors) to accomplish this locking ?

The reason I ask is because MSM has similar code, and from what I can
tell the MSM version has some structures in memory but that's all. It
just operates on the structures in memory.

It might be worth looking over the two implementation so we aren't both
remaking the wheel.

Daniel

-- 

Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--

From: Ohad Ben-Cohen
Date: Tuesday, October 19, 2010 - 11:13 pm

That's interesting.

We did have thoughts of making this a generic framework, in the hope
that it would be useful for other vendors too, but we didn't find

Indeed. Where is that MSM code ?

Thanks,
Ohad.
--

From: Ohad Ben-Cohen
Date: Wednesday, October 20, 2010 - 3:00 am

[resubmitting due to l-o being dropped from this discussion fork.
Thanks Russell for catching this]



 That's interesting.

 We did have thoughts of making this a generic framework, in the hope
 that it would be useful for other vendors too, but we didn't find

 Indeed. Where is that MSM code ?

 Thanks,
 Ohad.
--

From: Bryan Huntsman
Date: Wednesday, October 20, 2010 - 3:29 pm

We don't have an equivalent feature on MSM, nor any plans for one.  Daniel was thinking of an internal test feature that had been deprecated a while ago.

- Bryan

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: Russell King - ARM Linux
Date: Wednesday, October 20, 2010 - 2:53 am

Ohad's message to which you replied had:

To: linux-omap@vger.kernel.org, linux-kernel@vger.kernel.org,
        linux-arm-kernel@lists.infradead.org
Cc: Ohad Ben-Cohen <ohad@wizery.com>, Hari Kanigeri <h-kanigeri2@ti.com>,
        Benoit Cousson <b-cousson@ti.com>, Tony Lindgren <tony@atomide.com>,
        Greg KH <greg@kroah.com>, Grant Likely <grant.likely@secretlab.ca>,
        akpm@linux-foundation.org, Suman Anna <s-anna@ti.com>

Yours has:

To: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Hari Kanigeri <h-kanigeri2@ti.com>, Benoit Cousson <b-cousson@ti.com>,
        Tony Lindgren <tony@atomide.com>, Greg KH <greg@kroah.com>,
        linux-kernel@vger.kernel.org,
        Grant Likely <grant.likely@secretlab.ca>, mattw@codeaurora.org,
        akpm@linux-foundation.org, Suman Anna <s-anna@ti.com>,
        mattw@codeaurora.orgmattw, linux-arm-kernel@lists.infradead.org

which includes an invalid address "mattw@codeaurora.orgmattw".  Is there
a reason why you're excluding the linux-omap list from your message and
subsequent discussion?
--

From: Daniel Walker
Date: Wednesday, October 20, 2010 - 3:15 pm

I was trying to add Matt to the CC, but I guess I accidentally deleted
some CC's .. So it was purely an accident.

Daniel

-- 

Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--

Previous thread: