Re: [PATCH] lglock: make lg_lock_global() actually lock globally

Previous thread: Allocating memory aligned on 1M boundary? by david.hagood on Wednesday, August 25, 2010 - 11:35 am. (2 messages)

Next thread: [PATCH] gpio: Add generic driver for simple memory mapped controllers by Anton Vorontsov on Wednesday, August 25, 2010 - 12:42 pm. (3 messages)
From: Jonathan Corbet
Date: Wednesday, August 25, 2010 - 12:28 pm

lglock: make lg_lock_global() actually lock globally

lg_lock_global() currently only acquires spinlocks for online CPUs, but
it's meant to lock all possible CPUs.  At Nick's suggestion, change
for_each_online_cpu() to for_each_possible_cpu() to get the expected
behavior.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
---
 include/linux/lglock.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/lglock.h b/include/linux/lglock.h
index b288cb7..f549056 100644
--- a/include/linux/lglock.h
+++ b/include/linux/lglock.h
@@ -150,7 +150,7 @@
 	int i;								\
 	preempt_disable();						\
 	rwlock_acquire(&name##_lock_dep_map, 0, 0, _RET_IP_);		\
-	for_each_online_cpu(i) {					\
+	for_each_possible_cpu(i) {					\
 		arch_spinlock_t *lock;					\
 		lock = &per_cpu(name##_lock, i);			\
 		arch_spin_lock(lock);					\
@@ -161,7 +161,7 @@
  void name##_global_unlock(void) {					\
 	int i;								\
 	rwlock_release(&name##_lock_dep_map, 1, _RET_IP_);		\
-	for_each_online_cpu(i) {					\
+	for_each_possible_cpu(i) {					\
 		arch_spinlock_t *lock;					\
 		lock = &per_cpu(name##_lock, i);			\
 		arch_spin_unlock(lock);					\
-- 
1.7.2.1

--

From: Linus Torvalds
Date: Wednesday, August 25, 2010 - 1:00 pm

Grrr. Same disease as Nick and others. Why do you repeat the subject
line in the body? Don't do that. We don't want the summary line twice
in the commit message, and we don't want it twice in the email.


Can you say what this actually matters for? Don't we do stop-machine
for CPU hotplug anyway? And if we don't, shouldn't we? Exactly because
otherwise "for_each_online_cpu()" is always racy (and that has nothing
to do with the lglock).

                  Linus
--

From: Jonathan Corbet
Date: Wednesday, August 25, 2010 - 1:16 pm

On Wed, 25 Aug 2010 13:00:59 -0700

Sorry, I just pasted in the "git format-patch" output.  Will never ever

As I understand it from Nick (after I asked him why the two lock
primitives were identical): the files_lock scalability work puts a
per-CPU list of open files into each superblock.  A CPU can be offlined
while there are open files in "its" lists, and nothing is done to shift
those files to a still-online CPU's list.  So there will still be
cross-CPU accesses to those lists as those files are closed; that means
we need to be sure to acquire locks associated with offline CPUs if we
want to avoid races.

lg_global_lock_online() is used (only) in the brlock implementation,
instead.  In this case, there's no leftover data if a CPU goes
offline, so no need to take locks associated with offline CPUs.

jon
--

From: Nick Piggin
Date: Wednesday, August 25, 2010 - 9:23 pm

Yep, thanks Jon, I owe a bit more documentation in that file, coming up.


--

From: Tejun Heo
Date: Thursday, August 26, 2010 - 1:55 am

Hello,


We only do stop-machine for cpu downs not ups, so code running w/
preemption disabled is guaranteed that no cpu goes down while it's
running but not the other way around.  There are two ways to achieve
synchronization against cpu up/down operations.  One is explicitly
using get/put_online_cpus() and the other is via cpu notifiers with
proper synchronization.

So, yeah, given that there's no cpu notifier implemented, the use of
for_each_online_cpu for brlock seems fishy to me.  It probably should
use for_each_possible_cpu().

Thanks.

-- 
tejun
--

From: Nick Piggin
Date: Thursday, August 26, 2010 - 2:46 am

Oh, I thought we quiesce / preempt all online cpus before adding
another one. That sucks if we don't because then you need a big
heavy get_online_cpus when a simple preempt_disable would have
worked.

Why is that? Don't tell me realtime people want some latency "guarantee"

It should if that's the case, yes.

Thanks,
Nick

--

From: Tejun Heo
Date: Thursday, August 26, 2010 - 2:49 am

Hello,


Probably similar rationale of not doing stop_machine() on module
unload, I suppose.  Onlining something is usually considered hotter
path than offlining.  Performance penalty caused by the difference
between possible and online cpumask or cpu onlining probably only
matters for very large machines and on those machines stop-machine is
very expensive.  If there's a pressing need, doing stop_machine for

IMHO, in most configurations the difference between possible and
online cpumasks doesn't matter much (they're the same during normal
operation), so just using possible cpumask should be fine.  It's
already a pretty heavy path, right?

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Thursday, August 26, 2010 - 2:50 am

s/unload/load/ :-)

-- 
tejun
--

From: Peter Zijlstra
Date: Thursday, August 26, 2010 - 3:08 am

I would argue against that.. we should try and rid ourselves of
stopmachine, not add more dependencies on it. If you want to sync
against preempt_disable thingies synchronize_sched() is your friend.
--

From: Nick Piggin
Date: Thursday, August 26, 2010 - 4:38 am

I don't think it's that easy to do it in hotplug handlers.

Say take a brlock (not the best example because the write side
is a slowpath itself, but I could imagine better cases), then
you actually want to be able to prevent cpu online map from
changing for the entire period of a lock/unlock.

The lock/unlock is preempt disabled. synchronize_sched in the
plug handler will not really do anything, because there could
be subsequent write locks coming in from other CPUs at any
time during the handler, before or after synchronize_sched runs.

I think for CPU plug, stop_machine is reasonable (especially
considering it is required in unload, which means any frequent
amount of cpu plug activity already will require stop_machine to
run anyway).

--

From: Peter Zijlstra
Date: Thursday, August 26, 2010 - 4:45 am

How is it required? 

Its currently implemented as such, and its sure a lot easier to do that
way, but I could imagine that unplugging a CPU could be done without it.


--

From: Peter Zijlstra
Date: Thursday, August 26, 2010 - 4:49 am

Not that reworking unplug to not use stop_machine is anywhere near my
todo list, but it would be really nice to have ;-)
--

From: Nick Piggin
Date: Thursday, August 26, 2010 - 10:51 pm

I would much prefer the rules to be simpler and easier for all
other kernel code, and keep complexity and overheads in cpu
plug/unplug.

I don't see what is so nice about stop_machine()less cpu plug/unplug
or module unload. Module load definitely is nice because you can
have a lot of modules and on demand loading from non-privileged
operations.

--

From: Peter Zijlstra
Date: Friday, August 27, 2010 - 12:57 am

There's a large three lettered company that's promoting cpu hotplug use
for power management, they hotplug at an astonishing rate.

I've been saying hotplug isn't for that, and its a terrible slow path
and disrupts your whole machine etc.. they don't seem to care much.

The trouble seems to be that some smaller chips are now also wanting to
use hotplug for PM because they want to safe silicon and not make their
chips symmetric or crap like that...

Its all really sad, and I'd prefer to keep hotplug slow and simple and
simply never use it except when you like suspend your laptop -- but even
there, people want auto-demand suspend crazy lot.
--

From: Peter Zijlstra
Date: Friday, August 27, 2010 - 12:59 am

Oh, and the virt community seems to want (or is) doing the same silly
thing.
--

From: Peter Zijlstra
Date: Thursday, August 26, 2010 - 3:08 am

I think its only done because we can, don't use kstopmachine if you
don't have to etc..

We tell people who do RT not to hotplug, not load modules etc..
--

Previous thread: Allocating memory aligned on 1M boundary? by david.hagood on Wednesday, August 25, 2010 - 11:35 am. (2 messages)

Next thread: [PATCH] gpio: Add generic driver for simple memory mapped controllers by Anton Vorontsov on Wednesday, August 25, 2010 - 12:42 pm. (3 messages)