ChangesLog:
added MAGIC_SYSRQ support. removed notify_die handler and changes
to drivers/char/keyboard.c. save or restrive pt_regs for cases where
panic is called from non-exception context. handled cases
where the per_cpu registers and context are not valid/saved
when panic is called from non-exception contexts.removed more checkpatch.pl messages. lots more messages to still
work on. volatiles left in the code due to the previously stated
(and still present) severe breakage of the GNU compiler with SMP
shared data. most of the barrier() functions are just plain broken
and do not result in proper compiler behavior in this tree. it
helps having a working debugger to actually see these compiler problems.2.6.27-rc4 1/27 mdb: export genapic and machine_restart functions
export genapic and machine_restart functions
2.6.27-rc4 2/27 mdb: add makefile for Merkey's Linux Kernel Debugger module
add makefile for Merkey's Linux Kernel Debugger module
2.6.27-rc4 3/27 mdb: add local Makefile for Merkey's Linux Kernel Debugger module
add local Makefile for Merkey's Linux Kernel Debugger module
2.6.27-rc4 4/27 mdb: add mdb-base.c architecure independent functions
add mdb-base.c architecure independent functions
2.6.27-rc4 5/27 mdb: add mdb-base.h architecure indepedent include
add mdb-base.h architecure indepedent include
2.6.27-rc4 6/27 mdb: add mdb.h architecure main include
add mdb.h architecure main include
2.6.27-rc4 7/27 mdb: add mdb-ia32.c Intel x86 architecure functions
add mdb-ia32.c Intel x86 architecure functions
2.6.27-rc4 8/27 mdb: add mdb-ia32.h Intel x86 architecure includes
add mdb-ia32.h Intel x86 architecure includes
2.6.27-rc4 9/27 mdb: add mdb-ia32.c Intel x86 architecure function includes
add mdb-ia32.c Intel x86 architecure function includes
2.6.27-rc4 10/27 mdb: add mdb-ia32-support.c disassembler support
add mdb-ia32-support.c disassembl...
Can you provide explicit detail?
By using barrier() the compiler should clobber all its memory and
registers therefore forcing a write/reload of the variable.--
I hope Jeff didn't try mere barrier()s only. smp_wmb() and smp_rmb()
are the more relevant barrier variants for mdb, from what I remember
when I last looked at it.
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/
--
Sure, but volatile isn't a replacement for memory barriers.
--
Let's face it, the C standard does not support concurrency, so we are
all in a state of sin in any case, forced to rely on combinations of
gcc-specific non-standard language extensions and assembly language.Could be worse!!!
Thanx, Paul
--
It _will_ be worse.
The C standard will eventually support concurrency (they are working on
it), and it will almost inevitably be a horrible pile of stinking sh*t,
and we'll continue to use the gcc inline asms instead, but then the gcc
people will ignore our complaints when they break the compiler, and say
that we should use the stinking-pile-of-sh*t ones that are built in.No, I haven't seen the drafts, and maybe I'm overly pessimistic, but I'm
pretty sure that this is an area where (a) the kernel needs more support
than most normal pthread-like models and (b) any design-by-committee thing
simply won't be very good, because they'll have to try to make everybody
happy.Oh, well.
Linus
--
And not only that, it will be worse before the new standard is ratified.
Let's see...
1. Yes, the standard is being designed by a committee.
2. No, it does not simply ratify the Linux kernel's memory-barrier
API. Yes, I did propose that. No, the proposal did not go
very far, but yes, it did get people's attention. Yes, there
were a number of people who asserted that Linux's approach was
fundamentally broken/buggy/whatever. Yes, they have changed
their mind, most of them, anyway. No, there is still zero
chance of the standard simply ratifying smp_mb() and friends.3. Yes, we are working to get things closer to the Linux kernel's
memory-barrier API into the spec. No, they will likely not
be exact. Yes, getting some prominent committee members to
countenance the idea of any sort of raw memory barrier (not
connected directly to a load, store, or atomic operation)
was a long-term project that finally made headway last May.
Again, design by committee.4. No, the current Linux kernel memory-barrier API is not perfect.
Yes, the proposed standard's preferred memory-barrier approach
will make code easier to read and understand in many cases, and
could potentially allow the compiler to do better optimizations
(which, yes, might or might not be a good thing). No, the
proposed standard's preferred approach does not apply to all the
cases in the Linux kernel. No, I don't know whether or not it
will be worthwhile to introduce the standard's preferred approach
to the Linux kernel (though I suspect that it would be, but have
to wait and see). Either way, yes, the Linux kernel will likely
continue to need to resort to non-standard gcc extensions and
assembly language in at least some cases. As you say, the kernel
is not a garden-variety pthreads application.But even if the Linux kernel never uses this stuff, user applications
will need it. And there are a number of reasons why gcc extensions and
assembly language are even less welcome in many such a...
Ok, I have looked at the draft now, and I don't think I was overly
pessimistic.If I read it right, all the memory ordering operations are defined for
_single_ objects. So if you want to do the kernel kind of memory ordering
where you specify ordering requirements independently of the actual
accesses (perhaps because the accesses are in some helper function that
doesn't care, but then you want to "finalize" the thing by stating a
sequence point), it seems to be impossible with current drafts.Oh, well. Nothing lost. I didn't expect the thing to work.
Linus
--
You are looking for atomic_fence() on page 1168 (1154 virtual) of the
most recent draft. The current semantics are not correct, but this is
being worked. And yes, it does currently have a variable associated with
it, but it acts as a bare fence nevertheless. There is a proposal to;-)
Thanx, Paul
--
And the proposal for variable-free memory-ordering operations was voted
into the draft standard:http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2731.html
Still not perfect, of course, but hopefully movement in the right
direction.Thanx, Paul
--
Nevertheless, an analysis of which particular parts of code generation
are insufficient if one particular volatile qualification is removed is
IMO likely to turn up places in mdb where a clearer or/and more
efficient implementation is possible. (Based on what I saw a few
revisions ago; I haven't looked at the current one yet.)
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/
--
I used the smp_wmb() functions. I noted a couple of things. a) some of
these macros just emit __asm__ __volatile__ into the code so why not just
say "volatile" to begin with b) smp_wmb() in some cases worked and in
other cases jut optimized away the global reference. c) I can go back and
break the code again by inserting them and building broken assembler d) I
ave been doing hardware and software design since the early 1980;s, I
invented SMP affinity scheduling, and yes, I understand barriers and this
concept of instruction score-boarding and optimization very well -- its
not an excuse for a busted C compiler.It did not break all the places in the code, but broke enough for SMP to
lock up and fail, It turned global variables into local variables. If
you want me to reproduce this I can but it will have to wait til this
evening
because I have some product releases to get out the door at Omega 8 today.It's simple to reproduce. Take away the volatile declaration for the
rlock_t structure in mdb-ia32.c (rlock_t debug_lock) in all code
references and watch the thing lock up in SMP with multiple processors in
the debugger each stuck with their own local copy of debug_lock.The barrier functions do not appear to work in all cases.
Jeff
--
I'm having a quick look at mdb-2.6.27-rc4-ia32-08-20-08.patch at the
moment. Speaking of debug_lock()...:You use spin_trylock_irqsave((spinlock_t *)&rlock->lock, rlock->flags)
in there. Minor nit: The pointer type cast is unnecessary.Major problem: rlock->flags is wrong in this call. Use an on-stack
flags variable for the initial spin_trylock_irqsave. Ditto in the
following call of spin_trylock_irqsave.Next major problem with debug_lock() and debug_unlock(): The reference
counting doesn't work. You need an atomic_t counter. Have a look at
the struct kref accessors for example, or even make use of the kref API.
Or if it isn't feasible to fix with atomic_t, add a second spinlock to
rlock_t to ensure integrity of .count (and of the .processor if necessary).Furthermore, I have doubts about the loop which is entered by CPU B
while CPU A holds the rlock. You are fully aware that atomic_read(a) &&
!atomic_read(b) in its entirety is not atomic, I hope.All this aside, I don't see *anything* in debug_lock and _unlock which
would necessitate volatile. Well, volatile might have papered over some
of these bugs.PS:
Try to cut down on #if/#endif clutter. It should be possible to reduce
them at least in .c files; .h are a different matter. For example,
#if MDB_DEBUG_DEBUGGER
DBGPrint("something");
#endif
can be trivially reduced to
dbg_mdb_printk("something");
where dbg_mdb_printk() is defined as an inline function which does
nothing when MDB_DEBUG_DEBUGGER is false.PS2:
Why are there this many debug printks anyway?
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/
--
The code works in debug lock provided this memory location is actually
SHARED between the processors. The various race conditions you describe
are valid cases, but he only modifier of .count and .lock is the processor
that obtains the spinlock -- the rest are readers. This code works well,
of course, when this memory location is actually SHARED between the
processors and the read/write operations serialized.Even in SMP, at various times it is necessary for the processors to
perform serializing operations. You cannot in checker-scoreboard land for
everything.Jeff
--
jmerkey@wolfmountaingroup.com wrote:
Regarding rlock->count, I stand corrected: It works correctly if the
debug_lock()...debug_unlock() region can be entered by up to two
contexts and if the second one to enter cannot be preempted by the first
one.However, since these regions are enclosed in preempt_disable/enable and
since the first one to grab the rlock->lock keeps local interrupts
disabled until debug_unlock() or even longer, there is always only a
single context in the debug_lock()...debug_unlock() region --- which
defeats the whole purpose of the rlock_t. Or what am I missing /now/?Independently of this, you cannot use rlock->flags like you currently
do. spin_trylock_irqsave would overwrite the flags of CPU A by the
flags of CPU B, making the results of spin_unlock_irqrestore in
debug_unlock unpredictable.
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/
--
Hmmm....
Yep, that may be correct. I'll fix that one since it is outside of the
lock. At this point, the flags in the processors are all going to be
identical at this point in the code so it would not cause any problems
(local debugger state would not care), so the debugger would work with
this and does. But I will change that and repost the patch this evening.You have good eyes by friend.
--
It is not the same as volatile type. What it does is tell the compiler
to clobber all registers or temporaries. This something pretty wellLinux barriers aren't going to force a load to be emitted, if it can be
optimized away. If it optimized away a store, then I'd like to see aThe point is not whether it is possible to work with volatile types, but
that we tend not to use them in Linux to deal with concurrency.Also, barriers seem to work fine for everybody else, so I think it is
likely you either aren't using them correctly, or have other bugs in theYou should disable preempt before getting the processor id. Can't see any
other possible bugs, but you should be able to see from the disassembly
pretty easily.
--
No, that's not what "asm volatile" means. Its *only* meaning is "emit
this, even if it doesn't look like it has side-effects and its results
are not used". An asm() with no outputs is "volatile" by default, which
makes most of the uses of "asm volatile" in the kernel redundant. "asm
volatile" also has no effect on the ordering of the asm with respect to
other code; you must use constraints to do that.An asm with a "memory" clobber is sufficient to make sure that gcc
doesn't cache memory values in registers; perhaps that's what you mean.J
--
That is what I meant, yes.
--
Well, the gcc docs at _used_ to say that volatile asms are not moved
around "significantly". They've removed that.Linus
--
Results from Analysis of GCC volatile/memory barriers
Use of volatile will produce the results intended in those files which
have shared data elements, but will also result in some cases global data
which is not referenced outside of a file and which has not also been
declared as volatile as being treated as static and optimized
into local variables in some cases.If volatile is avoided entirely, the compiler appears to make correct
assumptions about whether or not it is in fact global memory references.
My conclusion is that the code generation of gcc appears correct and
in fact does a better job than Microsoft's implementation of shared
data management on SMP systems.If you use volatile, and use optimization at the same time, the compiler
will take you at your word and potentially optimize global references into
local variables. This is in fact better than MS cl which will ALWAYS
optimize global data into local if volatile is not used with SMP data within
a single file.If you choose to use volatile, then you had better use it on every variable
you need shared between processors -- or just leave it out entirely -- and
gcc does appear figure it out code references properly (though some of them
are quite odd).While this may be counter-intuitive, it makes sense. When you are using
volatile, you are telling the compiler anything not declared as volatile
witihin a given file is fair game for local optimization if you turn on
optimization at the same time.Analysis
Code Generation for two atomic_t variables. One an array and the other
standalone -- macros in kernel includes and their interactions
with the compiler may be the basis of some of these cases.atomic_inc(&debuggerActive);
atomic_inc(&debuggerProcessors[processor]);55a0: f0 ff 05 00 00 00 00 lock incl 0x0
55a7: 8d 2c 8d 00 00 00 00 lea 0x0(,%ecx,4),%ebp
55ae: 8d 85 00 00 00 00 lea 0x0(%ebp),%eax
55b4: 89 04 24 mov %eax,(%esp)
55b7: f...
So the result was that barrier()s actually worked? Great.
BTW if you did this work it might be worthwhile to go one further
step: gcc has a test suite where they run small programs and have a
simple facility to scan the resulting assembler code for specific
patterns. If you could bring your examples into that format (as in
being stand alone compiled and have a simple "yes/no" pattern to
check) and let it be included there that might make sure this always
keeps working in the future.-Andi
--
ok.
Jeff
--
Yep, they seem to do that kind of thing specifically to annoy you. But
it does mean we can't rely on it at all any more.J
--
Right, asm volatile () means that the asm may not be discarted. Very
Not sure - I think all barrier clobber the full register and memory set.
So if you access a variable after a barrier it will have to issue a
load.Well, there is of course the third option, which is what Jeff claims,
that gcc is broken. But in that case we should have more problems
elsewhere too.--
IIRC a barrier only clobbers memory. gcc must reload a variable from
memory unless it can prove the variable's address has not escaped anywhere.So:
void f()
{
int v;v = g();
barrier();
do_domething_with(v);
}Need not reload v from memory (indeed, v can be in a register for its
entire lifetime), butvoid f()
{
int v;v = g();
h(&v);
barrier();
do_domething_with(v);
}Will force v into memory, and reload it after the barrier.
--
error compiling committee.c: too many arguments to function--
Here is one example (which might or might not be what Nick had in mind):
extern int v;
void foo(void)
{
do_something_with(v);
barrier();
do_something_else_with(v - v);
}The second set of loads from v can be optimized away unless v is
declared volatile. In contrast:void bar(void)
{
do_something_with(v);
barrier();
do_something_else_with(v);
}Given the amount of code in gcc, we can reasonably assume that some
aspects of it are broken. Whether that presumed breakage is affecting
Jeff is another question altogether. ;-)Thanx, Paul
--
Ah, right. But in that case:
v-v := tmp1 = v; tmp2 = v; sub tmp1,tmp2;
Which you can of course write out more explicitly in C as well and
insert a barrier between the two reads of v, giving the same effect as
volatile.Still, point taken.
--
I'll create some cases this evening showing how GCC a) optimizes away
global references, even when you tell it not to (volatile is the only
thing that seems to work) b) takes global defines and turns them into
stack variables (WHAT A GREAT THING TO DO WITH ONLY A 4K STACK) when it
has bloated out on register aliases from all the global data it is turning
into local variables in registers or on the stack.Jeff
--
debug_lock() is AFAICS only called from contexts which have preemption
disabled. Last time around I recommended to Jeff to document this
requirement on the calling context.But even though preemption is disabled, debug_lock() is still incorrect
as I mentioned in my other post a minute ago. It corrupts its .flags
and .count members. (Or maybe it coincidentally doesn't as long as
volatile is around.)
--
Stefan Richter
-=====-==--- =--- =-=-=
http://arcgraph.de/sr/
--
I'm not talking about where debug_lock gets called, I'm talking about
where the processor id is derived that eventually filters down toI don't think so. And flags should only be restored by the processor
that saved it because the spinlock should disable preemption, right?
--
You are right, I replied to fast. debug_unlock() retrieves the
OK; the .count seems alright due to restrictions of the calling
contexts. About .flags: Jeff, can the following happen?- Context A on CPU A has interrupts enabled. Enters debug_lock(),
thus disables its interrupts. (Saves its flags in rlock->flags with
the plan to enable interrupts later when leaving debug_unlock()
provided it does so as last holder.)- Context B on CPU B happens to have interrupts disabled. Enters
debug_lock(), overwrites rlock->flags with its different value.
(Spins on the rlock which is held by CPU A.)- Context A on CPU A leaves debug_unlock. Doesn't re-enable its
interrupts as planned, since rlock->flags is the one from CPU B.
--
Stefan Richter
-=====-==--- =--- =-==-
http://arcgraph.de/sr/--
It should be benign since both procs have ints off here.
Stefan,
The flags needs fixing, you are right, however, since this case only
occurs when two processors both have an int1 exception (or some exception
other than an NMI) and ints are disabled here anyway, its benign. That
being said, it needs to be perfect. So I moved the flags to a stack
variable.I will post a new patch series correcting the flags issue along with code
fragments showing the gcc breakage with debug lock later tonight. I have
a lot to get done at omega8 on appliance releases this week, and being aJeff
--
jmerkey@wolfmountaingroup.com wrote:
Ah, thanks. I missed the bigger picture.
--
Stefan Richter
-=====-==--- =--- =-==-
http://arcgraph.de/sr/
--
Even if you use the smb_wmb() macros around the debug_lock, the compiler
still optimizes the debug_lock into a local variable. After watching the
broken behavior, the fact that some of these macro's emit __asm__
__volatile__ anyway, I just left the vars declared volatile. Its a
debugger, so its probably ok for the kernel debugger to use some volatile
data.--
Hehe, still, a little birdie told me they are working on it and perhaps
someone with clue could enlighten us on their direction.Still, I'd like Jeff to show his C, the resulting asm and the intent for
the volatile and barrier versions of his code (well, little snippets of
his code obviuosly).Either he doesn't understand barriers (nothing to be ashamed about), or
we might have more trouble lurking in the rest of the kernel.--
Well, I guess you guys will be the judge of that. Or one of the judges,
at least. ;-)One advantage of the current c++0x approach is that it allows extremely
weak memory barriers to be used in many cases that would require smp_rmb()
in current Linux kernel. If you are crazy enough to want to see a
sneak preview in standardese, try all 10MB of:http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2691.pdf
Section 1.10 (physical page 25, logical page 11) describes the memory model.
Sections 29 and 30 describe the operations (physical page 1155, logical
page 1141). The C and C++ guys got together ahead of time and agreed to
work together towards a compatible solution.And rcu_dereference() would be implemented in terms of memory_order_consume,
Sounds fair to me!
Thanx, Paul
--
I have thoroughly reviewed Linux memory barriers and the efficacy of the
barriers as defined in Linux are not the issue here. the code segment
discussed sits and spins on a variable waiting for a specific state, and
its a spinlock which creates a hard barrier, so no amount of barrier usage
should nor does matter here. Even if a processor was late in flushing its
writes, sooner or later the spinning processor would see the change in the
shared memory address -- IF IT WERE ACTUALLY A SHARED REFERENCE. What I
am seeing is not an issue of races between processors on load/store
operations, but cases where gcc has chosen to optimize away global
references entirely.Jeff
--
| Greg Kroah-Hartman | [PATCH 012/196] nozomi driver |
| Ingo Molnar | Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3 |
| Rafael J. Wysocki | [PATCH -mm 5/6] Freezer: Remove PF_NOFREEZE from bluetooth threads |
| Ingo Molnar | Re: [PATCH 00/23] per device dirty throttling -v8 |
git: | |
| David Miller | [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Natalie Protasevich | [BUG] New Kernel Bugs |
