I tried several ways of working around the bug, even tried
implementing it with kernel threads and protecting global data with
mutexes. Therefor I know that I have the same problem with mutexes. I
just created a simple example that showed the problem quickly, this
does not mean that this is the only case that does not work.
BTW: I am hacking around in the PREEMPT-RT kernel for years now, I
know the history very well, and I know what I am doing... Please, do
not find holes in the example I quickly hacked together, please focus
on the OOPS message, and help me figuring out what is causing this. I
can give you other examples of code that shows the same problem. But,
basically, every call that blocks with _inturruptible() on a rt-mutex
beneath the surface in the context of a user space process (and
receives a signal during the block) shows the same problem.
Okay, sounds fair. But: the current implementation does counting the
number of up's and down's, suggesting that it really behaves like a
semaphore. It only does some special things during the transition of
the counter from 0->1 and from 1->0. If this counting is illegal use
of the mutex mechanism, it should report (compile) errors if:
* sema_init is used on 'struct semaphore' -> init_MUTEX() must be used instead.
* sema_init should only be used on 'struct compat_semaphore' types.
* calls to up() and down() in a row should report a BUG message,
* if up() is called from a different thread than the down() it should
report a BUG message. Further, the counting up's and down's are not
allowed on struct semphore types, so it should be removed from the
code.
* PI should only take place if it is for 100% sure that the 'struct
semaphore' is used as a mutex. And this is only the case when it is
initialised with init_MUTEX().
So, because all these items are not there, I doubt it is really true
that it is illegal to use 'struct semaphore' types as counting
semaphores across multiple threads. BESIDES: Everything works fine
UNTIL a signal is generated during a block on the semaphore. I think
Ingo tried to make the 'struct semaphore' type to behave like the
non-RT kernel 'struct semaphore', which actually does NOT show this
problem wtih my example driver!!!
So, this is a regression if exactly the same driver is used in both
non-preempt-rt patched kernel and preempt-rt patched kernels.
According to code there is a counting mechanism there, which suggest
that this is allowed to do. It works fine, until a signal arrives.the
SIGNAL is the only problem here!
I hope you get my point now. Other mechanisms like ordinary rt-mutexes
show the same problem, so either case: Please help me figuring out
which bug the signal is triggering here.
Kind Regards,
Remy Bohmer
2007/11/17, Steven Rostedt <rostedt@goodmis.org>:
-