>
> spin_lock_irqsave(&base->cpu_base->lock, *flags);
>
> next event on cpu_4 :
>
> ftrace_entry: 144.034294878 (/tmp/traceirq7/cpu_4), 0, 0,
> swapper, , 0, 0x0, IRQ
> { ip = 0xFFFFFFFF802595EC <remove_hrtimer+0x1c>,
> parent_ip = 0xFFFFFFFF802597D0 <hrtimer_start+0x40> }
>
> So, what my researchs point me to is the base->cpu_base->lock being
> contended for 60 ms. More to come about the events surrounding this area
> on other CPUs.
>
> Therefore, I dump the trace for all the lock events linked to the
> 0xffff880028176c38 lock for the time period around the contention. I get
> the following event for lock release on CPU 4. I have put the previous
> lock release from CPU 2 occurring just before. Actually, I see that CPU
> 2 takes and releases this lock many times while cpu 4 contends for it (15
> acquire/release cycles).
>
> locking_lock_release: 144.030594439 (/tmp/traceirq7/cpu_2), 0, 0,
> swapper, , 0, 0x0, SOFTIRQ
> { retaddr = 0xFFFFFFFF802595B8 <hrtimer_run_pending+0xc8>,
> lock = 0xffff880028176c38, nested = 1 }
> locking_lock_release: 144.034296600 (/tmp/traceirq7/cpu_4), 0, 0,
> swapper, , 0, 0x0, IRQ
> { retaddr = 0xFFFFFFFF80258FD0 <switch_hrtimer_base+0x70>,
> lock = 0xffff880028176c38, nested = 1 }
>
> CPU 2 acquire/release cycles are identical :
>
> locking_lock_acquire: 143.954580205 (/tmp/traceirq7/cpu_2), 0, 0,
> swapper, , 0, 0x0, SOFTIRQ
> { retaddr = 0xFFFFFFFF80259532 <hrtimer_run_pending+0x42>,
> subclass = 0, lock = 0xffff880028176c38, trylock = 0 }
> locking_lock_release: 143.954581990 (/tmp/traceirq7/cpu_2), 0, 0,
> swapper, , 0, 0x0, SOFTIRQ
> { retaddr = 0xFFFFFFFF802595B8 <hrtimer_run_pending+0xc8>,
> lock = 0xffff880028176c38, nested = 1 }
>
> It therefore seems that
>
> run_hrtimer_pending() iterates on 15 cpu_base->cb_pending callbacks,
> which takes a while. Note that the fact that the cpu_base->lock is
> released between each callback does not seem to give the control back to
> the contended CPU 4 probably because it runs udelay, given that spinlock
> debugging options are enabled.
>
> So let's see which functions CPU 2 is calling during the CPU 4
> contention period :
>
> here is how I did it :
> ./lttv -m textDump -t /tmp/traceirq7 \
> -e "event.time>=143.974542821&event.time<=144.034296600&state.cpu=2& \
> (event.name=locking_lock_acquire|event.name=locking_lock_release| \
> event.name=ftrace_entry)"
>
> The output, rather long, shows the functions called with the spinlock
> held. I made it available at the following address along with the kernel
> config (for 2.6.27-rc2) and the vmlinux image (to resolve the symbols)
>
http://ltt.polymtl.ca/~compudj/60msdebug2/
>
> In the end, it seems that having a delay in __spin_lock_debug with
> CONFIG_DEBUG_SPINLOCK configs behaves badly when a lock is taken
> repeatedly; it behaves as if the spinlock is contended without being
> released, which can cause long contention periods affecting, amongst
> others, interrupt handlers. Given the TSC are synchronized on this
> computer, the responsible for such long delays between spinlock loops
> seems to be arch/x86/lib/delay.c:delay_tsc().
>
> Maybe adding some fairness to debug spinlock code would be welcome ?