Hi Steven,
On Sat, Apr 3, 2010 at 5:45 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
I added the RB_WARN_ON(ret == RB_PAGE_MOVED) in rb_set_head_page()
as you suggested and I think it has helped me figure out the problem.
I saw a warning triggered by this WARN_ON this morning and realized
that although we are not doing read from interrupt context, we sometimes
call ring_buffer_empty() from a timer interrupt handler that checks whether
there is new data coming in the trace buffer and if so wakes up the
user-space reader. ring_buffer_empty() calls rb_set_head_page()
that can move the head_page. As far as I understand, it should be
ok to have ring_buffer_empty() preempt a writer so I guess we should leave
that RB_WARN_ON out from rb_set_head_page(). The problem in our case
is that we use our own locking mechanism to guarantee a single reader
instead of using the cpu_buffer->reader_lock so the reader is not synchronized
with ring_buffer_empty(). So when ring_buffer_empty() is called while
we are in the process of swapping the reader_page and head_page, the
head_page pointer can point to the old head, i.e., the new reader_page,
and we will enter into an infinite loop.
I wrapped our rb_get_reader_page() calls with cpu_buffer->reader_lock
spinlock and it seems to have solved the problem.
Thank you very much for the help!
Jiaying
--