login
Header Space

 
 

Re: Possible race between direct IO and JBD?

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Jan Kara <jack@...>
Cc: Badari Pulavarty <pbadari@...>, <akpm@...>, <linux-ext4@...>, <linux-kernel@...>
Date: Tuesday, April 29, 2008 - 1:49 pm

On Tue, 2008-04-29 at 14:43 +0200, Jan Kara wrote:

Thanks, I saw this piece of code after I post it.

 here are some details:

The customer workload involves direct IO and buffered IO. The saw EIO
gets returned without any log messages. Initial probing via SystemTap
shows:

drop_buffers returns 0
try_to_free_buffers returns 0
try_to_release_page returns 0
drop_buffers returns 0
try_to_free_buffers returns 0
try_to_release_page returns 0

drop_buffers returns 0
try_to_free_buffers returns 0
try_to_release_page returns 0
invalidate_inode_pages2_range returns -5 (EIO)

drop_buffers returns 0
try_to_free_buffers returns 0
try_to_release_page returns 0

Which indicating that the EIO is from the
invalidate_inode_pages2_range(), which tries to free buffers but
failed. 

We will try to add more debug information. Thanks for the suggestions.


However, Since ext3 has releasepge method defined, so the
try_to_free_buffer() failure should from
try_to_release_page()->ext3_releasepage()->journal_try_to_free_buffers()->try_to_free_buffer(), instead of try_to_release_page() calling try_to_free_buffer() directly.

If journal_try_to_free_buffers() calls try to free_buffer(), that means
the journal head is already successfully removed by
journal_remove_journal_head(), so buffer_jbd() safty checking after it
is false as expected. Otherwise try_to_free_buffer() won't be called.
In that case,  I am not sure if it is possible to have race with commit
code?? we seems have j_list_lock protected when
__journal_try_to_free_buffer() is trying to take the buffer off the
list.

There are many other try_to_release_page() failure before the DIO EIO,
not sure where those coming from.

Fortunately Badari is able to reproduce this problem via simple buffered
write and direct write to the same file on 2.6.25-git12. We could add
more debug info there to see if we could get the counter and the jh
values out when try to free a busy buffer.


--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[RFC] JBD ordered mode rewrite, Jan Kara, (Thu Mar 6, 1:42 pm)
Possible race between direct IO and JBD?, Mingming Cao, (Fri Apr 25, 7:38 pm)
Re: Possible race between direct IO and JBD?, Jan Kara, (Mon Apr 28, 8:26 am)
Re: Possible race between direct IO and JBD?, Badari Pulavarty, (Mon Apr 28, 1:11 pm)
Re: Possible race between direct IO and JBD?, Jan Kara, (Mon Apr 28, 2:09 pm)
Re: Possible race between direct IO and JBD?, Mingming Cao, (Mon Apr 28, 3:09 pm)
Re: Possible race between direct IO and JBD?, Jan Kara, (Tue Apr 29, 8:43 am)
Re: Possible race between direct IO and JBD?, Mingming Cao, (Tue Apr 29, 1:49 pm)
Re: Possible race between direct IO and JBD?, Andrew Morton, (Sat Apr 26, 6:41 am)
Re: [RFC] JBD ordered mode rewrite, Andreas Dilger, (Fri Mar 7, 7:52 pm)
Re: [RFC] JBD ordered mode rewrite, Jan Kara, (Mon Mar 10, 3:54 pm)
Re: [RFC] JBD ordered mode rewrite, Andreas Dilger, (Mon Mar 10, 5:37 pm)
Re: [RFC] JBD ordered mode rewrite, Christoph Hellwig, (Sat Mar 8, 8:14 am)
Re: [RFC] JBD ordered mode rewrite, Mingming Cao, (Fri Mar 7, 8:08 pm)
Re: [RFC] JBD ordered mode rewrite, Mingming Cao, (Fri Mar 7, 6:55 am)
Re: [RFC] JBD ordered mode rewrite, Jan Kara, (Mon Mar 10, 2:29 pm)
Re: [RFC] JBD ordered mode rewrite, Mark Fasheh, (Thu Mar 6, 9:34 pm)
Re: [RFC] JBD ordered mode rewrite, Jan Kara, (Mon Mar 10, 2:00 pm)
Re: [RFC] JBD ordered mode rewrite, Andrew Morton, (Thu Mar 6, 7:53 pm)
Re: [RFC] JBD ordered mode rewrite, Jan Kara, (Mon Mar 10, 1:38 pm)
Re: [RFC] JBD ordered mode rewrite, Josef Bacik, (Thu Mar 6, 3:05 pm)
Re: [RFC] JBD ordered mode rewrite, Jan Kara, (Mon Mar 10, 12:30 pm)
speck-geostationary