login
Login
/
Register
Search
Header Space
Forums
News
Jobs
Blogs
Features
Man Pages
Site
Home
»
Mailing list archives
»
linux-kernel
»
2008
»
May
»
20
Re: [PATCH] JBD: Fix DIO EIO error caused by race between free buffer and commit trasanction
view
thread
Score:
Previous message: [
thread
] [
date
] [
author
]
Next message: [thread] [
date
] [
author
]
[view in full thread]
From:
Mingming Cao <cmm@...>
To: Jens Axboe <jens.axboe@...>
Cc: Andrew Morton <akpm@...>, <jack@...>, <pbadari@...>, <linux-ext4@...>, <linux-kernel@...>
Subject:
Re: [PATCH] JBD: Fix DIO EIO error caused by race between free buffer and commit trasanction
Date: Tuesday, May 20, 2008 - 1:47 pm
On Tue, 2008-05-20 at 11:30 +0200, Jens Axboe wrote:
quoted text
> On Mon, May 19 2008, Mingming Cao wrote: > > On Mon, 2008-05-19 at 13:25 -0700, Andrew Morton wrote: > > > On Mon, 19 May 2008 12:59:18 -0700 > > > Mingming Cao <cmm@us.ibm.com> wrote: > > > > > > > On Mon, 2008-05-19 at 00:37 +0200, Jan Kara wrote: > > > > > Hi, > > > > > > > > > > > This patch fixed a few races between direct IO and kjournald commit > > > > > > transaction. An unexpected EIO error gets returned to direct IO > > > > > > caller when it failed to free those data buffers. This could be > > > > > > reproduced easily with parallel direct write and buffered write to the > > > > > > same file > > > > > > > > > > > > More specific, those races could cause journal_try_to_free_buffers() > > > > > > fail to free the data buffers, when jbd is committing the transaction > > > > > > that has those data buffers on its t_syncdata_list or t_locked_list. > > > > > > journal_commit_transaction() still holds the reference to those > > > > > > buffers before data reach to disk and buffers are removed from the > > > > > > t_syncdata_list of t_locked_list. This prevent the concurrent > > > > > > journal_try_to_free_buffers() to free those buffers at the same time, > > > > > > but cause EIO error returns back to direct IO. > > > > > > > > > > > > With this patch, in case of direct IO and when try_to_free_buffers() failed, > > > > > > let's waiting for journal_commit_transaction() to finish > > > > > > flushing the current committing transaction's data buffers to disk, > > > > > > then try to free those buffers again. > > > > > If Andrew or Christoph wouldn't beat you for "inventive use" of > > > > > gfp_mask, I'm fine with the patch as well ;). You can add > > > > > Acked-by: Jan Kara <jack@suse.cz> > > > > > > > > > > > > > This is less intrusive way to fix this problem. The gfp_mask was marked > > > > as unused in try_to_free_page(). I looked at filesystems in the kernel, > > > > there is only a few defined releasepage() callback, and only xfs checks > > > > the flag(but not used). btrfs is actually using it though. I thought > > > > about the way you have suggested, i.e.clean up this gfp_mask and and > > > > replace with a flag. I am not entirely sure if it we need to change the > > > > address_space_operations and fix all the filesystems for this matter. > > > > > > > > Andrew, what do you think? Is this approach acceptable? > > > > > > > > > > <wakes up> > > > > > > Please ensure that the final patch is sufficiently well changelogged to > > > permit me to remain asleep ;) > > :-) > > > The ->releasepage semantics are fairly ad-hoc and have grown over time. > > > It'd be nice to prevent them from becoming vaguer than they are. > > > > > > It has been (approximately?) the case that code paths which really care > > > about having the page released will set __GFP_WAIT (via GFP_KERNEL) > > > whereas code paths which are happy with best-effort will clear > > > __GFP_WAIT (with a "0'). And that's reasonsable - __GFP_WAIT here > > > means "be synchronous" whereas !__GFP_WAIT means "be non-blocking". > > > > > > > This make sense to me. > > > > > Is that old convention not sufficient here as well? Two problem areas > > > I see are mm/vmscan.c and fs/splice.c (there may be others). > > > > > > > > In mm/vmscan.c we probably don't want your new synchronous behaviour > > > and it might well be deadlockable anyway. No probs, that's what > > > __GFP_FS is for. > > > > > Sure. We could check __GFP_FS and __GFP_WAIT, and that make sense. > > > > > In fs/splice.c, reading the comment there I have a feeling that you've > > > found another bug, and that splice _does_ want your new synchronous > > > behaviour? > > > > Yes, it looks like page_cache_pipe_buf_steal() expects page is free > > before removeing it by passing the GFP_KERNEL flag, but currently ext3 > > could fails to releasepage when it called. In fact > > try_to_release_page() return value is ignored in > > page_cache_pipe_buf_steal(), should probably checked the failure case. > > > > > > The other caller of try_to_release_page() in mm/splice.c is > > fallback_migrate_page(), which does want the synchronous behaviour to > > make sure buffers are dropped. > > So something like this, then? >
Acked-by: Mingming Cao <cmm@us.ibm.com>
quoted text
> diff --git a/fs/splice.c b/fs/splice.c > index 7815003..e08a2f5 100644 > --- a/fs/splice.c > +++ b/fs/splice.c > @@ -58,8 +58,8 @@ static int page_cache_pipe_buf_steal(struct pipe_inode_info *pipe, > */ > wait_on_page_writeback(page); > > - if (PagePrivate(page)) > - try_to_release_page(page, GFP_KERNEL); > + if (PagePrivate(page) && !try_to_release_page(page, GFP_KERNEL)) > + goto out_unlock; > > /* > * If we succeeded in removing the mapping, set LRU flag > @@ -75,6 +75,7 @@ static int page_cache_pipe_buf_steal(struct pipe_inode_info *pipe, > * Raced with truncate or failed to remove page from current > * address space, unlock and return failure. > */ > +out_unlock: > unlock_page(page); > return 1; > } >
--
unsubscribe notice
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to
majordomo@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html
Please read the FAQ at
http://www.tux.org/lkml/
Previous message: [
thread
] [
date
] [
author
]
Next message: [thread] [
date
] [
author
]
Messages in current thread:
[RFC] JBD ordered mode rewrite
, Jan Kara
, (Thu Mar 6, 1:42 pm)
Possible race between direct IO and JBD?
, Mingming Cao
, (Fri Apr 25, 7:38 pm)
Re: Possible race between direct IO and JBD?
, Jan Kara
, (Mon Apr 28, 8:26 am)
Re: Possible race between direct IO and JBD?
, Badari Pulavarty
, (Mon Apr 28, 1:11 pm)
Re: Possible race between direct IO and JBD?
, Jan Kara
, (Mon Apr 28, 2:09 pm)
Re: Possible race between direct IO and JBD?
, Mingming Cao
, (Mon Apr 28, 3:09 pm)
Re: Possible race between direct IO and JBD?
, Jan Kara
, (Tue Apr 29, 8:43 am)
[PATCH] jbd_commit_transaction() races with journal_try_to_d...
, Badari Pulavarty
, (Thu May 1, 11:16 am)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Jan Kara
, (Mon May 5, 1:06 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Fri May 9, 6:27 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Jan Kara
, (Mon May 12, 11:54 am)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Mon May 12, 8:39 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Jan Kara
, (Tue May 13, 10:54 am)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Tue May 13, 6:23 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Jan Kara
, (Wed May 14, 1:08 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Wed May 14, 1:41 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Jan Kara
, (Wed May 14, 2:14 pm)
[PATCH] Fix DIO EIO error caused by race between jbd_commit_...
, Mingming Cao
, (Fri May 16, 10:14 am)
Re: [PATCH] Fix DIO EIO error caused by race between jbd_com...
, Badari Pulavarty
, (Fri May 16, 1:12 pm)
[PATCH] JBD2: Fix DIO EIO error caused by race between free ...
, Mingming Cao
, (Fri May 16, 5:01 pm)
[PATCH] JBD: Fix DIO EIO error caused by race between free b...
, Mingming Cao
, (Fri May 16, 5:01 pm)
Re: [PATCH] JBD: Fix DIO EIO error caused by race between fr...
, Jan Kara
, (Sun May 18, 6:37 pm)
Re: [PATCH] JBD: Fix DIO EIO error caused by race between fr...
, Mingming Cao
, (Mon May 19, 3:59 pm)
Re: [PATCH] JBD: Fix DIO EIO error caused by race between fr...
, Andrew Morton
, (Mon May 19, 4:25 pm)
Re: [PATCH] JBD: Fix DIO EIO error caused by race between fr...
, Mingming Cao
, (Mon May 19, 6:07 pm)
[PATCH -v2] JBD2: Fix race between journal free buffer and c...
, Mingming Cao
, (Tue May 20, 2:03 pm)
[PATCH-v2] JBD: Fix race between free buffer and commit tras...
, Mingming Cao
, (Tue May 20, 2:02 pm)
[PATCH 2/2][TAKE3] JBD2: Fix race between free buffer and co...
, Mingming
, (Wed May 21, 7:39 pm)
[PATCH 1/2][TAKE3] JBD: Fix race between free buffer and com...
, Mingming
, (Wed May 21, 7:38 pm)
Re: [PATCH 1/2][TAKE3] JBD: Fix race between free buffer and...
, Andrew Morton
, (Thu May 22, 1:57 am)
Re: [PATCH-v2] JBD: Fix race between free buffer and commit ...
, Jan Kara
, (Tue May 20, 7:53 pm)
Re: [PATCH-v2] JBD: Fix race between free buffer and commit ...
, Mingming
, (Wed May 21, 1:14 pm)
Re: [PATCH-v2] JBD: Fix race between free buffer and commit ...
, Jan Kara
, (Sat May 24, 6:44 pm)
Re: [PATCH-v2] JBD: Fix race between free buffer and commit ...
, Mingming Cao
, (Wed May 28, 2:18 pm)
Re: [PATCH-v2] JBD: Fix race between free buffer and commit ...
, Jan Kara
, (Wed May 28, 2:55 pm)
[PATCH][take 5] JBD: Fix race between free buffer and commit...
, Mingming Cao
, (Wed May 28, 8:16 pm)
[PATCH][take 5] JBD2: Fix race between free buffer and commi...
, Mingming Cao
, (Wed May 28, 8:18 pm)
Re: [PATCH][take 5] JBD2: Fix race between free buffer and c...
, Aneesh Kumar K.V
, (Fri May 30, 2:24 am)
Re: [PATCH][take 5] JBD2: Fix race between free buffer and c...
, Mingming Cao
, (Fri May 30, 11:17 am)
Re: [PATCH-v2] JBD: Fix race between free buffer and commit ...
, Mingming Cao
, (Wed May 28, 8:15 pm)
Re: [PATCH] JBD: Fix DIO EIO error caused by race between fr...
, Jens Axboe
, (Tue May 20, 5:30 am)
Re: [PATCH] JBD: Fix DIO EIO error caused by race between fr...
, Mingming Cao
, (Tue May 20, 1:47 pm)
Re: [PATCH] Fix DIO EIO error caused by race between jbd_com...
, Josef Bacik
, (Fri May 16, 11:01 am)
Re: [PATCH] Fix DIO EIO error caused by race between jbd_com...
, Mingming Cao
, (Fri May 16, 1:11 pm)
Re: [PATCH] Fix DIO EIO error caused by race between jbd_com...
, Badari Pulavarty
, (Fri May 16, 1:17 pm)
Re: [PATCH] Fix DIO EIO error caused by race between jbd_com...
, Mingming Cao
, (Fri May 16, 1:30 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Fri May 16, 10:13 am)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Tue May 13, 12:37 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Mon May 12, 3:23 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Jan Kara
, (Tue May 13, 10:20 am)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Badari Pulavarty
, (Mon May 5, 8:10 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Mon May 5, 1:53 pm)
Re: [PATCH] jbd_commit_transaction() races with journal_try_...
, Mingming Cao
, (Thu May 1, 6:08 pm)
Re: Possible race between direct IO and JBD?
, Mingming Cao
, (Tue Apr 29, 1:49 pm)
Re: Possible race between direct IO and JBD?
, Andrew Morton
, (Sat Apr 26, 6:41 am)
Re: [RFC] JBD ordered mode rewrite
, Andreas Dilger
, (Fri Mar 7, 7:52 pm)
Re: [RFC] JBD ordered mode rewrite
, Jan Kara
, (Mon Mar 10, 3:54 pm)
Re: [RFC] JBD ordered mode rewrite
, Andreas Dilger
, (Mon Mar 10, 5:37 pm)
Re: [RFC] JBD ordered mode rewrite
, Christoph Hellwig
, (Sat Mar 8, 8:14 am)
Re: [RFC] JBD ordered mode rewrite
, Mingming Cao
, (Fri Mar 7, 8:08 pm)
Re: [RFC] JBD ordered mode rewrite
, Mingming Cao
, (Fri Mar 7, 6:55 am)
Re: [RFC] JBD ordered mode rewrite
, Jan Kara
, (Mon Mar 10, 2:29 pm)
Re: [RFC] JBD ordered mode rewrite
, Mark Fasheh
, (Thu Mar 6, 9:34 pm)
Re: [RFC] JBD ordered mode rewrite
, Jan Kara
, (Mon Mar 10, 2:00 pm)
Re: [RFC] JBD ordered mode rewrite
, Andrew Morton
, (Thu Mar 6, 7:53 pm)
Re: [RFC] JBD ordered mode rewrite
, Jan Kara
, (Mon Mar 10, 1:38 pm)
Re: [RFC] JBD ordered mode rewrite
, Josef Bacik
, (Thu Mar 6, 3:05 pm)
Re: [RFC] JBD ordered mode rewrite
, Jan Kara
, (Mon Mar 10, 12:30 pm)
Navigation
Create content
Mailing list archives
Recent posts
Mail archive search
Enter your search terms.
all mailing lists
alsa-devel
dragonflybsd-bugs
dragonflybsd-commit
dragonflybsd-docs
dragonflybsd-kernel
dragonflybsd-submit
dragonflybsd-user
freebsd-announce
freebsd-bugs
freebsd-chat
freebsd-cluster
freebsd-current
freebsd-drivers
freebsd-embeded
freebsd-fs
freebsd-hackers
freebsd-hardware
freebsd-mobile
freebsd-net
freebsd-performance
freebsd-pf
freebsd-security
freebsd-security-notifications
freebsd-threads
git
git-commits-head
linux-activists
linux-arm
linux-ath5k-devel
linux-btrfs
linux-c-programming
linux-driver-devel
linux-ext4
linux-fsdevel
linux-ia64
linux-input
linux-kernel
linux-kernel-janitors
linux-kernel-mentors
linux-kernel-newbies
linux-kvm
linux-net
linux-netdev
linux-newbie
linux-nfs
linux-raid
linux-scsi
linux-security-module
linux-sparse
linux-usb
linux-usb-devel
madwifi-devel
netbsd-announce
netbsd-tech-kern
openbsd-announce
openbsd-bugs
openbsd-ipv6
openbsd-misc
openbsd-security-announce
openbsd-smp
openbsd-source-changes
openbsd-tech
openfabrics-general
openmoko-community
openmoko-devel
openmoko-kernel
reiserfs-devel
tux3
ucarp
Optionally limit your search to a specific mailing list.
advanced
Popular discussions
linux-kernel
:
Dmitry Torokhov
2.6.27-rc8+ - first impressions
Linus Torvalds
Linux 2.6.27-rc8
Nick Piggin
[patch 3/6] mm: fix fault vs invalidate race for linear mappings
Alan Cox
[PATCH 00/76] Queued TTY Patches
git
:
Petr Baudis
[FYI][PATCH] Customizing the WinGit installer
Pierre Habouzit
Re: git push (mis ?)behavior
Mark Levedahl
Allowing override of the default "origin" nickname
Junio C Hamano
[PATCH] Detached HEAD (experimental)
openbsd-misc
:
Richard Stallman
Real men don't attack straw men
Luca Dell'Oca
Authenticate squid in Active Directory
Leon Dippenaar
New tcp stack attack
Nuno Magalhães
Can't scp, ssh is slow to authenticate.
linux-netdev
:
David Miller
Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock().
KOSAKI Motohiro
[bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin"
Andi Kleen
Re: [PATCH] Disable TSO for non standard qdiscs
Alexey Dobriyan
[PATCH 01/33] nf_conntrack_sip: de-static helper pointers
Latest forum posts
usb mic not detected
1 hour ago
Applications and Utilities
linux-2.6.10
2 hours ago
Linux kernel
How to detect usb device insertioin and removal event ?
5 hours ago
Linux general
Extended configuration Read/Write
6 hours ago
Windows
Add ext2 inode field
16 hours ago
Linux kernel
the kernel how to power off the machine
1 day ago
Linux kernel
struct gendisk via request_queue
1 day ago
Linux kernel
page initialization during kernel initialization
2 days ago
Linux kernel
Read Transport Layer Data form network packets (tcp/ip)
2 days ago
Linux kernel
Getting blinking screen in Fedora 9
3 days ago
Linux general
Show all forums...
Recent Tags
bugs
-rc8
2.6.27
-rc
Linus Torvalds
quote
2.6.27-rc8
Linux
Intel
more tags
Colocation donated by:
Who's online
There are currently
4 users
and
1473 guests
online.
Online users
strcmp
kingneutron
thorfinn@kernel...
Jeremy
Syndicate
speck-geostationary