https://bugzilla.kernel.org/show_bug.cgi?id=15792 Summary: ext4_inode_inode->i_flags modification is racy Product: File System Version: 2.5 Kernel Version: 2.6.33-rc1 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: ext4 AssignedTo: fs_ext4@kernel-bugs.osdl.org ReportedBy: dmonakhov@openvz.org Regression: No Created an attachment (id=26025) --> (https://bugzilla.kernel.org/attachment.cgi?id=26025) dmesg I've got an OOPS on 8'cores host under heavy IO load -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 --- Comment #1 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-04-15 22:17:54 --- Created an attachment (id=26026) --> (https://bugzilla.kernel.org/attachment.cgi?id=26026) xfstest testcase patch This is trivial but proven to be useful fsstress based test-case. i'm run it like follows while true; do ./check 227 || break ; done -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 --- Comment #2 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-04-15 22:31:55 --- The oops happens because non tested branch was triggered. ./fs/ext4/extents.c 3477: if (unlikely(EXT4_I(inode)->i_flags & EXT4_EOFBLOCKS_FL)) { if (unlikely(!eh->eh_entries)) { EXT4_ERROR_INODE(inode, "eh->eh_entries == 0 ee_block %d", ex->ee_block); ### OOPS here because ex == NULL ^^^^^^^^^^^^^^^^^^^^^^^^ err = -EIO; goto out2; } Bug was introduced by following commit: commit 273df556b6ee2065bfe96edab5888d3dc9b108d8 Author: Frank Mayhar <fmayhar@google.com> Date: Tue Mar 2 11:46:09 2010 -0500 And in fact it is rather trivial to fix. But most interesting question what the hell we are doing on that error path? inode has EXT4_EOFBLOCKS_FL flag enabled but eh->eh_entries, and in fact after adding more debug information i've found that inode is simply blockless. i_blocks == 0, i_size == 0. I've collected per-inode i_flag modification history (see an attachment) -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 --- Comment #3 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-04-15 22:39:15 --- Created an attachment (id=26027) --> (https://bugzilla.kernel.org/attachment.cgi?id=26027) i_flags modification history -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #26027|application/octet-stream |text/plain mime type| | -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 --- Comment #4 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-04-15 22:52:22 --- Most interesting part is last lines [61] [2] truncate clr s:737987 d:737987 b:104 fl:80000 [62] [0] clr ext4_inode_blocks_set b:104 fl:480000 bit:40000 [63] [2] trunc_ext begin s:737987 d:737987 b:104 fl:480000 ml:1 I.E. CPU2: is doing EXT4_I(inode)->i_flags &= ~EXT4_EOFBLOCKS_FL CPU0: is doung ei->i_flags &= ~EXT4_HUGE_FILE_FL CPU2: Wow EXT4_EOFBLOCKS_FL appear again due to race with cpu0. So even if truncate holds i_mutex it is possible to modify i_flags. Seems that we have to modify i_flags via anomic bits operations. A fix is almost ready. Currently i'm testing it. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|ext4_inode_inode->i_flags |ext4_inode_info->i_flags |modification is racy |modification is racy -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 --- Comment #5 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-04-19 03:42:20 --- Ohh.. one more thing, which is not actually related with the name of the bug, But this peace of code was already mention here. Let's just document this here. fs/ext4/extents.c 3288:int ext4_ext_get_blocks(...) { .... 3477: if (unlikely(EXT4_I(inode)->i_flags & EXT4_EOFBLOCKS_FL)) { 3478: if (unlikely(!eh->eh_entries)) { 3479: EXT4_ERROR_INODE(inode, 3480: "eh->eh_entries == 0 ee_block %d", 3481: ex->ee_block); 3482: err = -EIO; 3483: goto out2; 3484: } 3485: last_ex = EXT_LAST_EXTENT(eh); 3486: if (iblock + ar.len > le32_to_cpu(last_ex->ee_block) 3487: + ext4_ext_get_actual_len(last_ex)) ##### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ##### If depth > 0 then last_ex may not be the latest extent of the file. ##### because "eh" is just a one of leafs(writing at the middle of a file) ##### We have to find latest extent by traversing a whole path again but ##### always chose the latest eh,ex. 3488: EXT4_I(inode)->i_flags &= ~EXT4_EOFBLOCKS_FL; for ( 3489: } -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 --- Comment #6 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-04-19 14:38:10 --- Proposed patches http://patchwork.ozlabs.org/patch/50468/ http://patchwork.ozlabs.org/patch/50469/ My hosts survived reasonable amount of time with these patches. If everybody agree with the first patch, I'll prepare similar one for ext3. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15792 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |CODE_FIX -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
