oops during unmount - ext3? (2.6.27-rc5)

Previous thread: CD RW packet driver still oops in 2.6.27-rc5-git5 by Eric Valette on Thursday, September 4, 2008 - 12:05 pm. (2 messages)

Next thread: Please pull ACPI fixes for 2.6.27 by Andi Kleen on Thursday, September 4, 2008 - 12:26 pm. (1 message)
From: Marcin Slusarz
Date: Thursday, September 4, 2008 - 12:14 pm

Hi
2 days ago 2.6.27-rc5 oopsed on halt with this call trace:

dispose_list
invalidate_inodes
generic_shutdown_super
kill_block_super
? deactivate_super
mntput_no_expire
sys_umount
system_call_fastpath

Code: f8 ff 48 89 df e8 bd 19 01 00 48 83 bb 90 02 00 00 00 74 04 0f 0b eb fe 48 8b 83 b8 03 00 00 a8 20 75 04 0f 0b eb fe a8 40 74 04 <0f> 0b eb fe 48 c7 c7 7a a0 57 80 be 56 00 00 00 e8 56 31 f8 ff

RIP clear_inode

Output of decodecode:
/tmp/tmp.To8z8HQ0uE.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0:   f8                      clc
   1:   ff 48 89                decl   -0x77(%rax)
   4:   df e8                   fucomip %st(0),%st
   6:   bd 19 01 00 48          mov    $0x48000119,%ebp
   b:   83 bb 90 02 00 00 00    cmpl   $0x0,0x290(%rbx)
  12:   74 04                   je     0x18
  14:   0f 0b                   ud2a
  16:   eb fe                   jmp    0x16
  18:   48 8b 83 b8 03 00 00    mov    0x3b8(%rbx),%rax
  1f:   a8 20                   test   $0x20,%al
  21:   75 04                   jne    0x27
  23:   0f 0b                   ud2a
  25:   eb fe                   jmp    0x25
  27:   a8 40                   test   $0x40,%al
  29:   74 04                   je     0x2f

/tmp/tmp.To8z8HQ0uE.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0:   0f 0b                   ud2a
   2:   eb fe                   jmp    0x2
   4:   48 c7 c7 7a a0 57 80    mov    $0xffffffff8057a07a,%rdi
   b:   be 56 00 00 00          mov    $0x56,%esi
  10:   e8 56 31 f8 ff          callq  0xfffffffffff8316b

You can see partial screenshot and .config at http://www.kadu.net/~joi/kernel/2008.09.04/

It might be related to http://lkml.org/lkml/2008/9/3/405 - I'm not sure.
2 bugs related to VFS and/or ext3 in 2 days (I'm running .27 since rc1)

Marcin
--

From: Marcin Slusarz
Date: Sunday, September 7, 2008 - 4:27 am

Another one:

* Deactivating swap
* Unmounting filesystems
general protection fault: 0000 [1] PREEMPT
CPU 0
Modules linked in: af_packet usbhid tuner tea5767 tda8290 tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761 tda9875 uhci_hcd ehci_hcd usbcore bttv ir_common compat_ioctl32 ac97_bus videodev v4l1_compat v4l2_common videobuf_dma_sg videobuf_core btcx_risc tveeprom i2c_viapro soundcode [last unloaded: snd_page_alloc]
Pid: 10420, comm: umount Not tainted 2.6.27-rc5 #362
RIP: 0010:[<ffffffff802f0770>] [<ffffffff802f0770>] journal_invalidatepage+0x4d/0x373
RSP: 0018:ffff88003c923c68  EFLAGS: 00010203
RAX: 0000000005200000 RBX: 1000c20d02020000 RCX: 000000000000003c
RDX: 0000000000000002 RSI: ffff88000000ff30 RDI: ffff880001001340
RBP: ffff88003c923db8 R08: ffff88003c923cd8 R09: 0000000000000001
R10: ffff880035bc37b0 R11: ffff88003a0ca828 R12: 0000000000026358
R13: 0000000000000001 R14: ffff88003cddf4f8 R15: 0000000000000001
FS:  00007f2ce7744750(0000) GS:ffffffff80623200(0000) knlGS:00000000f74d56d0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000015c70d0 CR3: 000000003cd0e000 CD4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process umount (pid: 10420, threadinfo ffff88003c922000, task ffff88003a8ca140)
Stack:  0000000000000000 ffffe20000294098 1000c20d02020000 0520000000000001
 ffff88000000ff30 ffffe20000294098 0000000000026358 0000000000000001
 0000000000026358 ffff880035bc3798 ffff88003c923cc8 ffffffff802e2da7
Call Trace:
 [<ffffffff802e2da7>] ext3_invalidatepage+0x3c/0x3e
 [<ffffffff80272ad0>] do_invalidatepage+0x28/0x2a
 [<ffffffff80272f7a>] truncate_complete_page+0x2e/0x53
 [<ffffffff8027307d>] truncate_inode_pages_range+0xde/0x36b
 [<ffffffff8027331c>] truncate_inode_pages+0x12/0x16
 [<ffffffff802a4f62>] dispose_list+0x55/0x103
 [<ffffffff802a5301>] invalidate_inodes+0xe9/0x107
 [<ffffffff802932fd>] generic_shutdown_super+0x3f/0xfd
 ...
From: Marcin Slusarz
Date: Sunday, September 7, 2008 - 4:47 am

Little correction (at the end):
After correction:
/tmp/tmp.W6DvY3Lbtg.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0:   8b 06                   mov    (%rsi),%eax
   2:   a8 01                   test   $0x1,%al
   4:   75 04                   jne    0xa
   6:   0f 0b                   ud2a
   8:   eb fe                   jmp    0x8
   a:   f6 c4 08                test   $0x8,%ah
   d:   0f 84 2f 03 00 00       je     0x342
  13:   48 8b 45 b8             mov    -0x48(%rbp),%rax
  17:   48 8b 40 10             mov    0x10(%rax),%rax
  1b:   c7 45 c8 01 00 00 00    movl   $0x1,-0x38(%rbp)
  22:   48 89 45 d0             mov    %rax,-0x30(%rbp)
  26:   48 89 c3                mov    %rax,%rbx
  29:   31 c0                   xor    %eax,%eax

/tmp/tmp.W6DvY3Lbtg.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <.text>:
   0:   8b 53 20                mov    0x20(%rbx),%edx
   3:   01 c2                   add    %eax,%edx
   5:   89 c0                   mov    %eax,%eax
   7:   48 39 45 b0             cmp    %rax,-0x50(%rbp)
   b:   89 55 cc                mov    %edx,-0x34(%rbp)
   e:   48 8b 53 08             mov    0x8(%rbx),%rdx
  12:   48                      rex.W
  13:   89                      .byte 0x89
  14:   55                      push   %rbp
--

From: Jan Kara
Date: Monday, September 8, 2008 - 9:02 am

Hmm, from this disassembly it seems that somebody has overwritten our
page->private pointer to 1000c20d02020000 and then we obviously failed
to get bh->b_size. But I don't really see how this can happen. What also
puzzles me a bit is that I don't see BUG_ON(!PagePrivate(page)) in the
disassembly but it should be there because of page_buffers()
implementation... Anyone has an idea?

								Honza
-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs
--

Previous thread: CD RW packet driver still oops in 2.6.27-rc5-git5 by Eric Valette on Thursday, September 4, 2008 - 12:05 pm. (2 messages)

Next thread: Please pull ACPI fixes for 2.6.27 by Andi Kleen on Thursday, September 4, 2008 - 12:26 pm. (1 message)