Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linux 2.6.34-rc3)

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Linus Torvalds
Date: Tuesday, April 6, 2010 - 12:35 pm

On Tue, 6 Apr 2010, Linus Torvalds wrote:

I _have_ found what looks like a few clues, though.

In particular, the disassembly in Steinar Gunderson's case looks much more 
like the disassembly I get, and if I read that correctly, it's actually 
the _first_ iteration of the for_each_entry() loop that crashes.

Why do I think so?

In Steinar's oops, we have "RAX: ffff880169111fc8", which is clearly a 
kernel pointer. However, the code from Steinar's oops decodes to:

   0:	3b 56 10             	cmp    0x10(%rsi),%edx
   3:	73 1e                	jae    0x23
   5:	48 83 fa f2          	cmp    $0xfffffffffffffff2,%rdx
   9:	74 18                	je     0x23
   b:	4d 89 f8             	mov    %r15,%r8
   e:	48 8d 4d cc          	lea    -0x34(%rbp),%rcx
  12:	4c 89 e7             	mov    %r12,%rdi
  15:	e8 44 f2 ff ff       	callq  0xfffffffffffff25e
  1a:	41 01 c5             	add    %eax,%r13d
  1d:	83 7d cc 00          	cmpl   $0x0,-0x34(%rbp)
  21:	74 19                	je     0x3c
  23:	48 8b 43 20          	mov    0x20(%rbx),%rax
  27:	48 8d 58 e0          	lea    -0x20(%rax),%rbx
  2b:*	48 8b 43 20          	mov    0x20(%rbx),%rax     <-- trapping instruction
  2f:	0f 18 08             	prefetcht0 (%rax)
  32:	48 8d 43 20          	lea    0x20(%rbx),%rax
  36:	48 39 45 88          	cmp    %rax,-0x78(%rbp)
  3a:	75 a7                	jne    0xffffffffffffffe3
  3c:	41 fe 06             	incb   (%r14)
  3f:	e9                   	.byte 0xe9

which matches my code pretty well, and the point is, _if_ it went through 
the loop, then %rbx should be %rax+20. And it's not.

IOW, the code you see above before the trapping instruction is the end of 
the loop: it's the

		referenced += page_referenced_one(page, vma, address,
				&mapcount, vm_flags);
		if (!mapcount)
			break;
	}

part (the "callq" and "add %eax" is that "referenced +=", and %r13d is 
"referenced").

What you cannot see from the code decode is the loop setup and _entry_, 
which looks like this for me:

        movl    12(%rbx), %eax  # <variable>.D.11299._mapcount.counter, D.33294
        xorl    %r12d, %r12d    # referenced
        incl    %eax    # tmp89
        movl    %eax, -52(%rbp) # tmp89, mapcount
        leaq    48(%r14), %rax  #,
        movq    48(%r14), %r13  # <variable>.head.next, <variable>.head.next
        movq    %rax, -128(%rbp)        #, %sfp
        subq    $32, %r13       #, avc
        jmp     .L167   #

where that "L167" is actually the oopsing instruction (ie the "while" loop 
has been turned around, and we jump to the end of the loop that does the 
loop end test).

In other words, what is NULL here is not an anon_vma_chain entry, but  
actually the initial "anon_vma->head.next" pointer.

The whole _head_ of the list has never been initialized, in other words.

So we can entirely ignore the 'anon_vma_chain' issues. We need to look at 
the initializations of the 'anon_vma's themselves.

			Linus
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Linux 2.6.34-rc3, Linus Torvalds, (Tue Mar 30, 10:50 am)
[Regression, post-rc2] Commit a5ee4eb7541 breaks OpenGL on ..., Rafael J. Wysocki, (Tue Mar 30, 2:16 pm)
Re: [Regression, post-rc2] Commit a5ee4eb7541 breaks OpenG ..., Rafael J. Wysocki, (Wed Mar 31, 6:13 pm)
Re: [Regression, post-rc2] Commit a5ee4eb7541 breaks OpenG ..., Rafael J. Wysocki, (Thu Apr 1, 12:46 pm)
Re: [Regression, post-rc2] Commit a5ee4eb7541 breaks OpenG ..., Rafael J. Wysocki, (Sat Apr 3, 12:33 pm)
[PATCH] rmap: fix anon_vma_fork() memory leak, Rik van Riel, (Sun Apr 4, 4:09 pm)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Minchan Kim, (Sun Apr 4, 4:56 pm)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Linus Torvalds, (Mon Apr 5, 8:37 am)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Minchan Kim, (Mon Apr 5, 8:48 am)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Rik van Riel, (Mon Apr 5, 9:04 am)
[PATCH -v2] rmap: fix anon_vma_fork() memory leak, Rik van Riel, (Mon Apr 5, 9:13 am)
[No subject], Rik van Riel, (Tue Apr 6, 7:34 am)
[No subject], Rik van Riel, (Tue Apr 6, 7:38 am)
[No subject], Minchan Kim, (Tue Apr 6, 8:34 am)
[No subject], Rik van Riel, (Tue Apr 6, 8:40 am)
[No subject], Linus Torvalds, (Tue Apr 6, 8:55 am)
[No subject], Minchan Kim, (Tue Apr 6, 8:58 am)
[No subject], Minchan Kim, (Tue Apr 6, 9:23 am)
[No subject], Linus Torvalds, (Tue Apr 6, 9:28 am)
[No subject], Linus Torvalds, (Tue Apr 6, 9:32 am)
[No subject], Minchan Kim, (Tue Apr 6, 9:45 am)
[No subject], Linus Torvalds, (Tue Apr 6, 9:53 am)
[No subject], Minchan Kim, (Tue Apr 6, 9:54 am)
[No subject], Rik van Riel, (Tue Apr 6, 10:04 am)
[No subject], Borislav Petkov, (Tue Apr 6, 10:05 am)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Steinar H. Gunderson, (Tue Apr 6, 12:10 pm)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Linus Torvalds, (Tue Apr 6, 12:35 pm)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Steinar H. Gunderson, (Tue Apr 6, 1:46 pm)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Steinar H. Gunderson, (Tue Apr 6, 2:05 pm)
[PATCH 1/3] mm: make page freeing path RCU-safe, Borislav Petkov, (Sun Apr 11, 6:19 am)
[PATCH 2/3] mm: cleanup find_mergeable_anon_vma complexity, Borislav Petkov, (Sun Apr 11, 6:19 am)
[PATCH 3/3] mm: fixup vma_adjust, Borislav Petkov, (Sun Apr 11, 6:19 am)
[PATCH 2/3] mm: cleanup find_mergeable_anon_vma complexity, Borislav Petkov, (Sun Apr 11, 6:25 am)
[PATCH 2/4] vma_adjust: fix the copying of anon_vma chains, Linus Torvalds, (Mon Apr 12, 1:23 pm)