[PATCH 1/1] tty: release_one_tty() forgets to put pids

Previous thread: File For Claims! by FOREIGN TRANSFER MANAGER on Saturday, March 27, 2010 - 4:07 am. (1 message)

Next thread: KVM bug, git bisected by Kent Overstreet on Saturday, March 27, 2010 - 5:43 am. (8 messages)
From: Tetsuo Handa
Date: Saturday, March 27, 2010 - 5:21 am

I got below report with 2.6.33.1 .

unreferenced object 0xde144600 (size 64):
  comm "init", pid 1, jiffies 4294678101 (age 291.508s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 04 76 ae de d1 76 43 c0 d6 08 00 00  .....v...vC.....
  backtrace:
    [<c0481704>] create_object+0x121/0x1ef
    [<c05f546b>] kmemleak_alloc+0x25/0x42
    [<c047e326>] kmemleak_alloc_recursive+0x1c/0x22
    [<c047e36e>] kmem_cache_alloc+0x42/0x68
    [<c0437701>] alloc_pid+0x19/0x288
    [<c0428acc>] copy_process+0x95a/0xdac
    [<c04290d8>] do_fork+0x129/0x261
    [<c0407de5>] sys_clone+0x1f/0x24
    [<c040292d>] ptregs_clone+0x15/0x28
    [<ffffffff>] 0xffffffff
unreferenced object 0xdfa96a40 (size 64):
  comm "login", pid 2259, jiffies 4294719437 (age 250.179s)
  hex dump (first 32 bytes):
    02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 60 39 ae de d1 76 43 c0 bb 09 00 00  ....`9...vC.....
  backtrace:
    [<c0481704>] create_object+0x121/0x1ef
    [<c05f546b>] kmemleak_alloc+0x25/0x42
    [<c047e326>] kmemleak_alloc_recursive+0x1c/0x22
    [<c047e36e>] kmem_cache_alloc+0x42/0x68
    [<c0437701>] alloc_pid+0x19/0x288
    [<c0428acc>] copy_process+0x95a/0xdac
    [<c04290d8>] do_fork+0x129/0x261
    [<c0407de5>] sys_clone+0x1f/0x24
    [<c040292d>] ptregs_clone+0x15/0x28
    [<ffffffff>] 0xffffffff

This report is generated whenever /sbin/mingetty (invoked by SysVinit's
/sbin/init in accordance with /etc/inittab) is terminated.

Steps to reproduce.

(1) Go to console.
(2) Try to login. /sbin/mingetty will invoke /bin/login . Terminate /bin/login
    process by either "successful login and logout" or "login failure".
    /sbin/mingetty process will be respawned by /sbin/init after /bin/login
    terminates.
(3) Login as root.
(4) Run "echo scan > /sys/kernel/debug/kmemleak".
(5) Wait for a while.
(6) Run "cat /sys/kernel/debug/kmemleak".

I can find this report with 2.6.31.11 ...
From: Catalin Marinas
Date: Tuesday, March 30, 2010 - 8:31 am

I reported similar leaks last year -
http://lkml.org/lkml/2009/7/8/422. There is some analysis in the thread
above of the reference counting but I couldn't figure out where it goes
wrong. It looks to me like there isn't any reference to a struct pid
block but its reference count is 2.

There is a bugzilla entry as well -
https://bugzilla.kernel.org/show_bug.cgi?id=13868

-- 
Catalin
--

From: Andrew Morton
Date: Wednesday, March 31, 2010 - 3:17 pm

On Tue, 30 Mar 2010 16:31:13 +0100

Let's bug some people by cc'ing them ;)
--

From: Oleg Nesterov
Date: Thursday, April 1, 2010 - 9:52 am

Oh. It is hardly possibly to find the unbalanced get_pid() via grep.

IIRC, I sent the debugging patch which tracks get/put pid, but I can't
recall if anybody tried this patch. Hmm, and I can't find that patch
or the previous discussion in my maildir...

Catalin, Tetsuo, any chance you can remind me if this patch was used?

Oleg.

--

From: Serge E. Hallyn
Date: Thursday, April 1, 2010 - 10:21 am

[ probably sounding like a moron, but... ]

Looking through vt_ioctl.c, I get the feeling that the ttys will
hang onto vc->vt_pid until either getting a SAK or until someone
new logs in.  I don't see where logging out will cause a reset_vc().
So when the logged in task logs out, does vt_pid keep a ref to the
pid which now no longer exists?

Came to mind bc I notice that every trace you've sent has included
/bin/login or X...

-serge
--

From: Serge E. Hallyn
Date: Thursday, April 1, 2010 - 10:33 am

In particular, if vt_ioctl is called with VT_RELDISP, then
complete_change_console() will get called, which will kill vc->vt_pid,
but it will only call reset_vc(vc) if that kill_pid failed.  reset_vc()
is needed to do our last put_pid().

I could be way off base here, but...

-serge
--

From: Oleg Nesterov
Date: Friday, April 2, 2010 - 8:29 am

Serge, Catalin, thanks a lot for the debugging output you sent
me privately ;)


Yes, my first reaction was "it must be tty" too ;)

I seem to understand what happens. If I am right, it is possible
to leak the pid even without X.

I'll try to check my theory and send the patch soon...

Oleg.

--

From: Oleg Nesterov
Date: Friday, April 2, 2010 - 9:05 am

release_one_tty(tty) can be called when tty still has a reference
to pgrp/session. In this case we leak the pid.

The patch needs the ack from someone who understand tty magic.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 drivers/char/tty_io.c |    2 ++
 1 file changed, 2 insertions(+)

--- TTT/drivers/char/tty_io.c~TTY_PID_LEAK	2010-03-17 16:00:59.000000000 +0100
+++ TTT/drivers/char/tty_io.c	2010-04-02 17:23:07.000000000 +0200
@@ -1423,6 +1423,8 @@ static void release_one_tty(struct work_
 	list_del_init(&tty->tty_files);
 	file_list_unlock();
 
+	put_pid(tty->pgrp);
+	put_pid(tty->session);
 	free_tty_struct(tty);
 }
 

--

From: Oleg Nesterov
Date: Friday, April 2, 2010 - 9:19 am

I am sorry, I forgot it needs

	Reported-by: Catalin Marinas <catalin.marinas@arm.com>

--

From: Linus Torvalds
Date: Friday, April 2, 2010 - 10:46 am

Hmm. Maybe we should have cleared this in tty_release() already. We 
already do some of the session clearing there (but we clear the session in 
the _tasks_ associated with the tty, not the tty session pointer).


I think the patch is simpler than worrying about the much more complex 
release logic. So I think I actually prefer this patch over something that 
tries to be clever in tty_release.

We might even push it into "free_tty_struct()", although I think that the 
only non-release_one_tty() callers of that are the ones that allocated the 
tty but due to some failure never connected it to anything. So on the 
whole I think you picked the right spot.

So I'll ACK it. But maybe Alan sees some problem/issue I didn't see.

			Linus
--

From: Eric W. Biederman
Date: Friday, April 2, 2010 - 11:22 am

I agree.   However we made it to release_one_tty with pids we need
to free them, before we free the tty structure itself.

My general paranoia would suggest setting the pids to NULL.  So that
we don't have the chance of a use after free.

Eric
--

From: Oleg Nesterov
Date: Friday, April 2, 2010 - 11:48 am

In this case, I don't think this is needed. We are doing
free_tty_struct()->kfree(tty) right after put_pid()s, nobody
can use these pointers or we have another bug.


Most probably this patch is correct (but perhaps it is not the best fix).
Every time tty does put_pid() it should also clear the pointer. But I am
not sure I grepped enough.

Oleg.

--

From: Oleg Nesterov
Date: Friday, April 2, 2010 - 11:43 am

Yes, probably we can free them earlier.

But I am very nervous about this change, I tried to defer put_pid()
as much as possible, in case something else uses ->prgp/session
before free_tty_struct().

Oleg.

--

From: Alan Cox
Date: Friday, April 2, 2010 - 1:09 pm

> So I'll ACK it. But maybe Alan sees some problem/issue I didn't see.

I have no idea. Someone with the time (not me right now) will have to
trace the error paths or perhaps play safe and NULL any pointers so that
they can free it in the destructor only if one was set ?

--

From: Oleg Nesterov
Date: Friday, April 2, 2010 - 9:04 am

Add cc's.


OK. I do not undertand ttys, absolutely. This means the patch
should not be applied without acks. And in fact I feel the patch
probably fixes the symptom, not the problem. But the logic in
disassociate_ctty() is beyond my understanding.

However, I think it is easy to explain the leak.

Catalin, Tetsuo, could you try this patch?

Oleg.

--

From: Tetsuo Handa
Date: Friday, April 2, 2010 - 7:40 pm

I confirmed using 2.6.34-rc3 that this patch solved the leak.

--

From: Linus Torvalds
Date: Friday, April 2, 2010 - 8:08 pm

Ok, I committed this, but after committing I realized that the patch 
didn't have the "Cc: stable@kernel.org", even if the email did.

So Greg et al: it's now commit 6da8d866d0d39e9509ff826660f6a86a6757c966 in 
mainline. 

		Linus
--

From: Greg KH
Date: Friday, April 2, 2010 - 10:15 pm

Thanks, I'll queue it up in the morning.

greg k-h
--

Previous thread: File For Claims! by FOREIGN TRANSFER MANAGER on Saturday, March 27, 2010 - 4:07 am. (1 message)

Next thread: KVM bug, git bisected by Kent Overstreet on Saturday, March 27, 2010 - 5:43 am. (8 messages)