Re: [PATCH] evdev: Release eventual input device grabs when getting disconnected

Previous thread: some PCI reads makes machine lock up by Pierre Ossman on Sunday, March 30, 2008 - 11:39 am. (1 message)

Next thread: The never ending BEEEEP/__smp_call_function_mask with 2.6.25-rc7 by Chr on Sunday, March 30, 2008 - 12:09 pm. (23 messages)
From: Björn
Date: Sunday, March 30, 2008 - 11:42 am

When getting disconnected we need to release eventual grabs on the
underlying input device as we also release the input device itself.
Otherwise, we would try to release the grab when the client that
requested it closes its handle, accessing the input device which
might already be freed.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
---
I can't reproduce the bug on my UP box and currently can't afford
crashing my SMP box (all the oopses seem to come from SMP kernels, so I
guess it needs SMP to crash), so while this doesn't show any new
problems, I can't tell whether it actually fixes anything. Testers
welcome!

 drivers/input/evdev.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 0727b0a..99562ce 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -853,6 +853,9 @@ static void evdev_cleanup(struct evdev *evdev)
 	evdev_hangup(evdev);
 	evdev_remove_chrdev(evdev);
 
+	if (evdev->grab)
+		evdev_ungrab(evdev, evdev->grab);
+
 	/* evdev is marked dead so no one else accesses evdev->open */
 	if (evdev->open) {
 		input_flush_device(handle, NULL);
-- 
1.5.5.rc2
--

From: Linus Torvalds
Date: Sunday, March 30, 2008 - 2:51 pm

Ok, I applied this because I will do an -rc8 today or tomorrow, but I 
really really hope somebody can figure out what made this all start to 
trigger. It does smell like some core device layer change, because we do 
not seem to have a lot of changes since 2.6.24 in evdev.c and input.c that 
seem relevant.

Greg, are there any refcounting changes that would cause the input devices 
to be free'd earlier or something?

		Linus
--

From: Greg KH
Date: Sunday, March 30, 2008 - 3:22 pm

Earlier?  No, not that I know of at all, as long as the reference
counting logic was correct originally.  All of the problems we have been
fixing were ones where we accidentally were grabbing too many references
and then wondering why things were not getting cleaned up properly as
the kobject rework exposed these problems making them more obvious.

I haven't heard of the opposite happening.  Anything that I can try to
test for here, I have a lot of removable input devices to test with.

thanks,

greg k-h
--

From: Linus Torvalds
Date: Sunday, March 30, 2008 - 3:35 pm

Ahh, you weren't cc'd earlier, and may have missed the discussion. It's on 
Linux-kernel in the thread called "Oops/Warning report for the week of 
March 28th 2008"

Johannes says this triggers for him:

	On the other hand, it should be easily reproducible by anyone else with
	the same trick, here's what I do:
	
	 * configure X to use /dev/input/event* devices
	 * in an xterm, do something like
	      rmmod usbhid ; modprobe usbhid
	 * switch to a VT
	 * watch kernel crash as X releases the grab on the event device

and I haven't tried it because none of my normal machines use modules, and 
I was lazy and really hoping somebody else who actually knows the device 
and input layers would take a peek.

		Linus


--

From: Björn
Date: Sunday, March 30, 2008 - 3:42 pm

Not freeing the input device at all would of course also hide any
access-after-free problems :-) So if that's the case, that might explain
the sudden exposure of the problem. IMHO, my patch is the right thing to
do anyway, because releasing a grab on the underlying input device from
within evdev clearly needs to happen before we release that device. So
AFAICT we're really just looking for "why do we see that bug now?" and

Sorry, forgot to set the In-Reply-To header when sending the patch. The
original thread, with a reproducing recipe is here:
http://lkml.org/lkml/2008/3/28/442
Message-Id: <1206742499.22530.90.camel@johannes.berg>

Björn
--

From: Arjan van de Ven
Date: Sunday, March 30, 2008 - 4:19 pm

one note.. you do want to enable slab poison, just to catch the bug right away...
--

From: Dmitry Torokhov
Date: Monday, March 31, 2008 - 1:46 pm

Hi Bjorn,


If device is being disconnected (rdestroyed) then we dont really need to
release grab since there won't be any input events coming through anyway,
so there is no "another bug". I am considering removing the call to
release device once we sort out the issue with lifetime rules change,
since it is not needed.

-- 
Dmitry
--

From: Dmitry Torokhov
Date: Sunday, March 30, 2008 - 11:15 pm

Hi Linus,


The following commit changed lifetime runes on kobjects breaking input:

commit 0f4dafc0563c6c49e17fe14b3f5f356e4c4b8806
Author: Kay Sievers <kay.sievers@vrfy.org>
Date:   Wed Dec 19 01:40:42 2007 +0100

    Kobject: auto-cleanup on final unref

    We save the current state in the object itself, so we can do proper
    cleanup when the last reference is dropped.

    If the initial reference is dropped, the object will be removed from
    sysfs if needed, if an "add" event was sent, "remove" will be send, and
    the allocated resources are released.

    This allows us to clean up some driver core usage as well as allowing us
    to do other such changes to the rest of the kernel.

    Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Before we dropped reference to kobject's parent only when child kobject
was released (in kobject_cleanup). The changeset above moves the release
to kobject_del() which is way too early in my opinion. The kobject is only
marked for deletion at that time, not really deleted.

I will look how to properly fix evdev and the rest of input interfaces
tomorrow.

-- 
Dmitry
--

From: Greg KH
Date: Monday, March 31, 2008 - 10:28 am

It was "deleted" from sysfs, and should have never been used again by
any callers.  If the reference count was dropped to zero with this call,
it would be cleaned up as well, it seems that you were assuming that it
would not be?  Perhaps you just need to grab another reference as this
would have caused you problems without this change anyway, but without

If you need any help, please let me know.

thanks,

greg k-h
--

From: Dmitry Torokhov
Date: Monday, March 31, 2008 - 11:01 am

Greg, please look at the change again. Before kobject_put(kobj->parent)
was done in kobject_cleanup() and so the parent would only be freed when
all its children are gone. Now parent is deleted early, even if its
children are still referenced by other users. This is lifetime rule
change and should really be announced as such.

If this change it intentional and is here to stay then I will just grab
the references myself, although I wonder what else might be broken by
it.

-- 
Dmitry
--

From: Linus Torvalds
Date: Monday, March 31, 2008 - 11:24 am

I do agree that this might want reverting, unless there is some rally good 
reason for it. People may have pefectly valid reasons to expect topology 
and reachability to remain valid - it's certainly what we guarantee in the 
VFS code for similar rules (ie the parent of a dentry is only free'd after 
all children have gone away).

Greg, is it possible to get the old lifetime rules back wrt his? They seem 
valid and sane..

			Linus
--

From: Benjamin Herrenschmidt
Date: Monday, March 31, 2008 - 4:12 pm

Looks like we are seeing something similar with suspend, I just got this
oops log. I think what happens is that appletouch suspend causes it to
disconnect and then X console switches or closes the evdev, whatever,
kaboom ...

sd 0:0:0:0: [sda] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Result: hostbyte=0x01 driverbyte=0x00
usbcore: deregistering interface driver appletouch
input: appletouch disconnected
PM: Syncing filesystems ... done.
Oops: Kernel access of bad area, sig: 11 [#1]
PowerMac
Modules linked in: sg sd_mod binfmt_misc appletalk psnap llc hci_usb radeon drm rfcomm l2cap bluetooth cpufreq_userspace cpufreq_powersave cpufreq_ondemand cpufreq_conservative xt_tcpudp nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables fuse i2c_dev therm_adt746x sr_mod sbp2 apm_emu apm_emulation arc4 ecb snd_aoa_codec_tas snd_aoa_fabric_layout snd_aoa b43 mac80211 joydev pcmcia cfg80211 rng_core ohci1394 snd_aoa_i2sbus ieee1394 snd_pcm snd_page_alloc snd_seq_midi snd_rawmidi pmac_zilog serial_core snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_aoa_soundbus yenta_socket rsrc_nonstatic pcmcia_core ssb uninorth_agp agpgart [last unloaded: appletouch]
NIP: c02932a8 LR: c01f961c CTR: c002ac78
REGS: d26e7dc0 TRAP: 0300   Not tainted  (2.6.25-rc7-p1)
MSR: 00009032 <EE,ME,IR,DR>  CR: 24000888  XER: 00000000
DAR: 00000000, DSISR: 42000000
TASK = d5be6d30[14676] 'Xorg' THREAD: d26e6000
GPR00: d26e7e78 d26e7e70 d5be6d30 ef065640 c037b570 c03a60d0 00000f40 00000001 
GPR08: 00000008 00000000 d26e7ea4 00000000 24000888 101fba44 101f3bc4 101f3bec 
GPR16: bffb0660 00000000 102187e4 10218ae4 102186e4 10218764 102189e4 bffb0574 
GPR24: 102187e4 101f3cf8 ef63c0d4 ef80cc20 ef152c1c ef065644 d5be6d30 ef065640 
NIP [c02932a8] __mutex_lock_slowpath+0x2c/0xc0
LR [c01f961c] input_release_device+0x24/0x48
Call Trace:
[d26e7e70] [c0380000] 0xc0380000 (unreliable)
[d26e7ea0] [c01f961c] input_release_device+0x24/0x48
[d26e7ec0] [c01fd9c4] evdev_ungrab+0x4c/0x64
[d26e7ed0] ...
From: Greg KH
Date: Monday, March 31, 2008 - 4:51 pm

Can you try it with the patch that was just posted by Dmitry for the
evdev code?

thanks,

greg k-h
--

From: Benjamin Herrenschmidt
Date: Monday, March 31, 2008 - 6:01 pm

Yup, it works, ship it ! :-)

Cheers,
Ben.


--

From: Greg KH
Date: Monday, March 31, 2008 - 1:42 pm

Ugh, this was done because of scsi, they required that if you really

Yes, if you need those references, you are going to have to hold on for
them, the kobject layer will not keep them around.  It now is a "does
what you ask for" type model :)

I fail to see where this affects the input code though, in glancing at
it, it looks like you are doing things properly.  Kay, any thoughts
here, I think you looked at the kobject input layer interaction in the
past.

thanks,

greg k-h
--

From: Dmitry Torokhov
Date: Monday, March 31, 2008 - 1:57 pm

Ok, I really liked the old behavior better, but if it is to stay then
we need something like this (not for inclusion yet as mousedev and joydev
need to be adjusted as well):

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
---
 drivers/input/evdev.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Index: linux/drivers/input/evdev.c
===================================================================
--- linux.orig/drivers/input/evdev.c
+++ linux/drivers/input/evdev.c
@@ -124,6 +124,7 @@ static void evdev_free(struct device *de
 {
 	struct evdev *evdev = container_of(dev, struct evdev, dev);
 
+	input_put_device(evdev->handle.dev);
 	kfree(evdev);
 }
 
@@ -853,9 +854,6 @@ static void evdev_cleanup(struct evdev *
 	evdev_hangup(evdev);
 	evdev_remove_chrdev(evdev);
 
-	if (evdev->grab)
-		evdev_ungrab(evdev, evdev->grab);
-
 	/* evdev is marked dead so no one else accesses evdev->open */
 	if (evdev->open) {
 		input_flush_device(handle, NULL);
@@ -896,7 +894,7 @@ static int evdev_connect(struct input_ha
 	evdev->exist = 1;
 	evdev->minor = minor;
 
-	evdev->handle.dev = dev;
+	evdev->handle.dev = input_get_device(dev);
 	evdev->handle.name = evdev->name;
 	evdev->handle.handler = handler;
 	evdev->handle.private = evdev;

-- 
Dmitry
--

From: Greg KH
Date: Monday, March 31, 2008 - 3:09 pm

Yes, that's the proper behavior anyway, as you are passing off a pointer
to a device, you need to keep the reference to that device around until
you are finished with it.

I'm amazed that this wasn't causing a problem before the kobject change,
as this should have been needed there as well.  Would running with slab

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>

thanks,

greg k-h
--

From: Dmitry Torokhov
Date: Monday, March 31, 2008 - 8:30 pm

It worked because evdev (and mousedev, joydev) are direct children of
input_dev and prior to Kay change parent would stay till all childrens
are gone.

I will ask Linus to pull extended patch covering also joydev and mousedev
shortly.

-- 
Dmitry
--

From: Jiri Kosina
Date: Monday, March 31, 2008 - 2:27 pm

What is the exact meaning of "gone" here please?

-- 
Jiri Kosina
--

From: Kay Sievers
Date: Monday, March 31, 2008 - 3:46 pm

Gone means, that if you remove a device from sysfs, you drop the
implicit reference to the parent device, as this is no longer needed.

You are expected to keep a ref to the parent object (same way as to any
other used object) if you need to access the data. Removed objects are
isolated now, which means that you just pin their data and not their
parents.

This is the expected behavior and makes it possible to resolve refcount
loops (parent ref's child) which could not be released with the implicit
parent ref that was only released on object cleanup.

Thanks,
Kay

--

From: Johannes Berg
Date: Monday, March 31, 2008 - 1:21 pm

Unfortunately, I can't even bisect, I just tried compiling 2.6.25-rc1
and it failed to link because of __udivi3, __umodi3 and another one (or
similar). A quick look failed to tell me why I get that with -rc1 and
-rc2 but not -rc7.

Below is a simple test program. Run it on any event device, then rmmod
the module the device belongs to and abort the program with ctrl-c.

johannes

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/input.h>

int main (int argc, char **argv)
{
	int fd;

	if (argc < 2) {
		printf("Usage: grab /dev/input/eventX\n");
		printf("Where X =3D input device number\n");
		exit(1);
	}

	if ((fd =3D open(argv[argc - 1], O_RDONLY)) < 0) {
		perror("grab[open]");
		exit(1);
	}

	if (ioctl(fd, EVIOCGRAB, 1)) {
		perror("grab[EVIOCGRAB]");
		exit(1);
	}

	printf("interrupt to exit\n");

	while (1)
		sleep(1000);
}

From: Björn
Date: Monday, March 31, 2008 - 12:05 pm

Hm, crap, can't reproduce on my x86_64 SMP box either. Tried with
various preemption models as well as rcu classic and preempt. Johannes,
is there anything "special" in your configuration?

Björn
--

From: Johannes Berg
Date: Tuesday, April 1, 2008 - 4:51 am

slab debugging turned on by default? without it, you'll most likely
never notice.

johannes
From: Björn
Date: Tuesday, April 1, 2008 - 8:20 am

I tried with slab+debugging, slub+debugging (SLUB_DEBUG_ON was set),
various preemption models, UP and SMP. No luck. I also wrote a test
program like yours to eliminate any effects caused by different X.org
versions, but it just doesn't want to crash :-/

Björn
--

From: Johannes Berg
Date: Wednesday, April 2, 2008 - 2:17 am

Strange. I don't know then, I used 'appletouch' to reproduce, maybe
there's as bug in appletouch's exit function?

johannes
From: Johannes Berg
Date: Monday, March 31, 2008 - 2:02 pm

You might want to insert a comment about why this is safe and doesn't
race since it's not entirely trivial to see because everything else that
manipulates the grab needs to take the mutex. In fact, I'm not entirely
sure it's race-free but at least it can't race against the ioctl handler
because by this time ->exist will be 0.

johannes
Previous thread: some PCI reads makes machine lock up by Pierre Ossman on Sunday, March 30, 2008 - 11:39 am. (1 message)

Next thread: The never ending BEEEEP/__smp_call_function_mask with 2.6.25-rc7 by Chr on Sunday, March 30, 2008 - 12:09 pm. (23 messages)