Re: [BUG] object: is not initialized, yet kobject_put() is being called

Previous thread: ARM: 2.6.3[45] PCI regression (IXP4xx and PXA?) by Krzysztof Halasa on Tuesday, August 10, 2010 - 1:36 pm. (25 messages)

Next thread: [PULL REQ] IXP4xx (ARM) for 2.6.36 by Krzysztof Halasa on Tuesday, August 10, 2010 - 2:28 pm. (1 message)
From: Heinz Diehl
Date: Tuesday, August 10, 2010 - 2:09 pm

Hi,

I'm getting this in my logs with the latest -git7/8/9 kernels. Thw apcups
is not defective, it works flawlessly with 2.6.34.X.

Aug 10 23:01:44 liesel apcupsd[7687]: apcupsd FATAL ERROR in linux-usb.c
at line 609 Cannot find UPS device -- For a link to detailed USB trouble
shooting information, please see <http://www.apcupsd.com/support.html>.
Aug 10 23:01:44 liesel apcupsd[7687]: apcupsd error shutdown completed
Aug 10 23:04:40 liesel kernel: ------------[ cut here ]------------

Aug 10 23:04:40 liesel kernel: WARNING: at lib/kobject.c:595 kobject_put+0x2f/0x50()
Aug 10 23:04:40 liesel kernel: Hardware name: GA-MA770-UD3 Aug 10 23:04:40 liesel kernel: kobject: '(null)' (ffff88022c651898): is
not initialized, yet kobject_put() is being called. Aug 10 23:04:40 liesel kernel: Modules linked in: xt_pkttype ipt_LOG
xt_limit aes_x86_64 aes_generic af_packet it87 hwmon_vid binfmt_misc snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device cpufreq_conservative
cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table mperf xt_NOTRACK xt_state iptable_raw nf_conntrack_netbios_ns
nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip6_tables ipt_REJECT iptable_filter ip_tables tcp_vegas uhci_hcd rt73usb rt2x00usb rt2x00lib
r8169 led_class mii mac80211 snd_hda_codec_realtek ppdev parport_pc cfg80211 parport snd_hda_intel snd_hda_codec snd_pcm snd_timer snd
soundcore sr_mod k10temp hwmon i2c_piix4 pcspkr cdrom button rfkill joydev snd_page_alloc i2c_core sg xts usbhid sd_mod ohci_hcd ehci_hcd rtc_cmos
rtc_core pata_atiixp rtc_lib usbcore ahci libahci hmac dm_snapshot dm_crypt dm_mod gf128mul sha512_generic sha256_generic twofish_x86_64
twofish_common serpent cbc tgr192 loop ecb arc4 fuse edd ide_pci_generic amd74xx ide_core fan pata_amd sata_nv libata scsi
Aug 10 23:04:40 liesel kernel: Pid: 1115, comm: khubd Not tainted 2.6.35-git9 #1
Aug 10 23:04:40 liesel kernel: Call Trace:
Aug 10 23:04:40 liesel kernel:  [<ffffffff812a620f>] ? kobject_put+0x2f/0x50
Aug 10 23:04:40 liesel kernel:  ...
From: Markus Trippelsdorf
Date: Tuesday, August 10, 2010 - 3:32 pm

I'm seeing something similar today:

hub 3-0:1.0: port 1 disabled by hub (EMI?), re-enabling...
usb 3-1: USB disconnect, address 2
------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x37/0x4b()
Hardware name: System Product Name
kobject: '(null)' (ffff88011e9cd898): is not initialized, yet kobject_put() is being called.
Pid: 310, comm: khubd Not tainted 2.6.35-06113-gf6cec0a-dirty #4
Call Trace:
 [<ffffffff8105e7f7>] warn_slowpath_common+0x80/0x98
 [<ffffffff8105e8a2>] warn_slowpath_fmt+0x46/0x48
 [<ffffffff8118fbad>] kobject_put+0x37/0x4b
 [<ffffffff812aa88d>] put_device+0x17/0x19
 [<ffffffff8132fef1>] hid_destroy_device+0x41/0x45
 [<ffffffff81335e5f>] usbhid_disconnect+0x3e/0x4b
 [<ffffffff812fbd0d>] usb_unbind_interface+0x4c/0xd3
 [<ffffffff812ad746>] __device_release_driver+0x6c/0xb5
 [<ffffffff812adc33>] device_release_driver+0x23/0x30
 [<ffffffff812ace3b>] bus_remove_device+0x89/0x99
 [<ffffffff812aaec6>] device_del+0x12d/0x17f
 [<ffffffff812facc4>] usb_disable_device+0x86/0xfd
 [<ffffffff812f446a>] usb_disconnect+0x8e/0xf6
 [<ffffffff812f62fd>] hub_thread+0x3eb/0xcb5
 [<ffffffff8140170b>] ? _raw_spin_unlock_irq+0x16/0x31
 [<ffffffff81077ea3>] ? autoremove_wake_function+0x0/0x39
 [<ffffffff812f5f12>] ? hub_thread+0x0/0xcb5
 [<ffffffff8107750e>] kthread+0x82/0x8a
 [<ffffffff8102f794>] kernel_thread_helper+0x4/0x10
 [<ffffffff8107748c>] ? kthread+0x0/0x8a
 [<ffffffff8102f790>] ? kernel_thread_helper+0x0/0x10
---[ end trace 5707e7a0642da347 ]---
------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x37/0x4b()
Hardware name: System Product Name
kobject: '(null)' (ffff88011e9fd898): is not initialized, yet kobject_put() is being called.
Pid: 310, comm: khubd Tainted: G        W   2.6.35-06113-gf6cec0a-dirty #4
Call Trace:
 [<ffffffff8105e7f7>] warn_slowpath_common+0x80/0x98
 [<ffffffff8105e8a2>] warn_slowpath_fmt+0x46/0x48
 [<ffffffff8118fbad>] kobject_put+0x37/0x4b
 [<ffffffff812aa88d>] put_device+0x17/0x19
 ...
From: Jiri Kosina
Date: Thursday, August 12, 2010 - 2:11 am

Addin Alan Ott to CC, who has reported similar problem.

I am not able to reproduce the problem here with current Linus' tree 
(5af568cbd55f60b5). This is what I see in dmesg when I do two rounds of 
disconnect-connect with my random USB mouse

usb 3-2: USB disconnect, address 7
usb 3-2: new low speed USB device using uhci_hcd and address 8
usb 3-2: New USB device found, idVendor=05b8, idProduct=3091
usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 3-2: Product: USB mouse with wheel
usb 3-2: Manufacturer: mouse
input: mouse USB mouse with wheel as /devices/pci0000:00/0000:00:10.1/usb3/3-2/3-2:1.0/input/input16
generic-usb 0003:05B8:3091.000C: input,hidraw0: USB HID v1.10 Mouse [mouse USB mouse with wheel] on usb-0000:00:10.1-2/input0
input: mouse USB mouse with wheel as /devices/pci0000:00/0000:00:10.1/usb3/3-2/3-2:1.1/input/input17
generic-usb 0003:05B8:3091.000D: input,hidraw1: USB HID v1.10 Keyboard [mouse USB mouse with wheel] on usb-0000:00:10.1-2/input1
usb 3-2: USB disconnect, address 8
usb 3-2: new low speed USB device using uhci_hcd and address 9
usb 3-2: New USB device found, idVendor=05b8, idProduct=3091
usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 3-2: Product: USB mouse with wheel
usb 3-2: Manufacturer: mouse
input: mouse USB mouse with wheel as /devices/pci0000:00/0000:00:10.1/usb3/3-2/3-2:1.0/input/input18
generic-usb 0003:05B8:3091.000E: input,hidraw0: USB HID v1.10 Mouse [mouse USB mouse with wheel] on usb-0000:00:10.1-2/input0
input: mouse USB mouse with wheel as /devices/pci0000:00/0000:00:10.1/usb3/3-2/3-2:1.1/input/input19
generic-usb 0003:05B8:3091.000F: input,hidraw1: USB HID v1.10 Keyboard [mouse USB mouse with wheel] on usb-0000:00:10.1-2/input1

I.e. now BUG from kobject refcounting, and hidraw numbers are stable as 
well.

Guys, could you please send me your .config with which you are 
experiencing the problem, so that I can try to reproduce it with it?

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Markus Trippelsdorf
Date: Thursday, August 12, 2010 - 2:16 am

Attached.
-- 
»A man who doesn't know he is in prison can never escape.«
William S. Burroughs
From: Jiri Kosina
Date: Thursday, August 12, 2010 - 4:18 am

So I had to add support for HW of mine (SATA_VIA, PATA_VIA, disabled the 
radeon firmware stuff, and, most importantly, enabled UHCI, which is what 
is on machine I am using now for testing), and disconnect-connect cycle 
works nicely:

usb 3-2: USB disconnect, address 3
usb 3-2: new low speed USB device using uhci_hcd and address 4
input: mouse USB mouse with wheel as /devices/pci0000:00/0000:00:10.1/usb3/3-2/3-2:1.0/input/input6
generic-usb 0003:05B8:3091.0004: input,hidraw0: USB HID v1.10 Mouse [mouse USB mouse with wheel] on usb-0000:00:10.1-2/input0
input: mouse USB mouse with wheel as /devices/pci0000:00/0000:00:10.1/usb3/3-2/3-2:1.1/input/input7
generic-usb 0003:05B8:3091.0005: input,hidraw1: USB HID v1.10 Keyboard [mouse USB mouse with wheel] on usb-0000:00:10.1-2/input1

so still no BUG from kobject refcounting, and the hidraw numbers are the 
same. The major difference seems to be that I am using UHCI and you have 
your HID devices connected through OHCI, is that correct?

Alan, does the system you are seeing this bug on also have HID devices 
connected through OHCI?

Thanks,

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Alan Ott
Date: Thursday, August 12, 2010 - 4:26 am

Hi Jiri,

Yes, I'm OHCI as well.

Alan.


--

From: Markus Trippelsdorf
Date: Thursday, August 12, 2010 - 4:36 am

Correct. And I should add that it very easy to reproduce here; just
disconnecting the usb receiver triggers the WARNING.
-- 
»A man who doesn't know he is in prison can never escape.«
William S. Burroughs
--

From: Markus Trippelsdorf
Date: Thursday, August 12, 2010 - 4:45 am

Disabling USB_HIDDEV in my .config solves the problem (there is no
WARNING anymore when disconnecting the receiver).

-- 
»A man who doesn't know he is in prison can never escape.«
William S. Burroughs
--

From: Markus Trippelsdorf
Date: Thursday, August 12, 2010 - 4:56 am

Reverting bd25f4dd6972755579d0ea50d1a5ace2e9b00d1a with USB_HIDDEV
enabled also solves the problem. So there must be a bug in that patch:

commit bd25f4dd6972755579d0ea50d1a5ace2e9b00d1a
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Sun Jul 11 15:34:05 2010 +0200

    HID: hiddev: use usb_find_interface, get rid of BKL
    
    This removes the private hiddev_table in the usbhid
    driver and changes it to use usb_find_interface
    instead.
    
    The advantage is that we can avoid the race between
    usb_register_dev and usb_open and no longer need the
    big kernel lock.
    
    This doesn't introduce race condition -- the intf pointer could be
    invalidated only in hiddev_disconnect() through usb_deregister_dev(),
    but that will block on minor_rwsem and not actually remove the device
    until usb_open().
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Cc: Jiri Kosina <jkosina@suse.cz>
    Cc: "Greg Kroah-Hartman" <gregkh@suse.de>
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>


-- 
»A man who doesn't know he is in prison can never escape.«
William S. Burroughs
--

From: Jiri Kosina
Date: Thursday, August 12, 2010 - 5:07 am

Thanks, I had my suspicion about this commit (that's why I asked to you 
verify with CONFIG_USB_HIDDEV turned off as well).

Adding Arnd to CC and will look into it myself as well soon.

Arnd, the complete thread can be found at 
http://lkml.org/lkml/2010/8/10/280

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Jiri Kosina
Date: Thursday, August 12, 2010 - 5:57 am

Could please those of you, who are able to reproduce the problem, verify 
whether the patch below (completely untested) makes everything behave 
again? Thanks.



 drivers/hid/usbhid/hiddev.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index 254a003..0dcb79a 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -266,13 +266,15 @@ static int hiddev_open(struct inode *inode, struct file *file)
 {
 	struct hiddev_list *list;
 	struct usb_interface *intf;
+	struct hid_device *hid;
 	struct hiddev *hiddev;
 	int res;
 
 	intf = usb_find_interface(&hiddev_driver, iminor(inode));
 	if (!intf)
 		return -ENODEV;
-	hiddev = usb_get_intfdata(intf);
+	hid = usb_get_intfdata(intf);
+	hiddev = hid->hiddev;
 
 	if (!(list = kzalloc(sizeof(struct hiddev_list), GFP_KERNEL)))
 		return -ENOMEM;
@@ -888,7 +890,6 @@ int hiddev_connect(struct hid_device *hid, unsigned int force)
 	hid->hiddev = hiddev;
 	hiddev->hid = hid;
 	hiddev->exist = 1;
-	usb_set_intfdata(usbhid->intf, usbhid);
 	retval = usb_register_dev(usbhid->intf, &hiddev_class);
 	if (retval) {
 		err_hid("Not able to get a minor for this device.");


-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Markus Trippelsdorf
Date: Thursday, August 12, 2010 - 6:04 am

Yes, everything behaves again. Thanks Jiri.

-- 
»A man who doesn't know he is in prison can never escape.«
William S. Burroughs
--

From: Alan Ott
Date: Thursday, August 12, 2010 - 6:09 am

Seconded. Works good. Thanks for your support Jiri.



--

From: Jiri Kosina
Date: Thursday, August 12, 2010 - 6:11 am

Thanks for your testing. I still need to think a little bit more about 
whether this is indeed an appropriate fix or there might be still some 
problem.

Once I will be applying this, I will put

	Reported-and-tested-by: 

tag for both of you Alan and Markus, is that fine?

Thanks,

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Alan Ott
Date: Thursday, August 12, 2010 - 6:14 am

That's fine with me.

Alan.


--

From: Markus Trippelsdorf
Date: Thursday, August 12, 2010 - 6:17 am

It's also fine with me.

-- 
»A man who doesn't know he is in prison can never escape.«
William S. Burroughs
--

From: Heinz Diehl
Date: Thursday, August 12, 2010 - 9:40 am

Thirded! This fixes the warning here.

Thanks,
Heinz.
--

From: Jiri Kosina
Date: Thursday, August 19, 2010 - 12:41 am

The fix is now in Linus' tree, and thus will be in -rc2. Thanks everyone 
for testing.

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Heinz Diehl
Date: Monday, September 20, 2010 - 11:11 am

Unfortunately, I have to take up this problem a second time. 
I considered the issue fixed, and in fact it was using Jiris patch.

However, the same problem came back while testing 2.6.36-rc4 and the
following -git's. /var/log/warn shows the same warning as mentioned in
the first mail in this thread:

 Sep 20 08:18:42 liesel apcupsd[5436]: apcupsd FATAL ERROR in linux-usb.c at line 609 
 Cannot find UPS device -- For a link to detailed USB trouble
 shooting information, please see <http://www.apcupsd.com/support.html>.
 Sep 20 08:18:42 liesel apcupsd[5436]: apcupsd error shutdown completed

No additional oops this time, only the warning, and my UPS is no longer
working.

Jiris patch obviously didn't made it into 2.6.36-rc4, and it doesn't
apply nay longer either.

Thanks,
Heinz.
--

From: Jiri Kosina
Date: Tuesday, September 21, 2010 - 7:55 am

This seems like a completely different problem -- apcupsd is not finding 
the UPS, yes, but kernel doesn't WARN/BUG here, so it seems like similar 
symptoms, but very likely a different cause.

Could you perhaps strace apcupsd to see what it is doing before it fails?

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Heinz Diehl
Date: Tuesday, September 21, 2010 - 8:09 am

Yes, of course. The output is attached this mail.

From: Jiri Kosina
Date: Tuesday, September 21, 2010 - 8:14 am

strace -f would likely be more helpful.

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Heinz Diehl
Date: Tuesday, September 21, 2010 - 8:27 am

Oops, yes, of course. Output is attached.

From: Jiri Kosina
Date: Tuesday, September 21, 2010 - 8:53 am

Looks like the /dev/usb/hiddev* nodes are not being created at all. Could 
you please attach output of dmesg and ls -l /dev/usb

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Heinz Diehl
Date: Tuesday, September 21, 2010 - 9:11 am

Dmesg output is attached. 

/dev/usb is not existent on my systems (opensuse 11.1 and 11.2).
It doesn't exist with 2.6.35.4 either, which works perfectly.

From: Heinz Diehl
Date: Tuesday, September 21, 2010 - 9:20 am

And these are dmesg output and strace -f from the working 2.6.35.4
kernel. No /dev/usb here either.







From: Jiri Kosina
Date: Tuesday, September 21, 2010 - 9:24 am

OK, so the problem is

21206 open("/dev/hiddev0", O_RDWR|O_NOCTTY) = -1 ENODEV (No such device)
21206 open("/dev/hiddev1", O_RDWR|O_NOCTTY) = -1 ENODEV (No such device)
21206 open("/dev/hiddev2", O_RDWR|O_NOCTTY) = -1 ENODEV (No such device)

Do you have 8fe294caf8c868edd9046251824a0af91991bf43 commit applied? (it 
is included in -rc5).

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Heinz Diehl
Date: Tuesday, September 21, 2010 - 11:48 am

That's it! All works fine now.
Many thanks for your patience and help!

Heinz
--

From: Jiri Kosina
Date: Thursday, August 12, 2010 - 4:55 am

Hmm ... there is a certain interface of the device which doesn't get 
connected to input, right? (0003:046D:C52B.0008).

What happens if you compile without CONFIG_HIDRAW and/or 
CONFIG_USB_HIDDEV?

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--

From: Alan Ott
Date: Thursday, August 12, 2010 - 4:23 am

So this is weird. I tried it with _my_ random USB mouse, and it works 
fine. It's my custom HID device that breaks it. Joystick also seemed to 
work fine. I wonder if it's somehow related to devices which only show 
up as hidraw devices (ie: they don't also show up as /dev/js or /dev/mouse.)

I re-configured my programmable USB device (PIC32 USB Starter Board) to 
be a mouse, and it works fine. When I configure it to be my custom HID 

Attached.


From: Heinz Diehl
Date: Thursday, August 12, 2010 - 9:37 am

Of course, here's mine. The problem persists in -git10.


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.35-git9
# Tue Aug 10 19:58:33 2010
#
CONFIG_64BIT=y
# CONFIG_X86_32 is not set
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_DEFAULT_IDLE=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_HAVE_CPUMASK_OF_CPU_MAP=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ZONE_DMA32=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_EARLY_RES=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_64_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
# CONFIG_KTIME_SCALAR is not ...
Previous thread: ARM: 2.6.3[45] PCI regression (IXP4xx and PXA?) by Krzysztof Halasa on Tuesday, August 10, 2010 - 1:36 pm. (25 messages)

Next thread: [PULL REQ] IXP4xx (ARM) for 2.6.36 by Krzysztof Halasa on Tuesday, August 10, 2010 - 2:28 pm. (1 message)