Re: 2.6.21-rc6-mm1 USB related boot hang

Previous thread: man-pages-2.44 is released by Michael Kerrisk on Sunday, April 8, 2007 - 11:41 am. (2 messages)

Next thread: Add a norecovery option to ext3/4? by Samuel Thibault on Sunday, April 8, 2007 - 5:05 pm. (22 messages)
From: Andrew Morton
Date: Sunday, April 8, 2007 - 2:35 pm

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm1/


- Lots of x86 updates

- This is a 25MB diff against mainline, which is rather large.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

  git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
  git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
  mm-commits mailing list.

        echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
  most valuable if you can perform a bisection search to identify which patch
  introduced the bug.  Instructions for this process are at

        http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

  But beware that this process takes some time (around ten rebuilds and
  reboots), so consider reporting the bug first and if we cannot immediately
  identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
  list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
  email Subject: in some manner to reflect the nature of the bug.  Some
  developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
  ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
  the mm-commits list.


Changes since 2.6.21-rc5-mm4:

 origin.patch
 git-acpi.patch
 git-alsa.patch
 git-agpgart.patch
 git-arm.patch
 git-avr32.patch
 git-cifs.patch
 git-cpufreq.patch
 git-powerpc.patch
 git-drm.patch
 git-dvb.patch
 git-gfs2-nmw.patch
 git-hid.patch
 git-ia64.patch
 git-ieee1394.patch
 git-infiniband.patch
 git-input.patch
 git-jfs.patch
 ...
From: Steve Fox
Date: Tuesday, April 10, 2007 - 3:21 pm

Since 2.6.21-rc5-mm1, one of the test.kernel.org machines (elm3b239) has
not been able to boot because it cannot find the SCSI device. You can
view http://test.kernel.org/abat/82623/debug/console.log for the latest
boot log (rc6-mm1).

I tracked this down to the git-scsi-misc patch in the -mm tree and then
bisected the scsi-misc git tree until I reached the commit below from
Mark Salyzyn:

fe76df4235986cfacc2d3b71cef7c42bc1a6dd6c

[SCSI] aacraid: Fix blocking issue with container probing function (cast update)

This is a pretty big patch, so hopefully Mark can take a look at it.
lspci shows

01:02.0 RAID bus controller: Adaptec AAC-RAID (rev 02)
0f:02.0 SCSI storage controller: Adaptec AIC-9410W SAS (Razor ASIC
non-RAID) (rev 08)
1d:02.0 SCSI storage controller: Adaptec AIC-9410W SAS (Razor ASIC
non-RAID) (rev 08)
2b:02.0 SCSI storage controller: Adaptec AIC-9410W SAS (Razor ASIC
non-RAID) (rev 08)

on 2.6.21-rc6. Let me know if I can provide more details.

-- 

Steve Fox
IBM Linux Technology Center

-

From: Salyzyn, Mark
Date: Friday, April 13, 2007 - 5:35 am

Thanks for the help from Steve Fox and Duane Cox investigating this
issue, I'd like to report that we found the problem. The issue is with
the patch Steve Fox isolated below, by not accommodating older adapters
properly and issuing a command they do not support when retrieving
storage parameters about the arrays. This simple patch resolves the
problem (and more accurately mimics the logic of the original code
before the patch).

ObligatoryDisclaimer: Please accept my condolences regarding Outlook's
handling of patches.

This attached patch is against current scsi-misc-2.6 and can apply to
2.6.21-rc6-mm1. Please consider it for expedited inclusion.

Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>

---

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6
/2.6.21-rc6-mm1/

Since 2.6.21-rc5-mm1, one of the test.kernel.org machines (elm3b239) has
not been able to boot because it cannot find the SCSI device. You can
view http://test.kernel.org/abat/82623/debug/console.log for the latest
boot log (rc6-mm1).

I tracked this down to the git-scsi-misc patch in the -mm tree and then
bisected the scsi-misc git tree until I reached the commit below from
Mark Salyzyn:

fe76df4235986cfacc2d3b71cef7c42bc1a6dd6c

[SCSI] aacraid: Fix blocking issue with container probing function (cast
update)

This is a pretty big patch, so hopefully Mark can take a look at it.
lspci shows

01:02.0 RAID bus controller: Adaptec AAC-RAID (rev 02)
0f:02.0 SCSI storage controller: Adaptec AIC-9410W SAS (Razor ASIC
non-RAID) (rev 08)
1d:02.0 SCSI storage controller: Adaptec AIC-9410W SAS (Razor ASIC
non-RAID) (rev 08)
2b:02.0 SCSI storage controller: Adaptec AIC-9410W SAS (Razor ASIC
non-RAID) (rev 08)

on 2.6.21-rc6. Let me know if I can provide more details.

--=20

Steve Fox
IBM Linux Technology Center

From: Borislav Petkov
Date: Monday, April 9, 2007 - 4:13 am

fixes the issue in a slightly better way.

-- 
Regards/Gruß,
    Boris.
-

From: Rafael J. Wysocki
Date: Monday, April 9, 2007 - 9:08 am

The cpuidle thing tends to hang my x86-64 machines on boot.

Greetings,
Rafael
-

From: Pallipadi, Venkatesh
Date: Monday, April 9, 2007 - 9:14 am

Hi Rafael,

At what point during boot does it hang? Can you send me the last few
messages before the hang. And full dmesg when cpuidle is not configured
will help as well.

Thanks,
Venki
-

From: Rafael J. Wysocki
Date: Monday, April 9, 2007 - 10:40 am

When mounting the root filesystem.  It hangs completely, even the magic SysRq

Freeing unused kernel memory: 240k freed
Write protecting the kernel read-only data: 4356k
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
ACPI: Invalid PBLK length [0]
cpuidle: driver acpi_idle failed to attach to cpu 0
cpuidle: using driver acpi_idle
ACPI: Thermal Zone [THRM] (59 C)
ACPI: Fan [FN00] (on)
Attempting manual resume
swsusp: Resume From Partition 22:3
PM: Checking swsusp image.
PM: Resume from disk failed.
kjournald starting.  Commit interval 5 seconds
EXT3 FS on hdc6, internal journal

Attached.

Greetings,
Rafael
From: Venki Pallipadi
Date: Tuesday, April 10, 2007 - 3:20 pm

Rafael: Below patch should fix the hang.
Len: Please include this patch in acpi-test.

Thanks,
Venki

Prevent hang on x86-64, when ACPI processor driver is added as a module on
a system that does not support C-states.

x86-64 expects all idle handlers to enable interrupts before returning from
idle handler. This is due to enter_idle(), exit_idle() races. Make
cpuidle_idle_call() confirm to this when there is no pm_idle_old.

Also, cpuidle look at the return values of attch_driver() and set
current_driver to NULL if attach fails on all CPUs.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>

Index: linux-2.6.21-rc6-mm1/drivers/cpuidle/cpuidle.c
===================================================================
--- linux-2.6.21-rc6-mm1.orig/drivers/cpuidle/cpuidle.c
+++ linux-2.6.21-rc6-mm1/drivers/cpuidle/cpuidle.c
@@ -43,6 +43,8 @@ static void cpuidle_idle_call(void)
 	if (dev->status != CPUIDLE_STATUS_DOIDLE) {
 		if (pm_idle_old)
 			pm_idle_old();
+		else
+			local_irq_enable();
 		return;
 	}
 
Index: linux-2.6.21-rc6-mm1/drivers/cpuidle/driver.c
===================================================================
--- linux-2.6.21-rc6-mm1.orig/drivers/cpuidle/driver.c
+++ linux-2.6.21-rc6-mm1/drivers/cpuidle/driver.c
@@ -107,11 +107,20 @@ int cpuidle_switch_driver(struct cpuidle
 	cpuidle_curr_driver = drv;
 
 	if (drv) {
+		int ret = 1;
 		list_for_each_entry(dev, &cpuidle_detected_devices, device_list)
-			cpuidle_attach_driver(dev);
-		if (cpuidle_curr_governor)
+			if (cpuidle_attach_driver(dev) == 0)
+				ret = 0;
+
+		/* If attach on all devices fail, switch to NULL driver */
+		if (ret)
+			cpuidle_curr_driver = NULL;
+
+		if (cpuidle_curr_driver && cpuidle_curr_governor) {
+			printk(KERN_INFO "cpuidle: using driver %s\n",
+					drv->name);
 			cpuidle_install_idle_handler();
-		printk(KERN_INFO "cpuidle: using driver %s\n", drv->name);
+		}
 	}
 
 	return 0;
-

From: Rafael J. Wysocki
Date: Wednesday, April 11, 2007 - 12:28 pm

Yes, the box boots now, thanks.

Greetings,
Rafael
-

From: Nishanth Aravamudan
Date: Monday, April 9, 2007 - 5:50 pm

Get this Oops:

Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 
 [<ffffffff802f9320>] hugetlbfs_set_page_dirty+0x4/0xc
PGD 414e067 PUD 4198067 PMD 0 
Oops: 0002 [1] SMP 
last sysfs file: devices/system/node/node0/cpumap
CPU 1 
Modules linked in: ipv6 hidp rfcomm l2cap bluetooth sunrpc video button battery asus_acpi ac lp parport_pc parport nvram amd_rng rng_core i2c_amd756 i2c_core
Pid: 6053, comm: readback Not tainted 2.6.21-rc6-mm1-autokern1 #1
RIP: 0010:[<ffffffff802f9320>]  [<ffffffff802f9320>] hugetlbfs_set_page_dirty+0x4/0xc
RSP: 0018:ffff810004145d90  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff81003f1ad000 RCX: 000000000000003f
RDX: ffff810004771dc0 RSI: ffff810004145db0 RDI: ffff81003f1ad000
RBP: 8000000007800040 R08: 0000000001258020 R09: ffff81000160ad84
R10: 0000000000000282 R11: ffffffff802f931c R12: ffff8100035db7c0
R13: ffff810003675c38 R14: 00002aaaaae00000 R15: ffff810001022820
FS:  00002ac8d0bd6590(0000) GS:ffff81000160acc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000047b7000 CR4: 00000000000006e0
Process readback (pid: 6053, threadinfo ffff810004144000, task ffff81000177b140)
Stack:  ffffffff80283f95 ffff810004145d98 ffff810004145d98 ffff810000000000
 00002aaaaac00000 ffff810003675c38 00002aaaaae00000 00002aaaaac00000
 ffff8100047b68b8 00000036d5f18000 ffffffff80284060 ffff81003fc066c0
Call Trace:
 [<ffffffff80283f95>] __unmap_hugepage_range+0xcf/0x163
 [<ffffffff80284060>] unmap_hugepage_range+0x37/0x57
 [<ffffffff802761e4>] unmap_vmas+0xf6/0x744
 [<ffffffff8027a197>] exit_mmap+0x78/0xed
 [<ffffffff802313bc>] mmput+0x45/0xb7
 [<ffffffff80236636>] do_exit+0x23d/0x811
 [<ffffffff80236c86>] sys_exit_group+0x0/0xe
 [<ffffffff80209b6e>] system_call+0x7e/0x83


Code: f0 0f ba 28 04 31 c0 c3 48 89 c8 48 c7 c1 5f 9b 2f 80 48 89 
RIP  [<ffffffff802f9320>] hugetlbfs_set_page_dirty+0x4/0xc
 RSP <ffff810004145d90>
CR2: 0000000000000000
Fixing recursive fault but ...
From: William Lee Irwin III
Date: Monday, April 9, 2007 - 6:07 pm

Thanks for cleaning this up.


-- wli
-

From: Christoph Lameter
Date: Monday, April 9, 2007 - 5:56 pm

Correct. 

Acked-by: Christoph Lameter <clameter@sgi.com>

Who is off to look for more of these.
-

From: Joseph Fannin
Date: Tuesday, April 10, 2007 - 4:28 am

I'm seeing this while booting:

ima (ima_init): No TPM chip found(rc = -19), activating TPM-bypass!

=========================
[ BUG: held lock freed! ]
-------------------------
swapper/1 is freeing memory c04c7660-c04c76a3, with a lock still held there!
 (ima_queue_lock){--..}, at: [<c0202710>] ima_create_htable+0x10/0x90
1 lock held by swapper/1:
 #0:  (ima_queue_lock){--..}, at: [<c0202710>] ima_create_htable+0x10/0x90

stack backtrace:
 [<c0105959>] dump_trace+0x1d9/0x210
 [<c01059aa>] show_trace_log_lvl+0x1a/0x30
 [<c0106612>] show_trace+0x12/0x20
 [<c01066d6>] dump_stack+0x16/0x20
 [<c014fd3a>] debug_check_no_locks_freed+0x17a/0x180
 [<c014cdbf>] debug_mutex_init+0x1f/0x50
 [<c0145451>] __mutex_init+0x41/0x50
 [<c020277d>] ima_create_htable+0x7d/0x90
 [<c020286f>] ima_init+0x3f/0x270
 [<c051b765>] init_evm+0x1f5/0x250
 [<c05015d2>] kernel_init+0x132/0x320
 [<c010532f>] kernel_thread_helper+0x7/0x18
 =======================

    I saw this in -rc5-mm4 also.

    I couldn't find a contact address in MAINTAINERS, so I've CC'd the
two authors listed on top of ima_create_htable.c , as well as the
first submitter of the IMA stuff I found in my LKML archive.

    As an aside, this computer does have (some sort of) TPM chip, but
the driver is built as a module, and not loaded at this point (not a
worry for me, I don't intend to use it).

--
Joseph Fannin
jfannin@gmail.com || jhf@columbus.rr.com
From: Cornelia Huck
Date: Tuesday, April 10, 2007 - 5:24 am

On Sun, 8 Apr 2007 14:35:59 -0700,

Add the missing arch_trampoline_kprobe() for s390.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>

---
 arch/s390/kernel/kprobes.c |    7 +++++++
 1 files changed, 7 insertions(+)

--- linux-2.6.21-rc6-mm1.orig/arch/s390/kernel/kprobes.c
+++ linux-2.6.21-rc6-mm1/arch/s390/kernel/kprobes.c
@@ -662,3 +662,10 @@ int __init arch_init_kprobes(void)
 {
 	return register_kprobe(&trampoline_p);
 }
+
+int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+{
+	if (p->addr == (kprobe_opcode_t *) & kretprobe_trampoline)
+		return 1;
+	return 0;
+}
-

From: Ananth N Mavinakayanahalli
Date: Tuesday, April 10, 2007 - 5:38 am

From: Adrian Bunk
Date: Tuesday, April 10, 2007 - 2:08 pm

This patch makes the needlessly global truct proc_kpagemap static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---
--- linux-2.6.21-rc6-mm1/fs/proc/proc_misc.c.old	2007-04-10 00:52:35.000000000 +0200
+++ linux-2.6.21-rc6-mm1/fs/proc/proc_misc.c	2007-04-10 00:52:49.000000000 +0200
@@ -732,7 +732,7 @@
 	return ret;
 }
 
-struct proc_dir_entry *proc_kpagemap;
+static struct proc_dir_entry *proc_kpagemap;
 static struct file_operations proc_kpagemap_operations = {
 	.llseek = mem_lseek,
 	.read = kpagemap_read,

-

From: Matt Mackall
Date: Tuesday, April 10, 2007 - 2:09 pm

Acked-by: Matt Mackall <mpm@selenic.com>

-- 
Mathematics is the supreme nostalgia of our time.
-

From: Adrian Bunk
Date: Tuesday, April 10, 2007 - 2:08 pm

is_exported() can now become static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 include/linux/module.h |    7 -------
 kernel/module.c        |    2 +-
 2 files changed, 1 insertion(+), 8 deletions(-)

--- linux-2.6.21-rc6-mm1/include/linux/module.h.old	2007-04-10 01:04:03.000000000 +0200
+++ linux-2.6.21-rc6-mm1/include/linux/module.h	2007-04-10 01:05:09.000000000 +0200
@@ -382,8 +382,6 @@
 /* Look for this name: can be of form module:name. */
 unsigned long module_kallsyms_lookup_name(const char *name);
 
-int is_exported(const char *name, const struct module *mod);
-
 extern void __module_put_and_exit(struct module *mod, long code)
 	__attribute__((noreturn));
 #define module_put_and_exit(code) __module_put_and_exit(THIS_MODULE, code);
@@ -558,11 +556,6 @@
 	return 0;
 }
 
-static inline int is_exported(const char *name, const struct module *mod)
-{
-	return 0;
-}
-
 static inline int register_module_notifier(struct notifier_block * nb)
 {
 	/* no events will happen anyway, so this can always succeed */
--- linux-2.6.21-rc6-mm1/kernel/module.c.old	2007-04-10 01:05:16.000000000 +0200
+++ linux-2.6.21-rc6-mm1/kernel/module.c	2007-04-10 01:05:36.000000000 +0200
@@ -1746,7 +1746,7 @@
 }
 
 #ifdef CONFIG_KALLSYMS
-int is_exported(const char *name, const struct module *mod)
+static int is_exported(const char *name, const struct module *mod)
 {
 	if (!mod && lookup_symbol(name, __start___ksymtab, __stop___ksymtab))
 		return 1;

-

From: Adrian Bunk
Date: Tuesday, April 10, 2007 - 2:08 pm

This patch makes the following needlessly global functions static:
- aops.c: ocfs2_write_data_page()
- dlmglue.c: ocfs2_dump_meta_lvb_info()
- file.c: ocfs2_set_inode_size()

Signed-off-by: Adrian Bunk <bunk@stusta.de>

---

 fs/ocfs2/aops.c    |    6 ++---
 fs/ocfs2/dlmglue.c |   54 ++++++++++++++++++++++++---------------------
 fs/ocfs2/dlmglue.h |    7 -----
 fs/ocfs2/file.c    |    8 +++---
 fs/ocfs2/file.h    |    5 ----
 5 files changed, 36 insertions(+), 44 deletions(-)

--- linux-2.6.21-rc6-mm1/fs/ocfs2/aops.c.old	2007-04-10 00:38:47.000000000 +0200
+++ linux-2.6.21-rc6-mm1/fs/ocfs2/aops.c	2007-04-10 00:38:55.000000000 +0200
@@ -934,9 +934,9 @@
  * Returns a negative error code or the number of bytes copied into
  * the page.
  */
-int ocfs2_write_data_page(struct inode *inode, handle_t *handle,
-			  u64 *p_blkno, struct page *page,
-			  struct ocfs2_write_ctxt *wc, int new)
+static int ocfs2_write_data_page(struct inode *inode, handle_t *handle,
+				 u64 *p_blkno, struct page *page,
+				 struct ocfs2_write_ctxt *wc, int new)
 {
 	int ret, copied = 0;
 	unsigned int from = 0, to = 0;
--- linux-2.6.21-rc6-mm1/fs/ocfs2/dlmglue.h.old	2007-04-10 00:41:39.000000000 +0200
+++ linux-2.6.21-rc6-mm1/fs/ocfs2/dlmglue.h	2007-04-10 00:47:06.000000000 +0200
@@ -119,11 +119,4 @@
 struct ocfs2_dlm_debug *ocfs2_new_dlm_debug(void);
 void ocfs2_put_dlm_debug(struct ocfs2_dlm_debug *dlm_debug);
 
-/* aids in debugging and tracking lvbs */
-void ocfs2_dump_meta_lvb_info(u64 level,
-			      const char *function,
-			      unsigned int line,
-			      struct ocfs2_lock_res *lockres);
-#define mlog_meta_lvb(__level, __lockres) ocfs2_dump_meta_lvb_info(__level, __PRETTY_FUNCTION__, __LINE__, __lockres)
-
 #endif	/* DLMGLUE_H */
--- linux-2.6.21-rc6-mm1/fs/ocfs2/dlmglue.c.old	2007-04-10 00:42:19.000000000 +0200
+++ linux-2.6.21-rc6-mm1/fs/ocfs2/dlmglue.c	2007-04-10 00:44:23.000000000 +0200
@@ -103,6 +103,35 @@
 static void ocfs2_dentry_post_unlock(struct ocfs2_super ...
From: Helge Hafting
Date: Wednesday, April 11, 2007 - 12:42 pm

2.6.21-rc6-mm1 locks up during boot.
The last message is:
usbcore: registered new interface driver hiddev

Then it hangs so hard that not even sysrq+B have any effect.

With 2.6.18-rc5-mm1, the next messages I normally get are:
usbcore: registered new interface driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
usbcore: registered new interface driver usbserial

This is a x86-64 single processor

Helge Hafting
-

From: Andrew Morton
Date: Wednesday, April 11, 2007 - 1:43 pm

On Wed, 11 Apr 2007 21:42:27 +0200

OK.  If you add initcall_debug to the kernel boot command line, what's the
last thing we call?

-

From: Helge Hafting
Date: Wednesday, April 11, 2007 - 4:07 pm

The last messages (handwritten, somewhat shortened)
calling hid_init+0x0/0x10()
returned 0
ran for 0 msec
calling hid_init+0x0/0x50()
usbcore registered new interface driver hiddev

and then it hangs completely.

Helge Hafting
-

From: Andrew Morton
Date: Wednesday, April 11, 2007 - 4:25 pm

On Thu, 12 Apr 2007 01:07:00 +0200

OK, thanks.  If it happens to be, I'll bisect it down.  Chances are it
won't, and it gets merged, and we get to futz around with it for a week or
two while holding up 2.6.22.  I can only think we must enjoy doing it this way.
-

From: Jiri Kosina
Date: Thursday, April 12, 2007 - 12:50 am

Hi Helge,

2.6.21-rc6 (without any -mm patches) works fine?

Could you please

- try booting without any HID devices plugged in (i.e. usb mice, usb 
  keyboards) if the problem persists?
- recompile 2.6.21-rc6-mm1 with git-hid.patch reverted to see if it helps?

I am unfortunately not able to reproduce it here on x86_64.

Thanks,

-- 
Jiri Kosina
-

From: Helge Hafting
Date: Thursday, April 12, 2007 - 12:22 pm

Pulled the usb mouse - this moved the crash around.
usbhid was registered anyway, but later than usual.

The last messages:

md:  <...>
cpuidle: <...>
sdhci: <...>
sdhci: <...>
usbcore: registered new interface hiddev
usbcore: registered new interface usbhid
drivers/hid/usbhid/hid_core.c v2.6 USB HID coredriver
Advanced linux sound architecture <...>
ACPI: PCI Interrupt 0000:00:06.0[A]->GSI 17 (lewel,low)->IRQ 17

And then it hung. Rebooting into rc5mm4, I got this as the next msgs:
gameport: Trident 4DWave is pci0000:00:06.0/gameport0, speed 1966kHz
ALSA device list:
  #0: Trident TRID4DWAVENX PCI Audio at 0x9400, irq 17
oprofile: using NMI interrupt.
Netfilter messages via NETLINK v0.30.

Just downloaded it. Unfortunately, it will not revert cleanly:
$ patch -p1 -R --dry-run < ../git-hid.patch
 
patching file drivers/hid/Kconfig
patching file drivers/hid/Makefile
patching file drivers/hid/hid-core.c
Hunk #1 succeeded at 30 (offset -1 lines).
Hunk #2 succeeded at 871 (offset -1 lines).
Hunk #3 succeeded at 968 (offset -1 lines).
Hunk #4 succeeded at 984 (offset -1 lines).
patching file drivers/hid/hid-input.c
Hunk #1 succeeded at 433 (offset 2 lines).
Hunk #2 succeeded at 533 (offset 2 lines).
patching file drivers/hid/hidraw.c
patching file drivers/hid/usbhid/Kconfig
patching file drivers/hid/usbhid/Makefile
patching file drivers/hid/usbhid/hid-core.c
Unreversed patch detected!  Ignore -R? [n] 
Apply anyway? [n] 
Skipping patch.
1 out of 1 hunk ignored -- saving rejects to file
drivers/hid/usbhid/hid-core.c.
rej
patching file drivers/hid/usbhid/hid-ff.c
patching file drivers/hid/usbhid/hid-lgff.c
patching file drivers/hid/usbhid/hid-pidff.c
patching file drivers/hid/usbhid/hid-plff.c
patching file drivers/hid/usbhid/hid-tmff.c
patching file drivers/hid/usbhid/hid-zpff.c
patching file drivers/hid/usbhid/hiddev.c
patching file drivers/hid/usbhid/usbhid.h
patching file drivers/hid/usbhid/usbkbd.c
patching file drivers/hid/usbhid/usbmouse.c
patching file ...
From: Jiri Kosina
Date: Thursday, April 12, 2007 - 1:02 am

Do you compile with CONFIG_HIDRAW?

-- 
Jiri Kosina
-

From: Helge Hafting
Date: Thursday, April 12, 2007 - 4:42 am

No, that one is not set. 

I did use the new SLUB thing - could that possibly be the cause?
Going back to SLAB is easy enough. 

Helge Hafting
-

From: Andrew Morton
Date: Thursday, April 12, 2007 - 9:47 am

Yes, please try that.
-

From: Helge Hafting
Date: Thursday, April 12, 2007 - 11:56 am

Went back to SLAB, got a compile error. Did a make clean
and compiled again. Got some warnings:

  LD      vmlinux
  SYSMAP  System.map
  SYSMAP  .tmp_System.map
  MODPOST vmlinux
WARNING: init/built-in.o - Section mismatch: reference to
.init.text:kernel_init
 from .text.rest_init after 'rest_init' (at offset 0xe)
WARNING: mm/built-in.o - Section mismatch: reference to .init.text: from
.text.k
mem_cache_create after 'kmem_cache_create' (at offset 0x40b)
WARNING: mm/built-in.o - Section mismatch: reference to .init.text: from
.text.k
mem_cache_create after 'kmem_cache_create' (at offset 0x568)
  AS      arch/x86_64/boot/bootsect.o
  LD      arch/x86_64/boot/bootsect
  AS      arch/x86_64/boot/setup.o
  LD      arch/x86_64/boot/setup
  AS      arch/x86_64/boot/compressed/head.o
  CC      arch/x86_64/boot/compressed/misc.o
  OBJCOPY arch/x86_64/boot/compressed/vmlinux.bin
  GZIP    arch/x86_64/boot/compressed/vmlinux.bin.gz
  LD      arch/x86_64/boot/compressed/piggy.o
  LD      arch/x86_64/boot/compressed/vmlinux
  OBJCOPY arch/x86_64/boot/vmlinux.bin
  HOSTCC  arch/x86_64/boot/tools/build
  BUILD   arch/x86_64/boot/bzImage
Root device is (8, 49)
Boot sector 512 bytes.
Setup is 7302 bytes.
System is 3075 kB
Kernel: arch/x86_64/boot/bzImage is ready  (#11)


Then I booted this - and it hung exactly the same way.

I thought SLUB was reasonbably safe, it is new but not marked experimental.

Helge Hafting
-

From: Jiri Kosina
Date: Thursday, April 12, 2007 - 8:31 am

Helge,

with your .config, my machine hangs upon IPMI initialization, the last 
thing I see before total freeze is 

ipmi_si: Trying PCI-specified kcs state machine at mem address 0xd0121000, slave address 0x0, irq 5

(this was run on 32bit machine)

When I turn IPMI off, I can't reproduce your hang, evetything runs 
smoothly. Could you please try recompiling the kernel with IPMI disabled, 
if it could be related?

Corey added to CC.

Thanks,

-- 
Jiri Kosina
-

From: Andrew Morton
Date: Thursday, April 12, 2007 - 9:55 am

Was that with ipmi linked into vmlinux?  (Please send the output of grep
IPMI .config)

I thought we fixed that.

-

From: Greg KH
Date: Thursday, April 12, 2007 - 10:25 am

I thought we fixed that too :(

Can you run with the "print out what init function is running" option
and see if it really is the ipmi driver that is dying or not?

thanks,

greg k-h
-

From: Jiri Kosina
Date: Thursday, April 12, 2007 - 10:49 am

Confirmed. 2.6.21-rc6-mm1 with

CONFIG_IPMI_SI=y

hangs upon boot on the already mentioned printk from ipmi_si. With

CONFIG_IPMI_SI=m

the boot succeeds. When manually trying to modprobe ipmi_si after that, 
the modprobe itself hangs, but the machine remains usable otherwise.

I still wonder if this could be related to what Helge was originally 
reporting.

-- 
Jiri Kosina
-

From: Greg KH
Date: Thursday, April 12, 2007 - 10:58 am

Does this same .config hang in 2.6.21-rc6 without the -mm stuff?

thanks,

greg k-h
-

From: Jiri Kosina
Date: Thursday, April 12, 2007 - 11:17 am

Actually, after approximately 6 minutes 30 seconds, the modprobe finishes 
with -ENODEV and the following is spitted into dmesg:

ipmi_si: There appears to be no BMC at this location
ACPI: PCI interrupt for device 0000:02:00.4 disabled
ipmi_si: Unable to find any System Interface(s)

Anyway I just checked that I get precisely the same behavior with plain 
2.6.21-rc6, so we can rule out -mm with this issue.

It's possible that this system has some broken KCS. I will try to narrow 
this down.

Anyway, the USB-related hang Helge is seeing is therefore a different 
story.

-- 
Jiri Kosina
-

From: Corey Minyard
Date: Thursday, April 12, 2007 - 2:06 pm

My guess is that this system spaces out its KCS registers, but there 
appears to be no way to specify register spacing or offsets with PCI.  
That would mean that the configuration register appears operational to 
the driver, but the data register is returning bogus data.  Thus it 
appears "sort of" working to the driver, and it takes a long time to 
time out.

I'm pretty sure it's possible to test to figure out where the registers 
are really located.  However, I have no way to test this change.  All 
the other configuration methods have a way to discover this information.

Jiri, we should probably take this offline if you want to continue to 
work on it.

Thanks,

-corey
-

From: Helge Hafting
Date: Thursday, April 12, 2007 - 1:19 pm

Removed IPMI, recompiled, rebooted, crashed the same way.

Helge Hafting
-

From: Corey Minyard
Date: Thursday, April 12, 2007 - 9:01 am

Jiri, can you send me the output of "lspci -x" ?

-corey


-

From: Jiri Kosina
Date: Thursday, April 12, 2007 - 11:32 am

OK, so it hangs somewhere nearby usbhid's hid_init(), and the 
usb_register() has been already invoked. Could you please apply the 
superstupid patch below and send me the output up to the point it hangs? I 
am curious to know whether it hangs somewhere inside usb_register(), or 
elsewhere.

Thanks.

diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
index 1ddca31..d930f62 100644
--- a/drivers/hid/usbhid/hid-core.c
+++ b/drivers/hid/usbhid/hid-core.c
@@ -1550,15 +1550,22 @@ static int __init hid_init(void)
 	retval = hiddev_init();
 	if (retval)
 		goto hiddev_init_fail;
+	printk(KERN_DEBUG "hid_init: before usb_register()\n");
 	retval = usb_register(&hid_driver);
+	printk(KERN_DEBUG "hid_init: after usb_register(), retuned %d\n", retval);
 	if (retval)
 		goto usb_register_fail;
 	info(DRIVER_VERSION ":" DRIVER_DESC);
 
+	printk(KERN_DEBUG "hid_init: returning 0\n");
+	dump_stack();
 	return 0;
 usb_register_fail:
+	printk(KERN_DEBUG "hid_init: calling hiddev_exit()\n");
 	hiddev_exit();
 hiddev_init_fail:
+	printk(KERN_DEBUG "hid_init: returning %d\n", retval);
+	dump_stack();
 	return retval;
 }
 
-

From: Helge Hafting
Date: Thursday, April 12, 2007 - 1:25 pm

Are you sure this is the correct patch - against 2.6.21-rc6-mm1 ?
-

From: Jiri Kosina
Date: Thursday, April 12, 2007 - 4:16 pm

Well I am pretty sure:

box:~/scratch # wget ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... 2>&1
box:~/scratch # wget ftp://ftp.kernel.org/pub/linux/kernel/v2.6/linux-2.6.20.tar.bz2>/dev/null 2>&1
box:~/scratch # wget ftp://ftp.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.21-rc6.bz2>/dev/null 2>&1 
box:~/scratch # tar xf linux-2.6.20.tar.bz2
box:~/scratch # cd linux-2.6.20/
box:~/scratch/linux-2.6.20 # mv ../patch-2.6.21-rc6.bz2 .
box:~/scratch/linux-2.6.20 # bunzip2 patch-2.6.21-rc6.bz2
box:~/scratch/linux-2.6.20 # patch -p1 < patch-2.6.21-rc6 >/dev/null 2>&1; echo $?
0
box:~/scratch/linux-2.6.20 # mv ../2.6.21-rc6-mm1.bz2 .
box:~/scratch/linux-2.6.20 # bunzip2 2.6.21-rc6-mm1.bz2
box:~/scratch/linux-2.6.20 # patch -p1 < 2.6.21-rc6-mm1 >/dev/null 2>&1; echo $?
0
box:~/scratch/linux-2.6.20 # cat tmp.patch
diff --git a/drivers/hid/usbhid/hid-core.c b/drivers/hid/usbhid/hid-core.c
index 1ddca31..d930f62 100644
--- a/drivers/hid/usbhid/hid-core.c
+++ b/drivers/hid/usbhid/hid-core.c
@@ -1550,15 +1550,22 @@ static int __init hid_init(void)
        retval = hiddev_init();
        if (retval)
                goto hiddev_init_fail;
+       printk(KERN_DEBUG "hid_init: before usb_register()\n");
        retval = usb_register(&hid_driver);
+       printk(KERN_DEBUG "hid_init: after usb_register(), retuned %d\n", 
retval);
        if (retval)
                goto usb_register_fail;
        info(DRIVER_VERSION ":" DRIVER_DESC);

+       printk(KERN_DEBUG "hid_init: returning 0\n");
+       dump_stack();
        return 0;
 usb_register_fail:
+       printk(KERN_DEBUG "hid_init: calling hiddev_exit()\n");
        hiddev_exit();
 hiddev_init_fail:
+       printk(KERN_DEBUG "hid_init: returning %d\n", retval);
+       dump_stack();
        return retval;
 }
box:~/scratch/linux-2.6.20 # patch -p1 < tmp.patch
patching file drivers/hid/usbhid/hid-core.c
box:~/scratch/linux-2.6.20 #

So I guess you are ...
From: Helge Hafting
Date: Wednesday, April 25, 2007 - 2:54 am

Jiri Kosina wrote:
I don't know about 2.6.21-rc6, but 2.6.21-rc7
 (from fresh sources) is good.  It boots up without hanging,
and my USB devices works too.

Should I test rc7-mm1 then?

Helge Hafting

-

From: Jiri Kosina
Date: Wednesday, April 25, 2007 - 4:28 am

That would also be useful.

But really identifying offending patch using bisection would help most. 
And it should be pretty easy and not too much time consuming for you, as 
the bug triggers immediately upon boot in your case.

In case you are not convenient with "bisecting by hand" Andrew's quilt 
patchset, don't forget that it is also possible to obtain -mm tree through 
git, which provides very convenient means for bisecting. This is what I 
usually do.

-- 
Jiri Kosina
-

From: Helge Hafting
Date: Wednesday, April 25, 2007 - 5:45 am

If there is an offending patch at all - my rc6-mm1 kernel must
have been built from messed-up sources - we saw that when your
patch did not apply.  So my source had errors - right in the USB part.

I haven't tested a correct rc6-mm1, so I don't even know if it
Indeed - it is easy to spot. :-)

Helge Hafting
-

From: Helge Hafting
Date: Thursday, April 26, 2007 - 11:38 am

I recompiled 2.6.21-rc6-mm1 from fresh sources.
It still hangs initializing USBm but this time your
patch applied.

I rebooted with your patch, and got:

Detailed lists of all the USB devices found
(printer,mouse,...)
Then usbcore registered various drivers, such as
usblp, usb-storage, libusual, usbserial, ipaq
These messages were intermixed with messages from
the md raid system initializing

The three last lines were:
sdhci: Secure digital host controller interface driver
sdhci: copyright Pierre Ossman
usbcore: registered new interface driver hiddev

And then the machine hung completely.  I'll have
a look at bisecting. :-(
-

From: Helge Hafting
Date: Thursday, April 26, 2007 - 3:28 pm

2.6.21-rc6 boots up fine.  Both rc6 and rc7 has a different problem - the
machine tends to hang after some minutes work in X.  That hang is
unusual in that moving the mouse still move the X cursor, but
everything else stops and sysrq fails me. But that is another story.

rc6 boots, rc6-mm1 hangs at the "usbcore registered hiddev" message.
Bisection:
1, 2, 3: the three first hangs at "usbcore registered hiddev"
4, 5, 6: the next three hangs at a message about ACPI  PCI[A]->IRQ17
I decided to keep bisecting these hangers as "bad", I don't really know
if this could be the same thing or completely different issues.  If they are
different, then one problem will mask the other anyway, so
calling every hanging kernel "bad" will at least find the first broken 
patch.
7: boots up ok!
8,9,10: hangs at the aboce mentioned ACPI message
The (first) "hanging" patch in 2.6.21-rc6-mm1 is: git-acpi.patch

Helge Hafting








-

From: Jiri Kosina
Date: Thursday, April 26, 2007 - 3:39 pm

Hi Helge,

thanks for the effort. If you take stock rc6-mm1 and revert just 
git-acpi.patch, doesn the machine behave correctly?

-- 
Jiri Kosina
-

From: Andrew Morton
Date: Thursday, April 26, 2007 - 4:13 pm

It would be easier and would produce a clearer result to test just

	2.6.21-rc7
+	2.6.21-rc7-mm2's origin.patch
+	2.6.21-rc7-mm2's acpi.patch

from
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc7/2.6.21-rc7-mm...
-

From: Helge Hafting
Date: Friday, April 27, 2007 - 2:04 pm

Just compiled & booted such a kernel - it came up fine!
So it looks like USB is fine then, and the problem is in
that ACPI patch.

Helge Hafting
-

From: Andrew Morton
Date: Friday, April 27, 2007 - 3:41 pm

On Fri, 27 Apr 2007 23:04:58 +0200

OK, thanks.  Len&co: we've established that 2.6.21-rc6-mm1's git-acpi.patch
-

From: Mattia Dongili
Date: Friday, April 13, 2007 - 4:45 pm

On Sun, Apr 08, 2007 at 02:35:59PM -0700, Andrew Morton wrote:

after bisecting I can finally say what breaks resume from STR here:

tadaaaaa: CPU_IDLE.
I first spotted the git-acpi.patch then reapplied it and disabled
CPU_IDLE, now my laptop resumes.

Any useful information I should add?

$ cat /sys/devices/system/cpu/cpuidle/*
acpi_idle 
no governors
acpi_idle
no governor

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz
stepping	: 6
cpu MHz		: 1000.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 3671.24
clflush size	: 64

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz
stepping	: 6
cpu MHz		: 1000.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 15805.85
clflush size	: 64

-- 
-

From: Shaohua Li
Date: Sunday, April 15, 2007 - 7:40 pm

please check if the patch at
http://marc.info/?l=linux-acpi&m=117523651630038&w=2 fixed the issue

Thanks,
Shaohua
-

From: Joshua Wise
Date: Monday, April 16, 2007 - 7:50 pm

I have the same system as Mattia, and when I applied this patch and turned
CPU_IDLE back on, I got a panic on boot. Unfortunately, the EIP scrolled off
screen, so I can't get a line number.

(I had the same STR breakage as him; STR did not work with CPU_IDLE turned
on, and it did work with CPU_IDLE turned off.)

I'm running +rc6+mm(April 11) on a Sony VAIO SZ.

joshua
-

From: Shaohua Li
Date: Monday, April 16, 2007 - 7:50 pm

Is it possible you can get the log from a serial? I thought at least you
can see some log info in the screen, if you haven't serial, please write
it down. The boot panic surprise me, as it works here.

Thanks,
Shaohua
-

From: Shaohua Li
Date: Monday, April 16, 2007 - 11:47 pm

Looks there is init order issue of sysfs files. The new refreshed patch
should fix your bug.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>

Index: 21-rc6-mm1/drivers/acpi/processor_idle.c
===================================================================
--- 21-rc6-mm1.orig/drivers/acpi/processor_idle.c	2007-04-17 13:41:29.000000000 +0800
+++ 21-rc6-mm1/drivers/acpi/processor_idle.c	2007-04-17 14:03:56.000000000 +0800
@@ -624,7 +624,7 @@ int acpi_processor_cst_has_changed(struc
 		return -ENODEV;
 
 	acpi_processor_get_power_info(pr);
-	return cpuidle_force_redetect(&per_cpu(cpuidle_devices, pr->id));
+	return cpuidle_force_redetect(per_cpu(cpuidle_devices, pr->id));
 }
 
 /* proc interface */
Index: 21-rc6-mm1/drivers/cpuidle/cpuidle.c
===================================================================
--- 21-rc6-mm1.orig/drivers/cpuidle/cpuidle.c	2007-04-17 13:41:29.000000000 +0800
+++ 21-rc6-mm1/drivers/cpuidle/cpuidle.c	2007-04-17 14:42:17.000000000 +0800
@@ -18,7 +18,7 @@
 
 #include "cpuidle.h"
 
-DEFINE_PER_CPU(struct cpuidle_device, cpuidle_devices);
+DEFINE_PER_CPU(struct cpuidle_device *, cpuidle_devices);
 EXPORT_PER_CPU_SYMBOL_GPL(cpuidle_devices);
 
 DEFINE_MUTEX(cpuidle_lock);
@@ -34,13 +34,13 @@ static void (*pm_idle_old)(void);
  */
 static void cpuidle_idle_call(void)
 {
-	struct cpuidle_device *dev = &__get_cpu_var(cpuidle_devices);
+	struct cpuidle_device *dev = __get_cpu_var(cpuidle_devices);
 
 	struct cpuidle_state *target_state;
 	int next_state;
 
 	/* check if the device is ready */
-	if (dev->status != CPUIDLE_STATUS_DOIDLE) {
+	if (!dev || dev->status != CPUIDLE_STATUS_DOIDLE) {
 		if (pm_idle_old)
 			pm_idle_old();
 		return;
@@ -117,19 +117,32 @@ static int cpuidle_add_device(struct sys
 	int cpu = sys_dev->id;
 	struct cpuidle_device *dev;
 
-	dev = &per_cpu(cpuidle_devices, cpu);
+	dev = per_cpu(cpuidle_devices, cpu);
 
-	dev->cpu = cpu;
 	mutex_lock(&cpuidle_lock);
 	if (cpu_is_offline(cpu)) {
 ...
From: Joshua Wise
Date: Wednesday, April 18, 2007 - 4:00 pm

Yes, that did fix the hang on resume from STR -- that now works fine.

However:
joshua@rebirth:/sys/devices/system/cpu/cpuidle$ cat available_drivers current_driver

<NULL>
joshua@rebirth:/sys/devices/system/cpu/cpuidle$ cat available_governors current_governor
ladder
ladder

Is this correct? For reference, my config is http://joshuawise.com/config.gz
-- I didn't see any options for cpuidle drivers to access ACPI states...

joshua
-

From: Torsten Kaiser
Date: Monday, April 9, 2007 - 12:03 pm

drivers/ieee1394/ieee1394_transactions.c fails for me if CONFIG_SMP=n

gcc complains:
  CC      drivers/ieee1394/ieee1394_transactions.o
drivers/ieee1394/ieee1394_transactions.c: In function 'hpsb_get_tlabel':
drivers/ieee1394/ieee1394_transactions.c:183: error:
'TASK_INTERRUPTIBLE' undeclared (first use in this function)
drivers/ieee1394/ieee1394_transactions.c:183: error: (Each undeclared
identifier is reported only once
drivers/ieee1394/ieee1394_transactions.c:183: error: for each function
it appears in.)
drivers/ieee1394/ieee1394_transactions.c:183: warning: implicit
declaration of function 'signal_pending'
drivers/ieee1394/ieee1394_transactions.c:183: warning: implicit
declaration of function 'schedule'
drivers/ieee1394/ieee1394_transactions.c: In function 'hpsb_free_tlabel':
drivers/ieee1394/ieee1394_transactions.c:213: error:
'TASK_INTERRUPTIBLE' undeclared (first use in this function)
make[2]: *** [drivers/ieee1394/ieee1394_transactions.o] Error 1
make[1]: *** [drivers/ieee1394] Error 2
make: *** [drivers] Error 2


I fixed this by adding #include <linux/sched.h> before #include <linux/wait.h>
But that is probably not the correct fix, but gives me a working kernel.

Diff between a working .config and a failing one:
(created by switching SMP off with menuconfig)
 --- config.works        2007-04-09 20:54:30.182374075 +0200
+++ .config     2007-04-09 20:54:47.317863059 +0200
@@ -3,3 +3,3 @@
 # Linux kernel version: 2.6.21-rc6-mm1
-# Mon Apr  9 16:01:11 2007
+# Mon Apr  9 20:54:47 2007
 #
@@ -36,3 +36,3 @@
 CONFIG_EXPERIMENTAL=y
-CONFIG_LOCK_KERNEL=y
+CONFIG_BROKEN_ON_SMP=y
 CONFIG_INIT_ENV_ARG_LIMIT=32
@@ -57,3 +57,2 @@
 CONFIG_IKCONFIG_PROC=y
-CONFIG_CPUSETS=y
 # CONFIG_SYSFS_DEPRECATED is not set
@@ -104,3 +103,2 @@
 CONFIG_KMOD=y
-CONFIG_STOP_MACHINE=y

@@ -151,5 +149,3 @@
 CONFIG_MTRR=y
-CONFIG_SMP=y
-# CONFIG_SCHED_SMT is not set
-CONFIG_SCHED_MC=y
+# CONFIG_SMP is not set
 CONFIG_PREEMPT_NONE=y
@@ -157,21 +153,12 @@
 # CONFIG_PREEMPT is not ...
From: Stefan Richter
Date: Monday, April 9, 2007 - 2:42 pm

Thanks, I'll add this to linux1394-2.6.git (which exposed the problem)
ASAP.  On the other hand, the culprit is actually include/linux/wait.h
which IMO should include the headers it needs for itself.
-- 
Stefan Richter
-=====-=-=== -=-- -=--=
http://arcgraph.de/sr/
-

From: Stefan Richter
Date: Monday, April 9, 2007 - 3:01 pm

And while I am at it:


From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Subject: ieee1394: some more includes

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
 drivers/ieee1394/ieee1394_transactions.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux/drivers/ieee1394/ieee1394_transactions.c
===================================================================
--- linux.orig/drivers/ieee1394/ieee1394_transactions.c
+++ linux/drivers/ieee1394/ieee1394_transactions.c
@@ -10,13 +10,16 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/compiler.h>
 #include <linux/hardirq.h>
 #include <linux/spinlock.h>
+#include <linux/string.h>
 #include <linux/sched.h>  /* because linux/wait.h is broken if CONFIG_SMP=n */
 #include <linux/wait.h>
 
 #include <asm/bug.h>
 #include <asm/errno.h>
+#include <asm/system.h>
 
 #include "ieee1394.h"
 #include "ieee1394_types.h"


-- 
Stefan Richter
-=====-=-=== -=-- -=--=
http://arcgraph.de/sr/

-

From: J.A.
Date: Tuesday, April 24, 2007 - 1:10 am

Has somthing related with PTY's changed in this kernel ?
I have to enable legacy PTY handling in a couple boxes to get ssh working.
If not, I had openpty() errors and nor sshd nor virtual terminals (aterm) were
able to get a terminal.

User space (udev) is the same in three boxes and one works and two fail.
I had /dev/ptmx everywhere and /dev/pts mounted

Any idea ?
TIA

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.0 (Cooker) for i586
Linux 2.6.20-jam10 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #1 SMP PREEMPT
-

From: Andrew Morton
Date: Tuesday, April 24, 2007 - 4:58 am

Not as far as I know, but there were some kobject_uevent changes which

I have CONFIG_PM_LEGACY unset in at least one of my test configs and it

Nope.  Can you please check 2.6.21-rc7-mm1, see if that fixed it?  If so,
it might have been the kobject_uevent thing.

-

From: J.A.
Date: Tuesday, April 24, 2007 - 6:43 am

I will, thanks.

A couple questions (as far as udev behaviour is sooooooo distro dependent):
- What should I have in /dev if I don't use legacy ptys ? As I understand
  it, only /dev/ptmx and /dev/pts/*, no /dev/tty* nor /dev/pty* ?
- If my setup, for whatever strange reasons has /dev/tty* stored anyware
  (/dev/.udev, links.conf...) and they get created, I supose that opening
  /dev/tty will give a ENODEV ?

TIA

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.0 (Cooker) for i586
Linux 2.6.20-jam10 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #4 SMP PREEMPT
-

From: Andrew Morton
Date: Tuesday, April 24, 2007 - 10:22 am

My FC5 CONFIG_LEGACY_PTYS=n box has no /dev/ptmx, /dev/pts/*, all of
/dev/tty0 through /dev/tty63 and no /dev/pty*.

I'm not sure where all the /dev/tty*'s came from - perhaps a static udev

well, /dev/tty is attached to your current tty and /dev/tty2 will get you
talking to the second VT.  I can't immediately thing what /dev/tty22 is
attached to.

-

From: J.A.
Date: Wednesday, April 25, 2007 - 1:50 pm

Linux has traditionally used the BSD-like names /dev/ptyxx for
masters and /dev/ttyxx for slaves of pseudo terminals. This scheme
has a number of problems. The GNU C library glibc 2.1 and later,
however, supports the Unix98 naming standard: in order to acquire a
pseudo terminal, a process opens /dev/ptmx; the number of the pseudo
terminal is then made available to the process and the pseudo
terminal slave can be accessed as /dev/pts/<number>. What was
traditionally /dev/ttyp2 will then be /dev/pts/2, for example.

So if all userspace is Unix98-aware, you just would be done with
/dev/ptmx and /dev/pts/*. In your setup it looks like you are not able
to use Unix98 PTYs, but as udev has created tty* things work.

I supposed it was something like you always opened /dev/tty but kernel+glibc
redirect you to /dev/ttyXX, that is your _real_ terminal.
I will try to check docs...

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.0 (Cooker) for i586
Linux 2.6.20-jam10 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #4 SMP PREEMPT
-

From: J.A.
Date: Wednesday, April 25, 2007 - 2:39 pm

Oops, no, /dev/tty?? are for virtual consoles.

But I think I found the problem.
In short, in /dev/pts is mounted before /dev. I remounted it and ssh worked
fine again.
I'll dig mandrivas rc's to check this...

Anyways, I see no plain 'mount' command in /sbin/start_udev, all are 
'mount --move' commands. So I think it supposes is already mounted and
tries to move it.

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.0 (Cooker) for i586
Linux 2.6.20-jam10 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #4 SMP PREEMPT
-

From: J.A.
Date: Wednesday, April 25, 2007 - 3:26 pm

As a (in)famous last work, I think Unix98 PTYs really don't like mount --move
for /dev/pts. If I mount it manually after boot, everything works fine.

--
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.0 (Cooker) for i586
Linux 2.6.20-jam10 (gcc 4.1.2 20070302 (prerelease) (4.1.2-1mdv2007.1)) #4 SMP PREEMPT
-

Previous thread: man-pages-2.44 is released by Michael Kerrisk on Sunday, April 8, 2007 - 11:41 am. (2 messages)

Next thread: Add a norecovery option to ext3/4? by Samuel Thibault on Sunday, April 8, 2007 - 5:05 pm. (22 messages)