Re: Problems with -git14

Previous thread: Hack to fix not working spindown over Firewire by Tino Keitel on Tuesday, April 29, 2008 - 4:26 pm. (21 messages)

Next thread: [2.6 patch] fix arch/frv/mm/unaligned.o build error by Adrian Bunk on Tuesday, April 29, 2008 - 4:57 pm. (4 messages)
From: J.A.
Date: Tuesday, April 29, 2008 - 4:56 pm

Hi...

I have a couple problems with latest git (-14):

- It only recognises 2 processors out of 4 (dual Xeon HT)
- It oopses on the swapper process just on boot...

Difference in dmesg is below. If full correct dmesg or config is
needed, please ask for them. The kernel was built copying old
2.6.25 config to .config && make oldconfig. I filled the missing
gaps like PAT and others...

--- dm.txt	2008-04-30 01:47:55.000000000 +0200
+++ dm-git14.txt	2008-04-30 01:47:29.000000000 +0200
@@ -1,4 +1,4 @@
-Linux version 2.6.25-jam01 (root@werewolf) (gcc version 4.2.3 (4.2.3-6mnb1)) #6 SMP PREEMPT Mon Apr 21 20:28:14 CEST 2008
+Linux version 2.6.25-jam02 (root@werewolf) (gcc version 4.2.3 (4.2.3-6mnb1)) #1 SMP PREEMPT Wed Apr 30 00:49:19 CEST 2008
 BIOS-provided physical RAM map:
  BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
  BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
@@ -8,11 +8,9 @@
  BIOS-e820: 000000007fee3000 - 000000007fef0000 (ACPI data)
  BIOS-e820: 000000007fef0000 - 000000007ff00000 (reserved)
  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
+x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
 1150MB HIGHMEM available.
 896MB LOWMEM available.
-Scan SMP from c0000000 for 1024 bytes.
-Scan SMP from c009fc00 for 1024 bytes.
-Scan SMP from c00f0000 for 65536 bytes.
 found SMP MP-table at [c00f57c0] 000f57c0
 Entering add_active_range(0, 0, 524000) 0 entries of 256 used
 Zone PFN ranges:
@@ -42,13 +40,13 @@
 ACPI: PM-Timer IO Port: 0x408
 ACPI: Local APIC address 0xfee00000
 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
-Processor #0 15:2 APIC version 20
+BIOS bug, APIC version is 0 for CPU#0! fixing up to 0x10. (tell your hw vendor)
 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled)
-Processor #6 15:2 APIC version 20
-ACPI: LAPIC (acpi_id[0x02] lapic_id[0x07] enabled)
-Processor #7 15:2 APIC version 20
-ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
-Processor #1 15:2 APIC version 20
+BIOS bug, APIC ...
From: Hugh Dickins
Date: Wednesday, April 30, 2008 - 8:17 am

Yes, I've been getting this for some days too: only 2 processors on
dual Xeon HT in 32-bit mode; whereas x86_64 finds all 4 just fine.
Ran lots of testing on 2.6.25-mm1 before I noticed it there.

I bisected for a while, but it got confusing (arrived at a bisect
point which gave only 1 processor: smpboot code getting rearranged),
so I was forced (quel horreur!) to investigate properly.  I've just
now had success with the patch below, please give it a try:

I presume this warning and backtrace is what you report as an oops:
I think you'll find Linus included a fix for this one overnight, and
it should have gone away in 2.6.25-git15 (but I didn't see it myself).

Hugh

[PATCH] x86_32: fix HT cpu booting

Since recent smpboot 32/64-bit merge, my dual Xeon with HT has been
booting only 2 of its 4 cpus (when running an i386 kernel; but x86_64
is okay).  J.A. Magall=C3=B3n reports the same.

native_cpu_up: bad cpu 2
native_cpu_up: bad cpu 3

The mach-default cpu_present_to_apicid() was just returning cpu number
(2, 3) instead of apicid (6, 7): looks like we now need the x86_64 code
even for the i386 case.

Comparing with other versions of cpu_present_to_apicid(), it seems a
good idea to include an NR_CPUS test too, since cpu_present() doesn't
include that; but that wasn't a problem here, and may no problem at all.

One point worth noting - is it a worry?  Prior to that smpboot merge,
my Xeon booted the two HT siblings on one physical first, then the
two siblings on the other physical after - when i386, but alternated
them when x86_64.  Since the merge, the x86_64 sequence is unchanged,
but the i386 sequence is now like x86_64.  I prefer this consistency,
and I prefer the new sequence: booting with maxcpus=3D2 then uses the
independent physicals without HT sharing; but surprises in store?

Signed-off-by: Hugh Dickins <hugh@veritas.com>

--- 2.6.25-git/include/asm-x86/mach-default/mach_apic.h=092008-04-23 07:24:=
16.000000000 +0100
+++ ...
From: Arjan van de Ven
Date: Tuesday, April 29, 2008 - 10:22 am

On Wed, 30 Apr 2008 16:17:46 +0100 (BST)

this is how it always was supposed to be!
At least this is how Intel specifies it to the BIOS vendors (but remember, it's the bios
that pretty much sets the cpu order; some will be weird).

Exactly for the maxcpus=2 reason... (and for systems where the cpu scheduler does 
load balancing "from 0 up" it also makes sense)
--

From: Glauber Costa
Date: Wednesday, April 30, 2008 - 8:54 am

Hugh, thanks for tracing this. The patch looks sane to me

However, since this problem was raised, I'm still concerned that visws 
may have the same problem, since it uses the same logic that i386 
mach_default used to. (and I did not touched, since x86_64 code went int 
o mach_default).

Could anyone with access to such hardware give it a try ?
--

From: Andrey Panin
Date: Monday, May 5, 2008 - 12:13 am

Unfortunately, no. Visws port is in non-bootable state currently, 2.6.25 ha=
ngs
in calibrate_delay_direct() and I still do not know why :(

--=20
Andrey Panin		| Linux and UNIX system administrator
pazke@donpac.ru		| PGP key: wwwkeys.pgp.net
From: Thomas Gleixner
Date: Monday, May 5, 2008 - 12:49 am

What was the last kernel which booted on visws ?

Thanks,
	tglx
--

From: Ingo Molnar
Date: Wednesday, April 30, 2008 - 11:13 am

applied. Thanks Hugh for the detective work! I love fixes that only 
remove code ;-)

	Ingo
--

From: Ingo Molnar
Date: Wednesday, April 30, 2008 - 11:27 am

yep - but right now our view is that such surprises are worth having in 
this case. Basically we now did the more or less "mechanical" 
unification of the two SMP boot codebases, while trying to keep the 
stability track record of both, as much as possible.

The 64-bit SMP boot code (the one which came from arch/x86_64) is 
cleaner, so the plan is to slowly but surely gravitate towards the 
cleaner code - such as with your fix - while not dropping quirks and 
legacies.

It's now possible to do that without impacting the stability (and 
legacy) track record of the 32-bit code too much, because now the code 
is placed next to each other and the differences are plain visible via 
ugly #ifdefs. Whatever change we do is small and revertable.

... but it's still not easy to modify the engine of a race car, in the 
middle of the race - so please be on the lookout and any help is welcome 
;-)

	Ingo
--

From: J.A.
Date: Wednesday, April 30, 2008 - 12:40 pm

One question that always bugs me: do I have to set NR_CPUS=8 to pick
apicid number 7, and let the map just flag the ones I have, or will
it work with NR_CPUS=4 ?

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam05 (gcc 4.2.2 20071128 (4.2.2-2mdv2008.1)) SMP PREEMPT
--

From: Hugh Dickins
Date: Thursday, May 1, 2008 - 5:04 am

Yes, that works (and it's intended to work that way).

Hugh
From: J.A.
Date: Wednesday, April 30, 2008 - 5:50 pm

-git16 plus your patch gives me my 4 cpus again.
But I still get the warning:

scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
        <Adaptec 29320 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs
ACPI: PCI Interrupt 0000:03:0a.1[B] -> GSI 23 (level, low) -> IRQ 23
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
        <Adaptec 29320 Ultra320 SCSI adapter>
        aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI 33 or 66Mhz, 512 SCBs
scsi 1:0:0:0: Direct-Access     SEAGATE  ST336807LW       0C01 PQ: 0 ANSI: 3
------------[ cut here ]------------
WARNING: at include/linux/blkdev.h:431 blk_queue_init_tags+0x110/0x11f()
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.25-jam03 #1
 [<c011e618>] warn_on_slowpath+0x4d/0x66
 [<c0132e68>] __atomic_notifier_call_chain+0x27/0x48
 [<c0132ea0>] atomic_notifier_call_chain+0x17/0x1b
 [<c024a7be>] notify_update+0x1f/0x23
 [<c024aa07>] vt_console_print+0x1dd/0x2ab
 [<c024a82a>] vt_console_print+0x0/0x2ab
 [<c0132a2e>] up+0x9/0x2b
 [<c01f314c>] init_tag_map+0x4e/0x95
 [<c01f3271>] __blk_queue_init_tags+0x27/0x4e
 [<c01f33a8>] blk_queue_init_tags+0x110/0x11f
 [<c0285b9b>] ahd_platform_set_tags+0x177/0x1a7
 [<c0286247>] ahd_linux_slave_configure+0xcd/0x131
 [<c025e89e>] scsi_probe_and_add_lun+0x706/0x964
 [<c025ecc7>] __scsi_scan_target+0xd9/0x5d9
 [<c019ce63>] sysfs_ilookup_test+0x0/0xd
 [<c019d8ea>] sysfs_addrm_finish+0x14/0x1c7
 [<c019d16d>] sysfs_find_dirent+0x1e/0x27
 [<c019d1b1>] sysfs_add_one+0x3b/0x8b
 [<c019ccfc>] sysfs_add_file_mode+0x4d/0x76
 [<c019e4cc>] internal_create_group+0xd7/0x196
 [<c032a1b2>] klist_next+0x53/0x90
 [<c025f23f>] scsi_scan_channel+0x78/0x8d
 [<c025f2fc>] scsi_scan_host_selected+0xa8/0xd1
 [<c025f38d>] do_scsi_scan_host+0x68/0x6a
 [<c028540c>] ahd_linux_register_host+0x267/0x2db
 [<c0287a6f>] ahd_pci_map_int+0x2c/0x50
 [<c0280d5a>] ahd_pci_config+0x533/0x824
 [<c01fce96>] kobject_get+0xf/0x13
 [<c0251635>] ...
From: Hugh Dickins
Date: Thursday, May 1, 2008 - 5:11 am

But sorry to hear this.  That warning has undergone several revisions
already: I expect yours is another false positive not to worry about,
but it still needs to be fixed.  I won't meddle in there, Cc'ed Jens
and Nick who will know what's appropriate.

> sd 1:0:1:0: [sdb] Attached SCSI disk
From: Nick Piggin
Date: Thursday, May 1, 2008 - 6:41 pm

Thanks for the heads up Hugh. I think we're OK at this point because
we're running in allocation/setup code so there should be no concurrency
on queue_flags. I remember following this call chain and making this
conclusion (hopefully correct, Jens?)... however I don't know how I
--

From: Jens Axboe
Date: Tuesday, May 6, 2008 - 12:13 pm

That's USUALLY correct, but not always. If blk_queue_init_tags() is
called for resizing depth, then it's a running queue and we should not
use _unlocked() for that. So basically they can all be _unlocked() due
to lack of concurrency at init time, but not this one:

        } else if (q->queue_tags) {
                rc = blk_queue_resize_tags(q, depth);
                if (rc)
                        return rc;
                queue_flag_set(QUEUE_FLAG_QUEUED, q);
                return 0;
        } ...

So if a driver ever re-calls blk_queue_init_tags() with a tag map
already set, then it needs to hold the queue lock.

-- 
Jens Axboe

--

Previous thread: Hack to fix not working spindown over Firewire by Tino Keitel on Tuesday, April 29, 2008 - 4:26 pm. (21 messages)

Next thread: [2.6 patch] fix arch/frv/mm/unaligned.o build error by Adrian Bunk on Tuesday, April 29, 2008 - 4:57 pm. (4 messages)