Re: [BUILD-FAILURE] 2.6.27-rc1-mm1 - allyesconfig build fails on powerpc

Previous thread: Re: high latency NFS by Neil Brown on Thursday, July 31, 2008 - 3:03 am. (14 messages)

Next thread: linux-next: sparc32/gcc-3.4.5/ld-2.15 build failure by Stephen Rothwell on Thursday, July 31, 2008 - 3:16 am. (4 messages)
To: <linux-kernel@...>
Date: Thursday, July 31, 2008 - 3:03 am

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.27-rc1...

- Something in linux-next has broken my X server

- My Vaio is vacationing on the other side of the continent. Ignorance
is bliss.

- Lots of people are vacationing at present (certain x86 people, for
example). I'll still be around so please be sure to Cc me on things.

But I am unlikely to want to be buried in x86 patches, so please just
give those an extra week or two's testing.

Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1
git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.

echo "subscribe mm-commits" | mail majordomo@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.

- Occasional snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list. These probably are at least compilable.

- More-than-daily -mm sna...

To: Peter Oberparleiter <peter.oberparleiter@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <kernel-testers@...>
Date: Tuesday, August 5, 2008 - 4:26 pm

Hello Peter,

I'm seeing similar GCOV problems as with 2.6.26-rc5-mm1 that you fixed.
This is the same x86_64 box and again it was unable to boot with gcov enabled.
A quick look revealed that arch/x86/tsc_64.c and arch/x86/tsc_32.c code was
unified. Unfortunately simple change of

GCOV_tsc_32.o := n
GCOV_tsc_64.o := n

to

GCOV_tsc.o := n

did not help. Given the amount of combinations of which set of files with GCOV
might cause failures I was rather fortunate and after a few hours I was able
to pinpoint exactly two files which need GCOV disabled to make my x86_64 boot.

If you want to try to figure out what is wrong with them please feel free to send
me patches to test. If not then how about this patch? Compile and run tested.

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

--- linux-2.6.27-rc1-mm1/arch/x86/kernel/Makefile 2008-08-01 18:05:04.000000000 +0200
+++ linux-2.6.27-rc1-mm1-dirty/arch/x86/kernel/Makefile 2008-08-05 21:49:21.000000000 +0200
@@ -13,8 +13,8 @@ CFLAGS_REMOVE_rtc.o = -pg
CFLAGS_REMOVE_paravirt.o = -pg
endif

-GCOV_tsc_32.o := n
-GCOV_tsc_64.o := n
+GCOV_vsyscall_64.o := n
+GCOV_tsc.o := n

#
# vsyscalls (which work on the user stack) should have

Mariusz
--

To: Mariusz Kozlowski <m.kozlowski@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <kernel-testers@...>
Date: Wednesday, August 6, 2008 - 3:23 am

Your patch looks good. I don't think I will be able to refine those
list of files to be excluded any better than you already did so this
should go into -mm with the other gcov patches.

For future reference, there are other object files which "stand out" in
the respective Makefile, namely rtc.o, hpet.o and paravirt.o. Just like
the two files that you identified as causing problems with gcov
profiling, these are explicitly excluded from either FTRACE profiling
or stack-protector checks or both. If there should be further run-time
problems, these are good candidates to check, though I'd like to refrain
from removing them at this point in time without them causing any
apparent problems.

Regards,
Peter
--

To: Peter Oberparleiter <peter.oberparleiter@...>
Cc: Andrew Morton <akpm@...>, <linux-kernel@...>, <kernel-testers@...>
Date: Tuesday, August 5, 2008 - 4:29 pm

Make that arch/x86/kernel/tsc_64.c and arch/x86/kernel/tsc_32.c

Mariusz
--

To: Andrew Morton <akpm@...>, <davem@...>
Cc: <linux-kernel@...>, <kernel-testers@...>, <sparclinux@...>
Date: Saturday, August 2, 2008 - 11:17 am

Hello,

$ uname -a
Linux sparc64 2.6.27-rc1-mm1 #1 SMP PREEMPT Sat Aug 2 15:51:55 CEST 2008 sparc64 sun4u TI UltraSparc II (BlackBird) GNU/Linux

Logs get a little flooded with:

BUG: using smp_processor_id() in preemptible [00000000] code: emerge/3217
caller is smp_call_function_mask+0x1c/0x180
Call Trace:
[0000000000486374] smp_call_function_mask+0x14/0x180
[0000000000447f94] tsb_grow+0x2d4/0x420
[000000000040796c] sparc64_realfault_common+0x10/0x20
[000000000045b604] schedule_tail+0x64/0xa0
[0000000000406150] ret_from_syscall+0x8/0x48
BUG: using smp_processor_id() in preemptible [00000000] code: emerge/3220
caller is smp_call_function_mask+0x1c/0x180
Call Trace:
[0000000000486374] smp_call_function_mask+0x14/0x180
[0000000000447f94] tsb_grow+0x2d4/0x420
[000000000040796c] sparc64_realfault_common+0x10/0x20
[000000000045b604] schedule_tail+0x64/0xa0
[0000000000406150] ret_from_syscall+0x8/0x48
BUG: using smp_processor_id() in preemptible [00000000] code: rsync/3220
caller is smp_call_function_mask+0x1c/0x180
Call Trace:
[0000000000486374] smp_call_function_mask+0x14/0x180
[0000000000447f94] tsb_grow+0x2d4/0x420
[000000000040796c] sparc64_realfault_common+0x10/0x20
BUG: using smp_processor_id() in preemptible [00000000] code: rsync/3220
caller is smp_call_function_mask+0x1c/0x180
Call Trace:
[0000000000486374] smp_call_function_mask+0x14/0x180
[0000000000447f94] tsb_grow+0x2d4/0x420
[000000000040796c] sparc64_realfault_common+0x10/0x20
BUG: using smp_processor_id() in preemptible [00000000] code: rsync/3224
caller is smp_call_function_mask+0x1c/0x180
Call Trace:
[0000000000486374] smp_call_function_mask+0x14/0x180
[0000000000447f94] tsb_grow+0x2d4/0x420
[000000000040796c] sparc64_realfault_common+0x10/0x20
[000000000045b604] schedule_tail+0x64/0xa0
[0000000000406150] ret_from_syscall+0x8/0x48
BUG: using smp_processor_id() in preemptible [00000000] code: file/3246
caller is smp_call_function_mask+0x1c/0x180
Call Trace:
[0000000000486374] smp_call...

To: <m.kozlowski@...>
Cc: <akpm@...>, <linux-kernel@...>, <kernel-testers@...>, <sparclinux@...>
Date: Sunday, August 3, 2008 - 3:02 am

From: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

Thenk for the report and sample patch.

I've decided to put the preemption disabled call at the smp_tsb_sync() call
site so that smp_tsb_sync() can still invoke smp_call_function_mask() as
a tail-call.

Thanks again!

sparc64: Need to disable preemption around smp_tsb_sync().

Based upon a bug report by Mariusz Kozlowski

It uses smp_call_function_masked() now, which has a preemption-disabled
requirement.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
arch/sparc64/mm/tsb.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/sparc64/mm/tsb.c b/arch/sparc64/mm/tsb.c
index 3547937..587f8ef 100644
--- a/arch/sparc64/mm/tsb.c
+++ b/arch/sparc64/mm/tsb.c
@@ -1,9 +1,10 @@
/* arch/sparc64/mm/tsb.c
*
- * Copyright (C) 2006 David S. Miller <davem@davemloft.net>
+ * Copyright (C) 2006, 2008 David S. Miller <davem@davemloft.net>
*/

#include <linux/kernel.h>
+#include <linux/preempt.h>
#include <asm/system.h>
#include <asm/page.h>
#include <asm/tlbflush.h>
@@ -415,7 +416,9 @@ retry_tsb_alloc:
tsb_context_switch(mm);

/* Now force other processors to do the same. */
+ preempt_disable();
smp_tsb_sync(mm);
+ preempt_enable();

/* Now it is safe to free the old tsb. */
kmem_cache_free(tsb_caches[old_cache_index], old_tsb);
--
1.5.6.GIT

--

To: Andrew Morton <akpm@...>, <kernel-testers@...>, <bzolnier@...>
Cc: <linux-kernel@...>, <linux-ide@...>
Date: Saturday, August 2, 2008 - 5:23 am

Hi,

rmmod on ide-cd_mod causes this oops:

BUG: unable to handle kernel paging request at 83535683
IP: [<c0246ffa>] ide_device_put+0xc/0x33
*pde = 00000000
Oops: 0000 [#1] PREEMPT
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:05.0/resource
Modules linked in: radeon drm nfsd lockd sunrpc exportfs pcmcia uhci_hcd ehci_hcd usbcore snd_ali5451 yenta_socket pcspkr snd_ac97_codec ac97_bus rsrc_nonstatic snd_pcm snd_timer ati_agp agpgart snd soundcore snd_page_alloc ide_cd_mod(-) cdrom 8139too psmouse sony_laptop backlight floppy rtc

Pid: 3890, comm: rmmod Not tainted (2.6.27-rc1-mm1 #2)
EIP: 0060:[<c0246ffa>] EFLAGS: 00010286 CPU: 0
EIP is at ide_device_put+0xc/0x33
EAX: 83535657 EBX: dc927a00 ECX: 00000003 EDX: 00000001
ESI: dec34e34 EDI: dec34e34 EBP: d9f46ee0 ESP: d9f46edc
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rmmod (pid: 3890, ti=d9f46000 task=dd88e780 task.ti=d9f46000)
Stack: dc927c00 d9f46eec dec2e202 dc927c00 d9f46ef8 dec2e225 dd9138dc d9f46f00
c02469e0 d9f46f10 c024156f dd9138dc dd9139f4 d9f46f24 c024162c 00000880
dec34e34 c0397dc0 d9f46f38 c0240a33 00000880 dec34e34 00000000 d9f46f48
Call Trace:
[<dec2e202>] ? ide_cd_put+0x26/0x33 [ide_cd_mod]
[<dec2e225>] ? ide_cd_remove+0x16/0x19 [ide_cd_mod]
[<c02469e0>] ? generic_ide_remove+0x1a/0x1e
[<c024156f>] ? __device_release_driver+0x59/0x7f
[<c024162c>] ? driver_detach+0x97/0x99
[<c0240a33>] ? bus_remove_driver+0x6f/0x8b
[<c02419f1>] ? driver_unregister+0x2f/0x33
[<dec31331>] ? ide_cdrom_exit+0xd/0xf [ide_cd_mod]
[<c014265a>] ? sys_delete_module+0x10d/0x1e2
[<c015fedc>] ? do_munmap+0x1d7/0x234
[<c01e8684>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c0103015>] ? sysenter_do_call+0x12/0x35
=======================
Code: ff ff 89 44 24 08 c7 44 24 04 a7 de 35 c0 89 34 24 e8 cb ce f9 ff 31 c0 83 c4 0c 5b 5e 5d c3 55 89 e5 53 89 c3 8b 40 24 8b 40 10 <8b> 40 2c 85 c0 74 12 8b 80 44 01 00 0...

To: Mariusz Kozlowski <m.kozlowski@...>
Cc: Andrew Morton <akpm@...>, <kernel-testers@...>, <linux-kernel@...>, <linux-ide@...>
Date: Saturday, August 2, 2008 - 12:15 pm

Hi,

Unfortunately, I'm unable to reproduce it here with 2.6.27-rc1-mm1.

Could you please check whether it is drive->hwif or hwif->host exploding?

--

To: Bartlomiej Zolnierkiewicz <bzolnier@...>
Cc: Andrew Morton <akpm@...>, <kernel-testers@...>, <linux-kernel@...>, <linux-ide@...>
Date: Saturday, August 2, 2008 - 7:25 pm

It's ALI M15x3 chipset. .config is attached.

# lspci
00:00.0 Host bridge: ATI Technologies Inc RS200/RS200M AGP Bridge [IGP 340M] (rev 02)
00:01.0 PCI bridge: ATI Technologies Inc PCI Bridge [IGP 340M]
00:03.0 Modem: ALi Corporation M5457 AC'97 Modem Controller
00:04.0 Multimedia audio controller: ALi Corporation M5451 PCI AC-Link Controller Audio Device (rev 02)
00:06.0 Bridge: ALi Corporation M7101 Power Management Controller [PMU]
00:07.0 ISA bridge: ALi Corporation M1533/M1535 PCI to ISA Bridge [Aladdin IV/V/V+]
00:0a.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev aa)
00:0a.1 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev aa)
00:0a.2 FireWire (IEEE 1394): Ricoh Co Ltd R5C552 IEEE 1394 Controller (rev 02)
00:0c.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
00:0c.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 50)
00:0c.2 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 51)
00:0f.0 IDE interface: ALi Corporation M5229 IDE (rev c4)
00:12.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)

I saw it exploding in two ways. I added simple debugging stuff:

--- linux-2.6.27-rc1-mm1/drivers/ide/ide.c 2008-08-02 11:42:05.000000000 +0200
+++ linux-2.6.27-rc1-mm1-dirty/drivers/ide/ide.c 2008-08-02 23:26:52.000000000 +0200
@@ -714,6 +714,21 @@ EXPORT_SYMBOL_GPL(ide_device_get);
void ide_device_put(ide_drive_t *drive)
{
#ifdef CONFIG_MODULE_UNLOAD
+ void *tmp;
+
+ tmp = drive;
+ printk("drive: 0x%p\n", tmp);
+ tmp = drive->hwif;
+ printk("drive->hwif: 0x%p\n", tmp);
+ tmp = drive->hwif->host;
+ printk("drive->hwif->host: 0x%p\n", tmp);
+ tmp = drive->hwif->host->dev;
+ printk("drive->hwif->host->dev: 0x%p\n", tmp);
+ tmp = drive->hwif->host->dev[0];
+ printk("drive->hwif->host->dev[0]: 0x%p\n", tmp);
+ tmp = drive->hwif->host->dev...

To: Mariusz Kozlowski <m.kozlowski@...>
Cc: Andrew Morton <akpm@...>, <kernel-testers@...>, <linux-kernel@...>, <linux-ide@...>
Date: Sunday, August 3, 2008 - 10:41 am

[...]

Thanks for debugging this. I see the problem now: previous reference
counting fix was totally fscked up and introduced access to cd->drive
after putting last reference on cd (time to re-supply brown paper bag
stock). The incremental fix (for 2.6.27-rc1-mm1) attached, the fixed

Does it still happen with the 1) fixed?

---
drivers/ide/ide-cd.c | 4 +++-
drivers/ide/ide-disk.c | 4 +++-
drivers/ide/ide-floppy.c | 4 +++-
drivers/ide/ide-tape.c | 4 +++-
drivers/scsi/ide-scsi.c | 4 +++-
5 files changed, 15 insertions(+), 5 deletions(-)

Index: b/drivers/ide/ide-cd.c
===================================================================
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -78,9 +78,11 @@ static struct cdrom_info *ide_cd_get(str

static void ide_cd_put(struct cdrom_info *cd)
{
+ ide_drive_t *drive = cd->drive;
+
mutex_lock(&idecd_ref_mutex);
kref_put(&cd->kref, ide_cd_release);
- ide_device_put(cd->drive);
+ ide_device_put(drive);
mutex_unlock(&idecd_ref_mutex);
}

Index: b/drivers/ide/ide-disk.c
===================================================================
--- a/drivers/ide/ide-disk.c
+++ b/drivers/ide/ide-disk.c
@@ -74,9 +74,11 @@ static struct ide_disk_obj *ide_disk_get

static void ide_disk_put(struct ide_disk_obj *idkp)
{
+ ide_drive_t *drive = idkp->drive;
+
mutex_lock(&idedisk_ref_mutex);
kref_put(&idkp->kref, ide_disk_release);
- ide_device_put(idkp->drive);
+ ide_device_put(drive);
mutex_unlock(&idedisk_ref_mutex);
}

Index: b/drivers/ide/ide-floppy.c
===================================================================
--- a/drivers/ide/ide-floppy.c
+++ b/drivers/ide/ide-floppy.c
@@ -179,9 +179,11 @@ static struct ide_floppy_obj *ide_floppy

static void ide_floppy_put(struct ide_floppy_obj *floppy)
{
+ ide_drive_t *drive = floppy->drive;
+
mutex_lock(&idefloppy_ref_mutex);
kref_put(&floppy->kref, idef...

To: Bartlomiej Zolnierkiewicz <bzolnier@...>
Cc: Andrew Morton <akpm@...>, <kernel-testers@...>, <linux-kernel@...>, <linux-ide@...>
Date: Sunday, August 3, 2008 - 11:45 am

No. I applied your incremental fix and tested it for some time. It doesn't
oops anymore in any way in spite of my best efforts :)

Tested-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>

Thanks,

--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, <linuxppc-dev@...>, Kernel Testers List <kernel-testers@...>, Sam Ravnborg <sam@...>, Andy Whitcroft <apw@...>
Date: Thursday, July 31, 2008 - 8:43 am

Hi Andrew,

make allyesconfig with 2.6.27-rc1-mm1 kernel on powerpc fails with build error

LD .tmp_vmlinux1
ld: drivers/built-in.o section .devexit.text exceeds stub group size
ld: sound/built-in.o section .devinit.text exceeds stub group size
ld: drivers/built-in.o section .devinit.text exceeds stub group size
ld: net/built-in.o section .exit.text exceeds stub group size
ld: drivers/built-in.o section .exit.text exceeds stub group size
ld: net/built-in.o section .init.text exceeds stub group size
ld: sound/built-in.o section .init.text exceeds stub group size
ld: drivers/built-in.o section .init.text exceeds stub group size
ld: fs/built-in.o section .init.text exceeds stub group size
ld: mm/built-in.o section .init.text exceeds stub group size
ld: kernel/built-in.o section .init.text exceeds stub group size
ld: arch/powerpc/platforms/built-in.o section .init.text exceeds stub group size
ld: arch/powerpc/kernel/built-in.o section .init.text exceeds stub group size
ld: init/built-in.o section .init.text exceeds stub group size
ld: kernel/built-in.o section .sched.text exceeds stub group size
ld: net/built-in.o section .text exceeds stub group size
ld: arch/powerpc/oprofile/built-in.o section .text exceeds stub group size
ld: sound/built-in.o section .text exceeds stub group size
ld: drivers/built-in.o section .text exceeds stub group size
ld: lib/built-in.o section .text exceeds stub group size
ld: tests/built-in.o section .text exceeds stub group size
ld: block/built-in.o section .text exceeds stub group size
ld: crypto/built-in.o section .text exceeds stub group size
ld: security/built-in.o section .text exceeds stub group size
ld: ipc/built-in.o section .text exceeds stub group size
ld: fs/built-in.o section .text exceeds stub group size
ld: mm/built-in.o section .text exceeds stub group size
ld: kernel/built-in.o section .text exceeds stub group size
ld: arch/powerpc/xmon/built-in.o section .text exceeds stub group size
ld: arch/powerpc/platforms/built-in.o section .te...

To: Kamalesh Babulal <kamalesh@...>
Cc: Andrew Morton <akpm@...>, <linuxppc-dev@...>, Kernel Testers List <kernel-testers@...>, Sam Ravnborg <sam@...>, <linux-kernel@...>, Peter Oberparleiter <peter.oberparleiter@...>
Date: Friday, August 1, 2008 - 1:29 am

<snip>

Turning off GCOV "fixes" this. Not really the best solution but at
least it narrows doen the search effort.

Peter,
Can you have a look at how this can be fixed, if at all?

Yours Tony

linux.conf.au http://www.marchsouth.org/
Jan 19 - 24 2009 The Australian Linux Technical Conference!

--

To: Tony Breeds <tony@...>
Cc: Andrew Morton <akpm@...>, <linuxppc-dev@...>, Kernel Testers List <kernel-testers@...>, Sam Ravnborg <sam@...>, <linux-kernel@...>, Peter Oberparleiter <peter.oberparleiter@...>
Date: Friday, August 1, 2008 - 12:16 pm

Peter,

--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
--

To: Tony Breeds <tony@...>
Cc: Andrew Morton <akpm@...>, Kamalesh Babulal <kamalesh@...>, Kernel Testers List <kernel-testers@...>, <linux-kernel@...>, <linuxppc-dev@...>, Sam Ravnborg <sam@...>
Date: Friday, August 1, 2008 - 11:44 am

I did some testing with a cross-compiler myself and I don't think
there is a general solution to this problem. It's not one particular
file that is causing the problem but seemingly the sheer size of the
resulting vmlinux file - even though the toal vmlinux.o size is
"merely" up about 100MiB (from around 1,03Gib to 1,13Gib).

I think I'll need help from people with knowledge of the powerpc
toolchain here.

Regards,
Peter
--

To: Tony Breeds <tony@...>
Cc: Kamalesh Babulal <kamalesh@...>, <linuxppc-dev@...>, Kernel Testers List <kernel-testers@...>, Sam Ravnborg <sam@...>, <linux-kernel@...>, Peter Oberparleiter <peter.oberparleiter@...>
Date: Friday, August 1, 2008 - 2:12 am

Am not terribly happy with the state of the gcov patches. They STILL
leave thousands of dead symlinks lying around after `make mrproper' and
generally seem to muck up the kbuild system a bit, although nothing
that a bit of Sam love wouldn't fix.

Plus it breaks the build on a few architectures (branch out of range,
mainly), but that's a fairly minor thing which could even be worked
around in Kconfig (disable the offending code if gcov is enabled)

--

To: Andrew Morton <akpm@...>
Cc: Tony Breeds <tony@...>, Kamalesh Babulal <kamalesh@...>, <linuxppc-dev@...>, Kernel Testers List <kernel-testers@...>, <linux-kernel@...>, Peter Oberparleiter <peter.oberparleiter@...>
Date: Friday, August 1, 2008 - 12:45 pm

Have not had time / energy to get aroud to it.
Other things continue to pop up and time is limited at the moment
as in more limited than usual).

Sam
--

To: Andrew Morton <akpm@...>
Cc: Kamalesh Babulal <kamalesh@...>, Kernel Testers List <kernel-testers@...>, <linux-kernel@...>, <linuxppc-dev@...>, Sam Ravnborg <sam@...>, Tony Breeds <tony@...>
Date: Friday, August 1, 2008 - 11:40 am

This is caused by patch

gcov-create-links-to-gcda-files-in-build-directory.patch

which can be simply removed as it is no longer needed since patch

gcov-add-gcov-profiling-infrastructure-revert-link-changes.patch

Hm, by now the only change to kbuild is the addition of gcc options
-fprofile-arcs/-ftest-coverage depending on the respective config
symbols. If there is anything else that should be changed, please

Some of the problems caused/uncovered by enabling gcov profiling for
a kernel build on some architectures simply cannot be fixed by a change
to the kernel patch itself. I'm wondering if it would be possible
to disable this configuration option when specifying allyesconfig. That
way at least generic testing wouldn't be affected.

Regards,
Peter
--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Thursday, July 31, 2008 - 8:23 am

The only suspicious thing so far is 100% CPU during "time-schedule" test
from LTP:

[...]
pth_str03 0 INFO : thread 0 exiting, depth=4, status=0, addr=0xf2b010
pth_str03 0 INFO : The sum of tree (breadth 4, depth 3) is 3570
pth_str03 1 PASS : Test passed
<<<execution_status>>>
duration=0 termination_type=exited termination_id=0 corefile=no
cutime=0 cstime=3
<<<test_end>>>
<<<test_start>>>
tag=time-schedule01 stime=1217504927
cmdline=" time-schedule"

[100% CPU here, reproducible]

"strace -p" kicks the test to completion, and shows plenty of
sched_yield() calls, IIRC.

--

To: Andrew Morton <akpm@...>
Cc: <linux-kernel@...>, Hugh Dickins <hugh@...>, Paul Menage <menage@...>
Date: Thursday, July 31, 2008 - 4:10 am

Andrew,

Please don't do so. We did discuss this and while Paul and Hugh have
opposed the patches, there is no alternative to memory overcommit
handling for cgroups. Claiming that no one supports overcommit is not
a valid argument. Apache (of what I've seen can decide rlimits for
each of it's children). Without the overcommit feature, a cgroup would
be prone to either excessive swapping for OOM (if badly configured). A
friendly feature that allows us to control and fail allocations is
much nicer.

I've resolved most of the issues reported, except for the last one by
Hugh. The infrastructure also allows me to build a mlock controller. I
am just back from Canada, I hope to get cracking at the problem soon.

Balbir
--

Previous thread: Re: high latency NFS by Neil Brown on Thursday, July 31, 2008 - 3:03 am. (14 messages)

Next thread: linux-next: sparc32/gcc-3.4.5/ld-2.15 build failure by Stephen Rothwell on Thursday, July 31, 2008 - 3:16 am. (4 messages)