ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm... - Compilation is busted on mips due to the page-flags patches - Compilation is busted on sparc64 due to the page-flags patches - Compilation is partially busted on powerpc due to the page-flags patches. It compiles for my g5 but allmodconfig fails. - git-drm is dropped due to build errors - git-xfs is dropped due to failure to get a clean git diff - git-slub has been temporarily replaced by git-pekka. Pekka is standing in while Christoph is away. - git-kvm is dropped due to clashes with git-s390 - Many patches which weren't in 2.6.25-rc8-mm1's git-x86.patch and git-sched.patch have now (belatedly) been introduced. x86 works for me, but Rafael is reporting some crashes. Boilerplate: - See the `hot-fixes' directory for any important updates to this patchset. - To fetch an -mm tree using git, use (for example) git-fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git tag v2.6.16-rc2-mm1 git-checkout -b local-v2.6.16-rc2-mm1 v2.6.16-rc2-mm1 - -mm kernel commit activity can be reviewed by subscribing to the mm-commits mailing list. echo "subscribe mm-commits" | mail majordomo@vger.kernel.org - If you hit a bug in -mm and it is not obvious which patch caused it, it is most valuable if you can perform a bisection search to identify which patch introduced the bug. Instructions for this process are at http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt But beware that this process takes some time (around ten rebuilds and reboots), so consider reporting the bug first and if we cannot immediately identify the faulty patch, then perform the bisection search. - When reporting bugs, please try to Cc: the relevant maintainer and mailing list on any email. - When reporting bugs in this kernel via email, please also rewrite the email Subject: in some manner to reflect the nature of ...
$ cat /var/lib/rpm/Conflictname Killed BUG: unable to handle kernel paging request at fffff0002004c1b0 IP: [<ffffffff80296df7>] __dentry_open+0xe7/0x2d0 PGD 0 Oops: 0000 [6] SMP last sysfs file: /sys/devices/virtual/net/tun0/statistics/collisions CPU 1 Modules linked in: ipv6 tun bitrev test arc4 ecb crypto_blkcipher cryptomgr crypto_algapi ath5k mac80211 crc32 rtc_cmos usbhid sr_mod ohci1394 hid rtc_core cfg80211 rtc_lib ehci_hcd cdrom ieee1394 ff_memless floppy Pid: 4388, comm: cat Tainted: G D 2.6.25-rc8-mm2_64 #399 RIP: 0010:[<ffffffff80296df7>] [<ffffffff80296df7>] __dentry_open+0xe7/0x2d0 RSP: 0018:ffff810028ebbd98 EFLAGS: 00010206 RAX: fffff0002004c1b0 RBX: ffff81001a62d6c0 RCX: 0000000000000000 RDX: ffff81001a62d6c0 RSI: ffff81001a62d6c0 RDI: ffff81001a62d728 RBP: ffff810028ebbdc8 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000000e6 R11: 0000000000000246 R12: ffff81002004c0a0 R13: 0000000000000000 R14: ffffffff80296770 R15: ffff81001c6583e8 FS: 00007fb9b575b6f0(0000) GS:ffff81007d006580(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: fffff0002004c1b0 CR3: 00000000268ea000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 Process cat (pid: 4388, threadinfo ffff810028eba000, task ffff810024500000) Stack: ffff81007c5d4500 ffff81001a62d6c0 0000000000000000 0000000000000004 ffff810028ebbe48 0000000000008000 ffff810028ebbde8 ffffffff802970c4 0000000000000004 0000000000000000 ffff810028ebbf28 ffffffff802a56cb Call Trace: [<ffffffff802970c4>] nameidata_to_filp+0x44/0x60 [<ffffffff802a56cb>] do_filp_open+0x1eb/0x990 [<ffffffff80296aec>] ? get_unused_fd_flags+0x8c/0x140 [<ffffffff80296c16>] do_sys_open+0x76/0x110 [<ffffffff80296cdb>] sys_open+0x1b/0x20 [<ffffffff8020b88b>] system_call_after_swapgs+0x7b/0x80 Code: 4d 85 f6 0f 84 9b 01 00 00 4...
OK, I found the issue I was chasing when I noticed the WARN from genapic_64.c. After seeing that I'd gotten all the way to remove-div_long_long_rem.patch I took a close look at the remaining several dozen patches, and took a wild stab in the dark.. .and ta-da... PROFILE_LIKELY broke between -rc8-mm1 and -rc8-mm2. Setting it to 'y' gets me an instant reboot similar to what I was seeing against -rc2-mm. And changed between -mm1 and -mm2 (datestamp of 04/08): patches/profile-likely-unlikely-macros.patch patches/profile-likely-unlikely-macros-fix.patch I found the profile-likely-unlikely-macros-fix-2.patch from last time around, but the code has changed a bunch since then so it isn't a clean apply (and in fact, it's so different it's not even easily hand-patchable - stuff like !!(foo) has appeared).
I'm pretty sure this one is for Ingo and Steven to sort out, their names are all over git-sched.patch for this code.... :) The following config will actually build on x86_64: CONFIG_HAVE_FTRACE=y # CONFIG_FTRACE is not set CONFIG_FTRACE_STARTUP_TEST=y However, at boot time, it dies a quick and horrid ker-splat and hangs without doing anything visible at all, without even the decency of rebooting. I'm going to guess that the startup test fandangos on memory that wasn't set up by the non-present CONFIG_FTRACE. I got into this state by saying 'y' to 'startup test' in make oldconfig, then deciding I didn't want ftrace so I turned that *one* entry off in make menuconfig - which left the startup test dangling. Easy local workaround was to just turn the test off too, so you guys can hash this one out at your leisure...
Grrr, I was hunting for oopses in dup_fd and near that were plaguing one
box here for far too long, and hit below.
What happened if freshly booted box (probably not all init scripts finished),
X already started. ssh from another box and reboot from session.
(gdb) p __kmalloc
$1 = {void *(size_t, gfp_t)} 0xffffffff80286890 <__kmalloc>
(gdb) l *(0xffffffff80286890 + 0x69)
0xffffffff802868f9 is in __kmalloc (mm/slub.c:1663).
1658
1659 object = __slab_alloc(s, gfpflags, node, addr, c);
1660
1661 else {
1662 object = c->freelist;
1663 ===> c->freelist = object[c->offset]; <===
1664 stat(c, ALLOC_FASTPATH);
1665 }
1666 local_irq_restore(flags);
BUG: unable to handle kernel paging request at 0000000500000500
IP: [<ffffffff802868f9>] __kmalloc+0x69/0x110
PGD 17e04a067 PUD 0
Oops: 0000 [1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:05:02.0/resource
CPU 1
Modules linked in: nf_conntrack_irc ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables usblp ehci_hcd uhci_hcd usbcore sr_mod cdrom
Pid: 4966, comm: depscan.sh Not tainted 2.6.25-rc8-mm2 #20
RIP: 0010:[<ffffffff802868f9>] [<ffffffff802868f9>] __kmalloc+0x69/0x110
RSP: 0018:ffff81017cba9c68 EFLAGS: 00010006
RAX: 0000000000000000 RBX: ffffffff805c3950 RCX: ffff81017e7bb278
RDX: ffff81017c868000 RSI: 0000000000000001 RDI: ffffffff802868db
RBP: ffff81017cba9c98 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000005050561 R11: 00000000036c00b1 R12: 0000000500000500
R13: 0000000000000282 R14: 00000000000080d0 R15: ffff810001070360
FS: 00007fc9d17276f0(0000) GS:ffff81017fc44600(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000500000500 CR3: 000000017c9c2000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3...If this is easily reproducable, I would appreciate if you could give the 'for-linus' branch of my tree a spin: git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git Pekka --
Looks like freelist corruption where c->freelist is 0x0000000500000500 I assume you have CONFIG_SLUB_DEBUG enabled but it was left out by the grep? --
c->offset is zero is okay. Could be a object freepointer corruption because the first word of the object is overwritten after free. You need to run with slub_debug on the commandline or CONFIG_SLUB_DEBUG_ON to debug this. Anyone know what the possible meaning of 0x0000000500000500 is? I do not see anything in poison.h. --
500000000 is 21474836480. So 21GB boundary? Some sort of device? --
--
I can reproduce semi-reliably (by kernel standards) corruption in kmalloc-2048. No idea if this can explain all "struct file" related oopses I saw, or SLUB free pointer corruption Pekka and Christoph are looking into. 8139too and atl1 drivers are in use. 8139too connects to outer world, atl1 -- to laptop collecting netconsole logs. However, I never managed to collect late oopses with netconsole even if init scripts which are shutting down interfaces are disabled. :-( Transcribed from photo: 8000 flags=0x8000000000002082 INFO: Object 0xffff81017ff9d2d0 @offset=21200 fp=0xffff81017ff9ca88 Bytes b4 0xffff81017ff9d2c0: 62 ea ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a Object 0xffff81017ff9d2d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Object 0xffff81017ff9d2e0: 6b 6b 00 18 f3 a2 9f 90 00 1b 38 af 22 49 08 00 Object 0xffff81017ff9d2f0: 45 10 00 4c ff 59 40 00 40 11 86 ac c0 a8 00 2a Object 0xffff81017ff9d300: 50 fa a2 be 91 43 00 7b 00 38 54 d4 23 00 00 00 Object 0xffff81017ff9d310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Object 0xffff81017ff9d320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Object 0xffff81017ff9d330: 00 00 00 00 4c ff 10 44 74 7f 6f 9d e4 c8 a2 4f Object 0xffff81017ff9d340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Redzone 0xffff81017ff9dad0: bb bb bb bb bb bb bb bb Padding 0xffff81017ff9db10: 5a 5a 5a 5a 5a 5a 5a 5a Pid: 6168, comm: reboot Not tainted 2.6.25-rc8-mm2 #28 Call Trace: print_trailer check_bytes_and_report check_object __free_slab discard_slab __slab_free ? skb_release_data kfree ? skb_release_data skb_release_all __kfre_skb kfree_skb atl1_clean_rx_ring atl1_down atl1_close dev_close dev_change_flags devinet_ioctl ? trace_hardirqs_on inet_ioctl sock_ioctl vfs_ioctl do_vfs_ioctl sys_ioctl system_call_after_swapgs FIX kmalloc-2048: Restoring 0xffff81017ff9d2e2-0xffff81017ff9d8d9=0x6b --
Looks like skb corruption. Would be helpful to have the complete output though. Does the data in the restored range trigger any memories? --
No. I'm currently tracing this bug and 2.6.24 also has it. :-( --
OK, nailed it. It's commit 5f08e46b621a769e52a9545a23ab1d5fb2aec1d4 aka "atl1: disable broken 64-bit DMA". With this commit in tree, I can reproduce either a) kmalloc-2048 corruption after initscripts shutdown eth0 http://marc.info/?l=linux-kernel&m=120820360221261&w=2 b) or oopses at filp_close() first reported long ago (sorry, can't find that email) c) or hard hang after initscripts shutdown eth0 with even SysRq not working. http://marc.info/?l=linux-kernel&m=120795046008115&w=2 I have two boxes one with atl1, 4G RAM with 2G remapped after 4G boundary, another with r8169 connected with just ethernet cable. NICs agree on 1Gbps speed. So, it's enough to scp 200 MB git archive and immediately start rebooting sequence for horrors described above to appear. It's not 100% reproducible but more like 90%. I tested 10 times kernel one commit before and it doesn't have these issues and reboots reliably. CONFIG_IOMMU is in use, dmesg, lspci, /proc/mtrr below: 03:00.0 Ethernet controller [0200]: Attansic Technology Corp. L1 Gigabit Ethernet Adapter [1969:1048] (rev b0) Subsystem: ASUSTeK Computer Inc. Unknown device [1043:8226] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 319 Region 0: Memory at fe9c0000 (64-bit, non-prefetchable) [size=256K] Expansion ROM at fe9a0000 [disabled] [size=128K] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+ Address: 00000000fee0300c Data: 4161 Capabilities: [58] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited Ext...
On Sat, 19 Apr 2008 18:45:35 +0400 Do I understand correctly that these failures occur only while the network interface is going down? Jay --
Yep. During up or running there were no problems with this card. --
On Sun, 20 Apr 2008 15:14:53 +0400 One more question: Does it happen whether or not you're using atl1 as a netconsole? --
Without netconsole bugs happens too. --
On Sun, 20 Apr 2008 16:26:31 +0400 I can't duplicate this error, but it's probably because my machine doesn't have 4GB of memory. I have one report in Febroary 2008 of another user encountering strange oopses in 2.6.23.12 and 2.6.24 whenever he downed the interface. I suspect your experience is a repeat of that. Just to be clear, you transfer about 200MB to the NIC (Rx direction), then immediately reboot, right? Can you duplicate the problem if you simply ifconfig down instead of rebooting after the transfer? Thanks for your help. Jay --
Aha, ifconfig down is enough. Here is how reproducer looks like now: ./sync-linux-linus && ssh core2 "sudo /sbin/ifconfig eth0 down" where first script is basically scp(1). Also, booting with 1G or 2G of RAM (mem=1024m) makes issue go away. printk at dev_close() time shows that NETIF_F_HIGHDMA was not somehow enabled. --
Does the problem go away with iommu=nomerge? If so, I suspect we're not properly flushing an iowrite somewhere. -- Chris --
nomerge doesn't help. --
On Mon, 21 Apr 2008 00:55:00 +0400 Alexey, can you please try this (very minimally tested) patch? diff --git a/drivers/net/atlx/atl1.c b/drivers/net/atlx/atl1.c index 5586fc6..07fe5c0 100644 --- a/drivers/net/atlx/atl1.c +++ b/drivers/net/atlx/atl1.c @@ -1115,9 +1115,6 @@ static void atl1_free_ring_resources(struct atl1_adapter *adapter) struct atl1_rrd_ring *rrd_ring = &adapter->rrd_ring; struct atl1_ring_header *ring_header = &adapter->ring_header; - atl1_clean_tx_ring(adapter); - atl1_clean_rx_ring(adapter); - kfree(tpd_ring->buffer_info); pci_free_consistent(pdev, ring_header->size, ring_header->desc, ring_header->dma); @@ -3423,6 +3420,8 @@ static int atl1_set_ringparam(struct net_device *netdev, adapter->rrd_ring = rrd_old; adapter->tpd_ring = tpd_old; adapter->ring_header = rhdr_old; + atl1_clean_tx_ring(adapter); + atl1_clean_rx_ring(adapter); atl1_free_ring_resources(adapter); adapter->rfd_ring = rfd_new; adapter->rrd_ring = rrd_new; --
On Mon, 21 Apr 2008 21:08:21 -0500 Alexey, have you found time to try this patch yet? --
Looking at how other netdevice drivers: 8139too and others checks netif_running() in interrupt handler. r8169 has scary "50k$" question comment re irqs disabled after interacting with hardware. But the r8169 case should be fixed by atlx_irq_disable()? Writes to REG_IMR, REG_ISR are commented in atl1_reset_hw(), why? (I'll test that soon) Do we have a theory why changing from 64-bit DMA mask to 32-bit mask resurrects the bug? NIC here never showed any sort of corruption described in commit which banned 64-bit DMA. --
On Mon, 5 May 2008 01:15:07 +0400 I've tried all the stuff you mentioned above, and more, to prevent the memory corruption, all to no avail. I booted with mem=4000M and didn't hit the bug. I diffed dmesg between booting with mem=4000M and booting without it, and found that iommu was being disabled when booting with full memory: --- dmesg-4000.txt 2008-05-06 10:14:07.000000000 -0500 +++ dmesg-4096.txt 2008-05-06 10:09:19.000000000 -0500 @@ -1,5 +1,5 @@ Linux version 2.6.26-rc1 (jcliburn@finch.hogchain.net) (gcc version 4.1.2 20070 925 (Red Hat 4.1.2-27)) #4 SMP Mon May 5 18:03:48 CDT 2008 -Command line: ro root=LABEL=/1 console=ttyS0,38400 console=tty0 slub_debug=FZPU mem=4000M +Command line: ro root=LABEL=/1 console=ttyS0,38400 console=tty0 slub_debug=FZPU [...] +Looks like a VIA chipset. Disabling IOMMU. Override with iommu=allowed [...] So I then booted with iommu=allowed. No errors. Can't hit the bug to save my life. Why would disabling iommu cause the atl1 driver to write over poisoned memory? Alexey, can you please try booting with iommu=allowed and see if you avoid the problem? Thanks, Jay --
Hmmm, there was a wonderful oops on interface stop here when the other end of atl1 cable was physically unplugged (but there was traffic before): atl1_down atl1_clean_rx_ring swiotlb_unmap_single swiotlb_unmap_single_attrs memcpy_c --
Intel chip, or AMD? -- Chris --
Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz Asus P5B-E motherboard. --
[trimmed cc list slightly] On Sat, 10 May 2008 00:07:15 +0400 I see the same thing with a Socked AM2-based board (Asus M2V) with 4GB RAM installed. The problem occurs only when SWIOTLB is active, which happens automatically at boot (in arch/x86/kernel/pci-swiotlb.c) when the page frame number exceeds 1048576 (corresponding to 2^32 bytes). I thought for awhile that the problem went away with iommu=allowed, but I was wrong. The bug appears to be a "simple" skb write-after-free that happens only when bounce buffers are in use, but I'll be damned if I can find the cause of it. <continues looking> ============================================================================= BUG kmalloc-2048: Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff81010004297a-0xffff810100042f71. First byte 0x0 instead of 0x6b INFO: Allocated in dev_alloc_skb+0x16/0x2c age=5813 cpu=0 pid=3029 INFO: Freed in skb_release_data+0xa8/0xad age=201 cpu=0 pid=0 INFO: Slab 0xffffe20005801600 objects=15 used=0 fp=0xffff810100045b18 flags=0x8000000000002082 INFO: Object 0xffff810100042968 @offset=10600 fp=0xffff8101000418d8 Bytes b4 0xffff810100042958: aa 91 fd ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a �.��....ZZZZZZZZ Object 0xffff810100042968: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff810100042978: 6b 6b 00 17 31 4e 9d 41 00 0f db bc af 14 08 00 kk..1N.A..ۼ�... Object 0xffff810100042988: 45 00 00 4e 87 5e 00 00 40 11 6e 82 c0 a8 01 fe E..N.^..@.n.�������.� Object 0xffff810100042998: c0 a8 01 70 00 89 00 89 00 3a 3b 67 00 09 00 00 ��.p.....:;g.... Object 0xffff8101000429a8: 00 01 00 00 00 00 00 00 20 43 4b 41 41 41 41 41 .........CKAAAAA Object 0xffff8101000429b8: 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 AAAAAAAAAAAAAAAA Object 0xffff8101000429c8: 41 41 41 41 41 41 41 41 41 00 00 21 00 01 f0 53 AAAAAAAAA..!.. Object 0xffff8101000429d8: 56 17 df 3e 3b 9...
Try this patch! If scared, remove swiotlb poisoning, I'm not entirely
sure it's correct, but it makes aforementioned second oops
deterministic.
--- a/drivers/net/atlx/atl1.c
+++ b/drivers/net/atlx/atl1.c
@@ -2027,6 +2029,7 @@ rrd_ok:
/* Good Receive */
pci_unmap_page(adapter->pdev, buffer_info->dma,
buffer_info->length, PCI_DMA_FROMDEVICE);
+ buffer_info->dma = 0;
skb = buffer_info->skb;
length = le16_to_cpu(rrd->xsz.xsum_sz.pkt_size);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index d568894..f6165ed 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -399,12 +399,14 @@ unmap_single(struct device *hwdev, char *dma_addr, size_t size, int dir)
/*
* First, sync the memory before unmapping the entry
*/
- if (buffer && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL)))
+ if (buffer && ((dir == DMA_FROM_DEVICE) || (dir == DMA_BIDIRECTIONAL))) {
/*
* bounce... copy the data back into the original buffer * and
* delete the bounce buffer.
*/
memcpy(buffer, dma_addr, size);
+ io_tlb_orig_addr[index] = (void *)0x9a9a9a9a9a9a9a9aUL;
+ }
/*
* Return the buffer to the free list by setting the corresponding
--On Sat, 10 May 2008 23:31:07 +0400 Seems to fix it for me. Nicely done, Alexey! Thanks! I looked at that blasted unmap a thousand times, but never noticed the missing buffer_info->dma clear. I'll get input from one more tester, and if it's positive, I'll submit this to Jeff. --
On Mon, 5 May 2008 01:15:07 +0400 We had multiple reports of users who encountered repeated memory corruption when transferring large files while running with a 64-bit DMA mask. Chris Snook noticed the upper 32 bits of the descriptor address register are shared among five other registers, each containing the low bits for one of five descriptors. All the descriptors, therefore, have to live within the same 4GB address space. I'll keep poking at it as time permits through the week, but I probably won't be able to devote a whole lot of time to it until next weekend. --
On Sun, 4 May 2008 19:31:28 -0500 Make that "...within the same 2GB address space." --
I've tried it and it doesn't help. --
Patch doesn't help unfortunately.
BTW, below is clean corruption trace:
atl1 0000:03:00.0: eth0 link is up 1000 Mbps full duplex
=============================================================================
BUG kmalloc-2048: Poison overwritten
-----------------------------------------------------------------------------
INFO: 0xffff81017ed7a97a-0xffff81017ed7af71. First byte 0x0 instead of 0x6b
INFO: Allocated in dev_alloc_skb+0x18/0x30 age=23894 cpu=1 pid=30255
INFO: Freed in skb_release_data+0x7a/0xc0 age=20700 cpu=0 pid=0
INFO: Slab 0xffffe200053bf240 used=12 fp=0xffff81017ed7a968 flags=0x17c000000040c3
INFO: Object 0xffff81017ed7a968 @offset=10600 fp=0xffff81017ed7ca88
Bytes b4 0xffff81017ed7a958: 14 09 a7 01 01 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ..§.....ZZZZZZZZ
Object 0xffff81017ed7a968: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object 0xffff81017ed7a978: 6b 6b 00 18 f3 a2 9f 90 00 1b 38 af 22 49 08 00 kk..ó¢....8¯"I..
Object 0xffff81017ed7a988: 45 10 00 4c a4 9f 40 00 40 11 d2 fe c0 a8 00 2a E..L¤.@.@.ÒþÀ¨.*
Object 0xffff81017ed7a998: 59 6f a8 b1 9d e9 00 7b 00 38 58 29 23 00 00 00 Yo¨±.é.{.8X)#...
Object 0xffff81017ed7a9a8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object 0xffff81017ed7a9b8: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object 0xffff81017ed7a9c8: 00 00 00 00 1e 31 61 fa 08 5e 9a 73 de cf ce 94 .....1aú.^.sÞÏÎ.
Object 0xffff81017ed7a9d8: 63 64 65 66 67 68 6a 69 6b 6c 6d 6e 6f 70 71 72 cdefghjiklmnopqr
Redzone 0xffff81017ed7b168: bb bb bb bb bb bb bb bb »»»»»»»»
Padding 0xffff81017ed7b1a8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
Pid: 31677, comm: ifconfig Not tainted 2.6.25-3925e6fc1f774048404fdd910b0345b06c699eb4 #5
Call Trace:
[<ffffffff80288277>] print_trailer+0xe7/0x170
[<ffffffff802883a5>] check_bytes_and_report+0xa5/0xd0
[<ffffffff80288678>] check_object+...Yes, I don't think the sub changes are ready for prime-time. There is a fix in ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.25-rc8/2.6.25-rc8-mm... but it won't help this crash. --
At this point there is no tie in with slub changes. This is freepointer corruption that is typical for writing to the object after free it. Enabling slub debugging is needed to figure out when the object was overwritten. --
On Sun, Apr 13, 2008 at 11:53 PM, Andrew Morton Indeed. I now dropped the SLUB defragmentation patches from the 'for-mm' branch so Andrew can you please pull the new branch to -mm? --
Pekka fixed SLUB for me, and now core2 box survives up and including to not finding / : Setup is SATA disk with plain old partitions, nothing lvmancy: /dev/sda2 on / type ext3 (rw,noatime) CONFIG_ATA=y CONFIG_ATA_ACPI=y CONFIG_SATA_AHCI=y CONFIG_ATA_PIIX=y CONFIG_PATA_JMICRON=y sda1 is for swap. [ 3.920000] NET: Registered protocol family 1 [ 3.920000] VFS: Cannot open root device "sda2" or unknown-block(0,0) [ 3.920000] Please append a correct "root=" boot option; here are the available partitions: [ 3.920000] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) [ 3.920000] Pid: 1, comm: swapper Not tainted 2.6.25-rc8-mm2 #19 [ 3.920000] [ 3.920000] Call Trace: [ 3.920000] [<ffffffff8022ff80>] panic+0xa0/0x180 [ 3.920000] [<ffffffff8044d8c9>] ? mutex_unlock+0x9/0x10 [ 3.920000] [<ffffffff805d8e03>] ? printk_all_partitions+0x23/0x180 [ 3.920000] [<ffffffff8044d8c9>] ? mutex_unlock+0x9/0x10 [ 3.920000] [<ffffffff805d8efc>] ? printk_all_partitions+0x11c/0x180 [ 3.920000] [<ffffffff8044ee46>] ? _read_unlock+0x26/0x30 [ 3.920000] [<ffffffff805c8e82>] mount_block_root+0x102/0x2a0 [ 3.920000] [<ffffffff805c9076>] mount_root+0x56/0x60 [ 3.920000] [<ffffffff805c90cc>] prepare_namespace+0x4c/0x160 [ 3.920000] [<ffffffff805c8bce>] kernel_init+0x23e/0x2f0 [ 3.920000] [<ffffffff80228b17>] ? finish_task_switch+0x67/0xe0 [ 3.920000] [<ffffffff8020c358>] child_rip+0xa/0x12 [ 3.920000] [<ffffffff803184d2>] ? acpi_os_acquire_lock+0x9/0xb [ 3.920000] [<ffffffff805c8990>] ? kernel_init+0x0/0x2f0 [ 3.920000] [<ffffffff8020c34e>] ? child_rip+0x0/0x12 The very same sequence with lockdep on (I originally thought it's the culprit): [ 12.770000] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1 [ 17.820000] atl1 0000:03:00.0: eth0 link is up 1000 Mbps full duplex [ 3.500000] ahci 00...
The winner is partly me, partly git-libata-all. The latter introduced CONFIG_ATA_SFF option and put more or less every SATA and PATA driver under it. The former honestly answered N to when ATA_SFF popped up and failed to check existence of ATA_PIIX and PATA_JMICRON in failing .config . Now raise hands those who knew that your ATA controller is SFF compliant. --
Is there any technical reason why we have to bother users with the
ATA_SFF option at all?
It sounds like a perfect canndidate for being select'ed.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--'default y' is appropriate, but option that is used to disable a major swath of legacy code unneeded on modern FIS-based SATA platforms like AHCI. Jeff --
Heh.. yeah, but I have to admit SFF support is cryptic. We can definitely use some friendly explanation there. -- tejun --
I think you didn't understand my suggestion. I didn't want to get the code enabled unconditionally, I want kconfig users to not needlessly being bothered with an option where we could determine automatically the correct setting. cu Adrian <-- snip --> Making ATA_SFF a user-visible option with the drivers needing it depending on it caused the following problems: - people lose their driver when accidentally disabling it - people not requiring it needlessly enable it Fortunately, we don't have to bother the user with this option at all since we can simply select it when it's required. Signed-off-by: Adrian Bunk <bunk@kernel.org> --- drivers/ata/Kconfig | 79 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 71 insertions(+), 8 deletions(-) --- linux-2.6.25-rc8-mm2/drivers/ata/Kconfig.old 2008-04-13 01:59:27.000000000 +0300 +++ linux-2.6.25-rc8-mm2/drivers/ata/Kconfig 2008-04-13 02:05:42.000000000 +0300 @@ -73,17 +73,12 @@ If unsure, say N. config ATA_SFF - bool "ATA SFF support" - default y - help - This option adds support for ATA controllers with SFF - compliant or similar programming interface. - -if ATA_SFF + bool config SATA_SVW tristate "ServerWorks Frodo / Apple K2 SATA support" depends on PCI + select ATA_SFF help This option enables support for Broadcom/Serverworks/Apple K2 SATA support. @@ -93,6 +88,7 @@ config ATA_PIIX tristate "Intel ESB, ICH, PIIX3, PIIX4 PATA/SATA support" depends on PCI + select ATA_SFF help This option enables support for ICH5/6/7/8 Serial ATA and support for PATA on the Intel ESB/ICH/PIIX3/PIIX4 series @@ -103,6 +99,7 @@ config SATA_MV tristate "Marvell SATA support (HIGHLY EXPERIMENTAL)" depends on EXPERIMENTAL + select ATA_SFF help This option enables support for the Marvell Serial ATA family. Currently supports 88SX[56]0[48][01] chips. @@ -112,6 +109,7 @@ config SATA_NV tristate "NVIDIA SATA support" dep...
On Thu, 10 Apr 2008 20:33:54 -0700 On ia64/NUMA box (which has empty nodes.) CONFIG_SLAB ..... booted well CONFIG_SLUB ..... can't boot CONFIG_SLUB + CONFIG_SLUB_DEBUG_ON .... booted. Hmm? I'll dig more if I can. 2.6.25-rc8-mm1 had no troubles. Thanks, -Kame --
On Fri, 11 Apr 2008 18:57:03 +0900 with slub_nomerge , booted. Thanks, -Kame --
What happens when it doesn't boot? Does it hang or do you get an oops? Can you reproduce it with the 'for-mm' branch of: git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git Pekka --
On Fri, 11 Apr 2008 13:34:18 +0300 will try. Thanks, -Kame --
On Fri, 11 Apr 2008 19:57:38 +0900 slab-2.6.git booted well :( Hmm, It seems I have to dig somewhere different... Thanks, -Kame --
On Fri, 11 Apr 2008 20:17:24 +0900
Sorry, I tested *master* branch ;), under *testing* branch it reproduced.
bisected. (see below)
I'm sorry I can't use my box for next 2 days. I can test possible fix
on Monday (in Japan).
==bisect result==
831d78b552aade2c383cf8d75b180dd35f81a4e3 is first bad commit
commit 831d78b552aade2c383cf8d75b180dd35f81a4e3
Author: Christoph Lameter <clameter@sgi.com>
Date: Tue Apr 8 22:26:30 2008 +0300
SLUB: Add KICKABLE to avoid repeated kick() attempts
Add a flag KICKABLE to be set on slabs with a defragmentation method
Clear the flag if a kick action is not successful in reducing the
number of objects in a slab. This will avoid future attempts to
kick objects out.
The KICKABLE flag is set again when all objects of the slab have been
allocated (Occurs during removal of a slab from the partial lists).
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
== bisect log ==
git-bisect start
# good: [0e81a8ae37687845f7cdfa2adce14ea6a5f1dd34] Linux 2.6.25-rc8
git-bisect good 0e81a8ae37687845f7cdfa2adce14ea6a5f1dd34
# bad: [28e4b71a66881df1ac343f13d06395fa01021e8e] slub: use typedefs for ->get and ->kick functions
git-bisect bad 28e4b71a66881df1ac343f13d06395fa01021e8e
# good: [9597362d354f8655ece324b01d0c640a0e99c077] Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
git-bisect good 9597362d354f8655ece324b01d0c640a0e99c077
# good: [28b8383d5d4d9b636c3734c993563bafdc2ab3c3] Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb
git-bisect good 28b8383d5d4d9b636c3734c993563bafdc2ab3c3
# good: [869ab5147e1eead890245cfd4f652ba282b6ac26] SELinux: more GFP_NOFS fixups to prevent selinux from re-entering the fs code
git-bisect good 869ab5147e1eead890245cfd4f652ba282b6ac26
# good: [acd49c885e03f087c31f49e7c42ccb8befbf4009] slub: Make the order configurable for each slab cache
git-bisect good acd49c885e03f087c31f49e7c...My bad, sorry. Fixed and pushed out.
From: Pekka Enberg <penberg@cs.helsinki.fi>
Date: Fri, 11 Apr 2008 17:17:43 +0300
Subject: [PATCH] slub: add missing slab_unlock() to __kmem_cache_shrink()
If page is not kickable, remember to slab_unlock() before continuing the loop.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
---
mm/slub.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 4b694a7..f09f1fb 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2926,8 +2926,10 @@ static unsigned long __kmem_cache_shrink(struct kmem_cache *s, int node,
continue;
if (page->inuse) {
- if (!SlabKickable(page))
+ if (!SlabKickable(page)) {
+ slab_unlock(page);
continue;
+ }
if (page->inuse * 100 >=
s->defrag_ratio * page->objects) {
--
1.5.2.5
--On Fri, 11 Apr 2008 17:24:11 +0300 Works well. Thank you. Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> -Kame --
Is this due to the generic_find_next_le_bit compile error I reported as
2.6.25 regression (there seems to be some discussion recently how to
fix it - hopefully for 2.6.25) or is there even more breakage in -mm?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--Yes, it's the bitops screwup. I saw some related linux-ext4 email float past yesterday, so something might be happening. --
It would really help if we can get some m68k people to look at patch i posted. http://article.gmane.org/gmane.comp.file-systems.ext4/5944 -aneesh --
I'd love to - only I can't seem to compile 2.6.25 at all. What's the minimum gcc version I should use? Michael --
My cross-compiler is gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21).
Gr{oetje,eeting}s,
Geert
P.S. I'll look into it when I find some time, but you may beat me to it...
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--I bisected boot hang after "ACPI: using IOAPIC for interrupt routing"
down to git-pekka.
normal dmesg from 2.6.25-rc8-something and .config snippets for -mm
# CONFIG_SLUB_DEBUG is not set (hey, I thought this was on!)
CONFIG_SLUB=y
# CONFIG_SLUB_STATS is not set
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_FS is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_WRITECOUNT=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_SG=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PER_CPU_MAPS=y
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_RODATA_TEST is not set
# CONFIG_DEBUG_NX_TEST is not set
P.S.: I now suspect that one bisection point was wrong, however,
make-module_sect_attrs-private-to-kernel-modulec-checkpatch-fixes
was definitely good.
[ 0.000000] Linux version 2.6.25-rc8 (ad@martell) (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.2)) #2 SMP Fri Apr 11 00:54:03 MSD 2008
[ 0.000000] Command line: root=/dev/sda2 netconsole=@192.168.0.1/eth0,9353@192.168.0.42/00:1b:38:af:22:49 ignore_loglevel
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000007ff90000 (usable)
[ 0.000000] BIOS-e820: 000000007ff90000 - 000000007ff9e000 (ACPI data)
[ 0.000000] BIOS-e820: 000000007ff9e000 - 000000007ffe0000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000007ffe0000 - 0...Hi Alexey, That's odd. I don't immediately see anything there that can cause a this... Can you see the hang with the 'for-mm' branch of my tree: git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git If so, can you do a git bisect? Does sysrq-t tell us anything? Pekka --
Alexey, can you try passing the 'slub_nomerge' option to the kernel to see if the hang goes away with that? Pekka --
nomerge doesn't help as well as turning on combinations of SLUB debug options. --
Does the following patch fix it?
Pekka
From 7c7e7e5e7ec07c0a47705b2d21c779c39ba02252 Mon Sep 17 00:00:00 2001
From: Pekka Enberg <penberg@cs.helsinki.fi>
Date: Fri, 11 Apr 2008 17:17:43 +0300
Subject: [PATCH] slub: add missing slab_unlock() to __kmem_cache_shrink()
If page is not kickable, remember to slab_unlock() before continuing the loop.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
---
mm/slub.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index 4b694a7..f09f1fb 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2926,8 +2926,10 @@ static unsigned long __kmem_cache_shrink(struct kmem_cache *s, int node,
continue;
if (page->inuse) {
- if (!SlabKickable(page))
+ if (!SlabKickable(page)) {
+ slab_unlock(page);
continue;
+ }
if (page->inuse * 100 >=
s->defrag_ratio * page->objects) {
--
1.5.2.5
--Yes, it helps. Now I have some more bugs to report. :-( --
I bisected boot hang after "ACPI: using IOAPIC for interrupt routing"
down to git-pekka.
normal dmesg from 2.6.25-rc8-something and .config snippets for -mm
# CONFIG_SLUB_DEBUG is not set (hey, I thought this was on!)
CONFIG_SLUB=y
# CONFIG_SLUB_STATS is not set
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
# CONFIG_DEBUG_FS is not set
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_SHIRQ=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
# CONFIG_DEBUG_LOCKDEP is not set
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_WRITECOUNT=y
CONFIG_DEBUG_LIST=y
CONFIG_DEBUG_SG=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PER_CPU_MAPS=y
CONFIG_DEBUG_RODATA=y
# CONFIG_DEBUG_RODATA_TEST is not set
# CONFIG_DEBUG_NX_TEST is not set
P.S.: I now suspect that one bisection point was wrong, however,
make-module_sect_attrs-private-to-kernel-modulec-checkpatch-fixes
was definitely good.
[ 0.000000] Linux version 2.6.25-rc8 (ad@martell) (gcc version 4.1.2 (Gentoo 4.1.2 p1.0.2)) #2 SMP Fri Apr 11 00:54:03 MSD 2008
[ 0.000000] Command line: root=/dev/sda2 netconsole=@192.168.0.1/eth0,9353@192.168.0.42/00:1b:38:af:22:49 ignore_loglevel
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000007ff90000 (usable)
[ 0.000000] BIOS-e820: 000000007ff90000 - 000000007ff9e000 (ACPI data)
[ 0.000000] BIOS-e820: 000000007ff9e000 - 000000007ffe0000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000007ffe0000 - 0...