I've spent quite a while hunting that crap down; reverting VFS fix
mentioned in original thread *does* get rid of the symptoms, but so does the
patch below.
What happens is this: if ->follow_link() (usually something like
stat("/proc/2/fd", ...) done by pidof(8)) return ERR_PTR(-....), we return
to __do_follow_link() and do the following:
*p = dentry->d_inode->i_op->follow_link(dentry, nd);
error = PTR_ERR(*p);
if (!IS_ERR(*p)) {
char *s = nd_get_link(nd);
error = 0;
if (s)
error = __vfs_follow_link(nd, s);
else if (nd->last_type == LAST_BIND) {
error = force_reval_path(&nd->path, nd);
if (error)
path_put(&nd->path);
}
}
return error;
We _should_ return non-zero value; IS_ERR(ERR_PTR(-n)) is 1 and
PTR_ERR(ERR_PTR(n)) is -n. What happens instead is that this thing
actually returns 0. And no, it's not a miscompile. Patch below
removes the symptoms of the bug, but only if both parts are present.
I.e. *not* doing "report = 1" in proc_pid_follow_link() gives us
visible breakage, despite the fact that report is initialized as
1 and nothing except proc_pid_follow_link() ever tries to assign
anything to it. Seeing that fs/namei.c and fs/proc/base.c are
compiled separately, we can exclude gcc problems.
The cheapest way to reproduce is to boot with init=/bin/sh, then
mount /proc and have stat("/proc/2/exe", &st) called; if stat()
returns 0, we are fscked. The critical part is between return
from proc_exe_link() (we'll leave it via if (!mm) return -ENOENT;)
to return from __do_follow_link() -> do_follow_link() -> link_path_walk().
If somebody familiar with aranym guts are up to debugging that, more
power to them. If I would've seen it on real hardware, I'd suspect
something weird going on with caches, but...
FWIW, it's observable on amd64 host; I ...I booted 2.6.36-rc7-atari-00360-g0dd2e6a (my current private test kernel) with
init=/bin/sh, mounted /proc, and tried
for i in $(seq 1000); do stat /proc/2/exe; done
a few times, but I didn't see any ida_remove messages.
It cannot read the /proc/2/exe symlink, though.
This is on aranym-0.9.9-1 from Ubuntu/amd64.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
stat -L /proc/2/exec, otherwise you'll hit lstat() instead of stat(). And FWIW 0.9.10-1 squeeze/amd64 also triggers here... --
Still, just "stat: cannot stat `proc/2/exe': No such file or directory" here...
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
Argh... OK, going through aranym with debugger has exhonorated it. My
apologies ;-/ It *is* gcc in sid. Testcase:
extern int foo(int);
void *bar(int n)
{
return (void *)foo(n);
}
and gcc -S -O2 turns that into
bar:
link.w %fp,#0
unlk %fp
jra foo
Spot the obvious bug... BTW, why on the Earth does debian-ports m68k tree
use gcc-4.3 with Cthulhu-scaring 700Kb gzipped patch and does *not* have
gcc-4.4?
--
Yes, indeed. I’m working on gcc-4.4 but am stalled because, after finally getting a kernel to build, it has no support for nfeth. Finn Thain has kindly provided me with an eglibc+TLS sysroot tarball, which I can use to bootstrapp gcc-4.4+TLS then Debian’s eglibc+TLS, They don’t function at the moment because 70% of the archive is either outdated or uninstallable on m68k. I’m mainly playing (as in game) buildd here because of that fact, since I didn’t want mksh to not show up on all architectures ;-) My goal is to get cowbuilder working then re-bootstrap enough of Debian/m68k to get the autobuilders working again. [x] send unidiff I’ll include that in my gcc build and then forward it to Debian once I got a working gcc, unless you want to push that to them already. bye, //mirabilos -- I believe no one can invent an algorithm. One just happens to hit upon it when God enlightens him. Or only God invents algorithms, we merely copy them. If you don't believe in God, just consider God as Nature if you won't deny existence. -- Coywolf Qi Hunt --
http://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=commitdiff;h=d09fd72c630c4886367f1977cdb... Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." --
Huh? Oh, you mean the net_device_ops transition missed in
arch/m68k/emu/nfeth.c? Just add
static const struct net_device_ops nfeth_netdev_ops = {
.ndo_open = nfeth_open,
.ndo_stop = nfeth_stop,
.ndo_get_stats = nfeth_get_stats,
.ndo_start_xmit = nfeth_xmit,
.ndo_tx_timeout = nfeth_tx_timeout,
.ndo_validate_addr = eth_validate_addr,
};
in it and replace the assignments to ->open, etc. in nfeth_probe() with
dev->netdev_ops = &nfeth_netdev_ops;
and the sucker will work.
Below is what I'm using on top of mainline kernel; it's a combination of
couple of patches in debian m68k kernel plus compile fixes. At least
works well enough for booting with /dev/hda getting contents from file on
host and network working well enough for ssh. With gcc-4.1 that seems
to be enough. With 4.3... slapping assignment to global variable right
before the return from proc_pid_follow_link() gets it to boot and work
well enough for apt-get and compiles, but I wouldn't bet a dime on the
correctness around failure exits all over the tree. This kind of
miscompile is definitely triggered in a lot of places - anything that
does return ERR_PTR(error) right after error = foo(...); is going to
get fscked and it's not a rare thing. Amazing that it doesn't fall
apart much harder...
BTW, now that I've tried allmodconfig build, 4.3 gives a bunch of ICE
on e.g. ntfs. Cross-build, so it's not an underlying kernel breakage...
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 8030e24..6a6893a 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -248,6 +248,37 @@ config SUN3
If you don't want to compile a kernel exclusively for a Sun 3, say N.
+config NATFEAT
+ bool "ARAnyM emulator support"
+ depends on ATARI
+ help
+ This option enables support for ARAnyM native features, such as
+ access to a disk image as /dev/hda. Useful with the ARANYM option.
+
+config NFETH
+ tristate "NatFeat Ethernet ...Only if they are optimized into tail calls. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." --
And with -O2 4.3 does exactly that. BTW, any comments on signal patchset? Seems to work here, including the stack expansion fixes, but that's on aranym. I'll try to resurrect the real hardware, but that may take a while. If somebody could give it a beating in the meanwhile... --
Oh, lovely... One more signal bug (and a lot more on m68knommu): if we
strace a process and signal is delivered during pagefault handling, you'll
lose the second call of syscall_trace() on sigreturn(). If the signal
is delivered during a syscall or during an interrupt, syscall_trace() is
called twice on sigreturn() (as it does on all platforms). Fortunately,
that's easy to fix - same as on alpha (calling syscall_trace in ret_from_signal
if we are getting traced, just before doing RESTORE_SWITCH_STACK). Will test
and post...
FWIW, on other targets we either have sys_{rt,}_sigreturn() done as normal
functons (in which case the normal logics will take care of that), or
have them return to place in (common) syscall exit path earlier than
conditional call of syscall trace (mips, score), or check flags and do
call ourselves (sparc, alpha since it had been fixed). AFAICS, m68k and
m68knommu are the only ones buggered that way. On alpha we used to have
it even worse - there we did only one call on sigreturn() unconditionally...
--
I tried it on my Amiga 4000/040.
Without your patches, gdb gets stuck in state D+ when reaching a breakpoint:
| cassandra:~# gdb /tmp/hello
| GNU gdb 6.4.90-debian
| Copyright (C) 2006 Free Software Foundation, Inc.
| GDB is free software, covered by the GNU General Public License, and you are
| welcome to change it and/or distribute copies of it under certain conditions.
| Type "show copying" to see the conditions.
| There is absolutely no warranty for GDB. Type "show warranty" for details.
| This GDB was configured as "m68k-linux-gnu"...Using host
libthread_db library "/lib/libthread_db.so.1".
|
| (gdb) break main
| Breakpoint 1 at 0x800003fc: file /home/geert/hello.c, line 6.
| (gdb) run
| Starting program: /tmp/hello
|
| Breakpoint 1, main (argc=1, argv=0xefbeedb4) at /home/geert/hello.c:6
With your patches, it works a bit better:
| cassandra:~# gdb /tmp/hello
| GNU gdb 6.4.90-debian
| Copyright (C) 2006 Free Software Foundation, Inc.
| GDB is free software, covered by the GNU General Public License, and you are
| welcome to change it and/or distribute copies of it under certain conditions.
| Type "show copying" to see the conditions.
| There is absolutely no warranty for GDB. Type "show warranty" for details.
| This GDB was configured as "m68k-linux-gnu"...Using host
libthread_db library "/lib/libthread_db.so.1".
|
| (gdb) break main
| Breakpoint 1 at 0x800003fc: file /home/geert/hello.c, line 6.
| (gdb) run
| Starting program: /tmp/hello
|
| Breakpoint 1, main (argc=1, argv=0xefcc5db4) at /home/geert/hello.c:6
| 6 printf("Hello, world! [C]\n");
| (gdb) cont
| Continuing.
| Hello, world! [C]
|
| Program exited normally.
| (gdb) run
| Starting program: /tmp/hello
|
| Breakpoint 1, main (argc=1, argv=0xef85bdb4) at /home/geert/hello.c:6
| 6 printf("Hello, world! [C]\n");
| (gdb) next
| Hello, world! [C]
After which gdb is stuck in S+, and /tmp/hello in t.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- ...Thorsten Glaser writes:
> Mikael Pettersson dixit:
>
> >It's gcc PR41302 which was fixed for gcc trunk on November 4 2009
> >in r153890. The patch backports easily to gcc-4.4 and solves the
> >test case there (manual inspection using a cross). It also backports
>
> [x] send unidiff
>
> Iâll include that in my gcc build and then forward it to Debian
> once I got a working gcc, unless you want to push that to them
> already.
This is what I have had in my repo since November last year,
but it's only been minimally tested in a cross. Please push
if/once it passes bootstrap + regtest.
gcc/
2009-11-10 Mikael Pettersson <mikpe@it.uu.se>
Backport from mainline:
2009-11-04 Maxim Kuvyrkov <maxim@codesourcery.com>
PR target/41302
* config/m68k/m68k.c (m68k_reg_present_p): New static function.
(m68k_ok_for_sibcall_p): Handle different result return locations.
gcc/testsuite/
2009-11-10 Mikael Pettersson <mikpe@it.uu.se>
Backport from mainline:
2009-11-04 Carlos O'Donell <carlos@codesourcery.com>
PR target/41302
* gcc.target/m68k/pr41302.c: New test.
--- gcc-4.4.2/gcc/config/m68k/m68k.c.~1~ 2008-11-19 17:24:10.000000000 +0100
+++ gcc-4.4.2/gcc/config/m68k/m68k.c 2009-11-10 00:10:06.000000000 +0100
@@ -1374,6 +1374,30 @@ flags_in_68881 (void)
return cc_status.flags & CC_IN_68881;
}
+/* Return true if PARALLEL contains register REGNO. */
+static bool
+m68k_reg_present_p (const_rtx parallel, unsigned int regno)
+{
+ int i;
+
+ if (REG_P (parallel) && REGNO (parallel) == regno)
+ return true;
+
+ if (GET_CODE (parallel) != PARALLEL)
+ return false;
+
+ for (i = 0; i < XVECLEN (parallel, 0); ++i)
+ {
+ const_rtx x;
+
+ x = XEXP (XVECEXP (parallel, 0, i), 0);
+ if (REG_P (x) && REGNO (x) == regno)
+ return true;
+ }
+
+ return false;
+}
+
/* Implement TARGET_FUNCTION_OK_FOR_SIBCALL_P. */
static bool
@@ -1386,6 +1410,26 @@ m68k_ok_for_sibcall_p (tree decl, tree e
if ...I presume the bug is that foo put the return value in %d0 while bar should have its return value in %a0. This function isn't eligible for the optimization being used due to this need to move the result I believe that gcc-4.4 for m68k is being held up by the TLS support patches. While I haven't been personally involved to any great degree, I got the impression that the work is pretty much done other than getting it included. Brad Boyer flar@allandria.com --
Al Viro writes:
> On Mon, Oct 11, 2010 at 12:52:56AM +0100, Al Viro wrote:
> > On Sun, Oct 10, 2010 at 10:18:03PM +0200, Geert Uytterhoeven wrote:
> > > >> This is on aranym-0.9.9-1 from Ubuntu/amd64.
> > > >
> > > > stat -L /proc/2/exec, otherwise you'll hit lstat() instead of stat().
> > > > And FWIW 0.9.10-1 squeeze/amd64 also triggers here...
> > >
> > > Still, just "stat: cannot stat `proc/2/exe': No such file or directory" here...
> >
> > Interesting... Which gcc version is used?
>
> Argh... OK, going through aranym with debugger has exhonorated it. My
> apologies ;-/ It *is* gcc in sid. Testcase:
>
> extern int foo(int);
> void *bar(int n)
> {
> return (void *)foo(n);
> }
>
> and gcc -S -O2 turns that into
> bar:
> link.w %fp,#0
> unlk %fp
> jra foo
>
> Spot the obvious bug... BTW, why on the Earth does debian-ports m68k tree
> use gcc-4.3 with Cthulhu-scaring 700Kb gzipped patch and does *not* have
> gcc-4.4?
I can confirm that the bug exists in gcc-4.3.4 and gcc-4.4.5,
but it has been fixed in gcc-4.5.1 which generates:
bar:
link.w %fp,#0
move.l 8(%fp),-(%sp)
jsr foo
move.l %d0,%a0
unlk %fp
rts
I don't yet know the gcc PR number or svn commit # for the fix
(in case people want a backport).
--
Mikael Pettersson writes:
> Al Viro writes:
> > On Mon, Oct 11, 2010 at 12:52:56AM +0100, Al Viro wrote:
> > > On Sun, Oct 10, 2010 at 10:18:03PM +0200, Geert Uytterhoeven wrote:
> > > > >> This is on aranym-0.9.9-1 from Ubuntu/amd64.
> > > > >
> > > > > stat -L /proc/2/exec, otherwise you'll hit lstat() instead of stat().
> > > > > And FWIW 0.9.10-1 squeeze/amd64 also triggers here...
> > > >
> > > > Still, just "stat: cannot stat `proc/2/exe': No such file or directory" here...
> > >
> > > Interesting... Which gcc version is used?
> >
> > Argh... OK, going through aranym with debugger has exhonorated it. My
> > apologies ;-/ It *is* gcc in sid. Testcase:
> >
> > extern int foo(int);
> > void *bar(int n)
> > {
> > return (void *)foo(n);
> > }
> >
> > and gcc -S -O2 turns that into
> > bar:
> > link.w %fp,#0
> > unlk %fp
> > jra foo
> >
> > Spot the obvious bug... BTW, why on the Earth does debian-ports m68k tree
> > use gcc-4.3 with Cthulhu-scaring 700Kb gzipped patch and does *not* have
> > gcc-4.4?
>
> I can confirm that the bug exists in gcc-4.3.4 and gcc-4.4.5,
> but it has been fixed in gcc-4.5.1 which generates:
>
> bar:
> link.w %fp,#0
> move.l 8(%fp),-(%sp)
> jsr foo
> move.l %d0,%a0
> unlk %fp
> rts
>
> I don't yet know the gcc PR number or svn commit # for the fix
> (in case people want a backport).
It's gcc PR41302 which was fixed for gcc trunk on November 4 2009
in r153890. The patch backports easily to gcc-4.4 and solves the
test case there (manual inspection using a cross). It also backports
easily to gcc-4.3 but I haven't tested it there.
--
See http://gcc.gnu.org/PR41302 Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." --
Good old gcc version 4.1.2 20061115 (prerelease) (Ubuntu 4.1.1-21)
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux |
