Re: AMD64 version of GNAT Ada compiler broken due to libthr

Previous thread: Current problem reports assigned to freebsd-threads@FreeBSD.org by FreeBSD bugmaster on Monday, December 27, 2010 - 4:07 am. (1 message)

Next thread: Current problem reports assigned to freebsd-threads@FreeBSD.org by FreeBSD bugmaster on Monday, January 3, 2011 - 4:07 am. (1 message)
From: John Marino
Date: Friday, December 31, 2010 - 4:46 am

For several months I have been getting the GNAT Ada compiler to work 
properly on the four major BSDs.  The i386 FreeBSD, the i386 Dragonfly 
BSD, and the x86_64 Dragonfly BSD ports are currently perfect.  The i386 
and x86_64 ports of NetBSD are nearly perfect, and only lack a 
functional DWARF2 unwind mechanism, and the OpenBSD ports are in pretty 
good shape too.  The progress for this work can be seen at 
http://www.dragonlace.net

However the AMD64 FreeBSD version is unusable and it's due to libthr.  
I'm not sure why the i386 version works with libthr and AMD64 version 
doesn't.  For all four BSDs, there is no configuration difference for 
threading between architectures.

The problem seems to be with the pthread_cond_wait functionality.

I've logged a test case segfault via gdb7.1 below.  I would greatly 
appreciate some help in determining where the problem lies.  If this 
problem can be solved, it will likely result in a perfect port of the 
GNAT Ada compiler for FreeBSD AMD64, something that has not existed before.

Regards,
John

 

Starting program: /usr/home/marino/test_gnat/test_c9a009c/c9a009c
[New LWP 100051]
[New Thread 800a041c0 (LWP 100051)]
[New Thread 800a0ae40 (LWP 100073)]
[New Thread 800a64c80 (LWP 100080)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 800a64c80 (LWP 100080)]
0x00007fffffbfeb19 in ?? ()
* 4 Thread 800a64c80 (LWP 100080)  0x00007fffffbfeb19 in ?? ()
  3 Thread 800a0ae40 (LWP 100073)  0x00000008006923cc in _umtx_op_err ()
    at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
  2 Thread 800a041c0 (LWP 100051)  0x00000008006923cc in _umtx_op_err ()
    at /usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
[Switching to thread 3 (Thread 800a0ae40 (LWP 100073))]#0  
0x00000008006923cc in _umtx_op_err () at 
/usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
37    RSYSCALL_ERR(_umtx_op)
#0  0x00000008006923cc in _umtx_op_err ()
    at ...
From: John Marino
Date: Friday, December 31, 2010 - 5:35 am

Hi Kostik,
You're right, that was an oversight.  I'm using release 8.1, but I tried 
troubleshooting this months ago on 8.0 and the result was identical.

I'm well above my head here.  I don't know what I should be looking for. 
   Here's the dissembled _umtx_op_err function, along with the 
backtraces of the other two threads.  They didn't look that interesting 
to me the first time.

-- john



Starting program: /usr/home/marino/test_gnat/test_c9a009c/c9a009c
[New LWP 100086]
[New Thread 800a041c0 (LWP 100086)]
[New Thread 800a0ae40 (LWP 100051)]
[New Thread 800a64c80 (LWP 100073)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 800a64c80 (LWP 100073)]
0x00007fffffbfeb19 in ?? ()
Cannot set lwp 100073 registers: Invalid argument

An error occurred while in a function called from GDB.
Evaluation of the expression containing the function
(_umtx_op_err) will be abandoned.
When the function is done executing, GDB will silently stop.
Dump of assembler code for function _umtx_op_err:
    0x00000008006923c0 <+0>:	mov    $0x1c6,%rax
    0x00000008006923c7 <+7>:	mov    %rcx,%r10
    0x00000008006923ca <+10>:	syscall
    0x00000008006923cc <+12>:	retq
    0x00000008006923cd <+13>:	nop
    0x00000008006923ce <+14>:	nop
    0x00000008006923cf <+15>:	nop
    0x00000008006923d0 <+16>:	mov    0x102d09(%rip),%rax        # 
0x8007950e0
    0x00000008006923d7 <+23>:	push   %rbx
    0x00000008006923d8 <+24>:	cmp    $0xffffffffffffffff,%rax
    0x00000008006923dc <+28>:	je     0x8006923f5 <_umtx_op_err+53>
    0x00000008006923de <+30>:	lea    0x102cfb(%rip),%rbx        # 
0x8007950e0
    0x00000008006923e5 <+37>:	callq  *%rax
    0x00000008006923e7 <+39>:	mov    -0x8(%rbx),%rax
    0x00000008006923eb <+43>:	sub    $0x8,%rbx
    0x00000008006923ef <+47>:	cmp    $0xffffffffffffffff,%rax
    0x00000008006923f3 <+51>:	jne    0x8006923e5 <_umtx_op_err+37>
    0x00000008006923f5 <+53>:	pop    %rbx
    0x00000008006923f6 <+54>:	retq
    0x00000008006923f7 ...
From: Kostik Belousov
Date: Friday, December 31, 2010 - 5:52 am

The instruction counter is right before syscall, so I do think that the
thread was executing the syscall.

Backtrace for LWP 100073 indeed looks interesting, because the address
0x00007fffffbfeb19 belongs to the area used for stack(s), including
the thread stacks.

FreeBSD amd64 currently provides non-executable stacks for non-main
threads, but executable stack for main thread. i386 has no support for
nx bit on non-PAE kernels.

As a useful experiment, go to src/lib/libthr/thread/thr_stack.c, find
the following fragment

		if ((stackaddr = mmap(stackaddr, stacksize+guardsize,
		     PROT_READ | PROT_WRITE, MAP_STACK,
		     -1, 0)) != MAP_FAILED &&

and change the flags from PROT_READ | PROT_WRITE to
PROT_READ | PROT_WRITE | PROT_EXEC. Then recompile and reinstall libthr,
From: John Marino
Date: Friday, December 31, 2010 - 6:08 am

Hi Kostik,
The result is the test passes.  A small gdb log follows to prove it.
So what does this mean?

-- John


Starting program: /usr/home/marino/test_gnat/test_c9a009c/c9a009c
[New LWP 100064]
[New Thread 800a041c0 (LWP 100064)]
[New Thread 800a0ae40 (LWP 100051)]
[New Thread 800a64c80 (LWP 100073)]
[New Thread 800aa1ac0 (LWP 100090)]
[Thread 800aa1ac0 (LWP 100090) exited]
Invalid selected thread.
[Switching to thread 2 (Thread 800a041c0 (LWP 100064))]#0 
0x00000008006923cc in _umtx_op_err () at 
/usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
37	RSYSCALL_ERR(_umtx_op)
Continuing.
[Thread 800a64c80 (LWP 100073) exited]
Invalid selected thread.
[Switching to thread 2 (Thread 800a041c0 (LWP 100064))]#0 
0x00000008006923cc in _umtx_op_err () at 
/usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
37	RSYSCALL_ERR(_umtx_op)
Continuing.
[Thread 800a0ae40 (LWP 100051) exited]
Invalid selected thread.
[Switching to thread 2 (Thread 800a041c0 (LWP 100064))]#0 
0x00000008006923cc in _umtx_op_err () at 
/usr/src/lib/libthr/arch/amd64/amd64/_umtx_op_err.S:37
37	RSYSCALL_ERR(_umtx_op)
Continuing.

Program exited normally.




_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: Kostik Belousov
Date: Friday, December 31, 2010 - 6:27 am

This means that the Ada complier or tasking library uses on-stack
trampolines for something. Since FreeBSD threads on amd64 get
non-executable stacks, the tasking fails.

The proper solution is to provide a support for conditional
non-executable stacks, as described in
http://lists.freebsd.org/pipermail/freebsd-arch/2010-November/010826.html
The latest WIP patch is
http://people.freebsd.org/~kib/misc/nxstacks.3.patch
From: John Marino
Date: Friday, December 31, 2010 - 6:37 am

Yeah, that's kind of what I was getting at.  Would this patch get into 
FreeBSD 8.2, and would that mean that GNAT would start working properly 
starting with FreeBSD 8.2 if that happened?

I guess that also means the other BSD's have been allowing executable 
stacks all along.

Thanks!


_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: Kostik Belousov
Date: Friday, December 31, 2010 - 6:44 am

Definitely not in 8.2.
Might be in 8.3, if successfully landed in HEAD.

Besides the patch for the base system, compiler must be configured
to properly mark the objects that need executable thunks on the stack.
Or, there is a compiler configuration that prevents using the thunks
on the stack.
From: John Marino
Date: Friday, December 31, 2010 - 12:37 pm

Hi Kostik,

Thanks for pointing me in the right direction.  After some research, I 
discovered that only DragonFly BSD allows execution on the stack by 
default.  NetBSD and OpenBSD (and Solaris and Darwin) all were specially 
configured within gcc to execute mprotect first to enable this 
functionality.  FreeBSD never had this gcc configuration code and 
frankly it looks like it should have already been there.

I created my own __enable_execute_stack macro function based on these 
previous works and now GNAT has passed all tests!  Since i386 always 
worked, I only applied to macro to the AMD64 configuration header.

You've been a great help!  Once I understood what the issue was, 
everything fell into place.

-- John


_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: Kostik Belousov
Date: Friday, December 31, 2010 - 12:46 pm

You need the same application of mprotect() for i386 too, since
From: John Marino
Date: Friday, December 31, 2010 - 1:19 pm

Ah, interesting.  I didn't realize the ramifications of AMD64-only 
application of mprotect().  It's easy enough to apply the same macro to 
both architectures.

As far as pushing it upstream, I've got literally a few dozen patches, 
and the majority of them should be contributed back.  I haven't gone 
through the absurdly difficult and time-consuming process of assigning 
copyright over to the FSF, partly because I reside in France with a 
Dutch employer and nobody I work for would sign the legal documents FSF 
requests (if I even wanted to share with my employers what I do in my 
own time.)

I may go through the process some day if we can leave my employers out 
of it, but it's not a priority at this moment.  I'm not philosophically 
opposed to giving back, although I am dismayed at the number of offered 
patches that are never reviewed by the gcc developers and die on the 
vine.  If I could find a way to "fast-track" these patches in where I 
wouldn't be wasting my time, I'd do it.  It's a pain to maintain a 
parallel fork and I'd love to reduce the number of differences between 
the code bases.

Obviously if you have any ideas that get my FreeBSD work into the gcc 
efficiently, I'm all ears.

Regards,
John


_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: Daniel Eischen
Date: Friday, December 31, 2010 - 2:11 pm

I've got FSF paperwork on file, specifically to submit my original
FreeBSD and VxWorks GNAT ports to AdaCore (which they then upstreamed
to GCC).  It's been a few years since I submitted the paperwork,
however, and I'm not sure if they require resubmittal at periodic
intervals.  It may be possible for you to explain your changes to
me, without me looking at your original code or changes.

-- 
DE
_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: John Marino
Date: Saturday, January 1, 2011 - 1:21 am

Hi Daniel,
First, thanks for the offer.  I might come back to you on that.
Secondly, I should have mentioned that the majority of my patches are 
GNAT specific, and very few are like this one which might apply to all 
FreeBSD/GCC users.

I have already created 7 new FreeBSD ports that include this "GNAT AUX", 
the GNAT Programming Studio, and the Ada Web Server.  I will work with 
the FreeBSD ports people shortly to get these ports into the tree, and 
also to prune some of the previous GNAT ports, such as gnat-gcc44.  If I 
recall, your port was GNAT GPL, which is a different beast.

Anyway, the FreeBSD/Ada users will have the benefit of my work shortly. 
So maybe we should only focus on non-Ada patches.  I don't have very 
many on those, and the majority are on other BSD systems, not FreeBSD.

John

_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: Daniel Eischen
Date: Saturday, January 1, 2011 - 10:41 am

Well, it doesn't matter if they are GNAT or GCC specific,
I believe you need FSF paperwork on file for either of them
to be upstreamed.  There really isn't much of a difference
between the GPL version of GNAT (from AdaCore) or the GCC
GNAT - the GPL version is released from some stable GCC
version.  AdaCore eventually upstreams all of their changes
into GCC.

If you notice, the FreeBSD port of GNAT-GPL no longer
has any run time files as local patches because they
have been upstreamed.  There are only small patches to
change the binary names (e.g., gcc -> gnatgcc) or
other minor configuration changes.

Anyway, it would be really nice to upstream your changes,
to make the ports simpler, and so that GNAT-GPL will
also eventually inherit AMD64 support.

But regardless, thank you for your work!

-- 
DE
_______________________________________________
freebsd-threads@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-threads
To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
From: Kostik Belousov
Date: Friday, December 31, 2010 - 5:22 am

First, you did not specified which version of the base system you use.

Second, I suspect that the backtrace you have shown is not from the
thread that generated SIGSEGV. Switch to other threads and see their
backtraces, I am almost sure that there will be something more interesting.

Just to be sure, in gdb, disassemble _umtx_op_err() and see which
instruction is executed when SIGSEGV generated. I think that the thread
Previous thread: Current problem reports assigned to freebsd-threads@FreeBSD.org by FreeBSD bugmaster on Monday, December 27, 2010 - 4:07 am. (1 message)

Next thread: Current problem reports assigned to freebsd-threads@FreeBSD.org by FreeBSD bugmaster on Monday, January 3, 2011 - 4:07 am. (1 message)