Re: sbrk(2) broken

Previous thread: HEADSUP: new wiki page: State of Packages on Sparc64 by Mark Linimon on Wednesday, January 2, 2008 - 7:13 pm. (1 message)

Next thread: panic about half the time with WPA+WPI during startup by Hanns Hartman on Thursday, January 3, 2008 - 1:22 pm. (3 messages)
To: <freebsd-current@...>
Cc: Poul-Henning Kamp <phk@...>
Date: Thursday, January 3, 2008 - 2:38 am

Poul-Henning noticed today that xchat fails to start if malloc uses sbrk
internally. This failure happens during the first call to malloc, with
the following message:

Fatal error 'Can't allocate initial thread' at line 335 in file
/usr/src/lib/libthr/thread/thr_init.c (errno = 12)

This can be worked around with MALLOC_OPTIONS=dM .

The problem does not appear to be specific to jemalloc; I reverted
src/lib/libc/stdlib/malloc.c to revision 1.92 (last phkmalloc revision),
which also uses sbrk, and the failure mode is the same.

The failure occurs on both i386 and amd64. It appears that sbrk(0)
returns an address that is in the address range normally used by mmap.
So, the first call to sbrk with a non-zero increment is fantastically
wrong. On i386 (ktrace output):

1013 xchat CALL break(0x28200000)
1013 xchat RET break -1 errno 12 Cannot allocate memory

On amd64 (truss ouput):

break(0x800900000) ERR#12 'Cannot allocate memory'

sbrk is not a true system call, so it seems like the problem should have
something to do with the _end data symbol. I looked at it in gdb though
and never saw an unreasonable value, despite bogus sbrk(0) results. I
do not know offhand how to get the addresses of .minbrk and .curbrk
(register inspection within gdb while stepping through sbrk?), which are
what sbrk actually uses (see src/lib/libc/amd64/sys/sbrk.S). Perhaps
the loader isn't initializing them correctly...

I am quite pressed for time at the moment, and cannot look into this in
any more detail for at least a couple of weeks. If anyone knows what
the problem is, please let me know.

Thanks,
Jason
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Jason Evans <jasone@...>
Cc: <freebsd-current@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 8:21 am

Malloc() itself knows about memory amount _really_ in use by a program and
could check it don't go beyond the limits, but for this it needs run-time
check via getrlimit() call for each malloc() call (a program can use
setrlimit() by itself). Traking direct mmap()s and sbrk()s outside of
malloc() is also needed.

--
http://ache.pp.ru/
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Andrey Chernov <ache@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 8:57 am

No, the VM system has a much better idea about this.

You need to think about this the right way:

There is address space allocated to the process (via sbrk/mmap)

A subset of this, is address space allocated by the program (via malloc)

...and then there is memory actually in use, which is an entirely different
thing, of which we currently only have some kind of clue in the VM
system.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Poul-Henning Kamp <phk@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 9:12 am

Then, we need sysctl to fetch that "memory actually in use" from the
kernel and compare that with getrlimit() which allows malloc() to return
0 when needed.

--
http://ache.pp.ru/
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Andrey Chernov <ache@...>
Cc: <freebsd-current@...>
Date: Friday, January 4, 2008 - 9:25 am

That won't help much -- malloc could have allocated some address space that
hasn't (yet) been touched by the process. Just returning 0 when the
amount of memory "in use" hits a limit wouldn't stop the process from
then touching all the memory it has previously been allocated and
exceeding the limit.

--
David Taylor
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Date: Friday, January 4, 2008 - 10:22 am

In that case the process is subject to be killed by system, if exceeds its
limits.
But... this is not malloc() problem at all, malloc() designed to detect
overflow situation, not prevent it. The malloc() problem is not returning 0.

--
http://ache.pp.ru/
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Andrey Chernov <ache@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 9:28 am

[Empty message]
To: Jason Evans <jasone@...>
Cc: <freebsd-current@...>
Date: Thursday, January 3, 2008 - 4:39 pm

I cannot say definitely what happen, but please note that the _end
symbol is defined by linker script, and it shall be present in all
executable and shared objects. The value you reported would be naturally
the _end value for some shared object.

I tried both the RELENG_7 and HEAD, and sbrk(0) correctly returns a
seemingly valid value like 0x8049644.

#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int
main(int argc, char *argv)
{
void *p;

p =3D sbrk(0);
printf("%p\n", p);

return (0);
}

To: Jason Evans <jasone@...>
Cc: <freebsd-current@...>, Poul-Henning Kamp <phk@...>
Date: Thursday, January 3, 2008 - 3:21 pm

The real question is why we would revert perfectly good code (jemalloc)
from using a modern interface to using one that has been obsolete for
twenty years, and marked as such in the man page for seven years.

If rwatson@ wants malloc() to respect resource limits, he can bloody
well fix mmap(). Until he does, the datasize limit is a joke anyway, as
anyone can circumvent it by either using mmap() instead of malloc() or
setting _malloc_options before calling malloc().

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Thursday, January 3, 2008 - 8:26 pm

The issue here was that there were a number of reports that out-of-control=
=20
applications were toasting systems that weren't getting toasted under 6.x. =
I=20
experienced this on my web server, but the ports build cluster has been=20
running into it for months. The symptom is that a single application exhau=
sts=20
swap, causing all sorts of things to break (tm), killing of other large=20
processes, etc. To be clear, in the new world order, instead of getting NU=
LL=20
back from malloc(3), SIGKILL is delivered to large processes.

When I e-mailed Jason Evans and Alan Cox about it, I suggested that we=20
actually teach malloc(3) to enforce an allocation limit itself by querying =
a=20
limit once at process startup, and then using its own accounting to decide=
=20
when to start failing requests. As an alternative model that would require=
=20
some more infrastructural changes, I suggested a new mmap() flag that hinte=
d=20
to the kernel that the page should count against a swap/anonymous memory=20
limit, but that we should avoid more serious changes at the last minute bef=
ore=20
a release. Alan suggested the the model Jason ended up implementing as a=
=20
lower risk way to restore the 6.x resource limits non-disruptively. As it=
=20
turned out, this proved much more complicated than expected.

The right answer is presumably to introduce a new LIMIT_SWAP, which limits =
the=20
allocation of anonymous memory by processes, and size it to something like =
90%=20
of swap space by default. Since that won't be happening before 7.0, I beli=
eve=20
the consensus is to simply not MFC the changes for 7 and proceed with the=
=20
release. However, having a resource limit on swap use in order to prevent =
the=20
above scenario is actually quite important: SIGKILL of arbitrary processes =
is=20
not a good way to deal with one run-away process, and the virtual memory si=
ze=20
limit, while also useful, prevents you from limiting the allocation of swap=
=20
without also ...

To: Robert Watson <rwatson@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 6:41 am

Huh??? Again, huh???
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 7:19 am

FreeBSD allows memory overcommit, both overcommit of physical memory resulting
in paging, and overcommit of swap space. For the last few years, resource
limits on the data segment size, previously observed by malloc(), have
prevented processes from mallocing enough memory individually to exhaust swap
on 32-bit systems. This is arguably a bug, because you actually want a single
process to be able to allocate enough memory to fill its address space, but
because the data segment size is used to make address space layout decisions
from the inception of the process, is rather inate to using sbrk(). Jason's
new malloc uses mmap() of anonymous memory, which isn't affected by the data
segment limit, and hence, as a feature, isn't limited by the resouce limit.
This turns out to be awkward if you have a run-away process, as where
previously it would simply get back an error when it tried to exceed its
resource limit, now it simply consumes all your swap, which then results in
overcommit.

My hope was that we could re-introduce a resource limit on malloc'd memory
without large changes, but that appears to have been more tricky than hoped.
The goal is not to prevent overcommit, which is invaluable in UNIX systems due
to the fork() model which pretty much pre-supposes it by design, rather, to
prevent exhaustion of swap by a single process if not specifically allowed by
the administrator (in the same way we limit all sorts of other things, like
open files, mbufs, socket buffer memory, etc). The right way to do it is to
provide a specifically configurable process limit on swap use, the same way we
did for data segment size, only not data segment size, but that was considered
likely too risky for 7.0.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@f...

To: Igor Mozolevsky <igor@...>
Cc: <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 6:55 am

For the same reason as it has for the last 20 years or so: memory
overcommit, which means that malloc() allocates address space, not
memory. Actual memory is allocated on-demand when the address space is
used (read from or written to). If there is no RAM left and none can be
freed by swapping out, the process gets killed. The process that gets
killed is not necessarily the memory hog, it is merely the process that
is unlucky enough to touch a new page at the wrong moment, i.e. when all
RAM and swap is exhausted *or* everything in RAM is wired down and
unswappable.

Of course, if you're afraid of memory overcommit and you know in advance
how much memory you need, you can simply allocate a sufficient amount of
address space at startup and touch it all. This way, you will either be
killed right away, or be guaranteed to have sufficient memory for the
rest of your (process) lifetime. Alternatively, do what Varnish does:
create a large file, mmap it, and allocate everything you need from that
area, so you have your own private swap space. Just make sure to
actually allocate the disk space you need (by filling the file with
zeroes, or at the minimum writing a zero to the file every sb.st_blksize
bytes, preferably sequentially to avoid excessive fragmentation) or you
may run into the same problem as with malloc() if the disk fills up
while your backing file is still sparse.

The ability to specify a backing file to use instead of anonymous
mappings would be a cool addition to jemalloc.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 7:18 am

Broadcasting SIGDANGER would be a much better option; followed by
SIGTERM to the memory hogger (to allow for graceful termination) and
only then SIGKILL. I can imagine a few (legitimate) scenarios when a

That would be really cool and even better if it allocated it in a
contiguous chunk.

Igor :-)
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Dag-Erling Sm?rgrav <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 11:26 pm

That would create a nicely sized 'hole' in the starting blocks. What
Dag-Erling describes is the correct(TM) way of making sure that all
blocks have been allocated from the backing store of the file.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 8:45 am

We don't currently have SIGDANGER, but the signal code was rewritten
years ago to allow more than 32 signals precisely for the purpose of
implementing an AIX-like SIGDANGER. This wasn't done, however, and
eventually SIGTHR was the first new signal to take advantage of the

No. First of all, you're thinking of lseek(), not fseek() Second, an
lseek() beyond the end of a file will not actually extend the file.
Third, ftruncate() (which *will* extend a file if it is shorter than the
requested length) or lseek() followed by write() will not allocate
physical disk space except for the data actually written; it will create
a sparse file, which when later written to will become fragmented,
resulting in horrible performance.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Igor Mozolevsky <igor@...>
Date: Friday, January 4, 2008 - 8:53 am

In message <86myrlahee.fsf@ds4.des.no>, =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= wr

SIGDANGER is not what we need.

What we need is an intelligent mechanism to tell applications what
the overall situation is, so that jemalloc and aware applications can
tune their usage pattern to the availability of physical and virtual
memory.

Instead of the binary "SIGDANGER" indication we need a more gradual
state, at the very least three stats: "plenty", "getting a bit
tight" and "crunchtime".

Having a signal to indicate changes of the state may make sense,
but in a crunch, you don't want to wake all processes and page them
in, just to tell them that you're short on memory, it would have
to be a signal that doesn't schedule the recipient process until
something else does.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Poul-Henning Kamp <phk@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 9:03 am

This makes memory management in the userland hideously and
unnecessarily complicated. It's simpler to have SIGDANGER (meaning,
free all you can) -> SIGTERM (terminate gracefully) -> SIGKILL (too
late, I'm killing you anyway); and maybe a MIB in sysctl like
...vm.overcommit_action ='soft' being SIGDANGER->SIGTERM->SIGKILL and
= 'hard' being SIGKILL, so the sysadmin at least has a choice

Igor
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 9:12 am

You don't seem to understand what Poul-Henning was trying to point out,
which is that broadcasting SIGDANGER can make a bad situation much, much
worse by waking up and paging in every single process in the system,
including processes that are blocked and wouldn't otherwise run for
several minutes, hours or even days (getty, inetd, sshd, mountd, even
nfsd / nfsiod in some cases can sleep for days at a time waiting for
I/O)

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Sm??rgrav <des@...>
Cc: Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Igor Mozolevsky <igor@...>
Date: Friday, January 4, 2008 - 9:48 am

By making the default action for SIGDANGER to be SIG_IGN, this problem
would be mostly solved. Only processes that actually care about SIGDANGER
and installing the handler for it would require some non-trivial and
resource-hungry operation.

To: Kostik Belousov <kostikbel@...>
Cc: Dag-Erling Sm??rgrav <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Igor Mozolevsky <igor@...>
Date: Monday, January 7, 2008 - 5:08 am

In message <20080104134829.GA57756@deviant.kiev.zoral.com.ua>, Kostik Belousov

This is a non-starter, if SIGDANGER is to have any effect, all
processes that use malloc(3) should react to it.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Poul-Henning Kamp <phk@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Monday, January 7, 2008 - 5:58 am

This depends on what SIGDANGER is supposed to indicate. IMO, a single
signal is inadequate - you need a "free memory is less than desirable,
please reduce memory use if possible" and one (or maybe several levels
of) "memory is really short, if you're not important, please die".

The former could reasonably default to SIG_IGN - processes that are
in a position to release memory on demand could provide a handler to
do so. (This could potentially include malloc returning space on
its freelist to the kernel).

The latter should default to "terminate process" and a process that
considers itself "important" enough can trap it.

--=20
Peter Jeremy
Please excuse any delays as the result of my ISP's inability to implement
an MTA that is either RFC2821-compliant or matches their claimed behaviour.

To: Peter Jeremy <peterjeremy@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>
Date: Monday, January 7, 2008 - 6:05 am

That's what I have been advocating for the last 10 years...

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Poul-Henning Kamp <phk@...>
Cc: Kostik Belousov <kostikbel@...>, Peter Jeremy <peterjeremy@...>, <freebsd-current@...>
Date: Monday, January 7, 2008 - 9:15 am

That makes the userland side of unnecessarily overcomplicated. If a
process handles SIGDANGER then let it do so and assume it's important
enough to be left alone, if a process doesn't handle SIGDANGER then
send SIGTERM to them then SIGKILL; but in any case SIGTERM *should*
precede SIGKILL - the processes ought to be allowed to terminate
gracefully.

Igor :-)
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Kostik Belousov <kostikbel@...>, Peter Jeremy <peterjeremy@...>, <freebsd-current@...>
Date: Monday, January 7, 2008 - 9:18 am

In message <a2b6592c0801070515g37735475kc0922af8f93723ca@mail.gmail.com>, "Igor

Yes, but you will not see this complication, it will be hidden
in the implementation of malloc(3).

Every problem has a simple, easy to understand solution that does
not work. SIGDANGER is one of these. It didn't work any good on
AIX and it won't do so on FreeBSD either.

The problem simply requires more than one bit of feedback information
to get a sensible regulation.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Poul-Henning Kamp <phk@...>
Cc: Kostik Belousov <kostikbel@...>, Peter Jeremy <peterjeremy@...>, <freebsd-current@...>, Igor Mozolevsky <igor@...>
Date: Monday, January 7, 2008 - 7:19 pm

On Mon, 07 Jan 2008 13:18:47 +0000

How could you hide it inside malloc? Would malloc start
returning 0 after receiving the "less mem than desirable"
signal? Would it ever go back to returning non-zero?

I thought that the idea of things like SIGDANGER was that
applications would be written to have a mode where they could
shut down some aspect of their operation, and free resources. I
don't see how you can do that, autonomously, from within malloc?

Maybe introduce a special flavour of pointer value, returned by a
special version of malloc for "cache" objects, that the system is
allowed to automatically reclaim? Then programs would need to be
able to handle SIGSEGV when accessing those...

Cheers,

--
Andrew
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Andrew Reilly <andrew-freebsd@...>
Cc: Kostik Belousov <kostikbel@...>, Peter Jeremy <peterjeremy@...>, Poul-Henning Kamp <phk@...>, <freebsd-current@...>
Date: Monday, January 7, 2008 - 8:06 pm

I'm with Andrew on this one. The only (sensible) way I could see it
being hidden behind malloc() is if malloc() blocks until sufficient
memory becomes available.

I thought the real idea behind SIGDANGER was to tell the kernel "I
kind of know what I'm doing, so if you gonna kill something don't kill
me" and that was achieved by AIX not SIGKILLing processes that had
sigaction(SIGDANGER) != SIG_IGN.

Igor :-)
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Kostik Belousov <kostikbel@...>, Andrew Reilly <andrew-freebsd@...>, <freebsd-current@...>, Peter Jeremy <peterjeremy@...>
Date: Monday, January 7, 2008 - 8:17 pm

In message <a2b6592c0801071606g4c0dcb9ap117e345fda5e7e5f@mail.gmail.com>, "Igor

You should read some recent literature on malloc(3), my own and
Jasons papers are good places to start.

For performance reasons, malloc(3) will hold on to a number of pages
that theoretically could be given back to the kernel, simply because
it expects to need them shortly.

Such parameters and many others of the malloc implementation can
be tweaked to "waste" more or less memory, in response to a sensibly
granular indication from the kernel about how bad things are.

Also, many subsystems in the kernel could adjust their memory use
in response to a "memory pressure" indication, if memory is tight,
we could cache vnodes and inodes less agressively, if things are
going truly bad, we can even ditch all non-active entries from
these caches.

If one implements this with three states:

Green - "all clear"

Yellow - "tight" - free one before you allocate one if you can.

Red - "all out" - free all that you sensibly can.

And implemented strategies like I propose above (and have proposed
for the last 10 years), then it is very unlikely that the system
would ever get into the red state, because the yellow state will
mitigate and reduce the memory pressure.

Nothing prevents an intelligent process from listening in and
doing sensible things, firefox could ditch the memory cache of
pages for instance.

But we can't get anywhere until some VM wizard produces the
three "lamps" for us to look at in the first place, that's where
we have been stuck for the last 10 years.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-uns...

To: <freebsd-current@...>
Cc: Kostik Belousov <kostikbel@...>, Andrew Reilly <andrew-freebsd@...>, Poul-Henning Kamp <phk@...>, Igor Mozolevsky <igor@...>, Peter Jeremy <peterjeremy@...>
Date: Monday, January 7, 2008 - 9:37 pm

Although the primary concern is malloc(), I would like to point out that=20
various programs implementing copying garbage collection could more=20
efficiently give memory back to the system than malloc(), and could therefo=
r=20
benefit more than malloc() from some kind of feedback from the kernel.

There was concern over the complexity involved with intelligently doing=20
something about the memory pressure hints in userspace, but this does not=20
apply here since the allocator/garbage collection would be the equivalent o=
f=20
malloc() and complexity there would not affect application code.

The problem with malloc() being that, unless I am missing something, malloc=
=20
will never be able to give back memory to the kernel except insofar as the=
=20
memory mapped is continuously unused between some location and the break (i=
n=20
the case of sbrk()) or over the entire range (mmap()). malloc() cannot forc=
e=20
this to be the case, since pointers must remain valid. The possibility of=20
reclamation is then often going to be limited to completely unused space=20
being held by malloc() for future use, rather than also applying to areas=20
already used for allocation.

Programs implementing copying GC, or able to for some other reason to move=
=20
allocated memory around, could compact the heap and give back left-over=20
memory. In some cases this would only entail a temporary improvement due to=
=20
defragmentation, but in others (such as a long-running program spiking in=20
memory use, only then to drop a lot of that memory) it could have a pretty=
=20
massive effect on memory use.

Where a malloc() using program might be unable to sbrk() or munmap() becaus=
e=20
there happens to be some left-over non-free piece of memory at the top of t=
he=20
mapped range, a GC could use indications from the system to ensure this is=
=20
not the case (depending on details of the implementation; for example,=20
compactation of tenured generations could be forced early, etc).

(This i...

To: Peter Schuller <peter.schuller@...>
Cc: Andrew Reilly <andrew-freebsd@...>, Peter Jeremy <peterjeremy@...>, Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Igor Mozolevsky <igor@...>, Kostik Belousov <kostikbel@...>
Date: Tuesday, January 8, 2008 - 2:36 pm

Actually, malloc(3) can use madvise(2) to notify the kernel that
arbitrary pages in the arena are unused and can be discarded. The
current implementation will do so if the H option is specified.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: Andrew Reilly <andrew-freebsd@...>, Peter Jeremy <peterjeremy@...>, Poul-Henning Kamp <phk@...>, Igor Mozolevsky <igor@...>, Kostik Belousov <kostikbel@...>, Dag-Erling <des@...>
Date: Wednesday, January 9, 2008 - 2:22 pm

Ah, interesting. I was not aware of that.

However, in this context it will likely only help partially since you still=
=20
need a full page to be free (and with a lot of programs many allocations wi=
ll=20
be significantly smaller than that, and I have to assume no real-life mallo=
c=20
will align all allocations to pages, or the overhead would be extreme).

=2D-=20
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey@scode.org
E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org

To: Peter Schuller <peter.schuller@...>
Cc: Andrew Reilly <andrew-freebsd@...>, Peter Jeremy <peterjeremy@...>, Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Igor Mozolevsky <igor@...>, Kostik Belousov <kostikbel@...>
Date: Thursday, January 10, 2008 - 6:04 am

Page-aligning every allocation would be supremely stupid, and jemalloc
does so only for allocations larger than a page.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Peter Schuller <peter.schuller@...>
Cc: Andrew Reilly <andrew-freebsd@...>, Peter Jeremy <peterjeremy@...>, Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Igor Mozolevsky <igor@...>, Kostik Belousov <kostikbel@...>
Date: Thursday, January 10, 2008 - 10:31 am

I misread your "no" as "any", so it seems we are in violent agreement.

However, most allocators these days are zone or slab allocators (or
similar in principle), and are pretty good at minimizing external
fragmentation except for pathological cases, which are suprisingly rare
in practice.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Poul-Henning Kamp <phk@...>
Cc: Kostik Belousov <kostikbel@...>, Andrew Reilly <andrew-freebsd@...>, <freebsd-current@...>, Peter Jeremy <peterjeremy@...>
Date: Monday, January 7, 2008 - 8:57 pm

Can you provide some refs/links, unfortunately googling for

I don't think it's the kernel that is being ill-mannered (unless, of
course, it's running ZFS ;-)) by eating up the memory, it's the user

How do you propose they 'eavesdrop' on the kernel? Baring in mind that
most apps nowadays are written for Linux and are hacked to be portable
afterwards (just look at the number of patches in the ports tree),
it's much simpler to write a signal handler than FreeBSD-kernel

I think the problem is not in providing the lamps to indicate the
state, but figuring out an algorithm for judging green->yellow and
yellow->green transitions...

Igor
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Kostik Belousov <kostikbel@...>, Andrew Reilly <andrew-freebsd@...>, <freebsd-current@...>, Peter Jeremy <peterjeremy@...>
Date: Tuesday, January 8, 2008 - 4:31 am

In message <a2b6592c0801071657s43fcc739jac09baedef7b7532@mail.gmail.com>, "Igor

http://phk.freebsd.dk/pubs

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: <freebsd-current@...>
Date: Monday, January 7, 2008 - 10:34 pm

On Tue, 8 Jan 2008 00:57:21 +0000
"Igor Mozolevsky" <igor@hybrid-lab.co.uk> wrote:

Try PHK+malloc or just phkmalloc for better results. Looking for
misspelled acronyms can be a frustrating and futile undertaking
indeed :)

--=20
Alexander Kabaev

To: Poul-Henning Kamp <phk@...>
Cc: Kostik Belousov <kostikbel@...>, Peter Jeremy <peterjeremy@...>, <freebsd-current@...>, Igor Mozolevsky <igor@...>
Date: Monday, January 7, 2008 - 8:28 pm

On Tue, 08 Jan 2008 00:17:04 +0000

Aah, OK, so there's some essentially system-level caching going
on behind the scenes, and that's readily malleable for this sort
of thing. I thought that you were proposing some way to
propagate the "yellow" or "red" conditions to user-program
activity through malloc, which seems hard, since the only
official out-of-band signal there is a zero return.

I'll have to track down your papers, though, because I thought
that the whole problem revolved around the fact that malloc(3)
doesn't hand out physical pages at all: that was left up to the
kernel vm pager to do as needed. Is it zeroed (and therefore

I agree. That sort of auto-tuning of the space/speed trade-off

I imagine that even if the accounting can be managed efficiently,
the specification of the specific thresholds would be fairly
tricky to specify...

Cheers,

--
Andrew
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Jason Evans <jasone@...>, Igor Mozolevsky <igor@...>
Date: Friday, January 4, 2008 - 9:24 am

Another aspect of the problem is that applications have come to depend in=
=20
malloc(3) returning NULL when memory is getting tight, and while we have ne=
ver=20
done exactly that, we have historically had malloc(3) return NULL when we g=
et=20
close to the process data segment size.

Robert N M Watson
Computer Laboratory
University of Cambridge

To: Robert Watson <rwatson@...>
Cc: Poul-Henning Kamp <phk@...>, <freebsd-current@...>, Jason Evans <jasone@...>, Igor Mozolevsky <igor@...>
Date: Saturday, January 5, 2008 - 9:50 am

I don't do that any more. Unless the program I'm writing is intended to
run for a long time and can gracefully handle an out-of-memory situation
(such as denying client requests until the situation improves), I write
malloc() wrappers which zero the allocated region before returning to
the caller, to force a SIGSEGV and spare the caller from having to check
the return value.

I sometimes also allocate a little bit extra and stick a magic signature
and an allocation length in there so my free() wrapper can check for
bugs and zero the allocated memory before freeing it. I wouldn't need
any of this if my code only ran on FreeBSD, but most of my $DAYTIME_JOB
code these days runs on Linux first and FreeBSD second.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 7:31 am

Do everyone a favour and research the topic in the archives, please.
Another thread on the subject will just waste everyone's time.

Kris

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 7:22 am

That will create a sparse file without file system blocks to back it, and is
effectively also over-commit. When the file system runs out of room, you will
get SIGSEGV when the vnode pager discovers it can't write a page to disk. If
you zero-fill it, the blocks are pre-allocated. In a more ideal world, we
might support an ioctl or system call to pre-allocate but not hook up the
blocks until they were written to, in order to avoid writing lots of zeros to
disk, but we don't live in that ideal world yet.

Allowing malloc to support alternative sources of pages for memory mapping,
such as specific files, would be very neat indeed.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Robert Watson <rwatson@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 7:30 am

Surely you should not be allowed to overcommit on fseek() followed by
write(,,1); zeroing out gigs of hdd space seems rather silly...

Igor
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Igor Mozolevsky <igor@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 7:38 am

Sparse files are a feature. It just becomes inconvenient at that point
because you discover the lack of space asynchronously from a useful user
process event. When memory pressure gets high, the vnode pager decides it's
time to push a dirty page to disk, and then discovers that there are no free
blocks on the file system to write to. As I mentioned in my e-mail, it would
be nice if our file system supported a way to reserve blocks for files without
hooking them up to the file's visiible address space (in order to avoid
zeroing them, which is required if you do want to hook them up for an
unprivileged process). However, that feature doesn't currently exist.

Many systems with sensitivity to on-demand allocation costs and without
security requirements allow files to be extended without zeroing. On systems
with security requirements, this becomes a privileged operation (such as on
Mac OS X) because exposing unzeroed pages from other files or processes not
explicitly shared is Not Allowed.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Robert Watson <rwatson@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>, Igor Mozolevsky <igor@...>
Date: Friday, January 4, 2008 - 8:48 am

Even for files which are intended to be filled up immediately, telling
the file system ahead of time how much data will be written would allow
it to make much better layout decisions.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Robert Watson <rwatson@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 5:32 am

Not a good solution on its own. You need a per-process limit as well,
otherwise a malloc() bomb will still cause other processes to fail

Thank you :)

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 7:06 am

ly.

That was what I had in mind, the above should read RLIMIT_SWAP.

Robert N M Watson
Computer Laboratory
University of Cambridge

To: Robert Watson <rwatson@...>
Cc: Dag-Erling <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 9:54 am

Robert Watson wrote:

To: Skip Ford <skip@...>
Cc: Dag-Erling <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 9:59 am

Oh, I thought that I was the sole user of the patch. What problems did you
encountered while testing it ?

What you mean by "do 90% of swap" ?

To: Kostik Belousov <kostikbel@...>
Cc: Dag-Erling <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 10:11 am

> > > On Fri, 4 Jan 2008, Dag-Erling Sm

To: Skip Ford <skip@...>
Cc: Dag-Erling <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 10:18 am

Ok. The patch really imposes two kind of limits:
- the total amount of anon memory that could be allocated in the whole
system (this is what I called "disabling overcommit")
- per-user RLIMIT_SWAP limit, that account the allocation by the uid. This
has some obvious problems with setuid(2) syscall. AFAIR, I ended up
not moving the accounted numbers to the new uid.

Both limits can be turned on/off independently.

May be, time to revive it.

To: Kostik Belousov <kostikbel@...>
Cc: Dag-Erling <des@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 10:58 am

> > > > > On Fri, 4 Jan 2008, Dag-Erling Sm

To: Skip Ford <skip@...>
Cc: Kostik Belousov <kostikbel@...>, <freebsd-current@...>, Robert Watson <rwatson@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Saturday, January 5, 2008 - 10:01 am

Implementing a per-process limit would help fix the setuid() problem,
since the usage of the process calling setuid() would be known and could
be transferred to the new user. There could however be a problem when a
process creates a MAP_SHARED | MAP_ANON mapping, then fork()s, and the
child calls setuid() (think privilege separation). Hopefully, this case
is rare enough (malloc() always uses MAP_PRIVATE) that it can be handled
using the most restrictive interpretation possible rather than trying to
be painstakingly precise.

(BTW, Skip, I find your MUA's use of Mail-Followup-To: offensive; if you
don't want a copy of the followup, set the followup address to the list,
not to a random previous participant in the thread)

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Robert Watson <rwatson@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 8:34 am

You don't want the default to be so high. You want a low default, with
the possibility for the admin to increase the limit for a particular
user in login.conf or similar without rebooting (which is currently not
possible since the default datasize == maxdsiz, which can only be
changed in the kernel config or loader.conf)

You may also want to have a collective limit for unprivileged users, so
root will still be able to log in if something goes wrong.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 9:26 am

This will presumably only work for console logins, as sshd (etc) will depen=
d=20
on unprivileged users, but perhaps that is fine. I'm less concerned with t=
he=20
details of the implementation or policy than that we simply be able to supp=
ort=20
even a basic policy and have it configured by default to prevent=20
foot-shooting.

Robert N M Watson
Computer Laboratory
University of Cambridge

To: Robert Watson <rwatson@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>, Poul-Henning Kamp <phk@...>
Date: Friday, January 4, 2008 - 2:27 am

I'm not sure that I like that very much. At least the way that
it has been explained here so correct me if I misunderstood.

I have long lived processes that continuously handle very valuable
data and potentially get very large (several GB). I'd like that
process to be able to make a rational decision about what happens to its
memory contents when an allocation fails rather than having the
proverbial rug pulled out from under it. Rug pulling at any point
can cost an annual salary or two.

Ian

--
Ian Freislich

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Ian FREISLICH <ianf@...>
Cc: <freebsd-current@...>
Date: Friday, January 4, 2008 - 5:51 am

[Empty message]
To: Peter Jeremy <peterjeremy@...>
Cc: Ian FREISLICH <ianf@...>, <freebsd-current@...>
Date: Friday, January 4, 2008 - 8:47 am

I need to make a slight correction there:

some time ago the patch at the
http://people.freebsd.org/~kib/overcommit/index.html
works, at least I believe so. I implemented overcommit turn-off knob
and did the exact anonymous memory accounting. Quite possible, the code
rotten since then.

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Jason Evans <jasone@...>
Date: Thursday, January 3, 2008 - 6:23 pm

That is a pretty damning argument in my mind. Why make such a major
change right before the release when it's effectively useless?

Scott
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: Dag-Erling <des@...>, Jason Evans <jasone@...>
Date: Thursday, January 3, 2008 - 6:46 pm

The motivation for the change is to preserve POLA as malloc() does honor
RLIMIT_DATA in previous releases (4.x, 6.x, etc.). That said, I think
RLIMIT_VMEM is probably more useful going forward. I know at work we have
lots of hacks to deal with maxdsiz and trying to allow apps that use large
malloc() and large mmap both cooperate. Having one resource limit for malloc
+ mmap is probably best for the future.

--
John Baldwin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: John Baldwin <jhb@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Thursday, January 3, 2008 - 7:08 pm

If it were happening on a stable branch, I'd agree more with the POLA
argument.
The tradeoff between last minute destabilization, which is exactly
what happened
here, and the highly imperfect and antiquated justification, is pretty
bogus.

Scott

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Scott Long <scottl@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Thursday, January 3, 2008 - 8:31 pm

The reason I'm more of a fan of introducing LIMIT_SWAP is that I'd like to be
able to specifically avoid swap exhaustion by a process without preventing it

When Alan proposed this as the approach, it was presumably under the
assumption that it would be non-disruptive. As it has proven highly
disruptive, it's obviously not getting MFC'd for the release. Instead we'll
have to work on a solution for after .0, but make sure to document that the
default swap resource limits effectively enforced in all prior FreeBSD
releases are *not* enforced on 7.0, and that administrators wanting to prevent
users from exhausting swap accidentally with something like the following:

int
main(int argc, char *argv[])
{
char *c;

while (1) {
c = malloc(getpagsize());
if (c == NULL)
err(-1, "malloc");
*c = 'a';
}
}

will need to now manually set the virtual memory limit in login.conf. Note
that the above strongly resembles frequently run CGI scripts written by many
naive CGI script authors, so is something that we'd like to be robust against
in the same way we prefer to be robust against:

int
main(int argc, char *argv[])
{

while (1) {
fork();
}
}

Smacking the user is obviously a good idea, but taking down the multi-user web
server is not.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: Dag-Erling <des@...>, Jason Evans <jasone@...>
Date: Thursday, January 3, 2008 - 5:00 pm

Also, may I humbly inject a user centric view here - it is pretty annoying =
to=20
be limited to 500 MB of mallocable memory on 32 bit machines when you expec=
t=20
3 GB to be usable (with 1 GB mapped to the kernel).

I scratched my head for a long time as to why I was getting out of memory=20
errors in spite of carefully setting resource limits and ensuring virtual=20
memory was available; at some later point in time I discovered the hard-cod=
ed=20
distinction between sbrk():able and mmap():able memory. I am not sure what =
I=20
was supposed to find this in the documentation (I found it by chance=20
Googling).

If sbrk() is indeed to be used by the default malloc, one definitely user=20
visible annoyance will be the 500 MB limit. At least with mmap() that will =
be=20
2.5 GB, unless I am misstaken, which is much closer to what one might expec=
t=20
and thus less likely to cause problems in the common case.

Changing maxdsize to be > 500 MB is probably bad too, from a user centric=20
view, since you don't want to cause the equivalent problems for programs th=
at=20
do not use malloc(), but are indeed coded with "modern virtual memory" (as=
=20
the man page calls it) in mind. Better to leave this problem to those=20
programs that use sbrk() directly.

Another consequence is that if the sysadmin really wants a maximum amount o=
f=20
mmap():able memory, the maxdsize can presumably be lowered quite heftily=20
without affecting the vast majority of applications. With malloc() use of=20
sbrk() however, you will have mutual exclusivity between the common case=20
(malloc() users), and special purpose applications that *do* try to be nice=
,=20
modern and use mmap() instead of sbrk(). With mutual exclusivity between=20
malloc() users and sbrk() users, at least you can kinda blame the sbrk() us=
er=20
for using an obsolete interface.

=2D-=20
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to g...

To: Peter Schuller <peter.schuller@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Thursday, January 3, 2008 - 5:08 pm

amen. :-( Has anyone tried upgrading a system from i386 to amd_64
with any success? maxdsize tweaks and reboots are disappointing at best.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Jason Fesler <jfesler@...>
Cc: <freebsd-current@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 5:07 am

"Sidegrading" is supposed to work now in HEAD; with a little hacking,
you can build an amd64 world and kernel on the i386 world, install the
kernel, reboot, and install world. AFAIK, the required hacking involves
copying /libexec/ld-elf.so.1 to /libexec/ld-elf32.so.1 before rebooting
so the new kernel will be able to run the old binaries. It should also
be possible to install an amd64 world *before* rebooting, in which case
you don't need the aforementioned hackery (installworld will do it for
you) but you may have trouble doing anything at all after installworld
since your new world will not run on the old kernel. The install
process itself doesn't care, since it copies all the i386 binaries and
libraries it needs before installing anything.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: Peter Schuller <peter.schuller@...>, <freebsd-current@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 1:55 pm

I wonder when we'll have to standardize /libexec/<arch>/ to support
multiple architectures for things like ld-elf.so.1. It used to only
be a concern for those rare people running diskless over multiple
architectures, but the case of i386 binaries on amd64 is a little
more common.

On the other hand, if ld-elf.so.1 is fairly unique in this
concern, it might be simpler to rename it to:
ld-elf-{i386,amd64,ppc,...}.so.1

Tim Kientzle
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Tim Kientzle <kientzle@...>
Cc: Dag-Erling Smørgrav <des@...>, <freebsd-current@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>
Date: Friday, January 4, 2008 - 5:25 pm

Good point, it's silly that i386 binary running on amd64 kernel requires
ld-elf32.so.1, while ld-elf.so.1 when running on i386 kernel. It adds
unneeded complexity for running i386 jail or chroot on amd64 for example.

I wonder if we can do what Tim said - rename dynamic loader to actually
include architecture name. I am pretty sure it would allow to remove
quite few special cases from the kernel elf/emulation code and possibly
from the cross build logic.

-Maxim
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Maxim Sobolev <sobomax@...>
Cc: Dag-Erling Smørgrav <des@...>, Tim Kientzle <kientzle@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, <freebsd-current@...>
Date: Friday, January 4, 2008 - 5:42 pm

While this doesn't count as an explicit vote against the rename, we can
solve the chroot problem easily. I did this once already, but for some
reason never got around to committing it.

However, renaming ld-elf.so.1 is a bad idea in general. Yes, it would have
been better to have had the arch name in there from the start, but it
doesn't. It is unfortunate, but I feel that changing it will cause far more
pain across the board than it would solve for the specific case of chrooting
i386 binaries. I don't think it is worth it.

There are a whole bunch of references to the ld-elf.so.1 name. Not just in
our tree, but in external 3rd party code. Even things like gdb "know" how
to handle ld-elf.so.1. Getting those upstream folks to add additional
strcmp()'s for ld-elf-i386.so.1, ld-elf-amd64.so.1 etc will be hard enough,
and it will add another hurdle that minor platform maintainers have to
overcome. ld-elf-mips-be-4Kc.so.1 anybody? (ok, that last one is a
stretch)

Anyway, I'm not absolutely against it, but I think it will be a net loss
overall. We'll have more pain than I think it is worth, especially since
the alternatives are much easier.

-Peter
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Peter Wemm <peter@...>
Cc: Jason Evans <jasone@...>, <freebsd-current@...>, <Dag-Erling Smø@...>, rgrav <des@...>, Peter Schuller <peter.schuller@...>
Date: Friday, January 4, 2008 - 11:51 pm

Details? Does your approach also solve the problem of
sharing /usr across different architectures (either in
a diskless NFS environment or a dual-boot scenario with

I'm not sure that I see the problem. What am I missing?
1) gdb is built to debug binaries for a particular architecture.
(gdb/ARM can't debug gdb/i386 binaries)
2) gdb therefore only needs to check for "ld-elf-"`uname -m`".so.1",
which is easy to handle when gdb itself is built.

I can see some subtleties for cross-builds, but nothing
outrageous.

It also seems that your argument applies just as well to
ld-elf.so.1 and ld-elf32.so.1. Either way, there's more
than one ld-elf.so.1, and therefore more than one name
to keep track of.

I'm not championing the rename by any means, just trying
to better understand the issues. The fact that amd64 can
run i386 binaries but not vice-versa has a lot of subtle
implications. Also, this is the first time that FreeBSD
has really had large user bases on two fundamentally
different architectures, so it's the first time we've
really had to confront some of these support issues
(such as the shared /usr scenario).

Tim Kientzle
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Tim Kientzle <kientzle@...>
Cc: <freebsd-current@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, Peter Wemm <peter@...>
Date: Saturday, January 5, 2008 - 10:16 am

We don't embed ld-elf32.so.1 in 32-bit binaries; if we did, we couldn't
run unmodified i386 binaries on amd64, or move i386 binaries built on an
amd64 system to a real i386 system. Instead, the kernel automagically
translates ld-elf.so.1 to ld-elf32.so.1 for 32-bit binaries, and gdb is
none the wiser.

(see src/sys/sys/imgact_elf.h, src/sys/kern/imgact_elf.c, and the
various instances of Elf_Brandinfo, Elf32_Brandinfo and Elf64_Brandinfo
in the kernel for the precise details of how this is done)

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling Smørgrav <des@...>
Cc: <freebsd-current@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, Peter Wemm <peter@...>
Date: Sunday, January 13, 2008 - 3:33 pm

Ah! I see. So let me see if I understand:

* Peter Wemm's concern about gdb is that the library
reference in the compiled binaries always be "ld-elf.so.1"
so that the debugger and other system tools can identify
references to this special library.
* The kernel already has logic to translate refs to this
library to "ld-elf<platform>.so.1" for the single
special case of i386 on amd64 (even though "i386" seems to
be spelled "32" in this case).
* "Side-grades" from i386 to amd64 have the complication of
having to rename ld-elf.so.1 at some point in the process.
Failure to do this at the correct point risks breaking the
entire system.

It still seems that renaming ld-elf.so.1 to ld-elf-<platform>.so.1
on disk would solve the side-grade problem (nothing to be renamed,
only a new ld-elf-amd64.so.1 to install), and the existing
kernel translation logic could be generalized to allow all
binaries to refer to ld-elf.so.1, thus addressing the gdb
problem in the same way it's been handled for this case for
some time.

I suppose the question boils down to:
* If the kernel translates "ld-elf-so.1" to "ld-elf32.so.1"
for i386 binaries on amd64, why should it not do so for i386
binaries on i386 as well? This special case seems to be the
root cause of at least some of the side-grade problems being

Thanks for the pointers; this isn't an area I've looked closely
at before...

Variant symlinks may also provide a solution to this, but I'm
not as familiar with the mechanism behind that, the current
implementation status, or how that would interact with
issues such as running i386 binaries on amd64 systems.

Of course, variant symlinks could also solve the more
general problem of handling other shared libraries correctly
in multi-architecture environments. (Maybe even provide
a cleaner solution to the problem of Linux binaries on
FreeBSD? Or is that too much to hope for? ;-)

Tim Kientzle
_______________________________________________
freebsd-current@freebsd.or...

To: <freebsd-current@...>
Cc: Dag-Erling <des@...>, Tim Kientzle <kientzle@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, Peter Wemm <peter@...>
Date: Monday, January 14, 2008 - 9:51 am

It's not an arch name. The '32' is because it is part of the
freebsd32 compat ABI which provides an alternate syscall table
just like ABIs for SVR4, IBCS2, and Linux. freebsd32 is
not i386-specific, but instead is split into an MI portion
that provides generic 32-bit wrapping for 64-bit platforms
and MD backends (ia32 ABI on amd64 and ia64 for i386 currently).

If you had foo32 and foo64 archs then freebsd32 (and thus
ld-elf32.so.1) would be used on foo64 systems to run foo32

I think the side-grade is such a special case (are you going to
side-grade from ia64 to alpha?) that it doesn't warrant changing
the rest of the system. We don't have /usr/<arch>-bin instead
of /usr/bin.

--
John Baldwin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Tim Kientzle <kientzle@...>
Cc: <freebsd-current@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, Peter Wemm <peter@...>
Date: Sunday, January 13, 2008 - 3:48 pm

Not only that, but also for 32-bit Linux binaries on i386, and 32-bit
and 64-bit Linux binaries on amd64.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: Dag-Erling <des@...>, Tim Kientzle <kientzle@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, Peter Wemm <peter@...>
Date: Monday, January 14, 2008 - 9:46 am

Not for Linux, but it is for any 32-bit FreeBSD apps on a 64-bit kernel.
So currently i386 on amd64 and i386 on ia64, but it could also be used
to run 32-bit sparc binaries (if we had any) on sparc64.

--
John Baldwin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling <des@...>
Cc: Peter Wemm <peter@...>, <freebsd-current@...>, Jason Evans <jasone@...>, Tim Kientzle <kientzle@...>, <Peter@...>, Schuller <peter.schuller@...>
Date: Monday, January 14, 2008 - 7:03 am

Quoting Dag-Erling Smørgrav <des@des.no> (from Sun, 13 Jan 2008

Could you please point out where we have 64bit linux stuff? I'm only
aware of 32bit linux stuff (on i386 and amd64).

Bye,
Alexander.

--
"It was nice of you to let me reattach your arm."
--Zoidber

http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Tim Kientzle <kientzle@...>
Cc: <Dag-Erling Smø@...>, Peter Wemm <peter@...>, Jason Evans <jasone@...>, <freebsd-current@...>, rgrav <des@...>, Peter Schuller <peter.schuller@...>
Date: Saturday, January 5, 2008 - 3:32 am

The main issue is NOT sharing / or /usr or /usr/local, that is peenuts.
root and usr is less that 500 MGB, /usr/local though big, is handled
neatly by amd (the automounter).
cross building is one issue, but the real problem is sharing user's binaries.
in Apple one can compile a binary for both i386 & ppc, and the binary is
twice as big. side note, I compiled such a program, but by mistake chose
two different binaries to be joined, and imagine my surprice when it acted
differently from expected.
We have come a long way since the days that a wrong architecture a.out would
just coredump.
In the old days, we had ~/bin/$arch in our path to keep different
binaries, it was the days of VAX/Sun, but since i386 arrived, this has been
forgotten. Now we are concidering to deploy amd64, and it would be nice
if it can be a 2way street - amd64 can run i386, but i386 should run the i386
version ...

just blaberring before coffee.
danny

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Danny Braniss <danny@...>
Cc: <Dag-Erling Smø@...>, <freebsd-current@...>, Jason Evans <jasone@...>, Tim Kientzle <kientzle@...>, rgrav <des@...>, Peter Schuller <peter.schuller@...>
Date: Saturday, January 5, 2008 - 6:24 pm

It isn't very hard to do this at all. I did it as a proof-of-concept a few
months ago:

peter@overcee[2:18pm]/tmp/demo-218> cat foo.c
#include <stdio.h>

main()
{
#ifdef __i386__
printf("Platform = i386\n");
#endif
#ifdef __amd64__
printf("Platform = amd64\n");
#endif
}
peter@overcee[2:18pm]/tmp/demo-219> ./foo_i386
Platform = i386
peter@overcee[2:19pm]/tmp/demo-220> ./foo_amd64
Platform = amd64
peter@overcee[2:19pm]/tmp/demo-221> cat foo.c
#include <stdio.h>

main()
{
#ifdef __i386__
printf("Platform = i386\n");
#endif
#ifdef __amd64__
printf("Platform = amd64\n");
#endif
}
peter@overcee[2:19pm]/tmp/demo-222> which cc
/usr/bin/cc
peter@overcee[2:19pm]/tmp/demo-223> cc -o foo_amd64 foo.c
peter@overcee[2:19pm]/tmp/demo-224> cc -m32 -o foo_i386 foo.c
peter@overcee[2:19pm]/tmp/demo-225> file foo_*
foo_amd64: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD), for
FreeBSD 8.0 (800006), dynamically linked (uses shared libs), FreeBSD-style,
not stripped
foo_i386: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), for
FreeBSD 8.0 (800006), dynamically linked (uses shared libs), FreeBSD-style,
not stripped
peter@overcee[2:19pm]/tmp/demo-226> ./foo_i386
Platform = i386
peter@overcee[2:19pm]/tmp/demo-227> ./foo_amd64
Platform = amd64
peter@overcee[2:19pm]/tmp/demo-228> uname -m
amd64

What I did was a half-dozen lines of a hack to our bmake glue for gcc. It
is a hack though because I did it as specs overrides rather than have it
figure the correct #include paths. This means my version doesn't interact
with -nostdinc mode correctly. Doing it to correctly handle the paths isn't
much harder.

-Peter
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Cc: <Dag-Erling Smø@...>, Tim Kientzle <kientzle@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, rgrav <des@...>
Date: Sunday, January 6, 2008 - 3:56 am

what Apple has is one file, that will run the appropiate binary if run
on an i386 or a ppc, not 2 different files - universal binary - not rosetta.

[macbook:system/danny/tmp] danny% uname -p
i386
[macbook:system/danny/tmp] danny% gcc -arch i386 foo.c -o foo_i386
[macbook:system/danny/tmp] danny% gcc -arch ppc foo.c -o foo_ppc
[macbook:system/danny/tmp] danny% lipo -create -arch ppc foo_ppc -arch i386 foo_i386 -output foo
[macbook:system/danny/tmp] danny% file foo
foo: Mach-O universal binary with 2 architectures
foo (for architecture ppc7400): Mach-O executable ppc
foo (for architecture i386): Mach-O executable i386
[macbook:system/danny/tmp] danny% ./foo
Platform = i386
[macbook:system/danny/tmp] danny% ls -lsi foo
17768042 57 -rwxr-xr-x 1 danny wheel 28972 Jan 6 09:32 foo
===================================
twister> uname -p
powerpc
twister> ./foo
Platform = ppc
twister> ls -lsi foo
17768042 57 -rwxr-xr-x 1 danny wheel 28972 Jan 6 09:32 foo

danny

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Danny Braniss <danny@...>
Cc: <freebsd-current@...>, <Dag-Erling@...>, <=?ISO-8859-1?Q?rg@...>, Jason Evans <jasone@...>, ?= Tim Kientzle <kientzle@...>, rgrav <des@...>, Peter Schuller <peter.schuller@...>
Date: Sunday, January 6, 2008 - 4:42 pm

On Sun, 06 Jan 2008 09:56:32 +0200

Sure, but that's got a bunch of different driving factors. I
don't know, for example, whether you can build a four-way
executable (ia32, x86_64, ppc, ppc64). Well, you probably can,
but I'd be a bit surprised if anyone has. FreeBSD supports even
more architectures: it just doesn't scale. The best bet for
something that has to run everywhere is probably LLVM or TNEF.

The advantage that Unix has over MacOS is that we aren't trying
to squeeze everything into single "application" directories. So
it's reasonable to have "share", and select executables on the
basis of PATH. That's how it has worked before. Most sites
don't have more than two or three different architectures to
support, anyway.

If we do get much further with multi-architecture bin and lib,
and people actively use these on diskless setups or
multi-architecture hosts (amd64/ia32, or other 64/32 bit
combinations being the most common) then perhaps it would be nice
to have a share/bin where platform-independent scripts (shell,
perl, python) as well as dynamic-translated binaries (JVM, LLVM,
etc) can live?

Cheers,

--
Andrew
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Andrew Reilly <andrew-freebsd@...>
Cc: <Dag-Erling@...>, <Smø@freebsd.o@...>, <freebsd-current@...>, <=?ISO-8859-1?Q?rg@...>, Jason Evans <jasone@...>, ?= Tim Kientzle <kientzle@...>, Peter Schuller <peter.schuller@...>
Date: Monday, January 7, 2008 - 5:42 am

Two-way i386 + amd64 executables would be very useful, since they can
run on the same hardware with just a change of kernel.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Dag-Erling <des@...>
Cc: <=?ISO-8859-1?Q?rg@...>, <Dag-Erling@...>, <Smø@freebsd.o@des.no,@...>, Jason Evans <jasone@...>, ?= Tim Kientzle <kientzle@...>, <freebsd-current@...>, <Peter@...>, Schuller <peter.schuller@...>
Date: Monday, January 7, 2008 - 7:30 pm

On Mon, 07 Jan 2008 10:42:49 +0100

How is that useful? I386 executables can run on the same hardware
with the same changes of kernel. If you're not planning to
change kernel, then you can use amd64-only. I thought that the
whole fat-binary issue revolved around binary distribution (also
by networked file systems) to *different* architectures. Well,
that's what Apple and NeXT seem to have used them for. Apollo,
Sun, MIPS/SGI, HP(?) always seemed to manage with PATH
configurations and/or variant symlinks. I can't see why that
would be any harder for FreeBSD?

Cheers,

--
Andrew
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Andrew Reilly <andrew-freebsd@...>
Cc: <Dag-Erling@...>, <freebsd-current@...>, <Smø@freebsd.o@des.no@...>, Jason Evans <jasone@...>, <,@...>, <?ISO-8859-1?Q?rg@...>, ?= Tim Kientzle <kientzle@...>, <=@...>, Peter Schuller <peter.schuller@...>
Date: Tuesday, January 8, 2008 - 5:27 am

...but they cannot take advantage of the full capabilities of amd64 (not
just address space, but larger number of general-purpose registers etc.)
Even further, an i386 binary built for maximum compatibility cannot
assume SSE2 support, while an amd64 binary can. Conversely, there are
(admittedly not many, but some) workloads that run faster on i386 than
on amd64.

Imagine having a single binary distribution and a single install CD or
DVD that runs unmodified on i386 and amd64 - that would cover 90% or
more of our user base.

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Peter Wemm <peter@...>
Cc: Jason Evans <jasone@...>, <freebsd-current@...>, <Dag-Erling Smø@...>, Tim Kientzle <kientzle@...>, rgrav <des@...>, Peter Schuller <peter.schuller@...>
Date: Friday, January 4, 2008 - 7:42 pm

P.S. I wonder why gdb(1) and friends need that strcmp()'s and don't use
appropriate field from the elf header. I am pretty sure that dynamic
linker name is embedded on link time in there. At least that's the very
first string that is returned by invoking string(1) on any dynamic binary.

-Maxim
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Peter Wemm <peter@...>
Cc: Jason Evans <jasone@...>, <freebsd-current@...>, <Dag-Erling Smø@...>, Tim Kientzle <kientzle@...>, rgrav <des@...>, Peter Schuller <peter.schuller@...>
Date: Friday, January 4, 2008 - 7:38 pm

I see, what about moving it into /libexec/<arch>/? Is it better approach?

-Maxim
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: Maxim Sobolev <sobomax@...>
Cc: <freebsd-current@...>, Tim Kientzle <kientzle@...>, Peter Schuller <peter.schuller@...>, Jason Evans <jasone@...>, Peter Wemm <peter@...>
Date: Saturday, January 5, 2008 - 10:03 am

I'd rather see us implement variant symlinks, have /libexec symlink to
/libexec.%ARCH%, and let the kernel sort'em out...

DES
--
Dag-Erling Smørgrav - des@des.no
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

To: <freebsd-current@...>
Date: Saturday, January 5, 2008 - 4:56 pm

Been listening to Terry recently? ;-)
--
Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr
Darwin sidhe.keltia.net Version 8.10.1: Wed May 23 16:33:00 PDT 2007 i386
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"

Previous thread: HEADSUP: new wiki page: State of Packages on Sparc64 by Mark Linimon on Wednesday, January 2, 2008 - 7:13 pm. (1 message)

Next thread: panic about half the time with WPA+WPI during startup by Hanns Hartman on Thursday, January 3, 2008 - 1:22 pm. (3 messages)