KernelTrap: No Updates Through January 8'th

Submitted by Jeremy
on December 30, 2005 - 3:00am

I won't be posting updates to the KernelTrap front page through January 8'th as I'll be away on a short vacation. All the many tasks that have been keeping me too busy to maintain the site regularly the past few months are finally wrapping up, so the new year should see much more frequent updates again. Until then, be sure to visit the forums and journals which continue to be updated.

I'll be in Portland visiting family and friends. In Florida my house is finally on the market, hopefully it'll sell quickly as then I'll be set to buy a sailboat. KernelTrap is only a hobby that I do in my spare time, but I'm setting myself up to have a lot more spare time which should bode well for the site. I've also been researching satellite phones so I can keep online even when I'm out on the ocean - not cheap or fast, but it seems there are viable options.

The next big trip I have planned is the last ten days of January, we've rented a boat on Abaco and will spend the time sailing around the Bahamas. It's a dry run to be sure we really want to own a boat and spend some serious time on the water under wind power. Beyond that, I may take a little time off in February and/or March to travel to some foreign lands, but that depends on the house selling well.

Progress on upgrading KernelTrap has been slow, due to all the time I've put into my house getting it ready. But I'm still hoping to upgrade in January, including a shiny new theme. I've got a number of good plans underway to make the site more usable. And I've given some thought to new forms of original content that should be interesting to KernelTrap readers. So please be patient, as there is good stuff coming.

Report as spam?

on
December 31, 2005 - 10:34am

Just out of curiosity... does it make sense to have a "report as spam" link for articles that make the KernelTrap front page?

OT

Shy (not verified)
on
December 31, 2005 - 9:51pm

Hey guys, look what I found! Jeremy's diary... I can't wait for the next post...

2006 ToDo: B+star-Tree in dyn. fileswap for AMD64 only.

Anonymous (not verified)
on
January 1, 2006 - 6:38pm

It's important to remove the 2 GiB barrier of the old swapspace.

There are many reasons:
The people want that the Boehm's GC and GCJ-4.1 work in a normal environment without 'out of memory' crashes.

Known problems recently:
http://gcc.gnu.org/ml/java/2005-12/msg00033.html
http://gcc.gnu.org/ml/gcc/2005-12/msg00046.html
http://gcc.gnu.org/ml/gcc/2005-12/msg00778.html
http://gcc.gnu.org/ml/gcc/2005-12/msg00780.html

For easy implementation, the main target is x86-64 because it's the 64 bit processor most popular.

The AMD64 power in the desktop:
* easy multithreading, no small memory's zones by thread.
* apparently unlimited virtual memory if using huge cheap hard disk (e.g. 250 GB).
* 16 PRs vs 8 PRs.
* Long life software-hardware cycle (how long did i386 take since 198x?).

You want to fix this by throw

Anonymous (not verified)
on
January 2, 2006 - 6:31am

You want to fix this by throwing more memory at it?
Sure, I can see why one might want to remove the 2G limit, but it sounds like gcc/gcj is just fucked when it comes to memory usage.

It's actually make's fault.

on
January 2, 2006 - 12:43pm

Actually, all the links he posted point to problems with make, not GCC or GCJ. It's just that GCJ's huge dependency set sends make on a bender.

Hope all goes well

Inhibit (not verified)
on
January 1, 2006 - 11:54pm

Thanks for running such a great Linux resource. Having all the threads in a concise format so I can both read and link to them is a real time saver.

I hope all goes well with the home sale and sailing!

[OT] Is the moniker GCC 3.slow officially dead now?

on
January 2, 2006 - 8:03pm

I just ran, for a second mind-numbingly boring time, a benchmark between GCC 2.96 (which is basically 2.95 plus some bugfix patches) and GCC 3.4.5 compiling the Linux 2.6.14 kernel on a Pentium 60. For each compiler I did:

make clean; make oldconfig; time make bzImage

Thus, the time reported is just for "make bzImage".

GCC 2.96:

real    198m8.239s
user    186m34.780s
sys     10m1.290s

GCC 3.4.5:

real    178m34.919s
user    168m56.160s
sys     7m50.960s

Looks like the name GCC 3.slow only applies if the previous name was GCC 2.slower.

I know I posted similar data from my first run in the "Dropping support for GCC 2.95" article yesterday, but I figured I would post the second run a little higher on the main page so that someone might actually see it.

A retraction is in order.

on
January 3, 2006 - 10:02pm

I was so surprised myself by the GCC 2.96 results that I wondered if 2.96 was the minimal delta I claimed it to be. Looking through Andrew Morton's patch to remove 2.95 support hinted it might be closer to 3.0 than 2.95. Sure enough, the benchmark numbers seem to agree.

GCC 2.95.3:

real    120m43.057s
user    110m54.690s
sys     7m57.820s

So, GCC 3.4.5 is about 12% faster than 2.96, and about 47% slower than 2.95.3.

Taking about 50% longer is pretty noticeable. With this only quasi-scientific analysis, it seems GCC 3.x does deserve, at least somewhat, the moniker GCC 3.slow. Granted, on a faster machine, the difference should narrow. (Other factors of the build such as disk access speed make more difference if you're less CPU bound.) And on a machine with more RAM it should narrow further with "make -j2 bzImage," but nonetheless there *is* a large compilation time delta here.

Well, but it is supposed to p

Anonymous (not verified)
on
January 4, 2006 - 4:50am

Well, but it is supposed to produce better code, isn't it?
TANSTAAFL.

This is true.

on
January 4, 2006 - 9:05am

This is very true. At the same time, though, the GCC crew has optimized other portions of GCC itself to try to offset increases in compile time. Also, there have been several complaints that GCC got slower with no actual benefit in some cases.

If people *really* care about build time for a edit-compile-regress cycle, they could start by turning the optimization level down. The truly psycho about build time could use TCC.

is the code it produces 50% b

tu (not verified)
on
January 4, 2006 - 2:44pm

is the code it produces 50% better?

Is that really the appropriate measure?

on
January 4, 2006 - 3:00pm

Really, what matters is the total time saved vs. the total time compiling and running the program.

The typical end user compiles the software at most one time (zero times if they download binaries), and then uses it many, many times. The ratio of compiles to runs nears zero. A small boost in performance in the end app can therefore justify a fairly large increase in compilation time, and have it be an overall win.

Developers, though, sit in an edit-compile-debug loop. Thus, the ratio of compiles to runs is closer to 1.

I remember seeing benchmarks of the Linux kernel compiled with and without heavy optimization enabled. (This was years ago.) The resulting kernel performance moved only slightly, which makes sense. The goal of the kernel is to run as little as possible anyway, leaving most things to userspace. Performance should be dominated by the workload, whether it's the app, disk I/O or network I/O.

These days, it seems more problematic to compile the kernel w/ optimization disabled. I seem to recall certain macros/inline assembly don't compile correctly below -O2. 'Tis a shame.

Only -O2 or -Os.

on
January 4, 2006 - 7:58pm

The kernel only uses -O2 or if you enable compile for size optimisation it does -Os. With either optimisation option there are numerous workarounds for flags that don't work properly within the kernel so it is never purely O2 or Os. The performance drop of going to -O0 would be considerable by the way, and it doesn't actually compile or work even if you tried. -Os is turning out to be the new favourite by the way with FC using it by default for example due to it's effect on icache footprint. The pure "cycles" advantage of O2 is offset by the smaller ram footprint of the kernel in Os - optimising a kernel for the operating system using the ram in this case instead of trying to speed up within the kernel which has minimal effect.

Okay, thanks.

on
January 8, 2006 - 5:06am

I wasn't sure what the current state was with the kernel.

I remember reading awhile back where someone did compile a kernel largely without optimization, and they showed around a 5% shift in overall system performance as a result. That makes sense: Ideally, the kernel's algorithms (and their O(x) characteristics) have the largest impact on performance, especially the algorithms for scheduling tasks and I/O. The *implementation* of those algorithms (aside from details that affect big-Oh) should have minimal impact in most cases.

I'd expect microbenchmarks that focus on specific kernel paths (such as loopback-crypto-filesystems and raw pipe bandwidth) to suffer if the kernel was built without optimization. I would not expect overall system performance to shift dramatically otherwise. Under most loads I deal with (like kernel compilation and random web browsing), the kernel uses less than 10% of the system's cycles, and most often uses far less than 5%. Even if the kernel slowed down 300% or more, that'd still shift most userland performance figures minimally.

For example, suppose a given workload benchmarked on an optimized kernel resulted in 10% of the cycles spent in the kernel not waiting for I/O. Now suppose the kernel's execution time slowed down by 3x because we disabled optimization. Then the overall benchmark would be at most 20% slower--that is, take 1.2x as long.

I hear you on -Os vs. -O2. I work with embedded processors in my line of work. Program cache footprint is a huge concern for our customers, and the impact of program size, layout and so on has a gigantic impact on the ultimate performance of their application. Indeed, probably the strangest outcome I've witnessed is the impact of instruction-level parallelism in a deeply pipelined processor: If your code isn't highly parallel, then you won't notice instruction cache misses quite so much. But that's a topic for different forums. :-)

For every case?

kfz versicherung (not verified)
on
February 2, 2006 - 11:42pm

I doubt anyone could improve GCC as much as to make it produce 4x as fast code as it does now.

Maybe twofold improvement, but 4x?

Nice weblog! Thank you!

Stan (not verified)
on
May 14, 2006 - 3:05pm

Nice weblog! Thank you!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.