Re: [PATCH 00/32] Swap over NFS - v19

Previous thread: [PATCH 23/32] netvm: prevent a stream specific deadlock by Peter Zijlstra on Thursday, October 2, 2008 - 6:05 am. (1 message)

Next thread: [PATCH 18/32] net: sk_allocation() - concentrate socket related allocations by Peter Zijlstra on Thursday, October 2, 2008 - 6:05 am. (3 messages)
From: Peter Zijlstra
Date: Thursday, October 2, 2008 - 6:05 am

Patches are against: v2.6.27-rc5-mm1

This release features more comments and (hopefully) better Changelogs.
Also the netns stuff got sorted and ipv6 will now build and not oops
on boot ;-)

The first 4 patches are cleanups and can go in if the respective maintainers
agree.

The code is lightly tested but seems to work on my default config.

Let's get this ball rolling...

--

From: Andrew Morton
Date: Thursday, October 2, 2008 - 12:47 pm

I don't think we're really able to get any MM balls rolling until we
get all the split-LRU stuff landed.  Is anyone testing it?  Is it good?

--

From: Lee Schermerhorn
Date: Thursday, October 2, 2008 - 1:59 pm

Andrew:

Up until the mailing list traffic and patches slowed down, I was testing
it continuously with a heavy stress load that would bring the system to
its knees before the splitlru and unevictable changes.  When it would
run for days without error [96 hours was my max run] and no further
patches came, I've concentrated on other things.

Rik and Kosaki-san have run some performance oriented tests, reported
here a while back.  Maybe they have more info.

Lee

--

From: Nick Piggin
Date: Thursday, October 2, 2008 - 11:53 pm

Peter's patches are very orthogonal to that work and shouldn't
actually change those kinds of reclaim heuristics at all.
--

From: Rik van Riel
Date: Friday, October 3, 2008 - 12:38 pm

On Thu, 2 Oct 2008 12:47:48 -0700

I've done some testing on it on my two test systems and have not
found performance regressions against the mainline VM.

As for stability, I think we have done enough testing to conclude
that it is stable by now.

-- 
All rights reversed.
--

From: KOSAKI Motohiro
Date: Saturday, October 4, 2008 - 8:05 am

Also my experience doesn't found any regression.
and in my experience, split-lru patch increase performance stability.

What is performance stability?
example, HPC parallel compution use many process and communication
each other.
Then, the system performance is decided by most slow process.

So, peek and average performance isn't only important, but also
worst case performance is important.

Especially, split-lru outperform mainline in anon and file mixed workload.


example, I ran himeno benchmark.
(this is one of most famous hpc benchmark in japan, this benchmark
 do matrix calculation on large memory (= use anon only))

machine
-------------
CPU IA64 x8
MEM 8G

benchmark setting
----------------
# of parallel: 4
use mem:  1.7G x4 (used nealy total mem)


first:
result of when other process stoped  (Unit: MFLOPS)
               
              each process
              result
               1    2    3    4    worst average
---------------------------------------------------------
2.6.27-rc8:   217  213  217  154   154   200
mmotm 02 Oct: 217  214  217  217   214   216

ok, these are the almost same


next:
result of when another io process running (Unit: MFLOPS)
(*) infinite loop of dd command used

               each process
               result
               1    2    3    4    worst  average
---------------------------------------------------------
2.6.27-rc8:    34  205   69  196    34     126
mmotm 02 Oct: 162  179  146  178   146     166


Wow, worst case is significant difference.
(this result is reprodusable)

because reclaim processing of mainline VM is too slow.
then, the process of calling direct reclaim is decreased performance largely.


this characteristics is not useful for hpc, but also useful for desktop.
because if X server (or another critical process) call direct reclaim, 
it can strike end-user-experience easily.


yup,
I know many people want to other benchmark result too.
I'll try to mesure other bench at next ...
From: KOSAKI Motohiro
Date: Tuesday, October 7, 2008 - 7:26 am

I ran another benchmark today.
I choice dbench because dbench is one of most famous and real workload like i/o benchmark.


% dbench client.txt 4000

mainline:  Throughput 13.4231 MB/sec  4000 clients  4000 procs  max_latency=1421988.159 ms
mmotm(*):  Throughput  7.0354 MB/sec  4000 clients  4000 procs  max_latency=2369213.380 ms

(*) mmotm 2/Oct + Hugh's recently slub fix


Wow!
mmotm is slower than mainline largely (about half performance).

Therefore, I mesured it on "mainline + split-lru(only)" build.


mainline + split-lru(only): Throughput 14.4062 MB/sec  4000 clients  4000 procs  max_latency=1152231.896 ms


OK!
split-lru outperform mainline from viewpoint of both throughput and latency :)



However, I don't understand why this regression happend.
Do you have any suggestion?




--

From: Andrew Morton
Date: Tuesday, October 7, 2008 - 1:17 pm

On Tue,  7 Oct 2008 23:26:54 +0900 (JST)

erk.

dbench is pretty chaotic and it could be that a good change causes


One of these:

vmscan-give-referenced-active-and-unmapped-pages-a-second-trip-around-the-lru.patch
vm-dont-run-touch_buffer-during-buffercache-lookups.patch

perhaps?
--

From: Rik van Riel
Date: Tuesday, October 7, 2008 - 2:28 pm

Worth a try, but it could just as well be a CPU scheduler change
that happens to indirectly impact locking :)

-- 
All rights reversed.
--

From: Nick Piggin
Date: Thursday, October 2, 2008 - 11:49 pm

I know it's not too helpful for me to say this, but I am spending
time looking at this stuff. I have commented on it in the past,
but I want to get a good handle on the code before I chime in again.
--

From: Luiz Fernando N. Capitulino
Date: Friday, October 3, 2008 - 10:17 am

Em Thu, 02 Oct 2008 15:05:04 +0200
Peter Zijlstra <a.p.zijlstra@chello.nl> escreveu:

| Patches are against: v2.6.27-rc5-mm1
| 
| This release features more comments and (hopefully) better Changelogs.
| Also the netns stuff got sorted and ipv6 will now build and not oops
| on boot ;-)
| 
| The first 4 patches are cleanups and can go in if the respective maintainers
| agree.
| 
| The code is lightly tested but seems to work on my default config.
| 
| Let's get this ball rolling...

 What's the best way to test this? Create a swap in a NFS mount
point and stress it?

-- 
Luiz Fernando N. Capitulino
--

From: Peter Zijlstra
Date: Saturday, October 4, 2008 - 3:13 am

What I do is boot with mem=256M, then swapoff -a;
swapon /net/host/$path/file.swp;

the file.swp I created using dd and mkswap on the remote host.

I then run 2 cyclic loops on anonymous memory sized 96mb, and run 2
cyclic loops on file backed memory on the same NFS mount
(eg /net/host/$path/file[12]), also sized 96mb

That gives a memory footprint of 4*96=384mb and will thus rely on paging
quite heavily.

While this is on-going you can have a little deamon that listens and
accepts connections and reads from them.

On a 3rd machine, start say a 1000 connections to this deamon that
continuously write stuff to it.

Then on you NFS host do something like: /etc/init.d/nfs stop

go for lunch

and when you're back do: /etc/init.d/nfs start

and see if all comes back up again ;-)

--

From: Suresh Jayaraman
Date: Sunday, October 5, 2008 - 11:04 pm

Except for this one I think ;-)

net/netfilter/core.c: In function ‘nf_hook_slow’:
net/netfilter/core.c:191: error: ‘pskb’ undeclared (first use in this

The culprit is emergency-nf_queue.patch. The following change fixes the
build error for me.

Index: linux-2.6.26/net/netfilter/core.c
===================================================================
--- linux-2.6.26.orig/net/netfilter/core.c
+++ linux-2.6.26/net/netfilter/core.c
@@ -184,9 +184,12 @@ next_hook:
                ret = 1;
                goto unlock;
        } else if (verdict == NF_DROP) {
+drop:
                kfree_skb(skb);
                ret = -EPERM;
        } else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) {
+               if (skb_emergency(skb))
+                       goto drop;
                if (!nf_queue(skb, elem, pf, hook, indev, outdev, okfn,
                              verdict >> NF_VERDICT_BITS))
                        goto next_hook;


Thanks,

-- 
Suresh Jayaraman
--

Previous thread: [PATCH 23/32] netvm: prevent a stream specific deadlock by Peter Zijlstra on Thursday, October 2, 2008 - 6:05 am. (1 message)

Next thread: [PATCH 18/32] net: sk_allocation() - concentrate socket related allocations by Peter Zijlstra on Thursday, October 2, 2008 - 6:05 am. (3 messages)