Bottleneck in httpd. I need help to address capacity issues on max parallel and rate connections

Previous thread: creating menu's by Bryan Irvine on Tuesday, May 8, 2007 - 4:22 pm. (5 messages)

Next thread: OT: Monitoring tools and integration with SIM products by carlopmart on Tuesday, May 8, 2007 - 5:26 pm. (4 messages)

I am trying to improve my performance and fix my problem on httpd, but
look like I am hitting the roof regardless if I test in lab using an old
850MHz i386 or an new AMD64 at 1.6GHz. Both have > 2GB of ram, so that's
the issue both have. I can't pass more then ~300 to 325 simultaneous
httpd process and timeout goes jump high.

So, I guess may be the limit are in the connection process of the TCP
stack, more then the httpd itself. But I am at a lots as to where to
look. Tested both on 4.1 and 3.9 just to see.

Where are the OS bottleneck that I can may be improve here?

Please read for more details and more can be provided as well.

I need some help as I even went as far as order 4x X4100 with 2x dual
core processor 2.4GHz and 2x 10K SAS drives in them with 8GB of ram as
well, so 4GB per processors and I am afraid to hit the same limitations.
There isn't any reason that I shouldn't be able to pass these limits.

I don't have the new Sun yet, may be a week before I have them, but I am
trying to get ahead of the setup to fix my problem and test in lab. It
really is a capacity issue and look likes putting more powerful hardware
at it will not fix it.

I have:

# sysctl kern.maxproc
kern.maxproc=2048

Both also have noatime setup on the partition that the web files comes
from and I even send the logs of httpd to >/dev/null to be sure it's not
writing logs that would slow it down.

I use http_load to test my configuration and changes, but I am not
successful at improving it more. Look like connections are timing out
and I can't get more then ~ 300 process serving for httpd. Yes I have
also increase and recompile the httpd to allow more then the hard limit
of 250 and I can start 1500 httpd process if I want and they do run, but
they do not server traffic looks like and I am still getting timeout.

Even if I start "StartServers 2500" httpd process to be sure I don't run
out, or that the start of additional one is not the limit here, I can't
get more the...

To: Daniel Ouellet <daniel@...>
Cc: <misc@...>
Date: Wednesday, May 9, 2007 - 1:10 am

Loks at the memory usage. 300 httpd procces could take up 3000M
easily, especially with stuff like php. In that case, the machine
starts swapping and your hit the roof. As a general rul, do not allow
more httpd procces than our machine can handle without swapping. Also,
a long KeepAliveTmeout can works against you, by holding slots.

-Otto

To: Otto Moerbeek <otto@...>
Cc: <misc@...>
Date: Wednesday, May 9, 2007 - 1:30 am

Thanks Otto,

I am still doing tests and tweak, but as far as swap, I checked that and
same for keep alive in httpd.conf and I even changed it in:

net.inet.tcp.keepinittime=10
net.inet.tcp.keepidle=30
net.inet.tcp.keepintvl=30

For testing only. I am not saying the value above are any good, but I am
testing multiple things and reading a lot on sysctl and what each one does.

KeepAliveTmeout is at 5 seconds.

No swapping is happening, even with 1000 httpd running.

load averages: 123.63, 39.74, 63.3285 01:26:47
1064 processes:1063 idle, 1 on processor
CPU states: 0.8% user, 0.0% nice, 3.1% system, 0.8% interrupt, 95.4%
idle
Memory: Real: 648M/1293M act/tot Free: 711M Swap: 0K/4096M used/tot


How does this server do with 1000 non-httpd processes running? Perhaps
I need a newer Nemeth et al, but in my 3rd edition, pg 759 middle of the
page says "Modern systems do not deal welll with load averages over
about 6.0".

Could your bottleneck be in context-switching between so many processes?
With so many, the memory cache will be faulting during the context
switching and have to be retreived from main memory. I don't think that
such slow-downs appear in top, and I don't know about vmstat. I don't
know if there's a tool to measure this on i386.

I've never run httpd but it looks to me like a massivly parralized
problem where each connection is trivial to serve (hense low CPU usage,
no disk-io waiting) but there are just so many of them.

How does the server do with other connection services, e.g. pop or ftp?

Doug.


Be careful when reading these numbers here. Don't forget that I am doing
this in labs with abuse, etc. I am trying to push the server as much as
I can here. In production, I do see some server reaching 10, 18 and some
time I saw up to 25, but all these were in extreme cases, most of the
time, it's always below 10.

I can't answer this question with proper knowledge here as I don't
pretend to know that answer. May be someone else can speak knowingly

Wasn't. However yes there is and I can see faulting. I check both the
vmstat and iostat to see what's up. Obviously the number are higher on
older hardware as it run out of horse power obviously. But the problem
was the be able to handle more then 300 parallel connections and why it
just 3x when only 2 more process were added. So, no, I don't think the
context-switching had anything to do with it here.

You will see when I post the changes I did and the test I did. Some are

One multi core and multi processor hardware with proper memory, it

I only run one application per servers, always did and most likely
always will. So, any mail server is a mail server, and a web server is
only a web server here anyway. Even DNS are only running DNS as well, etc.


Here is more tests with always repeated results.

I increase the number of contiguous connection only by 5, from 305 to
310, and you get 3 times slower response for always the same thing and
repeated all the time. Very consistent and from different clients as well.

You can do any variation of 10 to 300 connections and you will always
get the same results, or very close to it. See that at the end as well
for proof.

So, I know I am hitting a hard limit someplace, but can't find where.

Note that I use a difference of 5 here, but I can reproduce the results
almost all the time, just by increasing the number of connections by 1.
From 307 to 308 I get 75% of the time the same results as below,
meaning times it;'s 6.7 seconds for the same transfer and other is 18.1
seconds.

See below. Always the same transfer size, always the same amount of
requests, always 100% success, but 3x slower.

Also, if I continue to increase it more, then I start to also get drop
in replies, etc.

So, far I have played with 26 different sysctl setting that may affect
that based on various possibility and from the man page and Google, but
I can improve it some, not to the point of be able to use 500
connections or more for example.

What is it that really limit the number of connection that badly and
that hard?

===================
305 parallel

# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.71609 seconds
13098 mean bytes/connection
74.4481 fetches/sec, 975121 bytes/sec
msecs/connect: 1813.57 mean, 6007.53 max, 0.418 min
msecs/first-response: 509.309 mean, 1685.92 max, 3.606 min
HTTP response codes:
code 200 -- 500
# http_load -parallel 305 -fetches 500 -timeout 30 /tmp/test
500 fetches, 305 max parallel, 6.549e+06 bytes, in 6.8586 seconds
13098 mean bytes/connection
72.9012 fetches/sec, 954860 bytes/sec
msecs/connect: 1957.35 mean, 6007.17 max, 0.445 min
msecs/first-response: 485.676 mean, 1559.27 max, 3.317 ...

To: Daniel Ouellet <daniel@...>
Cc: <misc@...>
Date: Wednesday, May 9, 2007 - 7:46 am

You've assumed that Apache is the bottleneck, but perhaps your
benchmark tool could be limited in some way. I suggest you try with
apache benchmark or some other tool just to verify the results.

Apache (especially in the prefork model) is known to have concurrency
issues. I doubt that there are knobs you can twist OpenBSD-wise that
will compensate for Apache and somehow magically make it scale.


Actually I have found a few things that fix it tonight.

I spend the last 24 hours reading like crazy and all night testing and
reading more.

I can now have two clients using 1000 parallel connections to one i386
850MHz server, my old one that I was testing with and I get all that no
problem now. No delay and I can even push it more, but I figure at 2000
parallel connections I should be able to get some breathing time now.

I will send the results soon.

All only in sysctl.conf

Now, I am still having some drop, not much, but some when I put pf in
actions. So, that would be the next step I guess, but not now. I need
some sleep.

Thanks

Daniel


I've spent considerable time with tuning apache on openbsd to
consume all available resources in OpenBSD. Here's the
relevant httpd.conf sections:

Timeout 300
KeepAlive On
MaxKeepAliveRequests 5000
KeepAliveTimeout 15

MinSpareServers 20
MaxSpareServers 30
StartServers 50
MaxClients 5000
MaxRequestsPerChild 0

I had staticlly compiled php into my httpd binary and obviously
raised HARD_LIMIT to 5000, using OpenBSD's apache.

This netted me an ability to serve about a max of 3000
requests per second on a 1.6ghz athlon with 256MB of memory.

hth.


Thanks. My configuration is more aggressive them yours and I can tell
you for a fact that the problem and limitations where not in the httpd
configuration, but in the OS part in my case anyway.

Some of your value I think would/could crash your system. Specially the:

MaxKeepAliveRequests 5000
MaxClients 5000

I don't think you could reach that high. Why, simply on a memory usage
stand point. That was my next exploration, but it's possible that one
apache process could take as much as 11MB

6035 www 2 0 11M 9392K sleep netcon 0:56 0.00% httpd

Obviously not all process would use that much. The question is really
depending on content. If small images and lots of them, then each
process use less memory. But if it is to serve all big files, then it's
possible to use a good amount of memory per process. Now I don't have
that answer here and I am not sure how to come with some logic on that,
but even if each process was using only 1MB, then 5000 would give you
5GB or RAM with is more then what OpenBSD was supporting until not so
long ago, so you will start to swap and god knows what will happen then.

I use KeepAliveTimeout 5 and I am considering to reduce it.

If you think aboiut your suggestion here, you have KeepAliveTimeout 15
and then MaxKeepAliveRequests 5000, don't you see the paradox here?

If your server is really busy, and lots of images on one page for
example, then you would have a lots of process stuck in KeepAliveTimeout
time out stage, so that's why you most likely increase your MaxClients
5000 to compensate for that, but that's wrong I believe. It makes your
server use more resources and be slower to react.

I use a logic here for the value on how to fix it.

MaxKeepAliveRequests I think should be set based on how many possible
additional requests a URL from a browser that support keep alive and
multiple requests at once could have. How many, well I think it's based
on how many elements your web page can have. That's the...


Hi,

I am passing my finding around for the configuration of sysctl.conf to
remove bottleneck I found in httpd as I couldn't get more then 300 httpd
process without crapping out badly and above that, the server simply got
out of wack.

All is default install and the tests are done with a server that is an
old one. dmesg at the end in case you are interested. This is on OpenBSD
4.0 and I pick that server just to see what's possible as it's not
really a very powerful one.

You can also see the iostat output and the vmstat as well with the
changes in place.

You sure can see a few page fault as I am really pushing the server
much, but even then I get decent results and the bottleneck was remove,
even with 2000 parallel connections. In that case I had to use two
different clients as the http_load only support up to 1021 parallel
connections, so to test pass that, I use more then one clients to push
the server more.

But in all, the results are much better then a few days ago and now
looks like we get more for the buck and adding more powerful hardware
will be use better now instead of suffering the same limitations.

I put also the value changed in sysctl.conf to come to this final setup.

I am not saying the value are the best possible choice, but they work
well in the test situation and there is many as you will see. Some are
very surprising to me, like the change in net.inet.ip.portfirst. Yes I
know, but if I leave it as default, then I can't get full success in the
test below and get time out, some errors and efficiency is not as good.
May be that's because of the random ports range calculations, I can't
say, but in any case, the effect is there and tested.

I try to stay safe in my choices and comments are welcome, but I have to
point out as well that ALL the values below needs to be changes to that
new value to get working well. If even only one of them is not at the
level below, the results in the tests start to be affected pretty bad at
times.

...


What does netstat -m tell you about the peak usage of clusters is it

Is httpd really so slow in accepting sockets that you had to increase this

Are you sure you need to tune the IP fragment queue? You are using TCP
which does PMTU discovery and sets the DF flag by default so no IP

These values are super aggressive especially the keepidle and keepintvl
values are doubtful for your test. Is your benchmark using SO_KEEPALIVE? I
doubt that and so these two values have no effect and are actually

This is another knob that should not be changed unless you really know
what you are doing. The mss calculation uses this value as safe default
that is always accepted. Pushing that up to this value may have unpleasant
sideeffects for people behind IPSec tunnels. The used mss is the max
between mssdflt and the MTU of the route to the host minus IP and TCP

If you need to tune the syncache in such extrem ways you should consider
to adjust TCP_SYN_HASH_SIZE and leave synbucketlimit as is. The
synbucketlimit is here to limit attacks to the hash list by overloading
the bucket list. On your system it may be necessary to traverse 420 nodes
on a lookup. Honestly the syncachelimit and synbucketlimit knob are totaly
useless. If anything we should allow to resize the hash and calculate the
both limits from there.


You are right again! (;>

# netstat -m
14140 mbufs in use:
1098 mbufs allocated to data
12527 mbufs allocated to packet headers
515 mbufs allocated to socket names and addresses
585/694/4096 mbuf clusters in use (current/peak/max)
4976 Kbytes allocated to network (94% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

I was not looking at the right place. Back to default value.

Thanks for the help!

Daniel


I will do an other series of tests in the next few days and be sure of
it before putting my foot in my mouth. But at 10000, I was getting drops

Yes, I was doing tests using a few clients and pushing the server at
2000 parallel connections to test with. That was in lab test and in real
life, I assume that half should be fine. But I wanted to be safe. So,

With smaller queue I was getting slower responses and drop. May be a

Yes, aggressive I was/am. Keep Alive was/is in use yes. I will have more
to play with in lab and see if I was to aggressive and look like you
would think I am. The default value give me not as good results however.
More tests needed specifically on this and I will do so. May be the
defaults are fine, I will see if I can find a way to be more objective

I will review and read more on it. I based my changes on results seen
with the setup under heavy load. There is always place for improvements.

Interesting! I will retest with that in mind. Didn't see that
explication in my reading so far. Thanks for this!

You are most helpful and this gives me something to research more and I
sure appreciates your time in passing the informations.

Looks like a few more days of testing needed.

Many thanks!

Daniel

To: Daniel Ouellet <daniel@...>
Cc: <misc@...>
Date: Thursday, May 10, 2007 - 1:24 am

never mind the rest, but these two really make no sense. none.

To: Ted Unangst <ted.unangst@...>
Cc: <misc@...>
Date: Thursday, May 10, 2007 - 2:31 am

Make no sense in the test and improving results, or make no sense in
setting them as such here?

net.inet.ip.redirect=0

Is to disable ICMP routing redirects. Otherwise, your system could have
its routing table misadjusted by an attacker. Wouldn't be wise to do so?
May be if PF is turn on, then there is no reason for this, but with PF
ON, I get drop and need to address that. Didn't pursue it yet as dead
however.

As for the net.bpf.bufsize, I am looking again in my notes and tests,
it's use for Berkeley Packet Filter (BPF), to maintains an internal
kernel buffer for storing packets received off the wire.

Yes in that case it make sense not to have that here. I redid the tests
with the default value and yes you are right! This one is wrong here.
May be lack of sleep. (;> Thanks for correcting me!

I also have the revise my statement on the net.inet.ip.portfirst=32768
effect. In a series of new tests, it doesn't have the impact noted the
first test runs. So, I would keep it as default value as well now. May
be it was when PF was enable that I have more of an impact then. But my
notes are not clear on that specific one.

Anything else you see that may be questionable in what I sent? I am
doing more tests with different hardware to be sure it's all sane value
in the end.

Other wise many thanks for having taken the time to look it over and
give me your feedback on it!

I sure appreciate it big time!

Best

Daniel


As requested a few times in private to make the results available, here
you go with what works for me. Hope this help some anyway.

Use what make sense to you based on your setup, hardware and traffic.

Final value in use after testing are now set as follow for me assuming a
good amount of memory to allow so many process to run. I use minimum
2GB, some have 4GB.

Recompile httpd with upper limits for process. I put 2048 to allow more
room in the future if needed, but I still want to be safe and limit the
process lower that that. If php is in use for example, static
compilation would improve, but I choose to keep the system as much as
possible as default for many reasons, including maintenance, support and
regular upgrades. Your choice may vary.

In fstab
========
A partition for the files used by the sites set with noatime set on it
to avoid the change in last access time for each files. Definitely
improve access time a lots under heavy load!

httpd logs could be on it's own partition as well, mounted softdep to
gain some efficiency in logs updates if very busy sites.

For httpd.conf
==============
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
MinSpareServers 50
MaxSpareServers 100
StartServers 75
MaxClients 768
MaxRequestsPerChild 0

In sysctl.conf
==============
# Below are values added to improve performance of httpd after
# testing with http_load under parallel and rate setting.

kern.maxclusters=12000 # The maximum number of mbuf(9) clusters
# that may be allocated.

kern.maxfiles=4096 # The maximum number of open files that
# may be open in the system.

kern.maxproc=2048 # The maximum number of simultaneous
# processes the system will allow.

kern.seminfo.semmni=1024 # The maximum number of semaphore
# identifiers allowed.

kern.seminfo.semmns=4096 # The maximum number of semaphores
# allowed in the system.

kern.shminfo.shmall=16384 # The maximum amount of total shared
# memory allowed i...


net.inet.ip.redirect has only an effect if you enable
net.inet.ip.forwarding. As you are running a server and not a router I
doubt this is the case. Additionally net.inet.ip.redirect does not modify

With many shortliving connections you have a lot of sockets in TIME_WAIT.
Because you are testing from one host only you start to hit these entries
more and more often this often results in a retry from the client.
Additionally by filling all available ports the port allocation algorithm
is starting to get slower but that's a problem that you will only see on

I think there are a few knobs that you should reconsider. I will write an
other mail about that.


More reading in the man pages did the truck on that one and yes you are

I did test it with a few more hosts and as stated, the OpenBSD default

That sure would be welcome. I would be curious to see what else, or
differences you may see. I did lots of tests in different setup, but I
am always happy to see improvements.

I have for now my somewhat final version done and looks pretty good.
Much better then before for sure anyway. Now I can enjoy seeing traffic
coming in instead of worry about complains. (;>

But more improvements and suggestions with explications would be welcome
as understanding on my side anyway.

Many thanks!

Daniel

To: Daniel Ouellet <daniel@...>
Cc: <misc@...>
Date: Wednesday, May 9, 2007 - 2:15 am

These parameters do not have a lot to do with what you are seeing.

I was talking abouty the KeepAliveTimeout of apache. It's by default
15s. WIth a long timout, any processs that has served a request will
wait 15s to see if the client issues more requests on the same
connection before it becomes available to serve other requests. For

To: OpenBSD <misc@...>, Daniel Ouellet <daniel@...>
Date: Tuesday, May 8, 2007 - 5:30 pm

Daniel,

Maybe I am about to say something really stupid, but ok, here I go:
are you testing from one location only? Maybe that host is the
bottleneck itself.

Wijnand

To: Wijnand Wiersma <wijnand@...>
Cc: OpenBSD <misc@...>
Date: Tuesday, May 8, 2007 - 5:50 pm

Nothing is stupid for me right now. I am looking for any ideas that can
help. Even if that look stupid, I am welling to test it.

As for the setup for the test, all servers and client are connected to
the same Cisco switch directly.

To: Daniel Ouellet <daniel@...>
Cc: OpenBSD <misc@...>
Date: Tuesday, May 8, 2007 - 6:52 pm

I meant the client being the bottleneck ;-)
Sorry for not being clear.

Wijnand

To: Wijnand Wiersma <wijnand@...>
Cc: OpenBSD <misc@...>
Date: Tuesday, May 8, 2007 - 7:13 pm

Nope. I sent updates on that too with a more powerful server. And I am
doing tests now with three clients at once to see and I can get a bit
more process running on the server side, but still no more output of
that server.

It is cap somehow and I am not sure what does it yet.


I'm new at this so please ignore if its not helpful.

Is this a bandwidth (hardware) limitation on the computer itself? If so
then a faster processor won't help. Bus contention?

Doug.


Could always be a possibility, but if you take the data sent and the
time spend to send it, you would see that one server in all tests look
like it cap at around 5.8Mb/sec and the other one at 9.0Mb/sec. These
numbers are sure way to low to be a bus problem here. Even drive speed,
look to me that drives these days sure can spit data lots faster then
this for sure.

I am trying so many different things without success so far. But I am
sure there have to be something I am overlooking here. Doesn't make
sense to me that one would be cap at that level. I don't believe it
anyway, but on the other end, I am running out of idea to check and
Google doesn't provide me lots more to try that I haven't done already.

I am sure Henning can get more out of his servers then this, but I am
not sure how he does it to be honest.

To: Daniel Ouellet <daniel@...>
Cc: <misc@...>
Date: Tuesday, May 8, 2007 - 5:27 pm

first, are you sure you are testing the server and not the client?

second, what happens if you start another web server on port 8080 and
test simultaneously?

To: Ted Unangst <ted.unangst@...>
Cc: <misc@...>
Date: Tuesday, May 8, 2007 - 6:35 pm

Even run locally, the numbers don't look much better. Even in this case,
looks like it can't do the required number of parallel requested:

old i386
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 94 max parallel, 1.37816e+07 bytes, in 20.7814 seconds
5512.65 mean bytes/connection
120.3 fetches/sec, 663172 bytes/sec
msecs/connect: 326.667 mean, 6062.79 max, 1.248 min
msecs/first-response: 36.5991 mean, 6071.86 max, 3.419 min
HTTP response codes:
code 200 -- 2500
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 90 max parallel, 1.38708e+07 bytes, in 20.9679 seconds
5548.31 mean bytes/connection
119.23 fetches/sec, 661525 bytes/sec
msecs/connect: 346.224 mean, 6130.06 max, 1.228 min
msecs/first-response: 43.7965 mean, 6055.29 max, 3.392 min
HTTP response codes:
code 200 -- 2500

new amd64
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 64 max parallel, 1.33453e+07 bytes, in 14.2911 seconds
5338.11 mean bytes/connection
174.934 fetches/sec, 933819 bytes/sec
msecs/connect: 107.002 mean, 6016.89 max, 0.802 min
msecs/first-response: 19.2824 mean, 512.538 max, 1.706 min
HTTP response codes:
code 200 -- 2500
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www1
2500 fetches, 63 max parallel, 1.37396e+07 bytes, in 14.1811 seconds
5495.84 mean bytes/connection
176.291 fetches/sec, 968869 bytes/sec
msecs/connect: 106.943 mean, 6022.11 max, -8.932 min
msecs/first-response: 21.5082 mean, 3041.49 max, 1.716 min
HTTP response codes:
code 200 -- 2500

To: Ted Unangst <ted.unangst@...>
Cc: <misc@...>
Date: Tuesday, May 8, 2007 - 6:04 pm

Yes confirmed, it's not the client. I just did it from and IBM e365 with
dual core processor. dmesg lower, but the results below for the Sun and
the IBM looks similar. So, no client issue that I can see:

IBM e365 client:

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.33069e+07 bytes, in 19.0603 seconds
5322.74 mean bytes/connection
131.163 fetches/sec, 698146 bytes/sec
msecs/connect: 140.559 mean, 6014.22 max, -7.799 min
msecs/first-response: 919.846 mean, 8114.42 max, -3.572 min
HTTP response codes:
code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.39552e+07 bytes, in 18.2373 seconds
5582.08 mean bytes/connection
137.082 fetches/sec, 765203 bytes/sec
msecs/connect: 814.221 mean, 18006.5 max, -7.838 min
msecs/first-response: 1248.39 mean, 11165.7 max, -3.433 min
HTTP response codes:
code 200 -- 2500

Sun V120 client:

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.37375e+07 bytes, in 19.137 seconds
5494.99 mean bytes/connection
130.637 fetches/sec, 717851 bytes/sec
msecs/connect: 232.358 mean, 6005.86 max, 0.439 min
msecs/first-response: 872.213 mean, 10733.2 max, 3.409 min
HTTP response codes:
code 200 -- 2500

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.37627e+07 bytes, in 18.6019 seconds
5505.09 mean bytes/connection
134.395 fetches/sec, 739854 bytes/sec
msecs/connect: 1182 mean, 18013.3 max, 0.502 min
msecs/first-response: 1001.47 mean, 9873.65 max, 3.435 min
HTTP response codes:
code 200 -- 2500

http_load Client dmesg:

# dmesg
OpenBSD 4.0 (GENERIC.MP) #967: Sat Sep 16 20:38:15 MDT 2006
deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 1072672768 (1047532K)
avail mem = 907272192 (886008K)
using 22937 buffers containing 107474944 bytes (104956K) of memory
mainbus0 (root)
bios0 at mainbus0: ...


Just a question - what do you seen when trying from localhost? That
would eliminate quite a few networking issues, at least.

Joachim

--
TFMotD: factor, primes (6) - factor a number, generate primes


Not that much different. I would even say that may be not as good
locally. Plus I sent an other example for two different servers with the
test done locally as well. Should show up on marc very soon. Not there yet.

Local:
# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 52 max parallel, 1.42596e+07 bytes, in 20.8623 seconds
5703.82 mean bytes/connection
119.833 fetches/sec, 683507 bytes/sec
msecs/connect: 107.61 mean, 6061.48 max, 1.224 min
msecs/first-response: 39.1055 mean, 6008.52 max, 3.384 min
HTTP response codes:
code 200 -- 2500

# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 82 max parallel, 1.35499e+07 bytes, in 20.7909 seconds
5419.97 mean bytes/connection
120.245 fetches/sec, 651724 bytes/sec
msecs/connect: 290.4 mean, 6059.02 max, 1.253 min
msecs/first-response: 33.4435 mean, 6004.2 max, 3.459 min
HTTP response codes:
code 200 -- 2500

Remote:

# http_load -parallel 400 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 400 max parallel, 1.34383e+07 bytes, in 18.4801 seconds
5375.32 mean bytes/connection
135.281 fetches/sec, 727177 bytes/sec
msecs/connect: 1016.4 mean, 18012.9 max, 0.406 min
msecs/first-response: 1104.19 mean, 10505.5 max, 3.455 min
HTTP response codes:
code 200 -- 2500
# http_load -parallel 200 -fetches 2500 -timeout 60 /tmp/www2
2500 fetches, 200 max parallel, 1.36846e+07 bytes, in 23.4292 seconds
5473.85 mean bytes/connection
106.704 fetches/sec, 584083 bytes/sec
msecs/connect: 391.978 mean, 6006.38 max, 0.486 min
msecs/first-response: 742.048 mean, 10497.9 max, 3.403 min
HTTP response codes:
code 200 -- 2500

To: Ted Unangst <ted.unangst@...>
Cc: <misc@...>
Date: Tuesday, May 8, 2007 - 5:47 pm

I will try a different server. For now, I use a Sun V120 with nothing
running on it as the client. I will use more beef one to be sure and
report back.

Also PF is not running on either client and servers for tests.

I also try these tests:

net.inet.ip.maxqueue=300 -> 1000

and

kern.somaxconn: 128 -> 512

In any case, what I see is that I can't pass 5.8Mb/sec on the old i386
server and 9.0Mb/sec on the HP145 AMD64 one regardless if I use 100
parallel connection or 400. More then 400 really put all numbers down

No, but I will. I am really looking for any ideas as I am at a lost and
I will use heavyer clients to be sure it's not the problem here.

Previous thread: creating menu's by Bryan Irvine on Tuesday, May 8, 2007 - 4:22 pm. (5 messages)

Next thread: OT: Monitoring tools and integration with SIM products by carlopmart on Tuesday, May 8, 2007 - 5:26 pm. (4 messages)