We have a setup where we have a Elastic LB that spreads load over two Apache servers A1 and A2. These apache servers render a few php pages and primarily redirects the API requests to tomcat appplication servers T1 and T2 as per the following diagram:
Incoming Request
|
|
|
\/
LB
/\
/ \
/ \
A1 A2
|\ /|
| \ / |
| \ / |
| / \ |
T1 T2
We have recently started to note delays between apache and tomcat. Here is an example log lines from apache mod_slow log and tomcat access log for the same request:
APACHE_MOD_SLOW: VNSdtwoAAJkAACnXb-cAAACJ [06/Feb/2015:16:25:51 +0530] elapsed: 50.58 cpu: 0.00(usr)/0.00(sys) pid: 10711 ip: 10.0.0.153 host: www.example.com:443 reqinfo: GET /data/v1/url?url=test-508324 HTTP/1.1
TOMCAT: [06/Feb/2015:16:26:42 +0530] "GET /data/v1/url?url=test-508324 HTTP/1.1" 200 65 10
Apache says the incoming request came at 06/Feb/2015:16:25:51 +0530 and it took 50s to process the request. Whereas tomcat says it took only 10ms to process the request whereas it received the request at 06/Feb/2015:16:26:42 +0530.
It means it took nearly 50s for apache to connect and send the whole request to tomcat. Apache is using mod_proxy_ajp to connect to apache. Here is the configuration:
<Proxy balancer://prod>
BalancerMember ajp://127.0.0.1:8009 route=jvmRoute-8009 connectiontimeout=1 retry=300
BalancerMember ajp://10.0.0.153:8009 route=jvmRoute-8009 connectiontimeout=1 retry=300
ProxySet lbmethod=byrequests
</Proxy>
Here is the connector configuration from tomcat:
<Connector port="8009" protocol="AJP/1.3" redirectPort="8443" maxThreads="4096" minSpareThreads="25" maxSpareThreads="75"/>
As per connectiontimeout value, I am assuming apache shouldn't take more than 1 sec to establish connection. Since both apache and tomcat are both on the same machine, there shouldn't be much time lag once connection is established.
If it helps, we are using https requests. But that, I don'y think has anything to do with this. We have done ab test in order to compare performance using https, http and connecting tomcat directly. Here are the stats:
ab -n5000 -c5 https://example.com/test/100001
Requests per second: 13.67 [#/sec] (mean)
Time per request: 365.851 [ms] (mean)
Time per request: 73.170 [ms] (mean, across all concurrent requests)
Transfer rate: 79.96 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 236 267 95.5 247 3401
Processing: 83 98 58.6 89 1959
Waiting: 82 96 57.5 87 1959
Total: 319 365 134.0 338 3571
Percentage of the requests served within a certain time (ms)
50% 338
66% 347
75% 356
80% 364
90% 399
95% 477
98% 689
99% 869
100% 3571 (longest request)
ab -n5000 -c5 http://example.com/test/100001
Time per request: 186.015 [ms] (mean)
Time per request: 37.203 [ms] (mean, across all concurrent requests)
Transfer rate: 155.55 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 74 79 33.4 76 1278
Processing: 83 107 82.3 91 3964
Waiting: 82 105 60.9 89 940
Total: 157 186 90.1 168 4042
Percentage of the requests served within a certain time (ms)
50% 168
66% 174
75% 180
80% 184
90% 211
95% 259
98% 379
99% 507
100% 4042 (longest request)
ab -n5000 -c5 http://IP:8080/test/100001
Requests per second: 31.32 [#/sec] (mean)
Time per request: 159.624 [ms] (mean)
Time per request: 31.925 [ms] (mean, across all concurrent requests)
Transfer rate: 181.30 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 71 76 68.4 73 3079
Processing: 78 84 13.1 81 594
Waiting: 77 83 6.5 81 185
Total: 149 159 71.2 154 3313
Percentage of the requests served within a certain time (ms)
50% 154
66% 157
75% 160
80% 161
90% 166
95% 171
98% 177
99% 189
100% 3313 (longest request)
Following observation leads me to believe this that bad performance is depending on sequence of events because none of the requests in the test performed that badly.
Versions:
Apache: 2.2.4
Tomcat: 8.0.16
Any ideas where the lag is coming from and how to reduce the same?
Related
I try to apply limit_conn to my server for test purposes but it doesn't work.
There is my nginx.conf
events {}
http {
include mime.types;
limit_conn_zone $binary_remote_addr zone=addr:10m;
server {
listen 80;
location /downloads/ {
limit_conn addr 1;
return 200 "Hello";
}
}
}
Then I run Apache Benchmark
ab -n 10 -c 10 http://MY_IP_ADDRESS_OF_SERVER/downloads/
I get this output data:
This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking MY_IP_ADDRESS_OF_SERVER (be patient).....done
Server Software: nginx/1.16.1
Server Hostname: MY_IP_ADDRESS_OF_SERVER
Server Port: 80
Document Path: /downloads/
Document Length: 16 bytes
Concurrency Level: 10
Time taken for tests: 0.004 seconds
Complete requests: 10
Failed requests: 0
Total transferred: 1760 bytes
HTML transferred: 160 bytes
Requests per second: 2693.97 [#/sec] (mean)
Time per request: 3.712 [ms] (mean)
Time per request: 0.371 [ms] (mean, across all concurrent requests)
Transfer rate: 463.03 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 2 0.3 2 2
Processing: 0 1 0.6 1 2
Waiting: 0 1 0.5 1 2
Total: 2 3 0.3 3 3
Percentage of the requests served within a certain time (ms)
50% 3
66% 3
75% 3
80% 3
90% 3
95% 3
98% 3
99% 3
100% 3 (longest request)
But I expect to get in output something like this
Complete requests: 10
Failed requests: 1
Non-2xx responses: 9
It seems that limit_conn doesn't work in my server. Why doesn't it work and how can i solve this problem?
I have an Angular website with static assets of around 1.5 mb and gzipped it is around 400 kb, I have nginx as my webserver & reverse proxy to the API server, when I test nginx with Apache benchmark tool, I find huge drop in performance if I test the https site compared to the http (https is 10 times slower) & the cpu utilization & memory is not high at all (cpu 30% memory is only 1 mb!!)
I have been searching for hours & tried all possible enhancements but none worked, as far as I have read https shall not be that much slower on modern web servers (http around 1500 req/sec & https is 46 req/sec for nginx), this is mostly from the Nginx https very high connect time but I have no clue how to solve this.
Can someone advise how to improve this?
(Also to my surprise, Apache performs much better in both cases but doesn't respond if I set concurrent connections to more than 200) & this is not nginx vs apache I am just stating my situation.
Important note:
I am not comparing the 2 web servers that is not the point of this site, but generally they have comparable performance so if https in nginx is 10 times slower than Apache I feel that something is wrong in my Nginx configuration & I want to fix it.
All test are on my windows machine i7 & 16 gb ram.
Nginx http only:
C:\Apache24\bin>ab -n 5000 -c 200 http://localhost:8100/abc/index.html?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Server Software: nginx/1.15.4
Server Hostname: localhost
Server Port: 8100
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 3.246 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6665000 bytes
HTML transferred: 5495000 bytes
Requests per second: 1540.32 [#/sec] (mean)
Time per request: 129.843 [ms] (mean)
Time per request: 0.649 [ms] (mean, across all concurrent requests)
Transfer rate: 2005.12 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.3 0 16
Processing: 31 87 12.8 94 124
Waiting: 0 87 13.7 94 124
Total: 31 87 12.8 94 124
Percentage of the requests served within a certain time (ms)
50% 94
66% 94
75% 94
80% 94
90% 99
95% 109
98% 109
99% 113
100% 124 (longest request)
Nginx https (with http2 enabled)
C:\Apache24\bin>abs -n 5000 -c 200 https://localhost:8200/abc/index.html?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Server Software: nginx/1.15.4
Server Hostname: localhost
Server Port: 8200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: localhost
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 108.985 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6780000 bytes
HTML transferred: 5495000 bytes
Requests per second: 45.88 [#/sec] (mean)
Time per request: 4359.386 [ms] (mean)
Time per request: 21.797 [ms] (mean, across all concurrent requests)
Transfer rate: 60.75 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 16 4201 506.8 4251 4755
Processing: 0 32 12.6 31 88
Waiting: 0 32 12.6 31 88
Total: 62 4232 506.9 4283 4800
Percentage of the requests served within a certain time (ms)
50% 4283
66% 4342
75% 4413
80% 4439
90% 4484
95% 4547
98% 4694
99% 4727
100% 4800 (longest request)
Compared to Apache http (here CPU is around 90 to 100% utilized)
C:\Apache24\bin>ab -n 5000 -c 200 http://localhost:6200/abc/index.html?param=abc
Server Software: Apache/2.4.33
Server Hostname: localhost
Server Port: 6200
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 1.781 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6810000 bytes
HTML transferred: 5495000 bytes
Requests per second: 2806.99 [#/sec] (mean)
Time per request: 71.251 [ms] (mean)
Time per request: 0.356 [ms] (mean, across all concurrent requests)
Transfer rate: 3733.51 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.6 0 16
Processing: 16 69 16.0 63 125
Waiting: 0 57 16.0 63 125
Total: 16 69 16.0 63 125
Percentage of the requests served within a certain time (ms)
50% 63
66% 78
75% 78
80% 78
90% 94
95% 94
98% 94
99% 109
100% 125 (longest request)
And Apache https is as follows (http 1.1) & note that http 1.1 in nginx didn't improve its performance:
C:\Apache24\bin>abs -n 5000 -c 200 https://localhost:7200/abc/index.html?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Server Software: Apache/2.4.33
Server Hostname: localhost
Server Port: 7200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: localhost
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 8.747 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6810000 bytes
HTML transferred: 5495000 bytes
Requests per second: 571.60 [#/sec] (mean)
Time per request: 349.894 [ms] (mean)
Time per request: 1.749 [ms] (mean, across all concurrent requests)
Transfer rate: 760.27 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 198 42.7 188 391
Processing: 62 145 39.1 140 385
Waiting: 0 76 28.3 78 250
Total: 62 343 63.0 331 615
Percentage of the requests served within a certain time (ms)
50% 331
66% 369
75% 380
80% 389
90% 422
95% 465
98% 500
99% 536
100% 615 (longest request)
My nginx configuration:
worker_processes auto;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
server {
listen 8100;
server_name localhost;
location / {
root html;
index index.html index.htm;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
server {
listen 8200 ssl http2;
server_name localhost;
ssl_certificate C:/nginx-1.13.12/conf/server.crt;
ssl_certificate_key C:/nginx-1.13.12/conf/server.key;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on;
gzip on;
gzip_comp_level 1;
gzip_vary on;
gzip_types
text/css
text/javascript
text/xml
text/plain
text/x-component
application/javascript
application/json
application/xml
application/rss+xml
font/truetype
font/opentype
application/vnd.ms-fontobject
image/svg+xml;
gzip_static on;
location /ipo_reits/ {
root html;
index index.html index.htm;
## here we redirect to the homepage in case of nginx 404
try_files $uri $uri/ /ipo_reits/index.html;
# error_page 404 =301 /;
}
location /api/ {
proxy_pass https://localhost:7001/;
}
}
}
I hope that this will help someone else, It seems that is related to nginx on windows issue, I wrongly assumed that the performance of nginx on windows & linux is similar but clearly it is not.
I have tried the benchmark again with nginx on Linux on the same machine & got excellent performance as shown below
ab -n 5000 -c 200 https://localhost:8200/abc/index?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Finished 5000 requests
Server Software: nginx/1.10.3
Server Hostname: localhost
Server Port: 8200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Document Path: /abc/index?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 4.179 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6825000 bytes
HTML transferred: 5495000 bytes
Requests per second: 1196.37 [#/sec] (mean)
Time per request: 167.173 [ms] (mean)
Time per request: 0.836 [ms] (mean, across all concurrent requests)
Transfer rate: 1594.77 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 15 141 185.3 106 1322
Processing: 1 22 13.1 20 82
Waiting: 1 14 9.5 13 81
Total: 24 163 185.7 128 1351
Percentage of the requests served within a certain time (ms)
50% 128
66% 142
75% 148
80% 155
90% 208
95% 260
98% 1100
99% 1164
100% 1351 (longest request)
Also for sustained higher load & concurrency, performance was still the same:
ab -n 25000 -c 1000 https://localhost:8200/abc/index?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Benchmarking localhost (be patient)
Completed 2500 requests
....
Completed 25000 requests
Finished 25000 requests
Server Software: nginx/1.10.3
Server Hostname: localhost
Server Port: 8200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Document Path: /abc/index?param=abc
Document Length: 1099 bytes
Concurrency Level: 1000
Time taken for tests: 20.149 seconds
Complete requests: 25000
Failed requests: 0
Total transferred: 34125000 bytes
HTML transferred: 27475000 bytes
Requests per second: 1240.76 [#/sec] (mean)
Time per request: 805.960 [ms] (mean)
Time per request: 0.806 [ms] (mean, across all concurrent requests)
Transfer rate: 1653.94 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 7 687 711.8 492 7694
Processing: 2 89 50.1 81 516
Waiting: 0 57 48.9 41 509
Total: 15 776 723.4 600 7756
Percentage of the requests served within a certain time (ms)
50% 600
66% 812
75% 1095
80% 1186
90% 1397
95% 1631
98% 3183
99% 3442
100% 7756 (longest request)
Avoid Old Cipher Suites
HTTP/2 has a huge blacklist of old and insecure ciphers, so we must avoid them. Cipher suites are a bunch of cryptographic algorithms, which describe how the transferring data should be encrypted.
We will use a really popular cipher set, whose security was approved by Internet giants like CloudFlare. It does not allow the usage of MD5 encryption (which was known as insecure since 1996, but despite this fact, its use is widespread even to this day).
Open the following configuration file:
sudo nano /etc/nginx/nginx.conf
Add this line after ssl_prefer_server_ciphers on;.
/etc/nginx/nginx.conf
ssl_ciphers EECDH+CHACHA20:EECDH+AES128:RSA+AES128:EECDH+AES256:RSA+AES256:EECDH+3DES:RSA+3DES:!MD5;
Save the file, and exit the text editor.
Once again, check the configuration for syntax errors:
sudo nginx -t
I have been working with apache bench for a while now and until now it worked just fine. However, today I started getting several Non-2xx responses:. In order to investigate further, I tried to run a test with a simple website, so I run:
ab -n 100 -c 10 http://www.yahoo.com/
And this is what I got:
This is ApacheBench, Version 2.3 <$Revision: 1796539 $> Copyright 1996
Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to
The Apache Software Foundation, http://www.apache.org/
Benchmarking www.yahoo.com (be patient).....done
Server Software: ATS
Server Hostname: www.yahoo.com
Server Port: 80
Document Path: /
Document Length: 8 bytes
Concurrency Level: 10
Time taken for tests: 4.898 seconds
Complete requests: 100
Failed requests: 0
Non-2xx responses: 100
Total transferred: 36875 bytes
HTML transferred: 800 bytes
Requests per second: 20.42 [#/sec] (mean)
Time per request: 489.817 [ms] (mean)
Time per request: 48.982 [ms] (mean, across all concurrent requests)
Transfer rate: 7.35 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 39 48 6.3 47 66
Processing: 50 416 89.3 415 521
Waiting: 49 254 121.0 261 512
Total: 93 464 92.1 460 575
Percentage of the requests served within a certain time (ms)
50% 460
66% 476
75% 511
80% 541
90% 569
95% 572
98% 574
99% 575
100% 575 (longest request)
As the output shows, even with an external url, I get 100% Non-2xx responses. Does anyone know how I could fix this?
Thank you!
It might be because when accessing Yahoo.com they are redirecting you, and therefore you'll get 30X responses and not 20X directly.
In the book Tomcat The Definitive Guide written by Jason Brittain with Ian F.Darwin, when using the ab tool to benchmark, the writters says,
you should benchmark by running a minimum of 100,000 HTTP requests.
Also , you may configure the test client to spawn as many client threads as you would like,
but you will not get helpful results if you set it higher than the maxThreads you set for you Connector in your Tomcat's conf/server.xml file.
By default, it is set to 150.
Then the writter recommends 149.
In my case, with 149 client threads, the running result is:
[user#apachetomcat ~]$ ab -k -n 100000 -c 149 http://10.138.0.2:8080/test.html
This is ApacheBench, Version 2.3 <$Revision: 1430300 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 10.138.0.2 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
Completed 100000 requests
Finished 100000 requests
Server Software:
Server Hostname: 10.138.0.2
Server Port: 8080
Document Path: /test.html
Document Length: 13 bytes
Concurrency Level: 149
Time taken for tests: 45.527 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 99106
Total transferred: 23195530 bytes
HTML transferred: 1300000 bytes
Requests per second: 2196.48 [#/sec] (mean)
Time per request: 67.836 [ms] (mean)
Time per request: 0.455 [ms] (mean, across all concurrent requests)
Transfer rate: 497.54 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 6.8 0 70
Processing: 66 67 5.6 67 870
Waiting: 66 67 5.6 67 870
Total: 66 68 8.8 67 870
Percentage of the requests served within a certain time (ms)
50% 67
66% 67
75% 67
80% 67
90% 67
95% 68
98% 69
99% 133
100% 870 (longest request)
After increasing to 1000 client threads, the result is:
[user#apachetomcat ~]$ ab -k -n 100000 -c 1000 http://10.138.0.2:8080/test.html
This is ApacheBench, Version 2.3 <$Revision: 1430300 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 10.138.0.2 (be patient)
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
Completed 100000 requests
Finished 100000 requests
Server Software:
Server Hostname: 10.138.0.2
Server Port: 8080
Document Path: /test.html
Document Length: 13 bytes
Concurrency Level: 1000
Time taken for tests: 7.205 seconds
Complete requests: 100000
Failed requests: 0
Write errors: 0
Keep-Alive requests: 99468
Total transferred: 23197340 bytes
HTML transferred: 1300000 bytes
Requests per second: 13879.80 [#/sec] (mean)
Time per request: 72.047 [ms] (mean)
Time per request: 0.072 [ms] (mean, across all concurrent requests)
Transfer rate: 3144.28 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 8.1 0 68
Processing: 66 69 22.3 67 1141
Waiting: 66 69 22.3 67 1141
Total: 66 70 27.5 67 1141
Percentage of the requests served within a certain time (ms)
50% 67
66% 67
75% 68
80% 68
90% 69
95% 71
98% 87
99% 139
100% 1141 (longest request)
The Requests per second increases from 2196.48/sec to 13879.80/sec, so I think this change is meaningful.
Why does the writer think it's not helpful when we set it higher than the maxThreads?
What does the incrementation of requests per second mean in my case?
I'm confused with the requests per second. It's very important to understand the writer's benchmarks in the following chapters of the book.
I have a 1 GB VPS and Apache slows to a crawl almost from start up. I ran ApacheBench on a static.html file and things don't differ. However, the site will have both MySQL and PHP and a high volume of AJAX requests, so I'd like to tune for that.
When I restart, error logs show this almost immediately:
[error] server reached MaxClients setting, consider raising the MaxClients setting
ab -n 1000 -c 1000
shows:
Document Path: /static.html
Document Length: 7 bytes
Concurrency Level: 1000
Time taken for tests: 57.784 seconds
Complete requests: 1000
Failed requests: 64
(Connect: 0, Receive: 0, Length: 64, Exceptions: 0)
Write errors: 0
Total transferred: 309816 bytes
HTML transferred: 6552 bytes
Requests per second: 17.31 [#/sec] (mean)
Time per request: 57784.327 [ms] (mean)
Time per request: 57.784 [ms] (mean, across all concurrent requests)
Transfer rate: 5.24 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 25 13.4 25 48
Processing: 1070 16183 15379.4 9601 57737
Waiting: 0 14205 15176.5 9591 42516
Total: 1070 16208 15385.0 9635 57783
Percentage of the requests served within a certain time (ms)
50% 9635
66% 20591
75% 20629
80% 36357
90% 42518
95% 42538
98% 42556
99% 42560
100% 57783 (longest request)
If I run ab on a php file, it finishes sometimes, most of the time it won't and sometimes gets errors like
apr_socket_recv: Connection reset by peer (104)
and
socket: No buffer space available (105)
httpd.conf items:
Timeout 10
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 1
<IfModule prefork.c>
StartServers 3
MinSpareServers 5
MaxSpareServers 9
ServerLimit 40
MaxClients 40
MaxRequestsPerChild 5000
</IfModule>
Top... (CPU and Load 1min are very erratic during testing):
top - 10:44:51 up 11:50, 3 users, load average: 0.17, 0.42, 0.90
Tasks: 84 total, 2 running, 82 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.8%us, 3.1%sy, 0.0%ni, 94.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1793072k total, 743604k used, 1049468k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21831 mysql 18 0 506m 71m 6688 S 0.7 4.1 4:03.18 mysqld
1828 root 15 0 113m 52m 2052 S 0.0 3.0 0:02.85 spamd
1830 popuser 18 0 113m 51m 956 S 0.0 2.9 0:00.00 spamd
8012 apache 15 0 327m 35m 17m S 3.7 2.0 0:11.83 httpd
8041 apache 15 0 320m 28m 15m S 0.0 1.6 0:11.83 httpd
8022 apache 15 0 321m 27m 14m S 2.3 1.6 0:11.05 httpd
8033 apache 15 0 320m 27m 14m S 1.7 1.6 0:10.06 httpd
Is there something obvious that is wrong here? or what would be my next step in troubleshooting?
Sounds like you don't have enough memory -- 1GB isn't much when you're running PHP with prefork and MySQL on the same server. Your MaxClients should probably be 10-20, not 40.
A few weeks ago I wrote a script to tune Apache httpd that would probably help determine the maximum values for your server. You can find the weblog entry here http://surniaulula.com/2012/11/09/check-apache-httpd-mpm-config-limits/ and the script is on Google Code as well.
Enjoy!
js.