Apache requests waterfall - apache

In normal mode my apache mod status shows this:
CPU Usage: u118.45 s9.79 cu0 cs0 - 14% CPU load
7.96 requests/sec - 18.1 kB/second - 2331 B/request
1 requests currently being processed, 29 idle workers
._.._._..._.___...._____...._.____.._.._.._....___._._____W.....
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
....................................................
But sometimes (period about 1-3 minutes) my site has "lags", and the apache status looks like this:
CPU Usage: u222.29 s18.89 cu0 cs0 - 9.58% CPU load
7.77 requests/sec - 23.3 kB/second - 3064 B/request
20 requests currently being processed, 10 idle workers
WW.WW_W..WWC.._W.WCWW._._.W....WW_W..W__.W__...._...W...........
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
....................................................
During these moments, requests are usual, as at any time. But there are a lot of them.
I don't have any cron jobs that are this frequent.
Increasing prefork SpareServers mb will help, but i want to know why these waves are happening.
My current config is
Timeout 60
KeepAlive On
MaxKeepAliveRequests 200
KeepAliveTimeout 20
<IfModule prefork.c>
StartServers 100
MinSpareServers 10
MaxSpareServers 30
ServerLimit 500
MaxClients 500
MaxRequestsPerChild 10000
</IfModule>
The hardware is strong enough.
Sorry for my bad English.
Any advice would be helpful.

Related

Apache Server Many requests stuck in "R" Reading Request

below apache2ctl status with almost no users online.
For over 5 years we (cloud ERP supplier) deploy instances on Google Cloud with Apache with mod_perl.
This week our largest server became slow and unresponsive. No idle workers were available. It turned out increasing both MaxRequestWorkers and ServerLimit to 400 from 150 in mpm_prefork.conf got our server back fast.
Iā€™m wondering why many requests stay in "R" Reading Request, at least 10 times more requests then actually should be.
We did further checking, DoS does not seem to be the issue, as also other servers ā€“ in different clouds as ASW or Alibaba ā€“ we notice the same ratio of 10 between requests actually being processed (R/W/K) and requests that stay in Reading mode.
What could cause this?
sudo /usr/sbin/apache2ctl status
Apache Server Status for localhost (via 127.0.0.1)
Server Version: Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.29 OpenSSL/1.0.1f
mod_perl/2.0.8 Perl/v5.18.2
Server MPM: prefork
Server Built: Apr 3 2019 18:04:25
Current Time: Saturday, 29-Feb-2020 10:15:35 CET
Restart Time: Thursday, 27-Feb-2020 09:45:48 CET
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 2 days 29 minutes 47 seconds
Server load: 0.75 0.77 0.75
Total accesses: 1581181 - Total Traffic: 8.6 GB
CPU Usage: u30.32 s9.64 cu0 cs0 - .0229% CPU load
9.06 requests/sec - 51.5 kB/second - 5.7 kB/request
96 requests currently being processed, 9 idle workers
RRKRRRK_RKRKKRRRRRK_RRRRKRCK_RRRC_CKK_KCRKCRK_RCR__CKKCCRCRRRRRR
RRRRR.RRRKRRRKRRR_RR..R.K.RCRKR.CKK.RRKKR.W.RRKR.....RR.........
................................................................
................................................................
................................................................
................................................................
................
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process

Nginx https very high connect time and is much slower (32 times) than Nginx http & 12 times slower than Apache https

I have an Angular website with static assets of around 1.5 mb and gzipped it is around 400 kb, I have nginx as my webserver & reverse proxy to the API server, when I test nginx with Apache benchmark tool, I find huge drop in performance if I test the https site compared to the http (https is 10 times slower) & the cpu utilization & memory is not high at all (cpu 30% memory is only 1 mb!!)
I have been searching for hours & tried all possible enhancements but none worked, as far as I have read https shall not be that much slower on modern web servers (http around 1500 req/sec & https is 46 req/sec for nginx), this is mostly from the Nginx https very high connect time but I have no clue how to solve this.
Can someone advise how to improve this?
(Also to my surprise, Apache performs much better in both cases but doesn't respond if I set concurrent connections to more than 200) & this is not nginx vs apache I am just stating my situation.
Important note:
I am not comparing the 2 web servers that is not the point of this site, but generally they have comparable performance so if https in nginx is 10 times slower than Apache I feel that something is wrong in my Nginx configuration & I want to fix it.
All test are on my windows machine i7 & 16 gb ram.
Nginx http only:
C:\Apache24\bin>ab -n 5000 -c 200 http://localhost:8100/abc/index.html?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Server Software: nginx/1.15.4
Server Hostname: localhost
Server Port: 8100
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 3.246 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6665000 bytes
HTML transferred: 5495000 bytes
Requests per second: 1540.32 [#/sec] (mean)
Time per request: 129.843 [ms] (mean)
Time per request: 0.649 [ms] (mean, across all concurrent requests)
Transfer rate: 2005.12 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.3 0 16
Processing: 31 87 12.8 94 124
Waiting: 0 87 13.7 94 124
Total: 31 87 12.8 94 124
Percentage of the requests served within a certain time (ms)
50% 94
66% 94
75% 94
80% 94
90% 99
95% 109
98% 109
99% 113
100% 124 (longest request)
Nginx https (with http2 enabled)
C:\Apache24\bin>abs -n 5000 -c 200 https://localhost:8200/abc/index.html?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Server Software: nginx/1.15.4
Server Hostname: localhost
Server Port: 8200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: localhost
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 108.985 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6780000 bytes
HTML transferred: 5495000 bytes
Requests per second: 45.88 [#/sec] (mean)
Time per request: 4359.386 [ms] (mean)
Time per request: 21.797 [ms] (mean, across all concurrent requests)
Transfer rate: 60.75 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 16 4201 506.8 4251 4755
Processing: 0 32 12.6 31 88
Waiting: 0 32 12.6 31 88
Total: 62 4232 506.9 4283 4800
Percentage of the requests served within a certain time (ms)
50% 4283
66% 4342
75% 4413
80% 4439
90% 4484
95% 4547
98% 4694
99% 4727
100% 4800 (longest request)
Compared to Apache http (here CPU is around 90 to 100% utilized)
C:\Apache24\bin>ab -n 5000 -c 200 http://localhost:6200/abc/index.html?param=abc
Server Software: Apache/2.4.33
Server Hostname: localhost
Server Port: 6200
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 1.781 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6810000 bytes
HTML transferred: 5495000 bytes
Requests per second: 2806.99 [#/sec] (mean)
Time per request: 71.251 [ms] (mean)
Time per request: 0.356 [ms] (mean, across all concurrent requests)
Transfer rate: 3733.51 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.6 0 16
Processing: 16 69 16.0 63 125
Waiting: 0 57 16.0 63 125
Total: 16 69 16.0 63 125
Percentage of the requests served within a certain time (ms)
50% 63
66% 78
75% 78
80% 78
90% 94
95% 94
98% 94
99% 109
100% 125 (longest request)
And Apache https is as follows (http 1.1) & note that http 1.1 in nginx didn't improve its performance:
C:\Apache24\bin>abs -n 5000 -c 200 https://localhost:7200/abc/index.html?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Server Software: Apache/2.4.33
Server Hostname: localhost
Server Port: 7200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
TLS Server Name: localhost
Document Path: /abc/index.html?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 8.747 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6810000 bytes
HTML transferred: 5495000 bytes
Requests per second: 571.60 [#/sec] (mean)
Time per request: 349.894 [ms] (mean)
Time per request: 1.749 [ms] (mean, across all concurrent requests)
Transfer rate: 760.27 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 198 42.7 188 391
Processing: 62 145 39.1 140 385
Waiting: 0 76 28.3 78 250
Total: 62 343 63.0 331 615
Percentage of the requests served within a certain time (ms)
50% 331
66% 369
75% 380
80% 389
90% 422
95% 465
98% 500
99% 536
100% 615 (longest request)
My nginx configuration:
worker_processes auto;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
server {
listen 8100;
server_name localhost;
location / {
root html;
index index.html index.htm;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
server {
listen 8200 ssl http2;
server_name localhost;
ssl_certificate C:/nginx-1.13.12/conf/server.crt;
ssl_certificate_key C:/nginx-1.13.12/conf/server.key;
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 10m;
ssl_ciphers HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers on;
gzip on;
gzip_comp_level 1;
gzip_vary on;
gzip_types
text/css
text/javascript
text/xml
text/plain
text/x-component
application/javascript
application/json
application/xml
application/rss+xml
font/truetype
font/opentype
application/vnd.ms-fontobject
image/svg+xml;
gzip_static on;
location /ipo_reits/ {
root html;
index index.html index.htm;
## here we redirect to the homepage in case of nginx 404
try_files $uri $uri/ /ipo_reits/index.html;
# error_page 404 =301 /;
}
location /api/ {
proxy_pass https://localhost:7001/;
}
}
}
I hope that this will help someone else, It seems that is related to nginx on windows issue, I wrongly assumed that the performance of nginx on windows & linux is similar but clearly it is not.
I have tried the benchmark again with nginx on Linux on the same machine & got excellent performance as shown below
ab -n 5000 -c 200 https://localhost:8200/abc/index?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Finished 5000 requests
Server Software: nginx/1.10.3
Server Hostname: localhost
Server Port: 8200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Document Path: /abc/index?param=abc
Document Length: 1099 bytes
Concurrency Level: 200
Time taken for tests: 4.179 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 6825000 bytes
HTML transferred: 5495000 bytes
Requests per second: 1196.37 [#/sec] (mean)
Time per request: 167.173 [ms] (mean)
Time per request: 0.836 [ms] (mean, across all concurrent requests)
Transfer rate: 1594.77 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 15 141 185.3 106 1322
Processing: 1 22 13.1 20 82
Waiting: 1 14 9.5 13 81
Total: 24 163 185.7 128 1351
Percentage of the requests served within a certain time (ms)
50% 128
66% 142
75% 148
80% 155
90% 208
95% 260
98% 1100
99% 1164
100% 1351 (longest request)
Also for sustained higher load & concurrency, performance was still the same:
ab -n 25000 -c 1000 https://localhost:8200/abc/index?param=abc
This is ApacheBench, Version 2.3 <$Revision: 1706008 $>
Benchmarking localhost (be patient)
Completed 2500 requests
....
Completed 25000 requests
Finished 25000 requests
Server Software: nginx/1.10.3
Server Hostname: localhost
Server Port: 8200
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Document Path: /abc/index?param=abc
Document Length: 1099 bytes
Concurrency Level: 1000
Time taken for tests: 20.149 seconds
Complete requests: 25000
Failed requests: 0
Total transferred: 34125000 bytes
HTML transferred: 27475000 bytes
Requests per second: 1240.76 [#/sec] (mean)
Time per request: 805.960 [ms] (mean)
Time per request: 0.806 [ms] (mean, across all concurrent requests)
Transfer rate: 1653.94 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 7 687 711.8 492 7694
Processing: 2 89 50.1 81 516
Waiting: 0 57 48.9 41 509
Total: 15 776 723.4 600 7756
Percentage of the requests served within a certain time (ms)
50% 600
66% 812
75% 1095
80% 1186
90% 1397
95% 1631
98% 3183
99% 3442
100% 7756 (longest request)
Avoid Old Cipher Suites
HTTP/2 has a huge blacklist of old and insecure ciphers, so we must avoid them. Cipher suites are a bunch of cryptographic algorithms, which describe how the transferring data should be encrypted.
We will use a really popular cipher set, whose security was approved by Internet giants like CloudFlare. It does not allow the usage of MD5 encryption (which was known as insecure since 1996, but despite this fact, its use is widespread even to this day).
Open the following configuration file:
sudo nano /etc/nginx/nginx.conf
Add this line after ssl_prefer_server_ciphers on;.
/etc/nginx/nginx.conf
ssl_ciphers EECDH+CHACHA20:EECDH+AES128:RSA+AES128:EECDH+AES256:RSA+AES256:EECDH+3DES:RSA+3DES:!MD5;
Save the file, and exit the text editor.
Once again, check the configuration for syntax errors:
sudo nginx -t

Apache 2.4.10 hangs AH00485: scoreboard is full, not at MaxRequestWorkers

Apache server will stay up for random amount of time, usually days, but eventually enters a hung state. When hung the CPU load gradually spikes on the machine and new web server requests are unresponsive.
Error logs typically contain lots of these:
Wed Jan 28 16:06:58.667188 2015] [mpm_event:error] [pid 25336:tid 1] AH00485: scoreboard is full, not at MaxRequestWorkers
Environment:
LDOM (VM) SunOS myhostname 5.10 Generic_118833-36 sun4v sparc SUNW,Sun-Fire-T200
http Conf:
StartServers 8
MinSpareServers Not set
MaxSpareServers Not set
ServerLimit 256
MaxRequestWorkers 100
MaxConnectionsPerChild 1000
KeepAlive On
TimeOut 3000
MaxKeepAliveRequests 50
KeepAliveTimeout 2
Current non-hung Score Board:
Server Version: Apache/2.4.10 (Unix)
Server MPM: event
Server Built: Oct 30 2014 16:29:03
Current Time: Wednesday, 28-Jan-2015 10:59:39 PST
Restart Time: Wednesday, 28-Jan-2015 09:49:21 PST
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 hour 10 minutes 17 seconds
Server load: 0.60 0.46 0.41
Total accesses: 1134 - Total Traffic: 2.2 GB
CPU Usage: u9.07 s16.94 cu609.51 cs69.31 - 16.7% CPU load
.269 requests/sec - 0.5 MB/second - 2.0 MB/request
1 requests currently being processed, 99 idle workers
PID Connections Threads Async connections
total accepting busy idle writing keep-alive closing
25337 0 yes 1 24 0 0 0
25338 1 yes 0 25 1 0 0
25339 1 yes 0 25 0 0 1
25340 1 yes 0 25 0 0 1
Sum 3 1 99 1 0 2
Any thoughts on http conf tuning, OS patches, apache bug fixes appreciated.
Yes I have seen the open ASF bugzilla for the same error message.
This is a production server, so you can imagine, having it go down at random times (usually when I am asleep) is not fun!

Possible reasons for "mysql server has gone away" error (php 5.4, mysqlnd)

We recently upgraded one of our webservers from PHP 5.3 (Debian Squeeze package, using libmysqlclient and APC) to PHP 5.4 (Debian Wheezy, Dotdeb package, using mysqlnd, Opcache and APCu). After working fine for almost one day, we experienced "mysql server has gone away" errors for every request. All other servers with the same load which still run PHP 5.3 with libmysqlclient using the same MySQL server had no problem at all. On all servers we use:
max_execution_time = 60
default_socket_timeout = 60
On our PHP 5.3 servers we did not change any mysql/my.cnf timeouts. We know about problems with read_timeout (mysql), wait_timeout (mysql), default_socket_timeout (php) and max_execution_time (php), but only in context of batch scripts with long running queries. Our webservers usually respond in about 300ms, so those timeouts should not be an issue here.
It became really strange when we removed the server from loadbalancing, so there was no load anymore, but we still had 180 busy Apache processes. Even a apache2ctl graceful did not change anything, even hours later apache2ctl status said:
Apache Server Status for localhost
Server Version: Apache/2.2.22 (Debian)
Server Built: Jun 16 2014 03:51:14
__________________________________________________________________
Current Time: Tuesday, 22-Jul-2014 10:17:44 CEST
Restart Time: Monday, 21-Jul-2014 18:43:37 CEST
Parent Server Generation: 26
Server uptime: 15 hours 34 minutes 6 seconds
Total accesses: 596973 - Total Traffic: 1.6 GB
CPU Usage: u6288.72 s463.96 cu.01 cs0 - 12% CPU load
10.7 requests/sec - 30.8 kB/second - 2962 B/request
176 requests currently being processed, 99 idle workers
GGGGGG_GGGGGGGGG_GG_GGGGGGGGGGGGGGGGGGGG_GGGGGG_GGGGGGGG_GGGGGGG
GGGGGGGGGG_G_GGGGGGGG_G_GG__GGGGGG_GGGGG_GGG___GG_GGGGGGGG_G_GGG
GGGGGGGGGGGG_G_GG__GG_GGG_GGGGGGGGG__GGG_GGG_G_G_GG_G_GGGGGGGGGG
GGG_GGG_GG_GGG_GG_G_GGG_______________.___._W___________________
____.___________.______.........................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
................................................................
....
Scoreboard Key:
"_" Waiting for Connection, "S" Starting up, "R" Reading Request,
"W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup,
"C" Closing connection, "L" Logging, "G" Gracefully finishing,
"I" Idle cleanup of worker, "." Open slot with no current process
Only apache2ctl restart solved the issue and everything worked fine again. The MySQL error is the only "useful" error message we found so far.
Could it be an issue with mysqlnd, opcache or apcu and PHP 5.4.30? Are there any known problems which could result in the behavior we have experienced?
Or do you have an idea how to debug the "mysql server has gone away" issue?
We probably found out why it comes to the "mysql server has gone away" error: On the MySQL server we configured a wait_timeout of 30 seconds, which is less than 60 seconds max_execution_time. So under certain conditions something seems to take more than 30 seconds while we are reading a result-set from MySQL, so the server closes the connection while we are still trying to get data from the server. That leads us to the next questions:
What function consumes so much time while we are in a loop reading a result-set from mysql?
Why does apache2ctl graceful not restart the Apache Processes, even when max_execution_time should abort the scripts after 60 seconds?
I think the answer to both questions is a bug in APCu. Because if I look at the hanging Apache childs, I get FUTEX_WAIT from strace:
[pid 28354] futex(0x7f3a8c3d2094, FUTEX_WAIT, 69, NULL <unfinished ...>
If I look at such a process using gdb it seems to hang at pthread_rwlock_wrlock(), what I get is:
0x00007f3adcd18abd in pthread_rwlock_wrlock () from /lib/x86_64-linux-gnu/libpthread.so.0
OK, pthread_rwlock is used for locking in APCu, and problems with locking mechanism is a really good explanation for what we see here, in that we definitely have code which reads/writes to APCu inside of loops through MySQL result-sets, and if there is a problem with locking (which already has been a problem for us in the past with APC as well) it could take >30 and <60 seconds so the MySQL error is what we see. And after that situation something with APCu goes really wrong, so the php script can't be aborted by max_execution_time and not be restarted by apache2ctl graceful anymore.
In APCu issue tracker I could find very similar issues:
https://github.com/krakjoe/apcu/issues/19
But we found another hint. The crash always happens, when there are about 70k keys in APCu, and it does not depend on apc.shm_size, but we found out that our APCu monitoring script produces "PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 78 bytes)" errors when calling apcu_cache_info() at line 47 at the same time when we see the crash. So we have to look into the script why it consumes so much memory, AFAIR id read's all the data for calculating memory fragmentation, perhaps we should remove that part...
But we had a lot of problems with APC in the past, we switched to APCu/Opcache only because we got seg faults with latest APC and PHP 5.4.30 and the issue mentioned above is open for one year now. We are happy to see recent activity on yac, perhaps lockless is a more stable option. If we can't fix by removing problems from our monitoring script, we will switch to local memcached instances, it will be slower but we know it's very stable.

Apache performance tuning with 1 GB with httpd.conf

I have a 1 GB VPS and Apache slows to a crawl almost from start up. I ran ApacheBench on a static.html file and things don't differ. However, the site will have both MySQL and PHP and a high volume of AJAX requests, so I'd like to tune for that.
When I restart, error logs show this almost immediately:
[error] server reached MaxClients setting, consider raising the MaxClients setting
ab -n 1000 -c 1000
shows:
Document Path: /static.html
Document Length: 7 bytes
Concurrency Level: 1000
Time taken for tests: 57.784 seconds
Complete requests: 1000
Failed requests: 64
(Connect: 0, Receive: 0, Length: 64, Exceptions: 0)
Write errors: 0
Total transferred: 309816 bytes
HTML transferred: 6552 bytes
Requests per second: 17.31 [#/sec] (mean)
Time per request: 57784.327 [ms] (mean)
Time per request: 57.784 [ms] (mean, across all concurrent requests)
Transfer rate: 5.24 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 25 13.4 25 48
Processing: 1070 16183 15379.4 9601 57737
Waiting: 0 14205 15176.5 9591 42516
Total: 1070 16208 15385.0 9635 57783
Percentage of the requests served within a certain time (ms)
50% 9635
66% 20591
75% 20629
80% 36357
90% 42518
95% 42538
98% 42556
99% 42560
100% 57783 (longest request)
If I run ab on a php file, it finishes sometimes, most of the time it won't and sometimes gets errors like
apr_socket_recv: Connection reset by peer (104)
and
socket: No buffer space available (105)
httpd.conf items:
Timeout 10
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 1
<IfModule prefork.c>
StartServers 3
MinSpareServers 5
MaxSpareServers 9
ServerLimit 40
MaxClients 40
MaxRequestsPerChild 5000
</IfModule>
Top... (CPU and Load 1min are very erratic during testing):
top - 10:44:51 up 11:50, 3 users, load average: 0.17, 0.42, 0.90
Tasks: 84 total, 2 running, 82 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.8%us, 3.1%sy, 0.0%ni, 94.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1793072k total, 743604k used, 1049468k free, 0k buffers
Swap: 0k total, 0k used, 0k free, 0k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
21831 mysql 18 0 506m 71m 6688 S 0.7 4.1 4:03.18 mysqld
1828 root 15 0 113m 52m 2052 S 0.0 3.0 0:02.85 spamd
1830 popuser 18 0 113m 51m 956 S 0.0 2.9 0:00.00 spamd
8012 apache 15 0 327m 35m 17m S 3.7 2.0 0:11.83 httpd
8041 apache 15 0 320m 28m 15m S 0.0 1.6 0:11.83 httpd
8022 apache 15 0 321m 27m 14m S 2.3 1.6 0:11.05 httpd
8033 apache 15 0 320m 27m 14m S 1.7 1.6 0:10.06 httpd
Is there something obvious that is wrong here? or what would be my next step in troubleshooting?
Sounds like you don't have enough memory -- 1GB isn't much when you're running PHP with prefork and MySQL on the same server. Your MaxClients should probably be 10-20, not 40.
A few weeks ago I wrote a script to tune Apache httpd that would probably help determine the maximum values for your server. You can find the weblog entry here http://surniaulula.com/2012/11/09/check-apache-httpd-mpm-config-limits/ and the script is on Google Code as well.
Enjoy!
js.