I have a LAMP server (Quad Core Debian with 4GB RAM, Apache 2.2 and PHP 5.3) with Rackspace which is used as an API Server. I would like to know what is the best KeepAlive option for Apache given our setup.
The API server hosts a single PHP file which responds with plain JSON. This is a fairly hefty file which performs some MySql reads/writes and quite a few Memcache lookups.
We have about 90 clients that are logged into the system at any one time.
Roughly 1/3rd of clients would be idle.
Of the active clients (roughly 60) they send a request to the API every 3 seconds.
Clients switch from active to idle and vice versa every 15 or 20 minutes or so.
With KeepAlive On, the server goes nuts and memory peaks at close to 4GB (swap is engaged etc).
With KeepAlive Off, the memory sits at 3GB however I notice that Apache is constantly killing and creating new processes to handle each connection.
So, my three options are:
KeepAlive On and KeepAliveTimeout Default - In this case I guess I will just need to get more RAM.
KeepAlive On and KeepAliveTimeout Low (perhaps 10 seconds?) If KeepAliveTimeout is set at 10 seconds, will a client maintain a constant connection to that one process by accessing the resource at regular 3 second intervals? When that client becomes idle for longer than 10 seconds will the process then be killed? If so I guess option 2 looks like the best one to go for?
KeepAlive Off This is clearly best for RAM, but will it have an impact on the response times due to the work involved in setting up a new process for each request?
Which option is best?
It looks like your php script is leaking memory. Before making them long running processes you should get to grips with that.
If you have not a good idea of the memory usage per request and from request to request adding memory is not a real solution. It might help for now and break again next week.
I would keep running separate processes till memory management is under control. If you have response problems currently your best bet is add another server to spread load.
The very first thing you should be checking is whether the clients are actually using the keepalive functioality at all. I'm not sure what you mean by an 'API server' but if its some sort of webservice then (IME) its rather difficult to implement well behaved clients using keepalives.(See %k directive for mod_log_config).
ALso, we really need to know what your objectives and constraints are? Performance / capacity / low cost?
Is this running over HTTP or HTTPS - there's a big difference in latency.
I'd have said that a keeplive time of 10 seconds is ridiculously high - not low at all.
Even if you've got 90 clients holding connections open, 4Gb seems a rather large amount of memory for them to be using - I'e run systems with 150-200 concurrent connections to complex PHP scripts using approx 0.5Gb over resting usage. Your figures of 250 + 90 x 20M only gives you a footprint of about 2Gb (I know is not that simple - but its not much more complicated).
For the figures you've given I wouldn't expect any benefit - but a significantly bigger memory footprint - using anything over 5 seconds for the keepalive. You could probably use a keepalive time of 2 seconds without any significant loss of throughput, But there's no substitute for measuring the effectiveness of various configs - and analysing the data to find the optimal config.
Certainly if you find that your clients are able to take advantage of keepalives and get a measurable benefit from doing so then you need to find the best way of accomodating that. Using a threaded server might help a little with memory usage, but you'll probably find a lot more benefit in running a reverse proxy in front of the webserver - particularly which SSL.
Besides that you may get significant benefits through normal tuning - code profiling, output compression etc.
Instead of managing the KeepAlive settings, which clearly have no real advantage in your particular situation between the 3 options, you should consider switching the Apache to an event or a thread based MPM where you could easily use KeepAlive On and set the Timeout value high.
I would go as far as also considering the switch to Apache on Windows. The benefit here is that it's MPM is completely thread based and takes advantage of Windows preference for threads over processes. You can easily do 512 threads with KeepAlive On and Timeout of 3-10 seconds on 1-2GB of RAM.
WampDeveloper Pro -
Xampp -
WampServer
Otherwise, your only other options are to switch MPM from Prefork to Worker...
http://httpd.apache.org/docs/2.2/mod/worker.html
Or to Event (which also got better with Apache 2.4)...
http://httpd.apache.org/docs/2.2/mod/event.html
Related
For this test, I have a simple Java servlet that reads data in and calculates the CRC32 for it. When making serial requests of 512MB each, I get about 600MB/sec. That makes sense since I can't use all 24 cores available to me to calculate a CRC. The program driving this I/O is sitting on the local box to eliminate the possibility of networking issues. I am running Tomcat 8.0.24.0 on FreeBSD using OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode).
Next, I attempt the same test with 6 concurrent requests, expecting that the performance per request might be lower than 600MB/sec, but that the aggregate performance across all 6 requests would be significantly higher.
What I see is the CPU has some idle time at ALL times (so it doesn't appear that I'm CPU-bound). I also see that all processing threads in Tomcat are running concurrently as anticipated. However, it looks like I'm only getting around 800MB/sec in aggregate. The threads in Tomcat spend most of their time waiting to read from the socket, as shown below.
I would appreciate any thoughts on how to improve Tomcat throughput / why so much time is spent waiting for more data (which I assume is what's going on below).
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1386)
at org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1388)
at org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:185)
at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:251)
at org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:232)
at org.apache.coyote.http11.InternalNioInputBuffer.fill(InternalNioInputBuffer.java:133)
at org.apache.coyote.http11.InternalNioInputBuffer$SocketInputBuffer.doRead(InternalNioInputBuffer.java:177)
at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:110)
at org.apache.coyote.http11.AbstractInputBuffer.doRead(AbstractInputBuffer.java:416)
at org.apache.coyote.Request.doRead(Request.java:469)
at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:342)
at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:395)
at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:367)
at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:190)
...
Varnish has parameter sess_timeout(docs here), by default it is set to 5 seconds. Which means that after 5 seconds the session will be closed, and next page load will require will require extra 100ms (in average) to connect to server (I've described this issue here).
Why this parameter is so low by default?
If I increase it to 60 seconds, will it cause any problems on the server?
Does it matter what do I use behind the Varnish - nginx or apache? Or varnish optimizes the connections by itself?
What's the recommended value for average website (e.g. Magento store with 500 active users at a time)?
sess_timeout is tuned to avoid keeping state around when it is not needed. Worker threads are (in high traffic situations) a precious resource, and having one waiting around doing nothing isn't productive.
For all HTTP clients I know, manual netcat/telnet excluded, it does not take 5s to push through the 100-150 byte long HTTP request.
You can safely increase this to 60s if you feel you need to. If you are using this for long-running connections, you should probably use return(pipe) instead; different timers apply there.
I am traying to reduce memory usage by Apache on the server.
My actual Max Connections Per Child is 10k
According to the following recommendation
the Max Connections Per Child should be reduced to 1000
http://www.lophost.com/tutorials/how-to-reduce-high-memory-usage-by-apache-httpd-on-a-cpanel-server/
What is the recommended max value for Max Connections Per Child in Apache configuration?
The only time when this directive affects anything is when your Apache workers are leaking memory. One way this happens is that memory is allocated (via malloc() or whatever) and never freed. It's the result of design/implementation flaws in Apache or its modules.
This directive is somewhat of a hack, really -- but if there's some module that's loaded into Apache that leaks, say, 8 bytes every request, then after a lot of requests, you'll run out of memory. So the quick fix is to just kill the process every MaxConnectionsPerChild requests and start a new one.
This will only affect your memory usage if you see it gradually increase over the span of lots of requests when setting MaxConnectionsPerChild to zero.
The default is 0 (which implies no maximum connections per child) so unless you have memory leakage I'm unaware of any need to change this setting - I agree with Hut8.
Sharing here FYI from the Apache 2.4 Performance Tuning page:
Related to process creation is process death induced by the MaxConnectionsPerChild setting. By default this is 0, which means that there is no limit to the number of connections handled per child. If your configuration currently has this set to some very low number, such as 30, you may want to bump this up significantly. If you are running SunOS or an old version of Solaris, limit this to 10000 or so because of memory leaks.
And from the Apache 2.4 docs on MaxConnectionsPerChild:
Setting MaxConnectionsPerChild to a non-zero value limits the amount of memory that process can consume by (accidental) memory leakage.
I'm pretty new to all of this but I'm OCD about optimization.
I'm trying to optimize my web server running a LEMP setup for wordpress.
I'm using WP hypercache instead of w3 total cache as it seems to perform phenomenaly in comparison with my setup
I'm using blitz.io to test and throw 450 users at the domain for 60 seconds starting with the full 450.
This is my results:
Spike at 5 sec is errors and timeouts
http://i.imgur.com/CdpBz.png
htop during the spike:
http://i.imgur.com/OhEyS.png
It's a vps w/ 2 cpu at 2.5Ghz and 2.5GB memory, as you can see memory usage is low.
nginx: worker_processes 1; worker_connections 1024;
php-fpm: dynamic, pm.max_children = 10, pm.start_servers = 2, pm.max_spare_servers = 2, ;pm.max_requests = 500 default value = 0
I've increased the nginx worker_processes to 2 with no change, and I've messed with my php-fpm settings with no change. Any ideas what I should be looking at?
This does not look too bad. ~40 timeouts out of 19k requests is normal. I got similar results.
As for tuning:
look into http://wiki.nginx.org/HttpFastcgiModule#fastcgi_cache - using this avoids touching php at all and nginx does all the caching. You can also look at batcache (http://evansolomon.me/notes/faster-wordpress-multisite-nginx-batcache/)
look into apc/memcached for object caching. this makes the non-cached requests faster and also the backend is more responsive. apc also reduces the memory footprint of php. For day to day use this makes more of a difference. This also helps if a lot of your requests are not cacheable (e.g. lot's of new comments).
consider using php5.4 it's a lot of faster and requires less memory
enable the mysql query cache. http://mysqltuner.com is a nice little script to configure your server.
Measuring peak transfers is not a good indicator for scalability most of the time. real users behave probably different.
edit: try blitz.io on a static nginx page. If there are still timeouts the problem is probably at blitz.io or somehwere else. Also activate gzip compression for your pages.
The only time we notice this value appears to be when the service crashes because the value is too low. The quick way to fix this is to set it to some very large number. Then no problem.
What I was wondering about is are there any negative consiquences to setting this value high?
I can see that it can potentially give some protection from a denial of service attack, but does it have any other function?
It helps limit the strain on your WCF server. If you allow 1'000 connections, and each connection is allowed to send you 1 MB of data - you potentially need 1 GB of RAM in your server - or a lot of swapping / trashing might occur.
The limit on the message size (and the limit on the concurrent connections / calls) helps keep that RAM usage (and also CPU usage) to a manageable level.
It also allows you to scale, depending on your server. If you have a one-core CPU and 4 GB or RAM, you probably won't be able to handle quite as much traffic as if you have a 16-way CPU and 32 GB of RAM or more. With the various settings, including the MaxReceivedMessageSize, you can tweak your WCF environment to the capabilities of your underlying hardware.
And of course, as you already mention: many settings in WCF are kept OFF or set to a low value specifically to thwart malicious users from flooding your server with DoS attacks and shutting it down.