Server configuration for high traffic website - apache

I'm managing a hosting server and one of my customers will launch a high traffic PHP website. It's a penny auction website and we expect between 25k and 30k visitors per day.
Can you tell me please what should I change in my server configuration (PHP and Apache) to avoid problems? I'm afraid that the server crash with a large number of visitors.
Thank you

Using a lighter web server like nginx as a reverse proxy and a static content server should keep the Apache memory and CPU usage to a minimum which will be a problem on larger sites.
APC as an opcode cache will also be useful in a large site because compiling the PHP scripts to opcode is expensive.
Which Apache forking model are you using for the server? Event and Worker MPM's will probably work better for larger sites with higher concurrent connections.
How is PHP setup within Apache, i.e. FastCGI/CGI/DSO/SuPHP/FPM? SuPHP will be slowest while FastCGI, FPM and DSO will give you much better performance and allow you to use opcode caches.
If you don't need SSL support on the site a free service like https://www.cloudflare.com/ will also lessen the load on your servers.

You could put an opcode cache into use, eAccelerator is a good one for this purpose.
You may also want to consider creating Apache vHosts for static content like images/CSS/javascript to be served from. If these can be put into a CDN, then even better.
There are other tools available for benchmarking, including the Apache benchmarking tool "ab". You can use this to stress-test your site.
There are several areas in which tuning can take place, not just PHP.

Related

Lighttpd instead of Apache

No doubt Apache is the most popular web server to use with PHP and definately it works great. However I'm curious to know what are advantages (if any) to use Lighttpd instead of Apache.
Thanks.
Theoretically, because of a smaller footprint Lighttpd should allow more users to visit site at the same time using exactly the same resources as Apache would.
As example (just to prove the point, this is not the real numbers)
On the same hardware Apache would allow 100 users to view your page at the same time,
while Lighttpd would allow 150.
Lighttpd also has a different scheme of mapping processes, so it would serve better when the number of visitors is spiking.
Every server, and webpage is written differently, so it is very hard to predict how each of these servers would perform on :
a) your specific hardware, it is good to contact your hosting company and ask what they advice to use on their hardware
b) your software, Plesk of CPanel would perform differently than clean Apache of Lighttpd installation
c) Your site content, site with a lot of pictures has different fingerprint than server which serves video.
d) Your processor cores
They claim to scale better as their main advantage http://www.lighttpd.net/benchmark/

nginx/apache/php vs nginx/php

I currently have one server with nginx that reverse_proxy to apache (same server) for processing php requests. I'm wondering if I drop apache so I'd run nginx/fastcgi to php if I'd see any sort of performance increases. I'm assuming I would since Apache's pretty bloated up, but at the same time I'm not sure how reliable fastcgi/php is especially in high traffic situations.
My sites gets around 200,000 unique visitors a month, with around 6,000,000 page crawls from the search engines monthly. This number is steadily increasing so I'm looking at perfomrance options.
My site is very optimized code wise and there isn't any caching (don't want that either), each page has a max of 2 sql queries without any joins on other tables, indexes are perfect as well.
In a year or so I'll be rewriting everything to use ClearSilver for the templates, and then probably use python or else c++ for extreme performance.
I suppose I'm more or less looking for any advice from anyone who is familiar with nginx/fastcgi and if willing to provide some benchmarks. My sites are one server with 1 quad core xeon, 8gb ram, 150gb velociraptor drive.
nginx will definitely work faster than Apache. I can't tell about fastcgi since I never used it with nginx but this solution seems to make more sense on several servers (one for static contents and one for fastcgi/PHP).
If you are really targeting performance -and even consider C/C++- then you should give a try to G-WAN, an all-in-one server which provides (very fast) C scripts.
Not only G-WAN has a ridiculously small memory footprint (120 KB) but it scales like nothing else. There's work ahead of you if you migrate from PHP, but you can start with the performance-critical tasks and migrate progressively.
We have made the jump and cannot consider to go back to Apache!
Here is a chart showing the respective performances of nginx, apache and g-wan:
g-wan.com/imgs/gwan-lighttpd-nginx-cherokee.png
apache does not seem to lead the pack (and that's a -Quad XEON # 3GHz).
Here is an independent benchmark for g-wan vs nginx, varnish and others http://nbonvin.wordpress.com/2011/03/14/apache-vs-nginx-vs-varnish-vs-gwan/
g-wan handles much more requests per second with much less CPU time.
NGINX is the best choice as a webserver now a days.
The main difference between Apache and NGINX lies in their design
architecture. Apache uses a process-driven approach and creates a
new thread for each request. Whereas NGINX uses an event-driven
architecture to handle multiple requests within one thread.
As far as Static content is concerned, Nginx overpasses Apache.
Both are great at processing Dynamic content.
Apache runs on all operating systems such as UNIX, Linux or BSD and
has full support for Microsoft Windows & NGINX also runs on several
modern Unix-like systems and has support for Windows, but its
performance on Windows is not as stable as that on UNIX platforms.
Apache allows additional configuration on a per-directory basis via
.htaccess files. Where Nginx doesn’t allow additional configuration.
Request Interpretation-Apache pass file System location. Nginx
Passes URI to interpret requests.
Apache have 60 official dynamically loadable modules that can be
turned On/Off.Nginx have 3rd Party core modules (not dynamically
loadable).NGINX provides all of the core features of a web server,
without sacrificing the lightweight and high-performance qualities
that have made it successful.
Apache Supports customization of web server through dynamic modules.
Nginx is not flexible enough to support dynamic modules and loading.
Apache makes sure that all the website that runs on its server are
safe from any harm and hackers. Apache offers configuration tips for
DDoS attack handling, as well as the mod_evasive module for
responding to HTTP DoS, DDoS, or brute force attacks.
When Choose Apache over NGINX?
When needs .htaccess files, you can override system-wide settings on
a per-directory basis.
In a shared hosting environment, Apache works better because of its
.htaccess configuration.
In case of functionality limitations – use Apache
When Choose NGINX over Apache?
Fast Static Content Processing
Great for High Traffic Websites
When Use Both of them -Together
User can use Nginx in front of Apache as a server proxy.

At enterprise level, is Apache Tomcat used standalone or with Apache server?

Which one of these two is most commonly used scenario? I want to use the same scenario in my learning process. thanks.
Don't know about the rest of the industry, but where I work we have Apache HTTPD front-ending for Tomcat.
Any static content is directly provided by HTTPD for performance. Pain in the neck to separate every app out, but there is a noticeable payoff.
Also, HTTPD has some nice code for cookie handling, URL rewriting, clustering and so on.
Only if we determine that there's dynamic, database-bound data to show do we forward to Tomcat, which does an admirable job there.
Has been working well for us for almost a decade. Others too, I would wager.

Why use Apache over NGINX/Cherokee/Lighttpd?

Apache has been the de facto standard web server for over a decade, but recent years have brought us web servers that consume less RAM and handle many more requests per second using fewer threads and asynchronous i/o. In my opinion, I also find the configuration of these servers to be more straightforward and minimal.
Why do people use Apache when asynchronous servers are so much more lightweight? Is there any clear benefit?
Ubiquity, "good enough", and familiarity.
Apache's .htaccess provides flexible configuration. This allows users on a shared host to customize certain settings of an apache without having to alter the core apache configs.
It is the standard server bundled in typical LAMP setups, although, many services use other web servers for in conjunction (like static files, video streaming, etc.).
Since Apache is popular, it's easy to find a solution to any problems.
Other than that, other solutions would probably be better.
Apache IS asynchronous if you want it to be with the Event MPM. Unlike Nginx and Cherokee, etc., it is not the default.
Apache's made some important moves in 2.4 so it can be more competitive — esp. as it pertains to serving static requests using the Event MPM. Various benchmarks don't speak well of this, but:
It's very difficult to ascertain how much slower Apache is in 2.4 because Apache's out of the 'box' configuration is detrimental to performance and legacy holds it back in some respects. For example, .htaccess requires stat/reading a multitude of files on every request, which may include many rules and regexes. Nginx doesn't have this problem, nor does Cherokee. Litespeed has .htaccess support in Apache's style, but only if you pay for it. Most benchmarks don't turn off features like those.
Most of the benchmarks are also ridiculous in that they're run locally and over a GbE network or similar. A real web server has to cope well with various speeds, including 3G phones. It could be that Apache's performance is better in the real world.
I doubt it.
Nginx is still faster, and I might choose it, but Apache isn't asleep.

apache + lighttpd front-proxy concept

In order to lighten Apache's load people often suggest using lighttpd to serve up static content.
e.g. http://www.linux.com/feature/51673
In this setup Apache passes requests for static content back to lighttpd via mod_proxy, while serving dynamic requests itself.
My question is: how does this reduce the load on the server? Since you still have an apache process spawned for every request that comes in, how does this positively impact the load? From what I can see the size of the Apache process proxying its request through lighttpd is as large as it would be if it were serving the file itself.
Running Lighttpd behind Apache to serve static files certainly seems braindead to me. Apache still has to unpack the HTTP packets and parse the request through its parse tree, send proxy requests, and then Lighttpd has to re-unpack, hit the filesystem and send the files back through Apache. I've never heard of anyone using a setup like this in production.
What you will see, is people using a lightweight webserver like Nginx as a frontend server to serve static files and proxy dynamic URLs to Apache. Or, you can run Varnish or Squid as a caching reverse proxy frontend, so that all your high-traffic static files (i.e. images, CSS etc. and any dynamic pages you're willing to send cache-friendly headers for) are served out of memory.
Apache can also be optimized to serve static files -- so often when I hear people complain about Apache, they really don't know how to configure it. They've only ever used the prefork MPM (vs. threaded or worker) and have all sorts of modules enabled (usually they're running from a Linux distribution's kitchen-sink Apache package that builds everything as modules and defaults to enabling 10-20 modules or more). Tune Apache by turning off unneeded modules/stupid features like support for .htaccess (which makes Apache scan the filesystem on every request!) first. (You can also run two instances of Apache, with a "light" Apache as frontend that proxies to a "heavy" Apache for dynamic requests ... maybe your frontend is threaded but your backend is prefork because you have to run thread-unsafe external modules like mod_php.)
Re:
Since you still have an apache process
spawned for every request that comes
in, how does this positively impact
the load? From what I can see the size
of the Apache process proxying its
request through lighttpd is as large
as it would be if it were serving the
file itself.
If you're spawning processes on every request, then that means you're using the prefork MPM. Keep in mind that when the OS reports memory usage for each of these processes, not all that memory is wired, a lot of those processes are idle. And when you're talking about speed, you're concerned more with request parsing and internal code branches for a given request (how much processing is the server doing?) than with memory usage reported by the OS.
For example, if you enable something like mod_php, then each of those worker processes is going to instantly go up by about 20-40M (depending on what's enabled in your PHP interpreter), but that doesn't mean Apache is using that memory on static requests. Of course if you're optimizing your server for maximum concurrency on small static files, then enabling mod_php would still be very bad, you're not going to be able to fit nearly as many prefork processes into RAM.
I probably could come up with a "nightmare configuration" for Apache that would make it actually slower serving static files than proxying those requests to a backend Lighttpd, but it would involve enabling expensive features like .htaccess in Apache that are disabled in Lighttpd, so it wouldn't really be fair.
If you still have the power to serve static and dynamic content from the same machine (as they in your referenced article do), then I really see no point in that setup.
Maybe it does reduce the Load of Apache, because it doesn't have to do IO to the disk, but it will increase the Load of Lighttpd on the same machine and thus reducing the available load to apache ...
Maybe Lighttpd IO access is lighter, than that of Apache 1.3, but why not just switch to Apache 2 or Lighttpd completely? And if the performance really start to suck, host the static files on another machine (media.yourdomain.com).
I small introduction to how you can make a performant setup is found here:
Deploying Django -> scroll to Scaling some page before the end
I don't know much about internal workings of Apache, but one explanation I've seen is about memory pressure. In short, Apache tries to balance the memory it uses for caching and for dynamic pages; but usually ends up with too much cache and too little for apps. If you separate them to different processes, each one will optimize for the kind of load.
Currently, what I'm doing is using nginx as front end. It's really fast and light, and specifically designed as a frontend proxy; but also serves static files. In fact, since it can also call FastCGI processes, you could get rid of Apache and still get the benefits of split file/app processes. (and there's some extra memcached magic that looks absolutely genius)
(Yes, lighttpd can also be used as frontend to Apache and/or FastCGI)
You don't have an Apache process spawned for each request - static files (images and the like) are fetched directly by lighttpd.
Use Apache MPM Worker fastcgi this will lower you server memory usage. MPM worker serves static content better then Prefork and is nearly on par with lighttpd when it comes to static content.