How much memory and/or other resources does Apache web server use?
How much more are lightweight servers efficient?
Say appache vs. Mongoose Web Server
Neil Butterworth you out there?
Thanks.
Yes, lightweight servers are more efficient with memory and resources, as the term 'lightweight' would indicate. nginx is a popular one.
Apache's memory and resource usage depends a lot on what you're doing with it - which modules are loaded, what your PHP etc. scripts are doing. There's no single answer.
You have to take into account your specific task, and also the fact that almost every web server has some sort of specialization (a niche).
Apache is configurable and stable.
nginx is extremely fast, but works only with static context.
lighttpd is small, fast and does both static and dynamic context.
Mongoose is embeddable, small and easy to use.
There are many more web servers, I won't go through the whole list here. You need to decide which features do you require for your task, and make a choice accordingly.
Apache Httpd is great if you need lots of flexibility that is provided via various mods. If you're looking for straight-up file serving or proxying, then some lightweight options might be better. I manage the Maven Central repo that gets millions of hits a day and I have some experience with Nginx.
Related
I am working on a project which has following requirements:
Perform sticky based load balancing(based on SOAP session ID) onto multiple backend servers.
Possibility to plugin my own custom based load balancer.
Easy to write and deploy.
A central configuration file(Possibly an XML), to take care of all the backend servers.
Easy extraction of a node from this configuration file(Possibly with xpath).
I tried working with camel for a while but, wasn't able to do perform certain task with it.
So thought of giving a try to Akka.
Will akka be possibly able to satisfy the above requirements?
If so is there a load balancing example in akka or proxy example?
Would really appreciate some feedbacks.
You can do everything you've described with Akka.
You don't mention what language you're working with, Scala or Java. I've included links to the Scala documentation.
Before you do anything with Akka you HAVE TO read the documentation and understand how Akka works.
http://doc.akka.io/docs/akka/2.0.3/
Doing so, you'll find Akka is perfect for the project you've described with some minor caveats.
Once you read the documentation the following answers should make a lot of sense.
Perform sticky based load balancing(based on SOAP session ID) onto multiple backend servers.
Load balancing is already part of the framework (it's called Routing in Akka http://doc.akka.io/docs/akka/2.0.3/scala/routing.html) and Remoting (http://doc.akka.io/docs/akka/2.0.3/scala/remoting.html) will take care of the backend servers. You can easily combine the two.
To my knowledge the idea of sticky load balancing is not a part of Akka but I can envision this being accomplished with a Map using the session ID as the key and the Actor name (or path) as the value. A quick actorFor will take care of the rest. Not well thought out but should give you a good idea of where to start.
Possibility to plugin my own custom based load balancer.
Refer to the Routing documentation.
Easy to write and deploy.
This depends on your aptitude and effort but after you read certain parts of the documentation you should be build a proof of concept in a couple of hours.
Deployment can be a bit frustrating mostly because the documentation isn't really great with respect to deploying Akka networks with remote components. However, there are enough examples on the web that you can figure out how to get it done...eventually. Once you do it once it's no big deal.
A central configuration file(Possibly an XML), to take care of all the backend servers.
Akka uses Typesafe Config (https://github.com/typesafehub/config) which is a lot easier to work with than XML (but I hate XML so take that with a grain of salt). As far as a central configuration, I'm not sure what you're trying to accomplish but it sounds like something that can be solved using remote actor creation. Again, see the Remoting documentation.
Easy extraction of a node from this configuration file(Possibly with xpath).
Akka provides a lookup method .actorFor. There's no need to go to the configuration file once the system is up and running.
If so is there a load balancing example in akka or proxy example?
Google is your friend.
Apache httpd has done me well over the years, just rock solid and highly performant in a legacy custom LAMP stack application I've been maintaining (read: trying to escape from)
My LAMP stack days are now numbered and am moving on to the wonderful world of polyglot:
1) Scala REST framework on Jetty 8 (on the fence between Spray & Scalatra)
2) Load balancer/Static file server: Apache Httpd, Nginx, or ?
3) MySQL via ScalaQuery
4) Client-side: jQuery, Backbone, 320 & up or Twitter Bootstrap
Option #2 is the focus of this question. The benchmarks I have seen indicate that Nginx, Lighthttpd, G-WAN (in particular) and friends blow away Apache in terms of performance, but this blowing away appears to manifest more in high-load scenarios where the web server is handling many simultaneous connections. Given that our server does max 100gb bandwidth per month and average load is around 0.10, the high-load scenario is clearly not at play.
Basically I need the connection to the application server (Jetty) and static file delivery by the web server to be both reliable and fast. Finally, the web server should double duty as a load balancer for the application server (SSL not required, server lives behind an ASA). I am not sure how fast Apache Httpd is compared to the alternatives, but it's proven, road warrior tested software.
So, if I roll with Nginx or other Apache alternative, will there be any difference whatsoever in terms of visible performance? I assume not, but in the interest of achieving near instant page loads, putting the question out there ;-)
if I roll with Nginx or other Apache alternative, will there be any difference whatsoever in terms of visible performance?
Yes, mostly in terms of latency.
According to Google (who might know a thing or tow about latency), latency is important both for the user experience, high search-engine rankings, and to survive high loads (success, script kiddies, real attacks, etc.).
But scaling on multicore and/or using less RAM and CPU resources cannot hurt - and that's the purpose of these Web server alternatives.
The benchmarks I have seen indicate that Nginx, Lighthttpd, G-WAN (in particular) and friends blow away Apache in terms of performance, but this blowing away appears to manifest more in high-load scenarios where the web server is handling many simultaneous connections
The benchmarks show that even at low numbers of clients, some servers are faster than others: here are compared Apache 2.4, Nginx, Lighttpd, Varnish, Litespeed, Cherokee and G-WAN.
Since this test has been made by someone independent from the authors of those servers, these tests (made with virtualization and 1,2,4,8 CPU Cores) have clear value.
There will be a massive difference. Nginx wipes the floor with Apache for anything over zero concurrent users. That's assuming you properly configure everything. Check out the following links for some help diving into it.
http://wiki.nginx.org/Main
http://michael.lustfield.net/content/dummies-guide-nginx
http://blog.martinfjordvald.com/2010/07/nginx-primer/
You'll see improvements in terms of requests/second but you'll also see significantly less RAM and CPU usage. One thing I like is the greater control over what's going on with a more simple configuration.
Apache made a claim that apache 2.4 will offer performance as good or better than nginx. They made a bold claim calling out nginx and when they made that release it kinda bit them in the ass. They're closer, sure, but nginx still wipes the floor in almost every single benchmark.
We would like to evaluate whether the SVN protocol works better for our team than HTTP, but we don't want to commit to a full switch just yet.
Right now we have an Apache sever serving up our main repository. Can we safely use svnserve.exe to with the same repository so that a few of our developers can test it? My initial guess is that we can, but we don't want to risk corrupting our repository.
Yes, it's possible. The official SVN book has a chapter devoted to this situation:
http://svnbook.red-bean.com/en/1.5/svn.serverconfig.multimethod.html . There are some pitfalls but they have more to do with permission settings.
Exactly, Subversion is designed to support concurrent access via multiple protocols, something which causes major problems with CVS. Not only can you use http:// and svn://, but also file:// (if you happen to be working locally on the machine, for example with a continuous integration tool or other post-commit hook) https://, svn+ssh://, etc.
In my experience, one method hasn't proven to be objectively "better" than the other, but there are certain benefits to each. For example, Apache is extremely adept at handling lots of accesses at one. On the other hand, if you're not already using Apache, or don't want to make it handle SVN traffic, the svnserve daemon is lightweight and quite performant. On my Macs, I set up svnserve using launchd to start up only when a request comes in, so it doesn't use any resources when there is no repository activity. What works best will largely be a factor of the access patterns you see in practice.
How is Apache in respect to handling the c10k problem under normal conditions ?
Say while running very small scripts with little data, or do I need to scale out if I use Apache?
In the background heavy lifting is done by a few servers running specialized software that processes the requests but I'd like to use Apache as a front. Is this a viable plan?
I consider Apache to be more of an origin server - running something like mod_php or mod_perl to generate the content and being smart about routing to the appropriate system.
If you are getting thousands of concurrent hits to the front of your site, with a mix of types of data (static and dynamic) being returned, you may find it useful to put a more optimised system in front of it though.
The classic post-optimisation problem with Apache isn't generating the dynamic content (or at least, that can be optimised for early in the process), but simply waiting for a slow client to be able to receive the bytes that are being sent. It can therefore be a significant advantage to put a reverse proxy, in the form of Squid or Nginx, in front of the servers to take over the 'spoon-feeding' of the slow network clients, while allowing the content production to happen at full speed, and at local network speeds - 100Mb/sec or even gigabit speeds - if it even has to traverse a network at all.
I'm assuming you've probably seen this data, but if not, it might give you some idea.
Guys, imagine that you are running web server with 10K connections (simultaneous). How could it be?
You've got many many connections per second
Dynamic content
Are you sure that your CPU can handle that many PHP sessions for example? I guess no, so why are you thinking about C10K problem? :D
Static content - small files
And still soo many connections? On single server? Probably you've got problems with networking/throughput too or you are future competitor of Google. Use lighttpd which addresses C10K problem and is stable - fly light. Using Apache for only static files for large sites is obvious.
Your clients are downloading large files for a large time - static content
ISO images, archives etc
If you are doing it via web server - FTP may be more appropriate.
Video streaming
Use lighttpd or specialized software. And still... What about other resources?
I am using Linux Virtual Server as load balancer in front of apache servers (with specific patches for LVS-NAT) and I am happy :) This string is an answer you want to hear.
I currently have a cluster of 4 Apache web servers which are used to serve up static files of up to 30Mb in size. Generally, I can expect up to 5000 concurrent connections to these servers. What performance improvement would I expect to get by moving this to lighttpd?
I would expect it to handle the concurrency with much more ease and less memory overhead. I've stopped deploying Apache pretty much everywhere I can.
You may also consider nginx for a comparison.
If you are using Apache with MPM with worker or event you probably won't see much of a difference. If you haven't moved to using them I would give that a try. There isn't really any problem with lighttpd though either. I think today it is just a matter of picking one and going with it.
If I where serving that type of file I would push it out to a CDN and not have to worry about it. There are plenty of cheap ones now like CacheFly and Amazon's Cloudfront.
From the top of my head:
Smaller memory footprint
Quicker file reads
Definitely check out the benchmark at their site, they provide a lot of information on this topic: http://www.lighttpd.net/benchmark