Glassfish v3 cache and Varnish cached - glassfish

I am looking into way to speed up my site, which contain a decent amount of small images at one times (so my site primary does IO read). I use glassfish v3. In GF v3, I specify glassfish to cache static resources. Is it sufficient to just use GF cache? Will varnish cache will me significant improvement over GF cache? Do GF work well with varnish?

Varnish may be a better option; rather than trying to manage it's own set of files in memory and on-disk, it works to utilize the underlying caching system of the OS itself instead of fighting against it, which is why it often out-performs other caching technologies such as Squid.
I've found Varnish to be very simple to set-up and have used it in the past to help a number of client sites survive "slashdottings".

Varnish is tested against Apache Traffic Server (also a cache server), Nginx and Lighttpd here:
http://nbonvin.wordpress.com/2011/03/24/serving-small-static-files-which-server-to-use/
The charts are showing CPU and memory consumption as well as the performances.

Related

Apache or nginx ? I like to understand the basic working flow of Nginx , its advantage and disadvantage

Pros & cons over Apache or nginx and how they work internally in order to maximize the resource utilization
Can I use Apache & Nginx together ? If I use only Nginx then what problem I can face ?
Apache has some disadvantages, especially when it is used with the PHP module.
Apache's process model is such that each connection uses a separate process. Each process carries all the overhead of PHP and any other modules you may have loaded with it. An Apache process might run a PHP script or serve static content for one request. If the PHP has a memory leak (which does happen sometimes), the process continues to grow in size. Also, when KeepAlive is enabled, which is usually recommended, that process stays alive for a few seconds after the connection, consuming a "slot" that another client might be able to use and helping the server to reach its MaxClients sooner.
Nginx is an alternative webserver that normally uses the Linux "epoll" API to process requests in a non-blocking mode. This means that one single process can handle many simultaneous connections. Epoll is an efficient way to tell the single process which connection(s) it needs to deal with and which can wait. Nginx has a goal of solving the "C10k" problem - how to have 10,000 concurrent connections.
This naturally goes hand in hand with php-fpm, the FastCGI Process Manager. Nginx itself does not have PHP built-in. When it receives a request for a PHP script, it makes a call out to php-fpm to run the script, which then returns the result to nginx, which returns it to the client.
This all uses a lot less memory than a similar Apache+mod_php configuration.
There are a couple more huge advantages of php-fpm over mod_php:
It uses different "pools", each of which can run as a separate Linux user. This provides a simple and effective way of isolating websites (for example, if they are run by different customers who should not read each other's code) without the overhead or nastiness of suexec or suphp.
It has a slow log feature where it can dump a PHP stack trace of any script that has been running for greater than X seconds. This can help diagnose slow code issues.
Php-fpm can be run with Apache, and in fact this allows you to take advantage of Apache's more efficient Worker MPM (or Event in Apache 2.4). However, my experience is that configuring it in Apache is significantly more complex than configuring it in nginx, and even with Worker, it still is not quite as efficient with nginx.
Disadvantages of moving to nginx - not many, but things to keep in mind:
It does not support .htaccess files. I think this is a good thing personally as .htaccess files must be parsed by Apache for every request, which can cause significant overhead.
Configuration files need to be re-written. If you have many complex site configurations, this could take some doing. For simple cases it is not usually a big deal.
Feature Of Nginx
Nginx is fast because it does not need to create a new process for
each new request.
HTTP proxy and Web server features
Ability to handle more than 10,000 simultaneous connections with a
low memory footprint (~2.5 MB per 10k inactive HTTP keep-alive
connections)
Handling of static files, index files, and auto-indexing
Reverse proxy with caching
Load balancing with in-band health checks
Fault tolerance
Nginx uses very little memory, especially for static Web pages..
FastCGI, SCGI, uWSGI support with caching
Name- and IP address-based virtual servers
IPv6-compatible
SPDY protocol support
FLV and MP4 streaming
Web page access authentication
gzip compression and decompression
URL rewriting having its own rewrite engine
Custom logging with on-the-fly gzip compression
Response rate and concurrent requests limiting
Bandwidth throttling
Server Side Includes
IP address-based geolocation
User tracking
WebDAV
XSLT data processing
Embedded Perl scripting
Nginx is highly scalable, and performance is not dependent on
hardware.
With only Nginx, you lose a whole bunch of apache-specific features such as all the mod_dav stuff. You lose a lot of modules, effectively
Conclusion
The best use for nginx is in front of Apache if you need Apache modules. Use it as a load-balancer if you might, between multiple Apache instances, and you suddenly have a mixed set-up that is rather

Low latency web server/load balancer for the non-Twitters of the world

Apache httpd has done me well over the years, just rock solid and highly performant in a legacy custom LAMP stack application I've been maintaining (read: trying to escape from)
My LAMP stack days are now numbered and am moving on to the wonderful world of polyglot:
1) Scala REST framework on Jetty 8 (on the fence between Spray & Scalatra)
2) Load balancer/Static file server: Apache Httpd, Nginx, or ?
3) MySQL via ScalaQuery
4) Client-side: jQuery, Backbone, 320 & up or Twitter Bootstrap
Option #2 is the focus of this question. The benchmarks I have seen indicate that Nginx, Lighthttpd, G-WAN (in particular) and friends blow away Apache in terms of performance, but this blowing away appears to manifest more in high-load scenarios where the web server is handling many simultaneous connections. Given that our server does max 100gb bandwidth per month and average load is around 0.10, the high-load scenario is clearly not at play.
Basically I need the connection to the application server (Jetty) and static file delivery by the web server to be both reliable and fast. Finally, the web server should double duty as a load balancer for the application server (SSL not required, server lives behind an ASA). I am not sure how fast Apache Httpd is compared to the alternatives, but it's proven, road warrior tested software.
So, if I roll with Nginx or other Apache alternative, will there be any difference whatsoever in terms of visible performance? I assume not, but in the interest of achieving near instant page loads, putting the question out there ;-)
if I roll with Nginx or other Apache alternative, will there be any difference whatsoever in terms of visible performance?
Yes, mostly in terms of latency.
According to Google (who might know a thing or tow about latency), latency is important both for the user experience, high search-engine rankings, and to survive high loads (success, script kiddies, real attacks, etc.).
But scaling on multicore and/or using less RAM and CPU resources cannot hurt - and that's the purpose of these Web server alternatives.
The benchmarks I have seen indicate that Nginx, Lighthttpd, G-WAN (in particular) and friends blow away Apache in terms of performance, but this blowing away appears to manifest more in high-load scenarios where the web server is handling many simultaneous connections
The benchmarks show that even at low numbers of clients, some servers are faster than others: here are compared Apache 2.4, Nginx, Lighttpd, Varnish, Litespeed, Cherokee and G-WAN.
Since this test has been made by someone independent from the authors of those servers, these tests (made with virtualization and 1,2,4,8 CPU Cores) have clear value.
There will be a massive difference. Nginx wipes the floor with Apache for anything over zero concurrent users. That's assuming you properly configure everything. Check out the following links for some help diving into it.
http://wiki.nginx.org/Main
http://michael.lustfield.net/content/dummies-guide-nginx
http://blog.martinfjordvald.com/2010/07/nginx-primer/
You'll see improvements in terms of requests/second but you'll also see significantly less RAM and CPU usage. One thing I like is the greater control over what's going on with a more simple configuration.
Apache made a claim that apache 2.4 will offer performance as good or better than nginx. They made a bold claim calling out nginx and when they made that release it kinda bit them in the ass. They're closer, sure, but nginx still wipes the floor in almost every single benchmark.

Apache resource usage vs Mongoose or other lightweight web server

How much memory and/or other resources does Apache web server use?
How much more are lightweight servers efficient?
Say appache vs. Mongoose Web Server
Neil Butterworth you out there?
Thanks.
Yes, lightweight servers are more efficient with memory and resources, as the term 'lightweight' would indicate. nginx is a popular one.
Apache's memory and resource usage depends a lot on what you're doing with it - which modules are loaded, what your PHP etc. scripts are doing. There's no single answer.
You have to take into account your specific task, and also the fact that almost every web server has some sort of specialization (a niche).
Apache is configurable and stable.
nginx is extremely fast, but works only with static context.
lighttpd is small, fast and does both static and dynamic context.
Mongoose is embeddable, small and easy to use.
There are many more web servers, I won't go through the whole list here. You need to decide which features do you require for your task, and make a choice accordingly.
Apache Httpd is great if you need lots of flexibility that is provided via various mods. If you're looking for straight-up file serving or proxying, then some lightweight options might be better. I manage the Maven Central repo that gets millions of hits a day and I have some experience with Nginx.

Apache and the c10k

How is Apache in respect to handling the c10k problem under normal conditions ?
Say while running very small scripts with little data, or do I need to scale out if I use Apache?
In the background heavy lifting is done by a few servers running specialized software that processes the requests but I'd like to use Apache as a front. Is this a viable plan?
I consider Apache to be more of an origin server - running something like mod_php or mod_perl to generate the content and being smart about routing to the appropriate system.
If you are getting thousands of concurrent hits to the front of your site, with a mix of types of data (static and dynamic) being returned, you may find it useful to put a more optimised system in front of it though.
The classic post-optimisation problem with Apache isn't generating the dynamic content (or at least, that can be optimised for early in the process), but simply waiting for a slow client to be able to receive the bytes that are being sent. It can therefore be a significant advantage to put a reverse proxy, in the form of Squid or Nginx, in front of the servers to take over the 'spoon-feeding' of the slow network clients, while allowing the content production to happen at full speed, and at local network speeds - 100Mb/sec or even gigabit speeds - if it even has to traverse a network at all.
I'm assuming you've probably seen this data, but if not, it might give you some idea.
Guys, imagine that you are running web server with 10K connections (simultaneous). How could it be?
You've got many many connections per second
Dynamic content
Are you sure that your CPU can handle that many PHP sessions for example? I guess no, so why are you thinking about C10K problem? :D
Static content - small files
And still soo many connections? On single server? Probably you've got problems with networking/throughput too or you are future competitor of Google. Use lighttpd which addresses C10K problem and is stable - fly light. Using Apache for only static files for large sites is obvious.
Your clients are downloading large files for a large time - static content
ISO images, archives etc
If you are doing it via web server - FTP may be more appropriate.
Video streaming
Use lighttpd or specialized software. And still... What about other resources?
I am using Linux Virtual Server as load balancer in front of apache servers (with specific patches for LVS-NAT) and I am happy :) This string is an answer you want to hear.

What perfromance improvement will I get by moving to lighttpd from Apache?

I currently have a cluster of 4 Apache web servers which are used to serve up static files of up to 30Mb in size. Generally, I can expect up to 5000 concurrent connections to these servers. What performance improvement would I expect to get by moving this to lighttpd?
I would expect it to handle the concurrency with much more ease and less memory overhead. I've stopped deploying Apache pretty much everywhere I can.
You may also consider nginx for a comparison.
If you are using Apache with MPM with worker or event you probably won't see much of a difference. If you haven't moved to using them I would give that a try. There isn't really any problem with lighttpd though either. I think today it is just a matter of picking one and going with it.
If I where serving that type of file I would push it out to a CDN and not have to worry about it. There are plenty of cheap ones now like CacheFly and Amazon's Cloudfront.
From the top of my head:
Smaller memory footprint
Quicker file reads
Definitely check out the benchmark at their site, they provide a lot of information on this topic: http://www.lighttpd.net/benchmark