Following yahoos performance teams advice, I decided to enable mod_deflate on Apache. In checking the results (using HTTPWatch), the gzipped responses took on average a 100 milliseconds more than the non-gzipped?
The server is on average load using <5% of CPU. Compression level is at minimum?
have you guys experienced results as such or read about it? I very much appreciate any input. Thanks.
What kind of responses are you sending? You won't notice any benefits in compressing certain kinds of binary data, e.g. images, Flash animations and other such assets; GZip works best for text.
Also, compressing data will incur a slight performance overhead on both server and client, but you expected that, right?
I don't think Yahoo's point is that gzipping will be faster. It's that if you look at the marginal cost of bandwidth versus CPU power, you're better off using more CPU if it allows you to use less bandwidth.
I'd agree with Rob that you need to figure out if the delay is due to Apache not serving the file as quickly because it has to go through compression or if its something else. Just watching the HTTP response is not going to tell you WHY its slower, just that it is.
Related
Say I have 100 images that are each 10KB in size. What are the benefits of putting all those into a single spritesheet? I understand there are fewer HTTP requests, and therefore less of a load on the server, but I'm curious as to the specifics. With modern pipelining, is it still worth the performance gains? How significant are the performance gains? Does it result in faster load time for the client, as well as less of a load on the server or just the same amount of load time, but less of a load on the server?
Are there any test cases anyone can point to that answers these questions?
Basically, what I'm asking is -- is it worth it?
Under HTTP/1.1 (which most sites are still
using) there is a massive overhead to downloading many small resources compared to one big one. This is why spriting became popular as an optimisation technique. HTTP/2 mostly solves that so there is less requirement for spriting (and in fact it's now being considered an anti-pattern). Not sure what you mean by "modern pipelining" but that mostly means HTTP/2 as the pipelining in HTTP/1.1 isn't as fully featured or used much.
How bad a performance hit is it over HTTP/1.1? Pretty shockingly bad actually - it can make load time 10 times as slow on an example site I created. It doesn't really impact server or client load too much - the same amount of data needs to be sent either way - but does massively impact load time.
Saying that there are downsides to spriting of images (and concatenation of text files which is similar). You have to download whole sprite even if only using one image, updating it invalidates the old version in the cache, it requires a build step... etc.
Ultimately the best test is to try it, as it will be different from site to site. However once HTTP/2 becomes ubiquitous this will become a lot less common.
More discussion on this topic on this answer: Optimizing File Cacheing and HTTP2
Been playing with ImageResizer for a bit now, and trying to do something, I am having trouble understanding the way to go about it.
Mainly I would like to stick to the idea of using the pipeline, and not trying to cheat it.
So.... Let's say, I pretty standard use ImageResizer For something like:
giants_logo.jpg?w=280&h=100
The File giants_logo.jpg
Processing Request is for a resized version of 'w=280&h=100'
In a clustered environment, what will happen is if this same request is served by 3 machines.
All 3 would end up doing the resize, and then storing their cached version in a local folder on disc. I could leverage a shared drive or something, but that has it's own limitations.
What I am looking to do, is get the processed file, and then copy it back up to the DB or S3 where the main images are served from.
My thought is.... I might have to write somehting like DiscCache, but with a complelty different guts, using the DB or S3 as the back end instead of the file system.
I realize the point of caching is speed, and what I am suggesting is negating that aspect..... but that's not the case if we layer the things maybe.
Anyway, What I am focused on is trying to keep track of the files generated, as well as avoid processing on multiple servers.
Any thoughts on the route I should look at to accomplish this?
TLDR; When DiskCache actually stops working well (usually between 1 and 20 million unique images), then switch to a CDN (unless it's too expensive), or a reverse proxy (unless your data set is really too huge to be bound by mortal infrastructure).
For petabyte data sets on the cheap when performance isn't king, it's a good plan. But for most people, it's premature. Even users with upwards of 20TB (source images) still use DiskCache. Really. Terabyte drives are cheap.
Latency is the killer.
To make this work you would need a central Redis server. MSSQL won't cut it (at least not on a VM or commodity hardware, we've tried). Given a Redis server, you can track what is done and stored (and perhaps even what is in progress, to de-duplicate effort in real time, as DiskCache does).
If you can track it, you can reuse it, and you can delete it. Reuse will be slower, since you're doubling the network traffic, moving the result twice. (But also decreasing it linearly with the number of servers in the cluster for source image fetches).
If bandwidth saturation is your bottleneck (very common), this could make performance worse. In fact, unless your read/write ratio is write and CPU heavy, you'll likely see worse performance than duplicated CPU effort under individual disk caches.
If you have the infrastructure to test it, put DiskCache on a SAN or shared drive; this will give you a solid estimate of the performance you can expect (assuming said drive and your blob storage system have comparable IO perf).
However, it's a fair amount of work, and you're essentially duplicating a subset of the functionality of reverse proxy (but with worse performance, since every response has to be proxied through the unlucky cluster server, instead of being spooled directly from disk).
CDNs and Reverse proxies to the rescue
Amazon CloudFront or Varnish can serve quite well as reverse proxies/caches for a web farm or cluster. Now, you'll have a bit less control over the 'garbage collection' process, but... also less code to maintain.
There's also ARR, but I've heard neither success nor failure stories about it.
But it sounds fun!
Send me a Github link and I'll help out.
I'd love to get a Redis-coordinated, cloud-agnostic poor-man's blob cache system out there. You bring the petabytes and infrastructure, I'll help you with the integration and troublesome bits. Efficient HTTP proxying is probably the hardest part; the rest is state management and basic threading.
You might want to have a look at a modified AzureReader2 plugin at https://github.com/orbyone/Sensible.ImageResizer.Plugins.AzureReader2
This implementation stores the transformed image back to the Azure blob container on the initial requests, so subsequent requests are redirected to that copy.
My rails application always reaches the threshold of the disk I/O rate set by my VPS at Linode. It's set at 3000 (I up it from 2000), and every hour or so I will get a notification that it reaches 4000-5000+.
What are the methods that I can use to minimize the disk IO rate? I mostly use Sphinx (Thinking Sphinx plugin) and Latitude and Longitude distance search.
What are the methods to avoid?
I'm using Rails 2.3.11 and MySQL.
Thanks.
did you check if your server is swapping itself to death? what does "top" say?
your Linode may have limited RAM, and it could be very likely that it is swapping like crazy to keep things running..
If you see red in the IO graph, that is swapping activity! You need to upgrade your Linode to more RAM,
or limit the number / size of processes which are running. You should also add approximately 2x the RAM size as Swap space (swap partition).
http://tinypic.com/view.php?pic=2s0b8t2&s=7
Since your question is too vague to answer concisely, this is generally a sign of one of a few things:
Your data set is too large because of historical data that you could prune. Delete what is no longer relevant.
Your tables are not indexed properly and you are hitting a lot of table scans. Check with EXAMINE on each of your slow queries.
Your data structure is not optimized for the way you are using it, and you are doing too many joins. Some tactical de-normalization would help here. Make sure all your JOIN queries are strictly necessary.
You are retrieving more data than is required to service the request. It is, sadly, all too common that people load enormous TEXT or BLOB columns from a user table when displaying only a list of user names. Load only what you need.
You're being hit by some kind of automated scraper or spider robot that's systematically downloading your entire site, page by page. You may want to alter your robots.txt if this is an issue, or start blocking troublesome IPs.
Is it going high and staying high for a long time, or is it just spiking temporarily?
There aren't going to be specific methods to avoid (other than not writing to disk).
You could try using a profiler in production like NewRelic to get more insight into your performance. A profiler will highlight the actions that are taking a long time, however, and when you examine the specific algorithm you're using in that action, you might discover what's inefficient about that particular action.
I wrote a JSON-API in NodeJS for a small project, running behind an Apache webserver. Now I'd like to improve performance by adding caching and compression. Basically, the question is what should be done in NodeJS itself and what is better handled by Apache:
a) The API calls have unique URLs (e.g. /api/user-id/content) and I want to cache them for at least 60 seconds.
b) I want the output to be served as Gzip (if it's understood by the client). NodeJS's HTTP module usually delivers content as "chunked". As I'm only writing a response in one place, is it enough to adjust the Content-encoding header to serve it as one piece so it can be compressed and cached?
a) I recommend caching but without a timer, just let the replacement strategy remove entries. I don't know what you are actually serving, maybe caching the actual JSON or its source data might be useful. here is a simple cache I wrote including a small unit test to give you some inspiration.
Simple Cache
b) How big is your JSON data? You have to compress it yourself, and keep in mind to not do it blocking. You can stream compress it and deliver it already. I never did that with node.
> I wrote a JSON-API in NodeJS for a small project, running behind an
> Apache webserver.
I would just run the API on different port and not behind apache(proxy??). If you want to proxy I would advice you to use NGINX. See Ryan Dahl's slides discussing Apache vs NGINX(Slides 8+). NGINX can also do compression/caching(fast). Maybe you should not compress all your JSON(size? few KB?). I recommendt you to read Google's Page Speed "Minimum payload size" section(good read!) explaining that, which I also quote below:
Note that gzipping is only beneficial for larger resources. Due to the
overhead and latency of compression and decompression, you should only
gzip files above a certain size threshold; we recommend a minimum
range between 150 and 1000 bytes. Gzipping files below 150 bytes can
actually make them larger.
> Now I'd like to improve performance by adding caching and compression
You could do compression/caching via NGINX(+memcached) which is going to be very fast. Even more prefered would be a CDN(for static files) which are optimized for this purpose. I don't think you should be doing any compressing in node.js, although some modules are available through NPM's search(search for gzip) like for example https://github.com/saikat/node-gzip
For caching I would advice you to have a look at redis which is extremely fast. It is even going to be faster than most client libraries because node.js fast client library(node_redis) uses hiredis(C). For this it is important to also install hiredis via npm:
npm install hiredis redis
Some benchmarks with hiredis
PING: 20000 ops 46189.38 ops/sec 1/4/1.082
SET: 20000 ops 41237.11 ops/sec 0/6/1.210
GET: 20000 ops 39682.54 ops/sec 1/7/1.257
INCR: 20000 ops 40080.16 ops/sec 0/8/1.242
LPUSH: 20000 ops 41152.26 ops/sec 0/3/1.212
LRANGE (10 elements): 20000 ops 36563.07 ops/sec 1/8/1.363
LRANGE (100 elements): 20000 ops 21834.06 ops/sec 0/9/2.287
> The API calls have unique URLs (e.g. /api/user-id/content) and I want
> to cache them for at least 60 seconds.
You can achieve this caching easily thanks to redis's setex command. This is going to be extremely fast.
Ok, as my API has only a very very basic use, I'll go with an little in-memory key/value store as basic cache (based on the inspiration Simple Cache gave me). For this little development experiment, that should be enough. For an API in production use, I'd stick to Alfred's tipps.
For the compression I'll use Apache's mod_deflate. It's robust and I don't need async gzipping at this point. Furthermore you can change compression settings without changing the app itself.
Thank you both for your help!
What are "Hits & Misses" in reference to APC opcode caching? I've installed APC and it's running great, but I've got "some" misses and I'm wondering if that's "bad". Also, I am running Openx and, as such, am filling up the "Cache full count(s)" pretty quickly. What do I need to change in the configuration to minimize that? Any recommended configurations?
Some misses are to be expected.
Hits = things are in cache
Miss = things not (yet) in cache. New or less-used things will always be a miss, so you'll always expect some.
You may need to tune how much memory you're dedicating to APC - Its sort of a guessing game, balancing how much memory your machine has and how much you 'usually' have filled in APC (it should tell you a amount or percent full). You'll have to tweak various values to see. An OK baseline is a compressed version of all your source code at like gzip level 2 - assume you're taking out comments and variable names and stuff, and you'll never get over that. Then you can figure out how much to dedicate to the cache.
If you're using APC for key-value caching as well, that will fill up faster than just code caching - and you'll expect to fill it up eventually. You'll then need to find an amount that gives a miss ratio you're comfortable with.