Cost of http request vs file size, rule of thumb?

Cost of http request vs file size, rule of thumb? - optimization

This sort of question has been asked before HTTP Requests vs File Size?, but I'm hoping for a better answer. In that linked question, the answerer seemed to do a pretty good job of answering the question with the nifty formula of latency + transfer time with an estimated latency of 80 ms and transfer speed of 5Mb/s. But it seems flaw in at least one respect. Don't multiple requests and transfers happen simultaneously in a normal browsing experience? That's what it looks like when I examine the Network tab in Chrome. Doesn't this mean that request latency isn't such a terrible thing?
Are there any other things to consider? Obviously latency and and bandwidth will vary but is 80ms and 5Mb/s a good rule of thumb? I thought of an analogy and I wonder if it is correct. Imagine a train station with only one track in and one track out (or maybe it is one for both). Http requests are like sending an engine out to get a bunch of cars at another station. They return pulling a long train of railway cars, which represents the requested file being downloaded. So you could send one engine out and have it bring back a massive load. Or you could send multiple engines out and they could each bring back smaller loads, of course they would all have to wait their turn coming back into the station. And some engines couldn't be sent out until other ones had come in. Is this a flawed analogy?
I guess the big question then is how can you predict how much overlap there will be in http requests so that you can know, for example, if it is generally worth it to have two big PNG files on your page or instead have a webp image, plus the Webpjs js and swf files for incompatible browsers. That doubles the number of requests but more than halves the total file size (say 200kB savings).

Your analogy it's not bad in general terms. Obviously if you want to be really precise in all aspects, there're things that are oversimplified or incorrect (but that happens with almost all analogies).
Your estimate of 80ms and 5mb/s might sound logical, but even though most of us likes theory, you should manage this kind of problems in another way.
In order to make good estimates, you should measure to get some data and analyze it. Every estimation depends on some context and you should not ignore it.
Think about not being the same estimating latency and bandwidth for a 3G connection, an ADSL connection in Japan or an ADSL connection in a less technology-developed country. Are clients accessing from the other end of the world or in the same country?. Like your good observation of simultaneous connections on the client, there're millions of possible questions to ask yourself and very little good-quality answers without doing some measure.
I know I'm not answering exactly your question, because I think is unanswerable without so many details about the domain (plus constrains, and a huge etc).
You seem to have some ideas about how to design your solution. My best advice is to implement each one of those and profile them. Make measurements, try to identify what your bottlenecks are and see if you have some control about them.
In some problems this kind of questions might have an optimal solution, but the difference between optimal and suboptimal could be negligible in practice.

This is the kind of answer I'm looking for. I did some simplistic tests to get a feel for the speed of many small files vs one large files.
I created html pages that loaded a bunch of random sized images from placekitten.com. I loaded them in Chrome with the Network tab open.
Here are some results:
# of imgs Total Size (KB) Time (ms)
1 465 4000, 550
1 307 3000, 800, 350, 550, 400
30 192 1200, 900, 800, 900
30 529 7000, 5000, 6500, 7500
So one major thing to note is that single files become much quicker after they have been loaded once. (The comma seperated list of times are page reloads). I did normal refresh and also Empty Cache and Hard Reload. Strangely it didn't seem to make much difference which way I refreshed.
My connection had a latency or return time or whatever of around 120 - 130ms and my download speed varied between 4 and 8Mbps. Chrome seemed to do about 6 requests at a time.
Looking at these few tests it seems that, in this range of file sizes at least, it is obviously better to have less requests when the file sizes are equal, but if you could cut the file size in half, even at the expense of increasing the number of http requests by 30, it would be worth it, at least for a fresh page load.
Any comments or better answers would be appreciated.

Related

Loading 1 1MB large image (spritesheet) vs loading 100 10KB images

Say I have 100 images that are each 10KB in size. What are the benefits of putting all those into a single spritesheet? I understand there are fewer HTTP requests, and therefore less of a load on the server, but I'm curious as to the specifics. With modern pipelining, is it still worth the performance gains? How significant are the performance gains? Does it result in faster load time for the client, as well as less of a load on the server or just the same amount of load time, but less of a load on the server?
Are there any test cases anyone can point to that answers these questions?
Basically, what I'm asking is -- is it worth it?

Under HTTP/1.1 (which most sites are still
using) there is a massive overhead to downloading many small resources compared to one big one. This is why spriting became popular as an optimisation technique. HTTP/2 mostly solves that so there is less requirement for spriting (and in fact it's now being considered an anti-pattern). Not sure what you mean by "modern pipelining" but that mostly means HTTP/2 as the pipelining in HTTP/1.1 isn't as fully featured or used much.
How bad a performance hit is it over HTTP/1.1? Pretty shockingly bad actually - it can make load time 10 times as slow on an example site I created. It doesn't really impact server or client load too much - the same amount of data needs to be sent either way - but does massively impact load time.
Saying that there are downsides to spriting of images (and concatenation of text files which is similar). You have to download whole sprite even if only using one image, updating it invalidates the old version in the cache, it requires a build step... etc.
Ultimately the best test is to try it, as it will be different from site to site. However once HTTP/2 becomes ubiquitous this will become a lot less common.
More discussion on this topic on this answer: Optimizing File Cacheing and HTTP2

With http2, does number of XHRs have any effect on performance if overall data size is the same?

As far as I know, HTTP/2 no longer uses separate TCP connections for every request, which is the main performance-booster of the protocol.
Does that mean it doesn't matter whether I use 10 XHRs with 10kB of content each or one XHR with 100kB and then split the parts client-side?

A precise answer would require a benchmark for your specific case.
In more general terms, from the client point of view, if you can make the 10 XHR at the same time (for example, in a tight loop), then what happens is that those 10 requests will leave the client more or less at the same time, incur in the latency between the client and the server, be processed on the server (more or less in parallel depending on the server architecture), so the result could be similar of a single XHR - although I would expect the single request to be more efficient.
From the server point of view, however, things may be different.
If you multiply by 10 what could have been done with a single request, now your server sees a 10x increase in request rate.
Reading from the network, request parsing and request dispatching are all activities that are heavily optimized in servers, but they do have a cost, and a 10x increase in that cost may be noticeable.
And that 10x increase in request rate to the server may impact the database as well, the filesystem, etc. so there may be ripple effects that can only be noticed by actually performing the benchmark.
Other things that you have to weigh are the amount of work that you need to do in the server to aggregate things, and to split them on the client; along with other less measurable things like code clarity and maintainability, and so forth.
I would say that common pragmatic judgement applies here: if you can make the same work with one request, why making 10 requests ? Do you have a more specific example ?
If you are in doubt, measure.

Centralized storage for large text files

What should do system : store/manage centralized large(100 - 400 mb) text files
What to store : lines from text file, for some files lines must be unique, metadata about file(filename, comment, last update etc.) also must be stored position in file( on same file may be different positions for different applications)
Operations : concurrent get lines from file (100 - 400 lines on query), add lines(also 100 - 400 lines), exporting is not critical - can be scheduled
So which storage to use SQL DBMS - too slow, i think, maybe a noSQL solution ?

NoSQL: Cassandra is an option (you can store it line by line or groups of lines I guess), Voldemort is not too bad, you might even get away with using MongoDB but not sure it fits the "large files" requirement.

400 MiB will be completely served from the caches on every non-ridiculous database server. Insofar, the choice of database does not really matter too much, any database will be able to deliver fast (though there are different kinds of "fast", it depends what you need).
If you are really desperate for raw speed, you can go with something like redis. Again, 400 MiB is no challenge for that.
SQL might be slightly slower (but not that much) but has the huge advantage of being flexible. Flexibility, generality, and the presence of a "built-in programming language" are not free, but they should not have a too bad impact, because either way returning data from the buffer cache works more or less at the speed of RAM.
If you ever figure that you need a different database at a later time, SQL will let you do it with a few commands, or if you ever want something else you've not planned for, SQL will do. There is no guarantee that doing something different will be feasible with a simple key-value store.
Personally, I wouldn't worry about performance for such rather "small" datasets. Really, every kind of DB will serve that well, worry not. Come again when your datasets are several dozens of gigabytes in size.
If you are 100% sure that you will definitively never need the extras that a fully blown SQL database system offers, go with NoSQL to shave off a few microseconds. Otherwise, just stick with it to be on the safe side.
EDIT:
To elaborate, consider that a "somewhat lower class" desktop has upwards of 2 GiB (usually rather 4 GiB) nowadays, and a typical "no big deal" server has something like 32 GiB. In that light, 400 MiB is nothing. Typical network uplink on a server (unless you are willing to pay extra) are 100 mibit/s.
A 400 MiB text file might have somewhere around a million lines. That boils down to 6-7 memory accesses for a "typical SQL server", and to 2 memory accesses plus the time needed to calculate a hash for a "typical NoSQL server". Which is, give or take few a dozen cycles, the same in either case -- something around a half a microsecond on a relatively slow system.
Add to that a few dozen microseconds the first time a query is executed, because it must be parsed, validated, and optimized, if you use SQL.
Network latency is somewhere around 2 to 3 milliseconds if you're lucky. That's 3 to 4 orders of magnitude more for establishing a connection, sending a request to the server, and receiving an answer. Compared to that, it seems ridiculous to worry whether the query takes 517 or 519 microseconds. If there are 1-2 routers in between, it becomes even more pronounced.
The same is true for bandwidth. You can in theory push around 119 MiB/s over a 1 Gibit/s link assuming maximum sized frames and assuming no ACKs and assuming absolutely no other traffic, and zero packet loss. RAM delivers in the tens of GiB per second without trouble.

Redis mimic MASTER/MASTER? or something else?

I have been reading a lot of the posts on here and surfing the web, but maybe I am not asking the right question. I know that Redis is currently Master/slave until Cluster becomes available. However, I was wondering if someone can tell me how I would want to configure Redis logistically to meet my needs (or if its not the right tool).
Scenerio:
we have 2 sites on opposite ends of the US. We want clients to be able to write at each site at a high volume. We then want each client to be able to perform reads at their site as well. However we want the data to be available from a write at the sister site in < 50ms. Given that we have plenty of bandwidth. Is there a way to configure redis to meet our needs? our writes maximum size would be on the order of 5k usually much less. The main point is how can i have2 masters that are syncing to one another even if it is not supported by default.

The catch with Tom's answer is that you are not running any sort of cluster, you are just writing to two servers. This is a problem if you want to ensure consistency between them. Consider what happens when your client fails a write to the remote server. Do you undo the write to local? What happens to the application when you can't write to the remote server? What happens when you can't read from the local?
The second catch is the fundamental physics issue Joshua raises. For a round trip you are talking a theoretical minimum of 38ms leaving a theoretical maximum processing time on both ends (of three systems) of 12ms. I'd say that expectation is a bit too much and bandwidth has nothing to do with latency in this case. You could have a 10GB pipe and those timings are still extant. That said, transferring 5k across the continent in 12ms is asking a lot as well. Are you sure you've got the connection capacity to transfer 5k of data in 50ms, let alone 12? I've been on private no-utilization circuits across the continent and seen ping times exceeding 50ms - and ping isn't transferring 5k of data.
How will you keep the two unrelated servers in-sync? If you truly need sub-50ms latency across the continent, the above theoretical best-case means you have 12ms to run synchronization algorithms. Even one query to check the data on the other server means you are outside the 50ms window. If the data is out of sync, how will you fix it? Given the above timings, I don't see how it is possible to synchronize in under 50ms.
I would recommend revisiting the fundamental design requirements. Specifically, why this requirement? Latency requirements of 50ms round trip across the continent are usually the sign of marketing or lack of attention to detail. I'd wager that if you analyze the requirements you'll find that this 50ms window is excessive and unnecessary. If it isn't, and data synchronization is actually important (likely), than someone will need to determine if the significant extra effort to write synchronization code is worth it or even possible to keep within the 50ms window. Cross-continent sub-50ms latency data sync is not a simple issue.
If you have no need for synchronization, why not simply run one server? You could use a slave on the other side of the continent for recovery-only purposes. Of course, that still means that best-case you have 12ms to get the data over there and back. I would not count on
50ms round trip operations+latency+5k/10k data transfer across the continent.

It's about 19ms at the speed of light to cross the US. <50ms is going to be hard to achieve.
http://www.wolframalpha.com/input/?i=new+york+to+los+angeles

This is probably best handled as part of your client - just have the client write to both nodes. Writes generally don't need to be synchronous, so sending the extra command shouldn't affect the performance you get from having a local node.

How online-game clients are able to exchange data through internet so fast?

Let's imagine really simple game... We have a labirinth and two players trying to find out exit in real time through internet.
On every move game client should send player's coordinates to server and accept current coordinates of another client. How is it possible to make this exchange so fast (as all modern games do).
Ok, we can use memcache or similar technology to reduce data mining operations on server side. We can also use fastest webserver etc., but we still will have problems with timings.
So, the questions are...
What protocol game clients are usually using for exchanging information with server?
What server technologies are coming to solve this problem?
What algorithms are applied for fighting with delays during game etc.

Usually with Network Interpolation and prediction. Gamedev is a good resource: http://www.gamedev.net/reference/list.asp?categoryid=30
Also check out this one: http://developer.valvesoftware.com/wiki/Source_Multiplayer_Networking

use UDP, not TCP
use a custom protocol, usually a single byte defining a "command", and as few subsequent bytes as possible containing the command arguments
prediction is used to make the other players' movements appear smooth without having to get an update for every single frame
hint: prediction is used anyway to smooth the fast screen update (~60fps) since the actual game speed is usually slower (~25fps).

The other answers haven't spelled out a couple of important misconceptions in the original post, which is that these games aren't websites and operate quite differently. In particular:
There is no or little "data-mining" that needs
to be speeded up. The fastest online
games (eg. first person shooters)
typically are not saving anything to
disk during a match. Slower online
games, such as MMOs, may use a
database, primarily for storing
player information, but for the most
part they hold their player and world data in memory,
not on disk.
They don't use
webservers. HTTP is a relatively slow
protocol, and even TCP alone can be
too slow for some games. Instead they
have bespoke servers that are written just for that particular game. Often these servers are tuned for low latency rather than throughput, because they typically don't serve up big documents like a web server would, but many tiny messages (eg. measured in bytes rather than kilobytes).
With those two issues covered, your speed problem largely goes away. You can send a message to a server and get a reply in under 100ms and can do that several times per second.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas