What's the difference Expires and Cache-control:max-age? - http-headers

Could you tell me the difference of Expires and Cache-control:max-age?

Expires was defined in the HTTP/1.0 specifications, and Cache-Control in the HTTP/1.1 specifications.
I would suggest defining both so you cater to both, the older clients that only understand HTTP/1.0, and the newer ones.

Expires was specified in HTTP 1.0 specification as compared to Cache-Control: max-age, which was introduced in the early HTTP 1.1 specification. The value of the Expires header has to be in a very specific date and time format, any error in which will make your resources non-cacheable. The Cache-Control: max-age header's value when sent to the browser is in seconds, the chances of any error happening in which is quite less.
Since you can specify only one of the two headers in your web.config file, I'd suggest going with the Cache-Control: max-age header because of the flexibility it offers in setting a relative timespan from the present date to a date in the future. You can basically set and forget, as compared to the case with Expires header, whose value you will have to remember to update at least once every year. And if you set both headers programmatically from within your code, know that the value of Cache-Control: max-age header will take precedence over Expires header. So, something to keep in mind there as well.
From Setting Expires and Cache-Control: max-age headers for static resources in ASP.NET

Related

Do any browsers support trailers sent in chunked encoding responses?

HTTP/1.1 specifies that a response sent as Transfer-Encoding: chunked can include optional trailers (ie. what would normally be sent as headers, but for whatever reason can't be calculated before the content, so they can be appended to the end), for example:
Request:
GET /trailers.html HTTP/1.1
TE: chunked, trailers
Response:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Trailer: My-Test-Trailer
D\r\n
All your base\r\n
B\r\n;
are belong\r\n
6\r\n
to us\r\n
0\r\n
My-Test-Trailer: something\r\n
\r\n
This request specifies in the TE header that it's expecting a chunked response, and will be looking for trailers after the final chunk.
The response specifies in the Trailer header the list of trailers it will be sending (in this case, just one: My-Test-Trailer)
Each of the chunks are sent as:
size of chunk in hex (D = 13), followed by a CRLF
chunk data (All your base), followed by a CRLF
A zero size chunk (0\r\n) indicates the end of the body.
Then the trailer(s) are specified (My-Test-Trailer: something\r\n), followed by a final CRLF.
Now, from everything I've read so far, trailers are rarely (if ever) used. Most discussions here and elsewhere concerning trailers typically start with "but why do you want to use trailers anyway?".
Putting aside the question of why, out of curiosity I've been trying to simulate a HTTP request/response exchange that uses trailers; but so far I have not yet been able to get it to work, and I'm not sure if it's something wrong with response I'm generating, or whether (as some have suggested) there are simply no clients that look for trailing headers.
Clients I've tried include: curl, wfetch, Chrome + jQuery.
In all cases, the client receives and correctly reconstructs the chunked response (All your base are belong to us); and I can see in the response headers that Trailer: My-Test-Trailer is being sent; but I'm not seeing My-Test-Trailier: something returned either in the response headers, or anywhere.
It's unclear whether a trailing header like this should appear in the client as a normal response header, after the entire response has been received and the connection closed?
Interestingly, the curl change logs appear to suggest that curl does support optional trailers, and that curl will process any trailers it finds into the normal header stream.
So does anybody know:
of a valid URL that I could ping, which sends trailers in a chunked response? (so that I can confirm whether it's just my test response that's not working); and
which clients are known to support (and access/display) trailers sent by the server?
No common browsers support HTTP/1.1 trailers. Look at the column "Headers in trailer" in the "Network" tab of browserscope.
Chrome: No, and won't fix (bug). Supports H/2 trailers (bug).
Firefox: No, and I don't see a ticket in bugzilla for it. Appears to support H/2.
IE: No
Edge: No
Safari: No
Opera: Old versions only (v10 - 12, removed in 14)
As you've discovered, a number of non-browser clients support it.
Over 5 years since asking this question, I can now definitively answer it myself.
Mozilla just announced that they will be supporting the new Server-Timing field as a HTTP trailing header (their first ever support for trailers).
https://bugzilla.mozilla.org/show_bug.cgi?id=1413999
However, more importantly, they confirm that it will be whitelisted so that Server-Timing is the only support value (emphasis mine):
Server-Timing is an HTTP trailer, not a header. :mcmanus tells me we currently parse trailers, but then silently throw them away. We don't want to change that behavior in general (we don't want to encourage trailers), so we'll want to whitelist the Server-Timing trailer, store it somewhere (probably even just a mServerTiming header will work for now, since it's the only trailer we support) and then make it available via some new channel.getTrailers() call.
So I guess that confirms it once and for all: trailing headers are not supported (and never likely to be in a general sense) by Moz, and presumably the same stance is taken by all other browser vendors.
Since this commit, Jodd HTTP Java client support trailer headers.
On the first question, I haven't yet found any live response that uses them ;)
As of May 2022, all browsers support the Trailer response header: https://caniuse.com/mdn-http_headers_trailer.
Library support:
Node.js
Jodd
EDIT ME, to add more. (This is a Community wiki answer.)
Just recently, according to caniuse.com, it is claimed that most major browser providers now support the Trailer response header in their latest versions (e.g. Firefox-88, Safari-14.1, Chrome-88, etc).
However, it seems that this only due to their support of Server-Timing (which can use trailers as mentioned in another answer), and there does not currently seem to be a general way to access a Trailer header from the browser Javascript - it's currently an open issue for the Fetch API, and the bug request for trailers in Chrome is marked wontfix.

Understanding caching strategies of a dynamically generated search page

While studying the caching strategies adopted by various search engine websites and Stackoverflow itself, I can't help but notice the subtle differences in the response headers:
Google Search
Cache-Control: private, max-age=0
Expires: -1
Yahoo Search
Cache-Control: private
Connection: Keep-Alive
Keep-Alive: timeout=60, max=100
Stackoverflow Search
Cache-Control: private
There must be some logical explanation behind the settings adopted. Can someone care to explain the differences so that everyone of us can learn and benefit?
From RFC2616 HTTP/1.1 Header Field Definitions, 14.9.1 What is Cacheable:
private
Indicates that all or part of the response message is intended for a single
user and MUST NOT be cached by a shared cache. This allows an origin server
to state that the specified parts of the response are intended for only one
user and are not a valid response for requests by other users. A private
(non-shared) cache MAY cache the response.
max-age=0 means that it may be cached up to 0 seconds. The value zero would mean that no caching should be performed.
Expires=-1 should be ignored when there's a max-age present, and -1 is an invalid date and should be parsed as a value in the past (meaning already expired).
From RFC2616 HTTP/1.1 Header Field Definitions, 14.21 Expires:
Note: if a response includes a Cache-Control field with the max-age directive
(see section 14.9.3), that directive overrides the Expires field
HTTP/1.1 clients and caches MUST treat other invalid date formats, especially
including the value "0", as in the past (i.e., "already expired").
The Connection: Keep-Alive and Keep-Alive: timeout=60, max=100 configures settings for persistent connections. All connections using HTTP/1.1 are persistent unless otherwise specified, but these headers change the actual timeout values instead of using the browsers default (which varies greatly).

What are the Valid values for http Pragma

What are the valid values for http header pragma . I know no-cache is one but i wnat to enable caching so what should i set it. I did some googleing and all that i got was most clients ignore this but no info on other values it accepts.
Surprisingly there is only one parameter defined by default, which is no-cache and no new Pragma directives will be defined in HTTP as per RFC.
ref: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32
Moreover, you will need to use the Cache-Control header for managing the caching behaviors rather than the Pragma directive which seems to be still included only to support the legacy HTTP/1.0.
ref: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
Bonus: http://www.mnot.net/cache_docs/
You're probably looking for Cache-Control, this is supported in HTTP/1.1 and defines more states than Pragma.
Some more information, that might help some people that are less interested in caching, and more interested in http headers in general. i.e the literal interpretation of the original question, "what are the valid values for the http header pragma"?
The reference in the accepted answer (https://stackoverflow.com/a/7376516/3246928) is the RFC http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32. It defines the snytax as:
Pragma = "Pragma" ":" 1#pragma-directive
pragma-directive = "no-cache" | extension-pragma
extension-pragma = token [ "=" ( token | quoted-string ) ]
This implies that any 'token=value' pair is acceptable (with the value being optional). The spec goes on to say
No new Pragma directives will be defined in HTTP.
and I would guess this is also meant to cover the "extension-pragma" part, but I wish they had been more unambiguous here.
This header does not seem to be specifically created for caching; the description in the RFC says:
The Pragma general-header field is used to include implementation-
specific directives that might apply to any recipient along the
request/response chain
So, in theory, you could add things here, and they could work. However, despite much searching, I have not found any reference to any other values ever being used here. It is effectively a dead and embarrassing part of http/1.
It seems like the normal thing to do is:
Only use pragma with the no-cache flag. This is the only value anyone should ever use. (And of course you should also use the cache-control header for your caching to behave as expected).
If you want to put some special information into a http header - i.e. If you want to "include implementation-specific directives that might apply to any recipient along the request/response chain", then create a custom http header. Google and Amazon, for example, do this:
http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html and
https://cloud.google.com/storage/docs/reference-headers
Note the naming convention on the http header. The "x-" prefix is deprecated
by https://www.rfc-editor.org/rfc/rfc6648, but everyone seems to use it
anyway.

Cache Control Question

If I set this for cache control on my site:
Header unset Pragma
FileETag None
Header unset ETag
# 1 YEAR
<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|swf|mp3|mp4)$">
Header set Cache-Control "public"
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>
# 2 HOURS
<FilesMatch "\.(html|htm|xml|txt|xsl)$">
Header set Cache-Control "max-age=7200, must-revalidate"
</FilesMatch>
# CACHED FOREVER
# MOD_REWRITE TO RENAME EVERY CHANGE
<FilesMatch "\.(js|css)$">
Header set Cache-Control "public"
Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT"
Header unset Last-Modified
</FilesMatch>
...then what if I update any css or image or other files, will the users browser still use the caches version until it expires (a year later)?
Thanks
Your css, js and image files will never be cached, as you are setting a date in the past.
I assume this is a mistake, and you intended to set it for a year in the future, this is one reason to favour max-age over expires.
If this was the case, then your images will be cached up to a year. It's allowable to drop something out of the cache at any time, for example to clean out less-frequently used entries to reduce the size on disk that the cache is taking up.
There are two possible approaches to deal with the possibility of reducing the risk of staleness. One is to set a much lower expiry time, and use e-tags and modification dates so that after that expiry time has past you can send a 304 if there is no change, so the server need send only a few bytes rather than the entire entity.
The other is to keep the expiry at a year, but to change the URI used when you change. This can be useful in the case of e.g. a large file that is used on almost every page on your site. It requires that you change all references to that resource when it does change (because you are essentially changing to use a new resource), which can be fiddly and therefore is only advised as an optimisation in a few hotspot cases. If a file ignores query attributes (e.g. it's just served straight from a file) the browser won't know that, hence you could use something like /scripts/bigScript.js?version=1.2.3 and then change to /scripts/bigScript.js?version=1.2.4 when you change bigScript.js. This will have no effect on bigScript.js, but will cause the browser to get a new file, as for all it knows it's a completely different resource.
Yes, a response with an expiration date in the future will be considered as fresh until the expiration date:
The Expires entity-header field gives the date/time after which the response is considered stale. […]
The presence of an Expires header field with a date value of some time in the future on a response that otherwise would by default be non-cacheable indicates that the response is cacheable, unless indicated otherwise by a Cache-Control header field (section 14.9).
Note that an expiration date more than one year in the future may be interpreted as never expires:
To mark a response as "never expires," an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future.
So if a cache has the response stored, it will probably take the response from the cache even without revalidating the cached response before sending it.
Now if you change a resource that is already stored in caches and still fresh, there is no way to invalidate them:
[…] although they might continue to be "fresh," they do not accurately reflect what the origin server would return for a new request on that resource.
There is no way for the HTTP protocol to guarantee that all such cache entries are marked invalid. For example, the request that caused the change at the origin server might not have gone through the proxy where a cache entry is stored.
This is the reason for why such never expiring resources use a unique version number in the URL (e.g. style-v123.css) that is changed with each update. This is also what I recommend in this case.
By the way, declaring the response with Cache-Control as public doesn’t do anything in this case. This is only used when a response that required authorization should be cacheable:
public  –  Indicates that the response MAY be cached by any cache, even if it would normally be non-cacheable or cacheable only within a non- shared cache. (See also Authorization, section 14.8, for additional details.)
For further information on HTTP caching:
HTTP 1.1 specification – Caching in HTTP
Mark Nottingham’s Caching Tutorial

Does the `Expires` HTTP header needs to be consistent across multiple cold-cache requests?

I'm implementing a custom web server of a kind. And am looking into adding an Expires header support. However, I'm a little unsure of how exactly to implement it.
If multiple cold-cache requests are being made to the same unchanged resource on the server and the server returned different Expires header (say it uses relative time to calculate the exact value of the Expires date e.g. +6 hours from the request time), does that invalidate the cache on all the proxy servers in-between as well? Or is it impossible to happen (per the spec)?
Does the Expires HTTP header needs to be consistent across multiple cold-cache requests?
Ok, never mind, found the relevant information under the Cache Revalidation and Reload Controls section of the HTTP Spec
Basically, you can serve all the different validators you want but you must be aware that in such case proxies may have a set of different validators from their own cache and from various user agents communicating with the proxy. They may choose to send one to you and that might not be the correct or the most optimal one for the end-users. However, a "best approach" has been suggested in the spec.
I suppose this should covers Expires headers as well as ETags, Cache-Control and whatnot.
Here's the relevant excerpt, in case anyone's interested:
When an intermediate cache is forced,
by means of a max-age=0 directive, to
revalidate its own cache entry, and
the client has supplied its own
validator in the request, the supplied
validator might differ from the
validator currently stored with the
cache entry. In this case, the cache
MAY use either validator in making its
own request without affecting semantic
transparency. However, the choice of
validator might affect performance.
The best approach is for the
intermediate cache to use its own
validator when making its request. If
the server replies with 304 (Not
Modified), then the cache can return
its now validated copy to the client
with a 200 (OK) response. If the
server replies with a new entity and
cache validator, however, the
intermediate cache can compare the
returned validator with the one
provided in the client's request,
using the strong comparison function.
If the client's validator is equal to
the origin server's, then the
intermediate cache simply returns 304
(Not Modified). Otherwise, it returns
the new entity with a 200 (OK)
response. If a request includes the
no-cache directive, it SHOULD NOT
include min-fresh, max-stale, or
max-age.