How do ETags in the HTTP header actually work? - apache

I don't know if I am not correctly understanding how the caching aspect of ETags work if there is some other issue I am dealing with, but I'll walk you through my situation.
From my understanding ETags are a unique hash that is created based on the file information and they are sent as part of Response header to uniquely identify the file. If the file is updated then the info is changed and hence the ETag for the file is also changed.
In my project, I need a fresh JS file to be fetched everytime I make changes to the file. I can't use version tags or unique hashes as part of the file name. I thought an ETag would work where
Http Request
GET myFile.js
Client ------------------> SERVER
Http Response 200
Http Response Header
accept-ranges: bytes
cache-control: max-age=86400, public
etag: "a7-58c3bb52101c4"
......
myFile.js
Client <------------------ SERVER
// myFile.js has not been changed
Http Request
GET myFile.js
Client ------------------> SERVER
Http Response 304
Http Response Header
accept-ranges: bytes
cache-control: max-age=86400, public
etag: "a7-58c3bb52101c4"
......
Client uses cached version of file
// myFile has been changed
Http Request
GET myFile.js
Client ------------------> SERVER
Http Response 200
Http Response Header
accept-ranges: bytes
cache-control: max-age=86400, public
etag: "88-58c3a1cb8474f" // new etag generated
......
myFile.js
Client <------------------ SERVER
So, if you request the file again and no changes have been made..the etag will remain the same and you'll get a 304 will indicate that the cached version should be used.
If the file has been changed the etag will be different as well and a fresh copy of the file will be sent by the server.
This is how I expected it to work.
MY PROBLEM:
When I update myFile.js it seems like I never get the new ETag has back. It just defaults to the cahced version of the file. If I clear the cache then I get the latest file and the new ETag. This to me seems to defeat the purpose. Is this how it works or am I understanding something incorrectly here?

Related

How to force increase the size of a range-bytes response for videos in Apache?

The newest version of Safari (mobile & desktop) buffers videos 4x slower than other browsers because it sends many small sized range-bytes requests opposed to a few large ones. An example request and response is below (this request continues with a small size of 64kb until enough data is loaded for the video to play, in Chrome, Firefox and other browsers the range-bytes request is much larger and so the data is delivered much faster in one stream).
Is it possible to get around this issue by forcing my web server (apache) to ignore Safari's small range-byte request of 64kb, and instead send a larger amount of data (about 5MB)? The request is made directly to the video file.
Summary
URL: http://example.org/video.mp4?rand=942824
Status: 206 Partial Content
Source: Network
Request
GET /video.mp4 HTTP/1.1
Accept: */*
Connection: keep-alive
Range: bytes=0-65535
Accept-Encoding: identity
Response
HTTP/1.1 206 Partial Content
Content-Type: video/mp4
Content-Range: bytes 0-65535/467342440
Accept-Ranges: 0-467342440
Content-Length: 65536
Connection: keep-alive
Server: nginx/1.2.1
UPDATE: I managed to change the request range header using the below code, however even though the 5mb is downloaded quickly, safari continues sending these small 64kb range requests and ignores the 5mb that was downloaded so this is not a solution.
SetEnvIf Range bytes=0-65535 HAVE_MyRequestHeader
RequestHeader unset Range env=HAVE_MyRequestHeader
RequestHeader set Range bytes=0-5000000 env=HAVE_MyRequestHeader
No. You can not change it server side. The client makes a request the server fulfills the request. Sending data the client didn’t ask for will likely cause errors.

How and where to set Access-Control-Expose-Headers for koa-cors

I am attempting to get some headers sent from my server to my front end via a fetch request.
In the controller function, I am explicitly sending some headers like this:
exports.getItems = async (ctx) => {
ctx.set('Search-type', 'category');
};
In postman, when I make a get request to my server I get these headers:
Connection →keep-alive
Content-Length →6442
Content-Type →application/json; charset=utf-8
Date →Thu, 19 Apr 2018 16:10:54 GMT
Search-type →category
However, when I try to access the header in the fetch request from the front end, I can only log the Content-Type. How do I get Search-type from my fetch?
After some googling, I found this issue on github which seems very similar to mine. This led me to another github issue page with the suggestion that I need to 'expose some explicitly needed headers'.
In the koa/cors documentation, there is an option allowHeaders Access-Control-Allow-Headers what I want to know is, how do I expose the headers so I can get them on my front end?
In the response to the GET, in addition to adding the Access-Control-Allow-Origin response header, you also need to include the Access-Control-Expose-Headers: <comma-separated-list-of-headers> response header.
If that header isn't returned by the server, even though the headers are sent by the server to the browser, the browser blocks any non-standard headers from being accessed by JavaScript. So you can see Content-Type (because it's a 'standard' response header), but not Search-type.
Basically, you need to ensure that the server responds with this
Access-Control-Expose-Headers: Search-type
(in addition to any other CORS response headers, like Access-Control-Allow-Origin, of course).

Can browser caching be controlled by HTTP headers alone w/o using hash names for asset files?

I'm reading it in Webpack docs:
The way it works has a pitfall: if we don’t change filenames of our resources when deploying a new version, browser might think it hasn’t been updated and client will get a cached version of it.
I'm curious, is it mandatory to use this mechanism with ugly file names main.55e783391098c2496a8f.js for assets in order to inform a browser that an asset file has changed?
Can it be controlled by HTTP headers only? There are multiple HTTP headers in the standard to control how browser caches assets, like:
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Wed, 24 Aug 2020 18:32:02 GMT
Last-Modified: Tue, 15 Nov 2024 12:45:26 GMT
ETag: x234dff
max-age: 12345
So can I use those headers alone? Or do I still have to bother about hash parts in file names main.55e783391098c2496a8f.js?
When user agent opens a page it must always get correct version of a source code. You have two options to achieve this:
Set Cache-Control, Expires and strong validator (ETag) response headers . This way you instruct user agent to perform relatively lightweight conditional request on each page load
Embed version in source code file URL and set Cache-Control and Expires response headers. This way you instruct user agent to cache source code with particural version forever
For more information check HTTP Caching article by Ilya Grigorik, HTTP conditional requests MDN page and this StackOverflow answer about resource revalidation.

Strange case of HTTP headers: If-Modified-Since and If-None-Match

As part of a test application, I'm making HTTP requests to webpages. On receiving a response, I save the current date/time (GMT) and ETag header for subsequent requests. However, for some strange reason, some host servers are not validating the If-Modified-Since and If-None-Match headers on subsequent requests.
One such example is this webpage: www.foxsports.com/nba/cleveland-cavaliers-team-news (running Apache). It always returns the full body with a 200-Ok HTTP status, when a 304-NotModified status is expected, hence ignoring the If-Modified-Since and If-None-Match headers sent in the request. I tested it using curl and online Hurl.
Any ideas why the sent request headers are not validated by some host servers?

Multipart form file upload, Nginx is "eating" it and not passing it on to the handler

I am new to nginx, but am familar with low-level HTTP and apache. When I try to do a multipart/form file upload, nginx writes some of the client request body to disk, but never finishes and it never passes it to the down/upstream script.
My specific setup is I have nginx with /dyn redirected to localhost:1337, where a node.js instance is listening. It works... except for the file upload handler. Also in the config is a /debug which is redirected to localhost:1338, which goes to a simple dump server.
I changed the error log handling to 'info'. It reports storing the client body to a file and when I exampline it is almost as I expected:
--boundary_.oOo._MjM5NzEwOTkxMzU2MjA0NjM5MTQxNDA3MjYwOA==
Content-Type: image/jpeg
Content-Disposition: form-data; name="file"; filename="dccde7b5-25aa-4bb2-96a6-81e9358f2252.jpg"
<binary data, ~89k>
The problem with this file is too short, only 81,920 bytes when the file is 88,963 bytes, it should be 88,963 + the header above.... But that is literally only half of it. There are 2 files (about the same size) that are coming in, so i would expect that file to be about ~160k. What nginx is doing is reassigning the request's http-level Content-length then passes that header on to the script, that's it and of course my script complains that it never finds --boundary_.oOo._MjM5NzEwOTkxMzU2MjA0NjM5MTQxNDA3MjYwOA== When I do the same request to my debug service without nginx in the middle, it corrrectly sends the data, anf the http Content-length is an apropriate 186943 bytes (both files are around 80k, so this makes sense)
My nginx config is default aside from what I've mentioned here.
Edit: after some more experimenting, all files in the client body directory are 81920 bytes