Does Amazon S3 need time to update CORS settings? How long? - amazon-s3

Recently I enabled Amazon S3 + CloudFront to serve as CDN for my rails application. In order to use font assets and display them in Firefox or IE, I have to enable CORS on my S3 bucket.
<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<CORSRule>
<AllowedOrigin>*</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>POST</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
Then I used curl -I https://small-read-staging-assets.s3.amazonaws.com/staging/assets/settings_settings-312b7230872a71a534812e770ec299bb.js.gz, I got:
HTTP/1.1 200 OK
x-amz-id-2: Ovs0D578kzW1J72ej0duCi17lnw+wZryGeTw722V2XOteXOC4RoThU8t+NcXksCb
x-amz-request-id: 52E934392E32679A
Date: Tue, 04 Jun 2013 02:34:50 GMT
Cache-Control: public, max-age=31557600
Content-Encoding: gzip
Expires: Wed, 04 Jun 2014 08:16:26 GMT
Last-Modified: Tue, 04 Jun 2013 02:16:26 GMT
ETag: "723791e0c993b691c442970e9718d001"
Accept-Ranges: bytes
Content-Type: text/javascript
Content-Length: 39140
Server: AmazonS3
Should I see 'Access-Control-Allow-Origin' some where? Does S3 take time to update CORS settings? Can I force expiring headers if its caching them?

Try sending the Origin header:
$ curl -v -H "Origin: http://example.com" -X GET https://small-read-staging-assets.s3.amazonaws.com/staging/assets/settings_settings-312b7230872a71a534812e770ec299bb.js.gz > /dev/null
The output should then show the CORS response headers you are looking for:
< Access-Control-Allow-Origin: http://example.com
< Access-Control-Allow-Methods: GET
< Access-Control-Allow-Credentials: true
< Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
Additional information about how to debug CORS requests with cURL can be found here:
How can you debug a CORS request with cURL?
Note that there are different types of CORS requests (simple and preflight), a nice tutorial about the differences can be found here:
http://www.html5rocks.com/en/tutorials/cors/
Hope this helps!

Try these:
Try to scope-down the domain names you want to allow access to. S3 doesn't like *.
CloudFront + S3 doesn't handle the CORS configuration correctly out of the box. A kludge is to append a query string containing the name of the referring domain, and explicitly enable support for query strings in your CloudFront distribution settings.

To answer the actual question in the title:
No, S3 does not seem to take any time to propagate the CORS settings. (as of 2019)
However, if you're using Chrome (and maybe others), then CORS settings may be cached by the browser so you won't necessarily see the changes you expect if you just do an ordinary browser refresh. Instead right click on the refresh button and choose "Empty Cache and Hard Reload" (as of Chrome 73). Then the new CORS settings will take effect within <~5 seconds of making the change in the AWS console. (It may be much faster than that. Haven't tested.) This applies to a plain S3 bucket. I don't know how CloudFront affects things.
(I realize this question is 6 years old and may have involved additional technical issues that other people have long since answered, but when you search for the simple question of propagation times for CORS changes, this question is what pops up first, so I think it deserves an answer that addresses that.)

You have a few problems with the way you test CORS.
Your CORS configuration does not have a HEAD method.
Your curl command does not have -H header.
I am able to get your data by using curl like following. However they dumped garbage on my screen because your data is compressed binary.
curl --request GET https://small-read-staging-assets.s3.amazonaws.com/staging/assets/settings_settings-312b7230872a71a534812e770ec299bb.js.gz -H "http://google.com"

Related

CkRest.AddHeader function does not add a header using Chilkat C++ ("Content-MD5" header using fullRequestBinary PUT)

We are using Chilkat 9.5.0.80 C++ library.
There is a certain HTTP header we cannot add to our requests: "Content-MD5". When we add this header like this:
m_ckRest.AddHeader("Content-MD5", "any-value-here");
and examine the resulting request*, the "Content-MD5" header is NOT present.
However, when we add a header of a different name:
m_ckRest.AddHeader("Content-Type", "application/octet-stream");
... the resulting request DOES contain that header. We are using the "fullRequestBinary" method, for example:
const char* responseStrPtr = m_ckRest.fullRequestBinary( "PUT", encodedObjectName.c_str(), ckByteDataBuffer);
* We are examining our requests using a Proxy (using "Fiddler" as an http proxy in between us and Amazon S3 for example to test the upload of a "part" in a multipart AWS S3 upload) and in every attempt, the "Content-MD5" header is NOT present, while other headers are present.
Is this a bug? We found an old forum post from 2013 referencing a very similar sounding problem: http://www.chilkatforum.com/questions/2901/addheader-range-does-not-appear-to-be-effective Does Chilkat remove or ignore our attempt to add a "Content-MD5" header? Is this bug fixed in a version newer than the one we are using? Is there a workaround? Here is an example of the headers in a PUT request:
PUT https://our-bucket.s3.us-west-1.amazonaws.com/somefile?partNumber=4&uploadId=tJJYIXdxG_7X8elzSJrKt32A_rH46Y0Yk1vyzZgwxpvmK5uCrcE82k_F9UmytVHWuxXfc6tX5o3w.SRnnYcD7VBskcLrr0xC13bHHVDx62iGGQ3eIzkv5J5d1F4_DkcW HTTP/1.1
Content-Length: 5266235
x-amz-date: 20200921T201943Z
x-amz-content-sha256: 90fa8fc564dd558d0c2eac92e367d94101f4ca9570c970795b9fdb2aa96d6666
Host: our-bucket.s3.us-west-1.amazonaws.com
Content-Type: application/octet-stream
Date: Mon, 21 Sep 2020 20:19:43 GMT
Authorization: AWS4-HMAC-SHA256 Credential=AKIAIBYS55OSD2FIOBFUS/20200921/us-west-1/s3/aws4_request,SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date,Signature=8ea74cb7769d8e158e5ccc0604cc2cdb096703b10c3c8d9323d0746debbdUUU
In correspondence with Chilkat support, turns out that Chilkat versions 9.5.0.80 and 9.5.0.83 intentionally remove the Content-MD5 header when authenticating using AWS Signature V4. Instead, Chilkat calculates the SHA256 hash and places it in the x-amz-content-sha256 (if authenticating using older AWS signature V2, it calculates Content-MD5 I'm told) So, unlike the comment from #Chilkat Software, this has not been fixed in a later version as of the writing of that comment, and the removal is intentional.
This is not terrible, but it stems from the misunderstanding that a SHA-256 hash of the content is necessary to construct a valid AWS Signature V4 for authentication, when in fact that is not the case. While SHA256 is perfectly adequate for content verification, it is also wasteful in comparison to MD5 for content verification.
The AWS C++ SDK itself does not use a SHA-256 hash in the x-amz-content-sha256 header when uploading a part. I've confirmed it uses: x-amz-content-sha256:UNSIGNED-PAYLOAD and instead uses the less "costly" MD5 hash, and puts it in the Content-MD5 header (see AWS documentation here https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html)
Unsigned payload option – You include the literal string
UNSIGNED-PAYLOAD when constructing a canonical request, and set the
same value as the x-amz-content-sha256 header value when sending the
request to Amazon S3
Here is an example of an Amazon AWS UploadPart request using Content-MD5 for conent verification, and NOT using SHA256 for signing the request (captured from request using AWS SDK for C++):
PUT https://mybucket.s3.us-west-1.amazonaws.com/somefile.mfs01?partNumber=1&uploadId=6CHL6tPKFcRSoxD4iysjKMgQCNfcFAt87bn4fsduV1YI5_aFIz9e36BxFURH_iEX8EChUtQm06qT9oyIUDbAnA.2M.novpBBKsnGl_NqNvVllQ7L1VK6x1PiLlqq46tH HTTP/1.1
Cache-Control: no-cache
Connection: Keep-Alive
Pragma: no-cache
Content-Type: binary/octet-stream
Content-MD5: PV204S0m8zJY8zu9Q3EF+w==
Accept: */*
Authorization: AWS4-HMAC-SHA256 Credential=AKIAIBYS55OSD2FOBFUSC/20200923/us-west-1/s3/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;content-length;content-md5;content-type;host;x-amz-content-sha256;x-amz-date, Signature=d013028d77e45f3dcce5f46f3fb53cdeeb3c9cfbd931371e69a9925047e61cd3
Host: nuix-nov-dev.s3.us-west-1.amazonaws.com
User-Agent: aws-sdk-cpp/1.7.333 Windows/10.0.19041.329 x86 MSVC/1927
amz-sdk-invocation-id: E57D09A7-B5E7-4E2A-8B2D-B493147F06D7
amz-sdk-request: attempt=1
x-amz-content-sha256: UNSIGNED-PAYLOAD
x-amz-date: 20200923T212738Z
Content-Length: 5242880
Chilkat gave us a new "beta" build that allows us to specify the Content-MD5 header even for AWS Signature V4 and it won't remove it, however, it's in addition to the automatically calculated SHA-256 x-amz-content-sha256 so that's doubling the hashing unnecessarily, and would be better to be able to specify UNSIGNED-PAYLOAD for purposes of the AWS signature.
If there is a content mismatch error with the Content-MD5 value, AWS returns this (with a status 400):
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>InvalidDigest</Code>
<Message>The Content-MD5 you specified was invalid.</Message>
<Content-MD5>thisisbad</Content-MD5>
<RequestId>8274DC9566D4AAA8</RequestId>
<HostId>H6kSy4cl+54nMon1Hq6AGjmTX/MfTVMQQr8vEVNXUnPlfMtIt8HPdObfusckhBpwpG/CJ6ORWv16c=</HostId>
</Error>
If there is a content mismatch with x-amz-content-sha256 AWS returns the following error, which I had difficulty finding on the web, and is slightly different so pasting here (also status 400):
Status:400 : AWSCode: XAmzContentSHA256Mismatch : AWSMessage: The provided 'x-amz-content-sha256' header does not match what was computed.
This problem should have already been fixed in a later version of Chilkat.

Google Cloud Function injecting 'Cache-Control: Private' header in response to Bingbot User Agent independent of function code

We have a Google Cloud Function in live, which is essentially returning the correct redirects for us from a now-defunct site using a very simple Python script, backed by a CDN which is caching the responses to avoid triggering the function more than necessary.
We're not having any problems with how the function itself is working, however we have noticed that in response to a specific User-Agent (Bingbot) being passed with the request, Google Cloud Function is injecting a Cache-Control: Private header into the response independent of the function code (which does not specify a Cache-Control header inside the 301 response it sends back). This is causing all requests from Bingbots to be passed to the backend every time, causing our cloud function usage to be much higher than it would ordinarily be and incurring higher costs.
This also causes changes to Content and Transfer Encoding, although we are less concerned about this.
We tested this by stripping out the User-Agent header at the CDN level before the request to the backend (the function) and confirmed that without the Bingbot headers we get 0 persistent passes; allowing the header back through recreated the issue of far more Passes than we should be seeing.
We've begun stripping all User-Agent headers at this point which has solved the issue on a shallow basis, but we are concerned that this is undocumented behaviour and we have no information about when Cloud Functions may in other circumstances inject or manipulate response headers in response to request headers.
To confirm this isn't coming from our Python script, the relevant portion returning our response is as follows:
try:
return flask.redirect(redirect_dict[request.path], code=301)
except:
return flask.redirect(os.environ.get('FALLBACK_URL'), code=301)
Curl with Bingbot UA (actual URL & host obscured):
curl -v -X GET $function/$path -H 'User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)' -H $host
And the relevant response:
< HTTP/1.1 301 Moved Permanently
< Content-Type: text/html; charset=utf-8
< Function-Execution-Id: jzlrm3k4ndhv
< Location: $redirectURL
< X-Cloud-Trace-Context: 83841aa8390d4ea4c1c8349c3aca21be
< Content-Encoding: gzip
< Date: Mon, 20 May 2019 13:02:22 GMT
< Server: Google Frontend
< Cache-Control: private
< Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"
< Transfer-Encoding: chunked
Without Bingbot UA, the response is:
< HTTP/1.1 301 Moved Permanently
< Content-Type: text/html; charset=utf-8
< Function-Execution-Id: t8frc9wsdvzp
< Location: $redirectURL
< X-Cloud-Trace-Context: 1f817eecdc84ad4a7542fba5898caf50;o=1
< Date: Mon, 20 May 2019 13:02:37 GMT
< Server: Google Frontend
< Content-Length: 319
< Alt-Svc: quic=":443"; ma=2592000; v="46,44,43,39"
We would expect responses to be the same as we are not injecting any cache-control headers in response to queries. Clearly varying the User Agent causes Google Cloud Functions to inject additional headers, vary encoding and otherwise transform responses. The concern is that there is no documentation or other information about this (unless I've missed it). If someone could point me at any kind of explanation, or if someone from Google could explain why this happens and any other settings we could use to prevent it, that would be the ideal outcome here.

Can browser caching be controlled by HTTP headers alone w/o using hash names for asset files?

I'm reading it in Webpack docs:
The way it works has a pitfall: if we don’t change filenames of our resources when deploying a new version, browser might think it hasn’t been updated and client will get a cached version of it.
I'm curious, is it mandatory to use this mechanism with ugly file names main.55e783391098c2496a8f.js for assets in order to inform a browser that an asset file has changed?
Can it be controlled by HTTP headers only? There are multiple HTTP headers in the standard to control how browser caches assets, like:
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Wed, 24 Aug 2020 18:32:02 GMT
Last-Modified: Tue, 15 Nov 2024 12:45:26 GMT
ETag: x234dff
max-age: 12345
So can I use those headers alone? Or do I still have to bother about hash parts in file names main.55e783391098c2496a8f.js?
When user agent opens a page it must always get correct version of a source code. You have two options to achieve this:
Set Cache-Control, Expires and strong validator (ETag) response headers . This way you instruct user agent to perform relatively lightweight conditional request on each page load
Embed version in source code file URL and set Cache-Control and Expires response headers. This way you instruct user agent to cache source code with particural version forever
For more information check HTTP Caching article by Ilya Grigorik, HTTP conditional requests MDN page and this StackOverflow answer about resource revalidation.

RabbitMQ HTTP API call to aliveness-test returns 404 but other calls work

When using the HTTP API I am trying to make a call to the aliveness-test for monitoring purposes. At the moment I am testing using curl and the following command:
curl -i http://guest:guest#localhost:55672/api/aliveness-test/
And I get the following response:
HTTP/1.1 404 Object Not Found
Server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted it blue)
Date: Mon, 05 Nov 2012 17:18:58 GMT
Content-Type: text/html
Content-Length: 193
<HTML><HEAD><TITLE>404 Not Found</TITLE></HEAD><BODY><H1>Not Found</H1>The requested document was not found on this server.<P><HR><ADDRESS>mochiweb+webmachine web server</ADDRESS></BODY></HTML>
When making a request just to list the users or vhosts, the requests returns successfully:
$ curl -I http://guest:guest#localhost:55672/api/users
HTTP/1.1 200 OK
Server: MochiWeb/1.1 WebMachine/1.9.0 (someone had painted it blue)
Date: Mon, 05 Nov 2012 17:51:44 GMT
Content-Type: application/json
Content-Length: 11210
Cache-Control: no-cache
I'm using the latest stable version (2.8.7) of RabbitMQ and obviously have the management plugin installed for the API to work with the users call (the response is left out due to it containing company data but is just regular JSON as expected).
There isn't much on the internet about this call failing so I am wondering if anyone has seen this before?
Thanks,
Kristian
Turns out that the '/' at the beginning of the vhosts names is not implicit, even when as part of a URL. To get this to work I simply changed my request from:
curl -i http://guest:guest#localhost:55672/api/aliveness-test/
To
curl -i http://guest:guest#localhost:55672/api/aliveness-test/%2F
As %2F is '/' HTTP encoded, my request now queries the vhost named '/' and returns a 200 response which looks like:
{"status":"ok"}

Terrible Apache Bench results on Custom CMS

Please note: This is not a complain about a shoddy CMS.
Just toying with Apache Bench and got terrible results with our custom CMS, more exactly i got:
Requests per second: 0.37 [#/sec] (mean)
When i run another test with a plain php file i got:
Requests per second: 4786.07 [#/sec] (mean)
Another test with a previous version of the CMS:
Requests per second: 6068.66 [#/sec] (mean)
The website(s) are working fine, no problems detected, Google's Webmaster Tools reports our sites as faster than 80% of the pages which is fine, i think.
The test was:
ab -t 30 -c 10 http://example.com/
Maybe some kind of Apache problem? Bad .htaccess config, or similar?
Update:
Just ran a simple test with sockets and the results are similar. Page loads very, very slowly. If i ran my script with another website everything is fine.
Also, there's a small hint about a chunk length problem. (Bad Apache Headers, or line endings?)
The site is gzipped, and when verbose logging turned on, i see these lines in the response:
LOG: Response code = 200
LOG: header received:
HTTP/1.1 200 OK
Date: Tue, 04 Oct 2011 13:10:49 GMT
Server: Apache
Set-Cookie: PHPSESSID=ibnfoqir9fee2koirfl5mhm633; path=/
Expires: Sat, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Cache-Control: post-check=0, pre-check=0
Vary: Accept-Encoding
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
2ef6
Always at the same place, in the middle of the HTML-source, then <!DOCTYPE HTML> again.
Please, help.
Update #2:
Just checked my HTTP headers with Rex Swain's HTTP Viewer and got these results:
HTTP/1.1·200·OK(CR)(LF)
Date:·Wed,·05·Oct·2011·08:33:51·GMT(CR)(LF)
Server:·Apache(CR)(LF)
Set-Cookie:·PHPSESSID=n88g3qcvv9p6irm1fo0qfse8m2;·path=/(CR)(LF)
Expires:·Sat,·26·Jul·1997·05:00:00·GMT(CR)(LF)
Cache-Control:·no-store,·no-cache,·must-revalidate(CR)(LF)
Pragma:·no-cache(CR)(LF)
Cache-Control:·post-check=0,·pre-check=0(CR)(LF)
Vary:·Accept-Encoding(CR)(LF)
Connection:·close(CR)(LF)
Transfer-Encoding:·chunked(CR)(LF)
Content-Type:·text/html;·charset=UTF-8(CR)(LF)
(CR)(LF)
Do you notice anything unusual?
If it works well with ordinary web browsers (as you mentioned in the comments) the CMS handle the requests from Apache Benchmark differently.
A quick checklist:
AFAIK Apache Benchmark just send simple requests without any cookie handling, so try to set -C with a valid cookie (copy the values from a web browser).
Try to send exactly the same headers to the CMS as the web browser sends. Save a dump of a valid request with netcat, HttpFox or a packet sniffer and set the missing headers with -H.
Profile the CMS on the server while you're sending to it a request with Apache Benchmark. Maybe you found the bottleneck. Two poor man's error_log calls with a timestamp in the first and the last line of the index.php (or the tested script's entry point) could show how fast is the PHP script and help to calculate the overhead of the Apache HTTP Server and network.
If you run socket tests and browser tests from different machines it's could be a DNS issue (turn off HostnameLookups in Apache). Try to run them from the same machine.
Try ab -k ... or ab -H "Connection: close" ....
I guess the CMS does some costly initialization when it initializes the session and it's happens when it processes the first request. Since Apache Benchmark does not send the cookies back the CMS it creates a new session for every request and it's the cause of the slow answers.
A second guess is that the CMS handle the incoming http headers differently and the headers which was sent (or the lack of them) by Apache Benchmark trigger some costly/slow processing. It looks more appropriate since the report of the Google's Webmaster Tools.
Apache Benchmark sends HTTP 1.0 request, for example:
GET / HTTP/1.0
Host: localhost:9100
User-Agent: ApacheBench/2.3
Accept: */*
It looks to me that your server does not send any http header about Keep-Alive settings but it assumes that the client uses keep-alive when the client uses HTTP 1.0. It's not an RFC compliant behaviour:
From RFC 2616, 19.6.2 Compatibility with HTTP/1.0 Persistent Connections:
Some clients and servers might wish to be compatible with some
previous implementations of persistent connections in HTTP/1.0
clients and servers. Persistent connections in HTTP/1.0 are
explicitly negotiated as they are not the default behavior.
By default Apache Benchmark doesn't use keep-alive so it waits when the response arrives for the closing of the socket. The server closes it after 15 seconds idle. Downloading the main page with wget also takes 15 seconds. Wget also uses HTTP 1.0 in the request.
I think it's a bug in the PHP code of the CMS since ab works well on the same server with a plain php file. Anyway, you can workaround it with using keep-alive connections (-k):
ab -k -t 30 -c 10 http://example.com/
or with explicitly disabling persistent connections:
ab -H "Connection: close" -t 30 -c 10 http://example.com/
but it's still a server side issue and your original ab commands is right.
Please note that this bug probably affects only HTTP 1.0 clients (like Apache Benchmark, wget) and clients with regular browsers will not notice it.