Cloudflare refresh resource - cloudflare

I am thinking to use cloudflare to cache a resource generated by a REST API endpoint.
Because the API can take time to return the result, I am wondering if it is possible to configure cloudflare to refresh the resource in background returning always the cached resource to clients.

You can use page rules on the API endpoint to cache the result for X hours (or days, etc).
I think it will have to be GET though, I don't think POST is ever cached.

Related

Cloudflare adds too much latency to API calls

Trying out cloudflare for ddos protection of an SPA webapp, using free tier for testing.
Static contents loading is fine, but API calls became very slow.
From original <50ms for each api call to around 450~500ms each.
My apis are called via a subdomain eg apiXXX.mydomain.xyz
Any idea the problems or alternative fast ddos protection solution?
Cloudflare has created a page to explain the configuration you have to do when proxyfing APIs :
You have to create a new Page Rule in which you have to bypass cache, turn off Always Online and Browser Integrity Check options.
If you do not configure this, you may have slow response time because all those options are enabled by default on API calls.
Here's the link to create the configuration : Cloudflare Page Rule for API

Requests to an API endpoint denied due to Testcafe prepending request URLs - solutions?

When Testcafe runs against our local site, every request it makes during the test steps are prepended with something like http://192.168.1.182:59304/http://localhost:3000 (port number varies per run).
For the most part this works, but our web application makes calls to certain APIs during a user journey, and within TestCafe they might look like: http://192.168.1.182:59304/http://www.example.com/api/v2/customers/1 which come back with a 401 and response body of 'unauthorized'. Some API calls are fine, however.
I guess my question is:
Are there any way to get around this from my side, such as rewrite certain requests, or do I need to contact the API provider - and if so, what would they be potentially looking to do to allow these requests to go ahead?
You have faced this issue: https://github.com/DevExpress/testcafe-hammerhead/issues/2344. It was fixed. Try to run your tests with the latest TestCafe version (1.8.8-alpha.3).

Cloufdront Masks User Agent and Remote Address

We are running an SPA that communicates with an API. Both are exposed to the public via Cloudfront.
We now have the issue that the requests we see in the backend are masked by Cloudfront. Meaning:
The Remote Address we see is the address of the AWS Cloud
The User Agent Header field is set to "Amazon Cloudfront" and not the browser of the user
So Cloudfront somehow intercepts the request in a way we didn't anticipate.
I already went through these steps: https://aws.amazon.com/premiumsupport/knowledge-center/configure-cloudfront-to-forward-headers/ but ended up cutting the connection between the API and the frontend.
We don't care about caching implications (we don't have a lot of traffic), we just need to the right fields to show up in the backend.
By default, most request headers are removed, because CloudFront's default behaviors are generally designed around optimal caching. CloudFront's default header handling behavior is documented.
If you need to see specific headers at the origin, whitelist those headers for forwarding in the cache distribution. The documentation refers to this as “Selecting the Headers on Which You Want CloudFront to Base Caching” -- and that is what it does -- but that description masks what's actually happening. CloudFront removes the rest of the headers because it has no way of knowing for certain whether a specific header with a certain value might change the response that the origin generates. If it didn't remove these headers by default, there would be confusion in the other direction when users saw the "wrong" responses served from the cache.
In your case, you almost certainly don't want to include the Host header in what you are whitelisting for forwarding.
When testing, especially, be sure you also set the Error Caching Minimum TTL to 0, because the default value is 300 seconds... so you can't see whether the problem is fixed for up to 5 minutes after you fix it. This default is also by design, a protective measure to avoid overloading your origin with requests that are likely to continue to fail.
When examining responses from CloudFront, keep an eye on the Age response header, which is present any time the response is served from cache. It tells you how long it's been (in seconds) since CloudFront obtained the response it is currently returning to you.
If you want to disable CloudFront caching, you can set Maximum, Minimum, and Default TTL all to 0 (this only affects 2xx and 3xx HTTP responses -- errors are cached for a different time window, as noted above), or your origin can consistently return Cache-Control: s-maxage=0, which will prevent CloudFront from caching the response.

Does Amazon pass custom headers to origin?

I am using CloudFront to front requests to our service hosted outside of amazon. The service is protected and we expect an "Authorization" header to be passed by the applications invoking our service.
We have tried invoking our service from Cloud Front but looks like the header is getting dropped by cloud front. Hence the service rejects the request and client gets 401 forbidden response.
For some static requests, which do not need authorization, we are not getting any error and are getting proper response from CloudFront.
I have gone through CloudFront documentation and there is no specific information available on how headers are handled and hence was hoping that they will be passed as is, but looks like thats not the case. Any guidance from you folks?
The list of the headers CF drops or modifies can be found here
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorCustomOrigin.html#RequestCustomRemovedHeaders
CloudFront does drop Authorization headers by default and will not pass it to the origin.
If you would like certain headers to be sent to the origin, you can setup a whitelist of headers under CloudFront->Behavior Settings->Forward headers. Just select the headers that you would like to be forwarded and CloudFront will do the job for you. I have tested it this way for one of our location based services and it works like a charm.
One thing that I need to verify is if the Authorization header will be included in the cache key and if its safe to do that?? That is something you might want to watch out for as well.
It makes sense CF drops the Authorization header, just imagine 2 users asking for the same object, the first one will grant access, CF will cache the object, then the second user will get the object as it was previously cached by CloudFront.
Great news are using forward headers you can forward the Authorization header to the origin, that means the object will be cached more than once as the header value is part of the cache "key"
For exmple user A GETS private/index.html
Authorization: XXXXXXXXXXXXX
The object will be cached as private/index.html + XXXXXXXXXXXXX (this is the key to cahce the object in CF)
Now when the new request from a diferent user arrives to CloudFront
GET private/index.html
Authorization: YYYYYYYYYYYY
The object will be passed to the origin as the combinaiton of private/index.html + YYYYYYYYYYYY is not in CF cache.
Then Cf will be cached 2 diferent objects with the same name (but diferent hash combinaiton name).
In addition to specifying them under the Origin Behaviour section, you can also add custom headers to your origin configuration. In the AWS documentation for CloudFront custom headers:
If the header names and values that you specify are not already present in the viewer request, CloudFront adds them. If a header is present, CloudFront overwrites the header value before forwarding the request to the origin.
The benefit of this is that you can then use an All/wildcard setting for whitelisting your headers in the behaviour section.
It sounds like you are trying to serve up dynamic content from CloudFront (at least in the sense that the content is different for authenticated vs unauthenticated users) which is not really what it is designed to do.
CloudFront is a Content Distribution Network (CDN) for caching content at distributed edge servers so that the data is served near your clients rather than hitting your server each time.
You can configure CloudFront to cache pages for a short time if it changes regularly and there are some use cases where this is worthwhile (e.g. a high volume web site where you want to "micro cache" to reduce server load) but it doesn't sound like this is the way you are trying to use it.
In the case you describe:
The user will hit CloudFront for the page.
It won't be in the cache so CloudFront will try to pull a copy from the origin server.
The origin server will reply with a 401 so CloudFront will not cache it.
Even if this worked and headers were passed back and forth in some way, there is is simply no point in using CloudFront if every page is going to hit your server anyway; you would just make the page slower because of the extra round trip to your server.

Is it possible to use both S3 Query String Authentication and HTTP caching?

I have the following requirements for a (Rails) web application that uses S3/Cloudfront for image storage:
A user may only see an image if they are logged in. If the user sends an image URL to a friend, it will not work.
If a user has seen an image, it should be cached by their browser, so they don't have to download it again.
…
Requirement 1 can solved with S3's Query String Authentication (QSA)
(e.g. with 30 second expiry).
Requirement 2 can be solved using HTTP
caching.
Is it possible to use them both together?
The challenge I'm facing is that QSA effectively changes the URL of the image after expiry, even though a perfectly good copy may reside in the browser cache.