Cloudfront serving png/jpg vs webp based on request headers - amazon-s3

I have Cloudfront in front of S3 serving images (png and jpg).
I have all png and jpg images in webp format in the same directory with .webp extension. For example:
png: /path/to/file.png
webp: /path/to/file.png.webp
I'd like to serve the webp file dynamically without changing the markup.
Since browsers flag webp support via Accept header, what i need to do is: if the user has support for webp (via Accept header) Cloudfront would pull the webp version (filename.png.webp), if not it should serve the original file (filename.png)
Is this possible to achieve?

Making Cloudfront serve different resources is easy (when you have done it a couple of times), but my concern is whether the entity making the request (i.e. browser) and possible caching elements between (proxies etc) expects to have different media types on the same request URI. But that is a bit beyond your question. I believe the usual way to handle this problem is with a element where the browser is free to choose an image from different media types like this:
<picture>
<source type="image/svg+xml" srcset="pyramid.svg" />
<source type="image/webp" srcset="pyramid.webp" />
<img
src="pyramid.png"
alt="regular pyramid built from four equilateral triangles" />
</picture>
But if you still want to serve different content from Cloufront for the same URL this is how you do it:
Cloudfront has 4 different points where you can inject a lamdba function for request manipulation (Lambda#Edge).
For your use case we need to create a Lambda#Edge function at the Origin Request location then associate this function with your Cloudfront Distribution.
Below is an example from AWS docs that looks on device type and does URL manipulation. For your use case, something similar can be done by looking at the "Accept" header.
'use strict';
/* This is an origin request function */
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
const headers = request.headers;
/*
* Serve different versions of an object based on the device type.
* NOTE: 1. You must configure your distribution to cache based on the
* CloudFront-Is-*-Viewer headers. For more information, see
* the following documentation:
* https://docs.aws.amazon.com/console/cloudfront/cache-on-selected-headers
* https://docs.aws.amazon.com/console/cloudfront/cache-on-device-type
* 2. CloudFront adds the CloudFront-Is-*-Viewer headers after the viewer
* request event. To use this example, you must create a trigger for the
* origin request event.
*/
const desktopPath = '/desktop';
const mobilePath = '/mobile';
const tabletPath = '/tablet';
const smarttvPath = '/smarttv';
if (headers['cloudfront-is-desktop-viewer']
&& headers['cloudfront-is-desktop-viewer'][0].value === 'true') {
request.uri = desktopPath + request.uri;
} else if (headers['cloudfront-is-mobile-viewer']
&& headers['cloudfront-is-mobile-viewer'][0].value === 'true') {
request.uri = mobilePath + request.uri;
} else if (headers['cloudfront-is-tablet-viewer']
&& headers['cloudfront-is-tablet-viewer'][0].value === 'true') {
request.uri = tabletPath + request.uri;
} else if (headers['cloudfront-is-smarttv-viewer']
&& headers['cloudfront-is-smarttv-viewer'][0].value === 'true') {
request.uri = smarttvPath + request.uri;
}
console.log(`Request uri set to "${request.uri}"`);
callback(null, request);
};
Next you need to tell Cloudfront that you want to use the Accept header as a part of your cache key (otherwise Cloudfront would only execute your Origin Request lambda once and also not expose this header to your function).
You do this nowadays with cache and origin request policies. Or with legacy settings (Edit Behaviour under your Cloudfront distribution settings) such as:
Worth to note here is that, if you get low cache hit ratio due to different variants of the Accept header you need to pre-process / clean it. The way I would do it is with a Viewer Request Lamdba that gets executed for each request. This new Lambda would then check if the Accept header supports Webp and then add a single NEW header to the request that it passes on to the Origin Request above. That way the Origin Request can cache on this new header (which only has two different possible values)
There's more config/setup needed such as IAM policies to get Lamdba to run etc, but there's lots of great material out there that walks you through the steps. Maybe start here?

Related

SignatureDoesNotMatch on GET request to pre-signed URL (hellosign)

i am attempting to grab a pdf file for e-signature using pre-signed URLs. i use Amplify Storage to generate the URL. then i provide the URL to the HelloSign function to bring the file into the embedding window. i am able to open the document using window.open but this other method brings the SignatureDoesNotMatchThe request signature we calculated does not match the signature you provided. error.
The request signature we calculated does not match the signature you provided. Check your key and signing method.
key: string = 'bucketname/path/to/file.pdf'
expires: number = 60
const url = await Storage.get(key, {
level: 'public',
cacheControl: 'no-cache',
signatureVersion: 'v4',
ContentType: 'application/pdf',
expires: 60,
customPrefix: {
public: '',
protected: '',
private: '',
},
})
hellosign.open(url, {
clientId,
skipDomainVerification: true,
})
one fix i found on Github was to include the signatureVersion when initializing the S3 class instance using the SDK. i am not using the SDK however, so i tried providing it when configuring Amplify like so:
AWSS3: {
bucket,
region,
signatureVersion: 'v4',
},
needless to say it did not fix the issue. i could not find any references to this in the docs, as the Amplify.configure function takes any in their docs.
i tried clicking the pre-signed URL and was able to download the doc without issue. i inspected the request and saw the correct headers that match the outgoing request originating from hellosign.open. any ideas what i can try?
hellosign.open() is only expected to load embedded URLs. These would be URLs returned from requests to the following endpoints:
/embedded/sign_url/ (sign URL - embedded signing)
/template/create_embedded_draft (edit URL - embedded template creation)
/embedded/edit_url/ (edit URL - embedded template editing)
/unclaimed_draft/create_embedded (claim URL embedded requesting)
/unclaimed_draft/create_embedded_with_template (claim URL embedded requesting)
If you'd like to collect signatures on a document in an iframe on your site, embedded signing will likely be what you're looking for. In that case, you'd want to create your signature request with a request to:
/signature_request/create_embedded
and then use the URL you're creating via Amplify Storage as the value of the file_url[] request parameter.
Once you've created your request, locate the signature ID for your signer in the response object, and use that in a request to /embedded/sign_url/
This will return a sign URL that you can load using hellosign.open().

Control cloudflare origin server using workers

I'm trying to use a cloudflare worker to dynamically set the origin based on the requesting IP (so we can serve a testing version of the website internally)
I have this
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
if (request.headers.get("cf-connecting-ip") == '185.X.X.X')
{
console.log('internal request change origin');
}
const response = await fetch(request)
console.log('Got response', response)
return response
}
I'm not sure what to set. The request object doesn't seem to have any suitable parameters to change.
Thanks
Normally, you should change the request's URL, like this:
// Parse the URL.
let url = new URL(request.url)
// Change the hostname.
url.hostname = "test-server.example.com"
// Construct a new request with the new URL
// and all other properties the same.
request = new Request(url, request)
Note that this will affect the Host header seen by the origin (it'll be test-server.example.com). Sometimes people want the Host header to remain the same. Cloudflare offers a non-standard extension to accomplish that:
// Tell Cloudflare to connect to `test-server.example.com`
// instead of the hostname specified in the URL.
request = new Request(request,
{cf: {resolveOverride: "test-server.example.com"}})
Note that for this to be allowed, test-server.example.com must be a hostname within your domain. However, you can of course configure that host to be a CNAME.
The resolveOverride feature is documented here: https://developers.cloudflare.com/workers/reference/apis/request/#the-cf-object
(The docs claim it is an "Enterprise only" feature, but this seems to be an error in the docs. Anyone can use this feature. I've filed a ticket to fix that...)

Correct code to upload local file to S3 proxy of API Gateway

I created an API function to work with S3. I imported the template swagger. After deployment, I tested with a Node.js project by the npm module aws-api-gateway-client.
It works well with: get bucket lists, get bucket info, get one item, put a bucket, put a plain text object, however I am blocked with put a binary file.
firstly, I ensure ACL is allowed with all permissions on S3. secondly, binary support also added
image/gif
application/octet-stream
The code snippet is as below. The behaviors are:
1) after invokeAPI, the callback function is never hit, after sometime, the Node.js project did not respond. no any error message. The file size (such as an image) is very small.
2) with only two times, the uploading seemed to work, but the result file size is bigger (around 2M bigger) than the original file, so the file is corrupt.
Could you help me out? Thank you!
var filepathname = './items/';
var filename = 'image1.png';
fs.stat(filepathname+filename, function (err, stats) {
var fileSize = stats.size ;
fs.readFile(filepathname+filename,'binary',function(err,data){
var len = data.length;
console.log('file len' + len);
var pathTemplate = '/my-test-bucket/' +filename ;
var method = 'PUT';
var params = {
folder: '',
item:''
};
var additionalParams = {
headers: {
'Content-Type': 'application/octet-stream',
//'Content-Type': 'image/gif',
'Content-Length': len
}
};
var result1 = apigClient.invokeApi(params,pathTemplate,method,additionalParams,data)
.then(function(result){
//never hit :(
console.log(result);
}).catch( function(result){
//never hit :(
console.log(result);
});;
});
});
We encountered the same problem. API Gateway is meant for limited data (10MB as of now), limits shown here,
http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html
Self Signed URL to S3:
Create an S3 self signed URL for POST from the lambda or the endpoint where you are trying to post.
How do I put object to amazon s3 using presigned url?
Now POST the image directly to S3.
Presigned POST:
Apart from posting the image if you want to post additional properties, you can post it in multi-form format as well.
http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#createPresignedPost-property
If you want to process the file after delivering to S3, you can create a trigger from S3 upon creation and process with your Lambda or anypoint that need to process.
Hope it helps.

Force reload cached image with same url after dynamic DOM change

I'm developping an angular2 application (single page application). My page is never "reloaded", but it's content changes according to user interactions.
I'm having some cache problems especially with images.
Context :
My page contains an editable image list :
<ul>
<li><img src="myImageController/1">Edit</li>
<li><img src="myImageController/2">Edit</li>
<li><img src="myImageController/3">Edit</li>
</ul>
When i want to edit an image (Edit link), my dom content is completly changed to show another angular component with a fileupload component.
The myImageController returns the LastModified header, and cache-control : no-cache and must-revalidate.
After a refresh (hit F5), my page does a request to get all img src, which is correct : if image has been modified, it is downloaded, if not, i just get a 304 which is fine.
Note : my images are stored in database as blob fields.
Problem :
When my page content is dynamically reloaded with my single page app, containing img tags, the browser do not call a GET http request, but immediatly take image from cache. I assume this a browser optimization to avoid getting the same resource on the same page multiple times.
Wrong solutions :
The first solution is to add something like ?time=(new Date()).getTime() to generate unique urls and avoid browser cache. This won't send the If-Modified-Since header in the request, and i will download my image every time completly.
Do a "real" refresh : the first page load in angular apps is quite slow, and i don't to refresh all.
Tests
To simplify the problem, i trying to create a static html page containing 3 images with the exact same link to my controller : /myImageController/1. With the chrome developper tool, i can see that only one get request is called. If i manage to get mulitple server calls in this case, it would probably solve my problem.
Thank you for your help.
5th version of HTML specification describes this behavior. Browser may reuse images regardless of cache related HTTP headers. Check this answer for more information. You probably need to use XMLHttpRequest and blobs. In this case you also need to consider Same-origin policy.
You can use following function to make sure user agent performs every request:
var downloadImage = function ( imgNode, url ) {
var xhr = new XMLHttpRequest();
xhr.open("GET", url, true);
xhr.responseType = "blob";
xhr.onreadystatechange = function () {
if (xhr.readyState == 4) {
if (xhr.status == 200 || xhr.status == 304) {
var blobUrl = URL.createObjectURL(xhr.response);
imgNode.src = blobUrl;
// You can also use imgNode.onload callback to release blob resources.
setTimeout(function () {
URL.revokeObjectURL(blobUrl);
}, 1000);
}
}
};
xhr.send();
};
For more information check New Tricks in XMLHttpRequest2 article by Eric Bidelman, Working with files in JavaScript, Part 4: Object URLs article by Nicholas C. Zakas and URL.createObjectURL() MDN page and Same-origin policy MDN page.
You can use the random ID trick. This changes the URL so that the browser reloads the image. Not that this can be done in the query parameters to force a full cache break or in the hash to allow the browser to re-validate the image from the cache (and avoid re-downloading it if unchanged).
function reloadWithCache(img: HTMLImageElement, url: string) {
img.src = url.replace(/#.*/, "") + "#" + Math.random();
}
function reloadBypassCache(img: HTMLImageElement, url: string) {
let sep = img.indexOf("?") == -1? "?" : "&";
img.src = url + sep + "nocache=" + Math.random()
}
Note that if you are using reloadBypassCache regularly you are better off fixing your cache headers. This function will always hit your origin server leading to higher running costs and making CDNs ineffective.

XmlHttpRequest origin header is null in webkit

I am trying to integrate 2 webapps to let one (new one) send a url to another (old one) that it will load into a text area. The URL is actually to a file in S3 (with the multiple redirects that come with that) and I am trying to do this in the receiver using an XMLHTTPRequest like so;
var client = new XMLHttpRequest();
client.withCredentials = true;
client.open('GET', geneIdFileUrl, true );
client.setRequestHeader("Content-type", "text/plain");
client.onreadystatechange = function() {
document.getElementById('geneList').value = client.responseText;
}
client.send();
This is fine in firefox but in webkit browsers (chrome, safari) the request is being sent out with the request header
Origin null
instead of the real origin of the page making the request. The get from S3 is coming as a ContentDisposition attachment so I can't just dump it into an iframe and read the contents from there.
Why do the webkit browsers not set the origin properly and is it possible to coecre them into doing so?