Google Cloud Storage support of S3 multipart upload - file-upload

Currenty, i'm using GCS in "interoperability mode" to make it accept S3 API requests. By using the official multipart upload example here (+ setting the appropriate endpoint), the first initiation POST request:
POST /bucket/object?uploads HTTP/1.1
Host: storage.googleapis.com
Authorization: AWS KEY:SIGNATURE
Date: Wed, 07 Jan 2015 13:34:04 GMT
User-Agent: aws-sdk-java/1.7.5 Linux/3.13.0-43-generic Java_HotSpot(TM)_64-Bit_Server_VM/24.72-b04/1.7.0_72
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Transfer-Encoding: chunked
Connection: Keep-Alive
results in this response:
HTTP/1.1 400 Bad Request
Content-Length: 55
Date: Wed, 07 Jan 2015 13:34:05 GMT
Server: UploadServer ("Built on Dec 19 2014 ...")
Content-Type: text/html; charset=UTF-8
Alternate-Protocol: 443:quic,p=0.02
The request's content type is not accepted on this URL.
Could that be an AWS client issue or GCS doesn't support S3's multipart upload yet?
Most of the other actions i have tried (download object, list bucket objects etc) seem to work fine.

Google Cloud Storage (GCS) now supports the S3-style multipart upload API. As such, use cases like the one in this question should just work.

Update: As of May 2021, Google Cloud Storage (GCS) supports S3-compatible multipart uploads.
https://cloud.google.com/storage/docs/multipart-uploads
The AWS SDK will work seamlessly once you configure the appropriate endpoint.
GSC doesn't support the S3 multipart upload interface.
If you want to perform a chunk-parallel upload you can use object composition - see https://cloud.google.com/storage/docs/composite-objects

Related

How to serve a wbn (WebPackage/WebBundle) file from a web server?

does anyone know how to serve a web bundle so that it loads, rather than just downloading as a file?
Some disambiguation: There is a format called WebPackage (not to be confused with webpack), also called a Web Bundle. Files typically have the .wbn suffix. It contains html and js files and can be used to view websites offline. Useful for e.g. archiving websites or making websites that work well with intermittent network access. Download the file once, and you have all the assets you need for at last basic operation of the site.
The standard on how to serve a .wbn file is here:
https://wicg.github.io/webpackage/draft-yasskin-wpack-bundled-exchanges.html
However when I add the required headers in the web server, the .wbn file is just downloaded. If I drag the downloaded file onto my browser (google-chrome), the file is displayed as the website it contains, so unless there is some very subtle bug in there I believe that the format of the bundle is OK.
Here is a sample request:
Request URL: http://localhost/bundle/www-signed.wbn
Request Method: GET
Status Code: 200 OK
Remote Address: [::1]:80
Referrer Policy: strict-origin-when-cross-origin
and the server response:
Accept-Ranges: bytes
Connection: keep-alive
Content-Length: 4300
Content-Type: application/webbundle <-- Required by the standard
Date: Thu, 02 Sep 2021 12:00:24 GMT
ETag: "612ef7cb-10cc"
Last-Modified: Wed, 01 Sep 2021 03:47:23 GMT
Server: nginx/1.18.0 (Ubuntu)
X-Content-Type-Options: nosniff <-- required by the standard
If anyone has this working on a website or knows how to do it, I would love to have a look.
I had the same problem that the wbn file was just downloaded instead of executed.
I had to enable the web bundles feature even though my chrome version is 96+

cache-control key set in s3 and present in response header, but images are not caching

I set cache-control key on bucket with this value: public, max-age=86400. This shows up in the response header, but the images are not being cached. Images are coming back with 200. I'm using active storage, s3, and CloudFront.
/// response header below ///
Accept-Ranges: bytes
Cache-Control: public, max-age=86400
Content-Disposition: inline; filename="xhbtr_ba6c7e14-cab0-44c2-8701-e48aa75ab3f7_w1200.jpg"; filename*=UTF-8''xhbtr_ba6c7e14-cab0-44c2-8701-e48aa75ab3f7_w1200.jpg
Content-Length: 1444501
Content-Type: image/jpeg
Date: Mon, 14 Sep 2020 15:15:21 GMT
ETag: "18c45e75803b2a82c0a85dc4dde7bba4"
Last-Modified: Mon, 14 Sep 2020 14:00:44 GMT
Server: AmazonS3
x-amz-id-2: /6ynYC2FE4QwOaVJe1uJOBeCsJhfKFzNbMu+X7r0L5pRyax2JoxzJ5qoyO0Sb7dr09yFxZYj5/iE=
x-amz-request-id: 415DC4198F2ED5F1
Server: AmazonS3
You are querying S3. S3 has no cache. (The cache-control directive is meant for the client, not the server.)
What you probably want is to use Cloudfront as an edge cache. Cloudfront loosely adheres on the cache-control directive for edge caching (depending on how you configure it). Since you mention Cloudfront in your question, I assume you have already set it up. So instead of calling the S3 URL, use the URL you’ve set up in the Cloudfront configuration.

AWS S3 + CloudFront: Fonts not loading (CORS Problem)

I added some custom fonts to my website and uploaded them to AWS S3 + CloudFront.
A lot of topics here describe this problem but non of them are solving my issue.
Using curl I get this output:
curl --head https://cdn.mzguru.de/fonts/sourcesanspro/source-sans-pro-v12-latin-ext_latin-700.woff2
HTTP/1.1 200 OK
Content-Type: binary/octet-stream
Content-Length: 25348
Connection: keep-alive
Date: Tue, 22 Oct 2019 11:54:18 GMT
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET
Access-Control-Max-Age: 3000
Last-Modified: Fri, 12 Apr 2019 10:54:26 GMT
ETag: "639c2738552a0376c91e7d485e476fda"
Cache-Control: max-age=62208000
Accept-Ranges: bytes
Server: AmazonS3
X-Cache: Hit from cloudfront
Via: 1.1 bae3e24625567f5728a5caa96d6b7669.cloudfront.net (CloudFront)
X-Amz-Cf-Pop: FRA53
X-Amz-Cf-Id: iAy-QTfuV9ZqwmaRjXE0ramVSgsZkA6MtRmQOKDSonf6I8OabrpLZA==
Age: 12818
Within Chrome I get this error:
Access to font at 'https://cdn.mzguru.de/fonts/sourcesanspro/source-sans-pro-v12-latin-ext_latin-700.woff2' from origin 'https://www.monteurzimmerguru.de' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
This is the point where I do not understand the problem. The error message says: "No 'Access-Control-Allow-Origin' header is present"
But in the curl request I see this header. What is wrong?
Thank you
EDIT
I have attached a screenshot with the error messages.
EDIT 2: AWS Interface changed (2022)
Please take a look at #James Dean post.
1.) Do I need to tick the options box?
2.) I can not find the settings you describe. I guess the UI changed in the meanwhile.
Your S3 CORS configuration is correct based on below output:
>curl -vk "https://cdn.mzguru.de/fonts/sourcesanspro/source-sans-pro-v12-latin-ext_latin-700.woff2" -H "Origin: https://www.monteurzimmerguru.de"
< HTTP/2 200
< content-type: binary/octet-stream
< content-length: 25348
< date: Thu, 24 Oct 2019 12:19:41 GMT
< access-control-allow-origin: *
< access-control-allow-methods: HEAD, GET
< access-control-max-age: 3000
< last-modified: Fri, 12 Apr 2019 10:54:26 GMT
< etag: "639c2738552a0376c91e7d485e476fda"
< cache-control: max-age=62208000
< accept-ranges: bytes
< server: AmazonS3
< x-cache: Hit from cloudfront
However, Chrome/Browser is making OPTIONS/Preflight request on CloudFront and options request is not allowed on cloudfront currently. Only Head and GET are allowed.
curl -X OPTIONS "https://cdn.mzguru.de/fonts/sourcesanspro/source-sans-pro-v12-latin-ext_latin-700.woff2" -H "Origin: https://www.monteurzimmerguru.de"
>This distribution is not configured to allow the HTTP request method that was used for this request
To fix this , you need to do it:
In the CloudFront cache behaviour, you need to allow GET,HEAD and OPTIONS
In Cache behaviour, cache based on selected header, you should select Origin
Invalidate cache once and test it again.
You have to update,
Query String Forwarding and Caching to Forward all, cache based on all
in cloudfront Cache Behavior Settings ( cloudfont -> select one -> edit )

Can browser caching be controlled by HTTP headers alone w/o using hash names for asset files?

I'm reading it in Webpack docs:
The way it works has a pitfall: if we don’t change filenames of our resources when deploying a new version, browser might think it hasn’t been updated and client will get a cached version of it.
I'm curious, is it mandatory to use this mechanism with ugly file names main.55e783391098c2496a8f.js for assets in order to inform a browser that an asset file has changed?
Can it be controlled by HTTP headers only? There are multiple HTTP headers in the standard to control how browser caches assets, like:
Cache-Control: no-cache, no-store, must-revalidate
Pragma: no-cache
Expires: 0
Date: Wed, 24 Aug 2020 18:32:02 GMT
Last-Modified: Tue, 15 Nov 2024 12:45:26 GMT
ETag: x234dff
max-age: 12345
So can I use those headers alone? Or do I still have to bother about hash parts in file names main.55e783391098c2496a8f.js?
When user agent opens a page it must always get correct version of a source code. You have two options to achieve this:
Set Cache-Control, Expires and strong validator (ETag) response headers . This way you instruct user agent to perform relatively lightweight conditional request on each page load
Embed version in source code file URL and set Cache-Control and Expires response headers. This way you instruct user agent to cache source code with particural version forever
For more information check HTTP Caching article by Ilya Grigorik, HTTP conditional requests MDN page and this StackOverflow answer about resource revalidation.

GET Bucket op response + AWS S3 + Content-Length header

Just wanted to know if the GET Bucket op response ever skips the Content-Length header. I tested this and i saw that there was no Content-Length header in the response for GET Bucket op.
How does an application reading the response understand where the body of the response ends if the response doesn't contain Content-Length header?
Request-Response Snippet:
GET /?max-keys=1000&prefix&delimiter=%2F HTTP/1.1
Date: Sat, 09 Apr 2016 18:27:23 GMT
x-amz-request-payer: requester
Authorization: AWS AKIAIP3KAUILC4GG7A2A:UG3bGvIjayrxrkxEX1mfrvETy/M=
Connection: Keep-Alive
User-Agent: Cyberduck/4.9.19632 (Mac OS X/10.10.5) (x86_64)
HTTP/1.1 200 OK
x-amz-id-2: yg76HSq5j0mi0oR6dXF8ZfGq722kHBWiMQmNvXPqiLxr1S4nGj5GVn1RVrPQrOUfNynxxaMSYEY=
x-amz-request-id: B4468E68E10B6AEF
Date: Sat, 09 Apr 2016 18:27:25 GMT
x-amz-bucket-region: us-east-1
Content-Type: application/xml
Server: AmazonS3
Connection: close
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">......</ListBucketResult>
Thanks!
The Content-Length header is optional in response. And it may not reflect the real content-length even if it presents. Think about gzipped response. So to answer the question: When no Content-Length is received, the client keeps reading until the server closes the connection.
In Java, keep calling InputStream.read() until it returns -1.
Is the Content-Length header required for a HTTP/1.0 response?