We have a number of google cloud storage transfer job that sync from aws s3 buckets to google buckets. I am assuming that they are using https to transfer the data but where can I get a confirmation that they do. Where can I get information about minimum TLS version used in these transfer jobs.
Regarding the Cloud Storage TLS, in this document you could find the TLS information for the gsutil commands which requests are done via the JSON API. These requests are via HTTPS only, and are used within the cloud console too.
Related
I am working on designing a system to upload files to the server. The request to upload a file must go through the API gateway. The request will be a REST API POST request and the request body is a file type form-data (ie. the location of the file to upload). The upload of a single file should be replicated on a quorum of file servers. For eg., if I have 3 file servers, the client should get acknowledged of successful upload after the file has been written to at least 2 file servers. The actual file upload (data transfer) should happen directly between the client and the file servers and not through the API gateway (or any proxy server in the path).
My Solution - API gateway returns the list of file servers(URL) to write and the client library orchestrates the uploads and makes sure that the upload happens of a quorum of file servers. But this creates a thick client which contains all the orchestration logic and is hard to maintain for different languages.
Is there any better way to solve this? How is this done in production servers? For eg, AWS S3/Azure blob store or any other production-grade system must be sending the request to API gateway(or proxy) first, how are they handling this?
It looks like you're trying to build a serverless solution which I'm not expert on.
One way I can think of is to use S3 bucket as a proxy (I know you said no proxy server but somehow did mention S3 🤷🏼♂️) storage server. You can then setup Lambda service to act on S3 upload complete. That Lambda function will then be responsible to upload S3 object to whichever file servers that will be hosting the uploaded file.
At least this way, the client only needs to concern with uploading file one time. If the client needs to check to see if at least 2 file servers have the file, you can poll using HEAD request since you would have endpoint urls with the initial request.
I'm not sure if this is a workable solution for you. If not, hopefully someone with more experience on the serverless architect can give you better answer.
Trying to send records to Amazon S3 with Flink: however these records need to be sent with an AES256 SSE header to request server side encryption
see aws documentation:
If you need server-side encryption for all of the objects that are stored in a bucket, use a bucket policy. For example, the following bucket policy denies permissions to upload an object unless the request includes the x-amz-server-side-encryption header to request server-side encryption:
Is this something that can be set for specific file sinks? have not found any documentation on the matter and beginning to think a forwarding lambda will be needed to transform the data.
When a client app is on prem and an AWS is setup with Direct Connect with the corporate on-prem network, how exactly can the client app gain access to the s3 objects?
For example, suppose a client app simply wants to obtain jpg images which live in an S3 bucket.
What type of configuration do I need to make to the S3 bucket permissions?
What configuration do I need to do at the VPC level?
I'd imagine that since Direct Connect is setup, this would greatly simplify an on prem app gaining access to an S3 bucket. Correct?
Would VPC endpoints come in to play here?
Also, 1 constaint here : the client app is not within my control: the client app simply needs a URL it can reach for the image. It cannot easily be changed to support sending credentials in the request, unfortunately. This may be a very important constraint worth mentioning.
Any insight is appreciated. Thank you so much.
you might want to consider these
https://aws.amazon.com/blogs/aws/new-vpc-endpoint-for-amazon-s3/
https://aws.amazon.com/premiumsupport/knowledge-center/s3-private-connection-no-authentication/
And for troubleshooting, try this
https://aws.amazon.com/premiumsupport/knowledge-center/connect-s3-vpc-endpoint/
If you need to access S3 over DirectConnect,
S3-DirectConnect
//BR
P.S. let me know if that's work for you.. :)
I had very similar issue to solve, also searching like you on how to force client to use direct connect to download content from S3.
In my case, the client is one on-prem load-balancer facing internet that needed to serve content hosted on S3 (CloudFront was not possible).
2 articles already mentioned are important to take into account but not sufficient:
Direct connect for virtual private interface
https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-access-direct-connect/
=> Needed to setup all the VPC endpoint and routing between onprem and AWS.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/privatelink-interface-endpoints.html#accessing-bucket-and-aps-from-interface-endpoints
=> Explain partially how to access bucket using VPC-Endpoints
The missing information from the latest AWS page is what URL structure you need to use to connect to your S3 endpoint, here is the structure I discovered working:
https://bucket.[vpc-endpoint-id].s3.[region].vpce.amazonaws.com/[bucket-name]/[key]
With that scheme, you can address any object on one S3 bucket using S3 VPC endpoint using normal web request.
We use that concept to serve securely files hosted on S3 bucket via our on-prem load-balancer and specific domain name using our Direct-Connect capacity.
The LB just rewrite the URL and get the files directly from the S3 bucket. The real client doesn't know either the file is served from S3 in backend in reality.
if I'm using the AmazonS3Client to put and fetch files, is my connection encrypted? This seems basic, but my googling seems to return things about encrypting the S3 storage and not whether the transmission from this client is secure. If it's not secure is there a setting to make it secure?
Amazon S3 endpoints support both HTTP and HTTPS. It is recommended that you communicate via HTTPS to ensure your data is encrypted in transit.
You can also create a Bucket Policy that enforces communication via HTTPS. See:
Stackoverflow: Force SSL on Amazon S3
Sample policy: s3BucketPolicyEncryptionSSL.json
Is there any way to serve a static website (SPA actually) located on Google Cloud Storage via SSL, for that nice SSL address and icon for users to see?
Amazon allows this via CloudFront SNI.
Yes!
Using GCS directly via CNAME redirects only allows HTTP traffic.
To use HTTPS with your own domain, you'll need to set up Google Cloud Load Balancer, and optionally you'll want to set up Google Cloud CDN as well. While it adds a bit of complexity, Google Cloud Load Balancer allows you to fill a domain with all sorts of content. Some resources could be served by a GCS bucket, but you could also have servers in GCE serving dynamic content for other paths.
There are instructions for setting this up here: https://cloud.google.com/compute/docs/load-balancing/http/using-http-lb-with-cloud-storage.
An alternative would be to host your domain DNS server at CloudFlare. They give free HTTPS to HTTP service.
More Info:
https://www.cloudflare.com/ssl/
Adding HTTPS For Free With CloudFlare
As of April 2019: https://cloud.google.com/storage/docs/troubleshooting#https
HTTPS serving Issue: I want my content served through HTTPS.
Solution: While you can serve your content through HTTPS using direct
URIs such as https://storage.googleapis.com/my-bucket/my-object, when
hosting a static website using a CNAME redirect, Cloud Storage only
supports HTTP. To serve your content through a custom domain over SSL,
set up a load balancer, use a third-party Content Delivery Network
with Cloud Storage, or serve your static website content from Firebase
Hosting instead of Cloud Storage.
Pretty shocking in this day and age that with letsEncrypt everywhere they have not figured out how to do this.
An alternative would be to host your SPA on Firebase. All apps have SSL included by default even those with custom domains. They also have a CLI that makes it easy to deploy!
If you're not tied to Cloud Storage, another alternative to host your SPA directly on App Engine, using static files.
Follow this tutorial for something more compreensive.
If you still want your SPA to be stored in a Cloud Storage bucket, you can use this project to serve it through App Engine. You can host multiple websites with a single app, in fact.
Using App Engine either way, you'll get a free managed certificate, and a free monthly allowance.
For simplicity use FireBase, the command to update is ssh firebase deploy Iv done a few thousand html files in a matter of seconds.
I would also recommend the free service CloudFlare provides as well for an extra level of protection.