Prevent view of individual file in AWS S3 bucket - amazon-s3

I'm currently looking to host an app with the Angular frontend in a AWS S3 bucket connecting to a PHP backend using the AWS Elastic Beanstalk. I've got it set up and it's working nicely.
However, using S3 to create a static website, anyone can view your code, including the various Angular JS files. This is mostly fine, but I want to create either a file or folder to keep sensitive information in that cannot be viewed by anyone, but can be included/required by all other files. Essentially I want a key that I can attach to all calls to the backend to make sure only authorised requests get through.
I've experimented with various permissions but always seems to be able to view all files, presumably because the static website hosting bucket policy ensures everything is public.
Any suggestions appreciated!
Cheers.

The whole idea of static website hosting on S3 means the content to be public, for example, you have maintenance of your app/web, so you redirect users to the S3 static page notifying there is maintenance ongoing.
I am not sure what all have you tried when you refer to "experimented with various permissions", however, have you tried to setup a bucket policy or maybe setup the bucket as a CloudFront origin and set a Signed URL. This might be a bit tricky considering you want to call these sensitive files by other files. But the way to hide those sensitive files will either be by using some sort of bucket policy or by restricting using some sort of signed URL in my opinion.

Related

Google Cloud Storage: Alternative to signed URLs for folders

Our application data storage is backed by Google Cloud Storage (and S3 and Azure Blob Storage). We need to give access to this storage to random outside tools (upload from local disk using CLI tools, unload from analytical database like Redshift, Snowflake and others). The specific use case is that users need to upload multiple big files (you can think about it much like m3u8 playlists for streaming videos - it's m3u8 playlist and thousands of small video files). The tools and users MAY not be affiliated with Google in any way (may not have Google account). We also absolutely need to data transfer to be directly to the storage, outside of our servers.
In S3 we use federation tokens to give access to a part of the S3 bucket.
So model scenario on AWS S3:
customer requests some data upload via our API
we give customers S3 credentials, that are scoped to s3://customer/project/uploadId, allowing upload of new files
client uses any tool to upload the data
client uploads s3://customer/project/uploadId/file.manifest, s3://customer/project/uploadId/file.00001, s3://customer/project/uploadId/file.00002, ...
other data (be it other uploadId or project) in the bucket is safe because the given credentials are scoped
In ABS we use STS token for the same purpose.
GCS does not seem to have anything similar, except for Signed URLs. Signed URLs have a problem though that they refer to a single file. That would either require us to know in advance how many files will be uploaded (we don't know) or the client would need to request each file's signed URL separately (strain on our API and also it's slow).
ACL seemed to be a solution, but it's only tied to Google-related identities. And those can't be created on demand and fast. Service users are also and option, but their creation is slow and generally they are discouraged for this use case IIUC.
Is there a way to create a short lived credentials that are limited to a subset of the CGS bucket?
Ideal scenario would be that the service account we use in the app would be able to generate a short lived token that would only have access to a subset of the bucket. But nothing such seems to exist.
Unfortunately, no. For retrieving objects, signed URLs need to be for exact objects. You'd need to generate one per object.
Using the * wildcard will specify the subdirectory you are targeting and will identify all objects under it. For example, if you are trying to access objects in Folder1 in your bucket, you would use gs://Bucket/Folder1/* but the following command gsutil signurl -d 120s key.json gs://bucketname/folderName/** will create a SignedURL for each of the files inside your bucket but not a single URL for the entire folder/subdirectory
Reason : Since subdirectories are just an illusion of folders in a bucket and are actually object names that contain a ‘/’, every file in a subdirectory gets its own signed URL. There is no way to create a single signed URL for a specific subdirectory and allow its files to be temporarily available.
There is an ongoing feature request for this https://issuetracker.google.com/112042863. Please raise your concern here and look for further updates.
For now, one way to accomplish this would be to write a small App Engine app that they attempt to download from instead of directly from GCS which would check authentication according to whatever mechanism you're using and then, if they pass, generate a signed URL for that resource and redirect the user.
Reference : https://stackoverflow.com/a/40428142/15803365

CloudFront or S3 ACL: This XML file does not appear to have... AccessDenied / Failed to contact the origin

I'm using custom domain and CloudFront for S3 static hosting site to serve https.
It's working fine when I open the pages through the app's internal buttons or link,
but if I input direct URL in the address bar, or click the browser refresh button, it shows
This XML file does not appear to have any style information associated with it. The document tree is shown below.... Access Denied error screen.
I searched related answers and tried to /index.html in the CloudFront general setting as Default Root Object but it didn't work. (Before this try, it was index.html)
When I updated it as /index.html, even the domain itself didn't work.
I have another S3 static hosting site without CloudFront and certificate just for testing.
This site is working fine even I input direct url or click the refresh button.
Above two S3 bucket have same settings (root object is index.html and error document is also index.html)
After this, I changed CloudFront Origin Domain Name from REST endpoint to website endpoint referred to this docs (https://aws.amazon.com/premiumsupport/knowledge-center/s3-website-cloudfront-error-403/)
But now getting this error when I refresh the screen.
All the object in S3 is owned to bucket owner and has public access.
This app is made by React and using react-router-dom.
Could you give me any hint or advice?
Thanks.
Solved...
My S3 bucket region requires . instead of - when I use website endpoint for cloudfront.
And FYI..
In my case, there are some little difference with the document and some tutorial. My CloudFront distribution doesn't need to use default root object, and individual objects in S3 has no public access but the bucket has it.
There are some specific endpoints to be used for website hosting buckets, which are listed in the Amazon Simple Storage Service endpoints and quotas document. For example, when hosting in eu-west-1, cloudfront will prepopulate the dropdown with example.s3.eu-west-1.amazonaws.com, but if you look into the bucket settings, Static website hosting section, it will show you the correct url example.s3-website-eu-west-1.amazonaws.com
Carefully read the table! The url scheme is not fully consistent, eg. s3-website.us-east-2.amazonaws.com but s3-website-us-east-1.amazonaws.com - just to make your day a bit more joyful.
So I had the exact same issue and was able to resolve it after taking the s3 bucket endpoint located in the properties of the s3 bucket and then pasting it into the cloudfront origins section into the origin domain. I removed the beginning of the endpoint for example: "http://website.com.s3-website.us-east-2.amazonaws.com" you would just remove the "http://" and then post the rest into the cloudfront origin domain and click save. That should solve the problem!
I tried all kinds of different options such as making sure every object was public as well in the s3 bucket. Make sure your s3 bucket is also publicly available.
Certain regions do have different endpoints for your s3 buckets. Here is a link that shows more of that: https://aws.amazon.com/premiumsupport/knowledge-center/s3-rest-api-cloudfront-error-403/

amazon s3 for downloads how to handle security

I'm building a web application and am looking into using Amazon S3 to store user uploads.
My concern is, I dont want user A to see his download link for a document he uploaded is urltoMyS3/doc1234.pdf and try urltoMyS3/doc1235.pdf and get another users document.
The only way I can think of to do this, is to only allow the web application to connect to S3, then check if the user has access to a file on the web application, have the web app download the file, and then serve it to the client. The problem with this method is the application would have to download the file first and would inevitably slow the download process down for the user.
How is user files typically handled with Amazon S3? Or is it simply not typically used in a scenario where the files should not be public? Is there another service for something like this?
Thanks
You can implement Query String Authentication, which will solve your problem.
Query string authentication is useful for giving HTTP or browser
access to resources that would normally require authentication. The
signature in the query string secures the request. Query string
authentication requests require an expiration date. You can specify
any future expiration time in epoch or UNIX time (number of seconds
since January 1, 1970).
You can do this by generating the appropriate links, see the following
https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html#RESTAuthenticationQueryStringAuth
If time-bound authentication will not work for (as suggested in other answers). You could consider implementing something like s3fs to mount your S3 bucket as a drive on your web application server. In this manner you can simply make your authentication and then serve up the file directly to the user, without them having any idea that the file resides in S3. Similarly, you can simply write uploaded files directly to this s3fs mount.
S3fs, also allows you to configure a local cache of the S3 directory on your machine for faster access.
This works nicely in a cluster web server environment as well, as you can just have each server mount the s3fs drive and perform/read/writes on it independently.
A link with more info

Does Amazon S3 help anything in this case?

I'm thinking about whether to host uploaded media files (video and audio) on S3 instead of locally. I need to check user's permissions on each download.
So there would be an action like get_file, which first checks the user's permissions and then gets the file from S3 and sends it using send_file to the user.
def get_file
if #user.can_download(params[:file_id])
# first, download the file from S3 and then send it to the user using send_file
end
end
But in this case, the server (unnecessarily) downloads the file first from S3 and then sends it to the user. I thought the use case for S3 was to bypass the Rails/HTTP server stack for reduced load.
Am I thinking this wrong?
PS. I'm using CarrierWave for file uploads. Not sure if that's relevant.
Amazon S3 provides something called RESTful authenticated reads, which are basically timeoutable URLs to otherwise protected content.
CarrierWave provides support for this. Simply declare S3 access policy to authenticated read:
config.s3_access_policy = :authenticated_read
and then model.file.url will automatically generate the RESTful URL.
Typically you'd embed the S3 URL in your page, so that the client's browser fetches the file directly from Amazon. Note however that this exposes the raw unprotected URL. You could name the file with a long hash instead of something predictable, so it's at least not guessable -- but once that URL is exposed, it's essentially open to the Internet. So if you absolutely always need access control on the files, then you'll need to proxy it like you're currently doing. In that case, you may decide it's just better to store the file locally.

Create my own error page for Amazon S3

I was wondering if it's possible to create my own error pages for my S3 buckets. I've got CloudFront enabled and I am using my own CNAME to assign the S3 to a subdomain for my website. This helps me create tidy links that reference my domain name.
When someone tries to access a file that has perhaps been deleted or the link isn't quite correct, they get the XML S3 error page which is ugly and not very helpful to the user.
Is there a way to override these error pages so I can display a helpful HTML page instead?
If you configure your bucket as a 'website', you can create custom error pages.
For more details see the Amazon announcement of this feature and the AWS developer guide.
There are however some caveats with this approach, a major one being that your objects need to be publicly available.
It also works with Cloudfront, but the same public access limitations apply. See https://forums.aws.amazon.com/ann.jspa?annID=921.
If you want, you can try these out
right away by configuring your Amazon
S3 bucket as a website and making the
new Amazon S3 website endpoint a
custom origin for your CloudFront
distribution. A few notes when you do
this. First, you must set your custom
origin protocol policy to “http-only.”
Second, you’ll need to use a tool that
supports CloudFront’s custom origin
feature – the AWS Management Console
does not at this point offer this
feature. Finally, note that when you
use Amazon S3’s static website
feature, all the content in your S3
bucket must be publicly accessible, so
you cannot use CloudFront’s private
content feature with that bucket. If
you would like to use private content
with S3, you need to use the S3 REST
endpoint (e.g., s3.amazonaws.com).