I am using Heroku and Amazon S3, for storage.
I'm trying to make the download dialogue appear for the audio file, instead of the browser playing it.
In one of my controllers, I have:
response.content_type = 'application/octet-stream'
response.headers['Content-Disposition'] = "attachment; filename=#audio.filename"
response.headers['X-Accel-Redirect'] = #audio.encoded_file_url
render :nothing => true
#audio.encoded_file_url returns http://bucket_name.s3.amazonaws.com/uploads/19/test.mp3.
Which seems to work on my local machine. However, I am wondering if this approach will block an entire HTTP request handler, freezing the app until the download completes.
In Heroku, a HTTP request handler is one Dyno. And having several Dynos is expensive.
I'm not sure that you can rely on nginx being used (X-Accel-Redirect is an nginx-ism) - the heroku docs imply that it's not always used.
In addition, X-Accel-Redirect is, to my knowledge only for redirecting to files actually on the server, not for externally hosted files. Why not do a normal redirect to the S3 hosted file (using an authenticated URL if needed) ?
If you need to set headers like content disposition, this can be done either at time of upload or afterwards. If you use fog to do your s3 business you could do it like this (assuming storage is a Fog::Storage object)
storage.copy_object("your_bucket", "filename","your_bucket","filename", "x-amz-metadata-directive" => 'REPLACE', 'Content-Disposition' => '...')
Note that this overwrites all the metadata - if you have other fields such as Content-Type, Cache-Control etc. then make sure to set them here or they will be lost.
I would really recommend against letting users download files from your application via the dynos that your using to server your pages. Any static assets should really be served from S3, which you can then direct users to for file downloads.
Whilst the user is downloading, your dyno is effectively just feeding that file to them, and thus unable to do anything else.
Related
I'm trying to add a test coverage badge to the Readme of a private repository on GitHub. Our continuous integration process saves out the image to a secured Google Cloud Storage bucket that's not accessible to the public, and should remain that way.
Google's authorization layer is smart enough that if I go to the URL for the image, I'm automatically redirected to the resource with a valid auto-generated signed URL.
E.g., if I go to http://storage.cloud.google.com/secret-files/mysecretfile.png, then if I'm logged in and allowed to view it, I'm automatically redirected to something like https://blahblah-apidata.googleusercontent.com/download/storage/v1/b/secret-files/o/mysecretfile.png?key=verylongkey, where I can load the image.
This seemed perfect. Reference the canonical path in the GitHub Readme, authenticated users see the image, unauthenticated users are still blocked, we don't have to make the file public, and we don't have to do anything complicated.
Except that GitHub is proxying the image request, meaning that it will always be unauthenticated. My browser is loading something like https://camo.githubusercontent.com/mysecretimage.png.
Is there a clever way to work around this? Or do I need to go back to the drawing board?
All images on github.com are proxied using the Camo image proxy. There are a couple reasons for this:
It preserves the privacy of users. It isn't possible for a document to track users by directing them to a different site or using cookies to track them.
It means images can be cached and served at an appropriate size.
GitHub can have a very strict content security policy that does not allow loading from untrusted sites, which means that any sort of accidental security problem (like an XSS) is a lot less likely to work.
Note the last part. Even if you found some sneaky way to get another image URL to render properly in the website, your browser wouldn't load it because it violates the Content-Security-Policy header the site sent, and moreover, your browser would tattle about that to the reporting URL that GitHub provided.
So any image URL you provide will need to be readable by GitHub's image proxy and it won't be possible to serve different content to different users.
I'm currently looking to host an app with the Angular frontend in a AWS S3 bucket connecting to a PHP backend using the AWS Elastic Beanstalk. I've got it set up and it's working nicely.
However, using S3 to create a static website, anyone can view your code, including the various Angular JS files. This is mostly fine, but I want to create either a file or folder to keep sensitive information in that cannot be viewed by anyone, but can be included/required by all other files. Essentially I want a key that I can attach to all calls to the backend to make sure only authorised requests get through.
I've experimented with various permissions but always seems to be able to view all files, presumably because the static website hosting bucket policy ensures everything is public.
Any suggestions appreciated!
Cheers.
The whole idea of static website hosting on S3 means the content to be public, for example, you have maintenance of your app/web, so you redirect users to the S3 static page notifying there is maintenance ongoing.
I am not sure what all have you tried when you refer to "experimented with various permissions", however, have you tried to setup a bucket policy or maybe setup the bucket as a CloudFront origin and set a Signed URL. This might be a bit tricky considering you want to call these sensitive files by other files. But the way to hide those sensitive files will either be by using some sort of bucket policy or by restricting using some sort of signed URL in my opinion.
I'm building a web application and am looking into using Amazon S3 to store user uploads.
My concern is, I dont want user A to see his download link for a document he uploaded is urltoMyS3/doc1234.pdf and try urltoMyS3/doc1235.pdf and get another users document.
The only way I can think of to do this, is to only allow the web application to connect to S3, then check if the user has access to a file on the web application, have the web app download the file, and then serve it to the client. The problem with this method is the application would have to download the file first and would inevitably slow the download process down for the user.
How is user files typically handled with Amazon S3? Or is it simply not typically used in a scenario where the files should not be public? Is there another service for something like this?
Thanks
You can implement Query String Authentication, which will solve your problem.
Query string authentication is useful for giving HTTP or browser
access to resources that would normally require authentication. The
signature in the query string secures the request. Query string
authentication requests require an expiration date. You can specify
any future expiration time in epoch or UNIX time (number of seconds
since January 1, 1970).
You can do this by generating the appropriate links, see the following
https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html#RESTAuthenticationQueryStringAuth
If time-bound authentication will not work for (as suggested in other answers). You could consider implementing something like s3fs to mount your S3 bucket as a drive on your web application server. In this manner you can simply make your authentication and then serve up the file directly to the user, without them having any idea that the file resides in S3. Similarly, you can simply write uploaded files directly to this s3fs mount.
S3fs, also allows you to configure a local cache of the S3 directory on your machine for faster access.
This works nicely in a cluster web server environment as well, as you can just have each server mount the s3fs drive and perform/read/writes on it independently.
A link with more info
I am attempting to use an S3 bucket as a deployment location for an internal, auto-updating application's files. It would be the location where the new version's files are dumped for the application to puck up on an update. Since this is an internal application, I was hoping to have the URL be private, but to be able to access it using only a URL. I was hoping to look into using third party auto updating software, which means I can't use the Amazon API to access it.
Does anyone know a way to get a URL to a private bucket on S3?
You probably want to use one of the available AWS Software Development Kits (SDKs), which all implement the respective methods to generate these URLs by means of the GetPreSignedURL() method (e.g. Java: generatePresignedUrl(), C#: GetPreSignedURL()):
The GetPreSignedURL operations creates a signed http request. Query
string authentication is useful for giving HTTP or browser access to
resources that would normally require authentication. When using query
string authentication, you create a query, specify an expiration time
for the query, sign it with your signature, place the data in an HTTP
request, and distribute the request to a user or embed the request in
a web page. A PreSigned URL can be generated for GET, PUT and HEAD
operations on your bucket, keys, and versions.
There are a couple of related questions already and e.g. Why is my S3 pre-signed request invalid when I set a response header override that contains a “+”? contains a working sample in C# (aside from the content type issue Ragesh is experiencing of course).
Good luck!
I'm thinking about whether to host uploaded media files (video and audio) on S3 instead of locally. I need to check user's permissions on each download.
So there would be an action like get_file, which first checks the user's permissions and then gets the file from S3 and sends it using send_file to the user.
def get_file
if #user.can_download(params[:file_id])
# first, download the file from S3 and then send it to the user using send_file
end
end
But in this case, the server (unnecessarily) downloads the file first from S3 and then sends it to the user. I thought the use case for S3 was to bypass the Rails/HTTP server stack for reduced load.
Am I thinking this wrong?
PS. I'm using CarrierWave for file uploads. Not sure if that's relevant.
Amazon S3 provides something called RESTful authenticated reads, which are basically timeoutable URLs to otherwise protected content.
CarrierWave provides support for this. Simply declare S3 access policy to authenticated read:
config.s3_access_policy = :authenticated_read
and then model.file.url will automatically generate the RESTful URL.
Typically you'd embed the S3 URL in your page, so that the client's browser fetches the file directly from Amazon. Note however that this exposes the raw unprotected URL. You could name the file with a long hash instead of something predictable, so it's at least not guessable -- but once that URL is exposed, it's essentially open to the Internet. So if you absolutely always need access control on the files, then you'll need to proxy it like you're currently doing. In that case, you may decide it's just better to store the file locally.