I am using aws s3 for storing files and returning links to those files. It's very convenient but I only know how to generate a pre-signed URL which only lasts for a set amount of time.
Is there anyway to generate a default URL which lasts permanently? I need the user to be able to access the photo from within their app. They take a photo, it's uploaded. They view it. Pretty standard.
You have to use Handlers (DOI, PURL, PURLZ, persistant URIs)
All are free of charge, except for DOIs.
You create that handler, then you add it to your project.
Related
I have image upload in my system. I am struggling to understand what is the logic of serving images.
If I upload directly to wwwroot, the files will be accessible to everyone, which is not what I want.
I understand I could save the file contents in the database as base64 but those can be big files, and I would like them on the server in files.
I could convert them on the fly when requested. Most probably getting the path to file, then loading it in a memory stream and spitting out the base64. But seems overkill, and not an elegant solution. I use Automapper for most data and I have to write some crazy custom mappers, which I will If there is no other way.
I could create virtual path, which from what I understand maps physical path on server to a url which doesn't seem any different than option 1
I fancy there is a way to spit out a link/url that this user has access to (or at least logged users) that can be passed to the app so it can load it. Is this impossible or unreasonable? Or am I missing something?
What is the correct way of doing in general?
Also, what is a quick way to do it without spending days for setup?
To protect the specific static files, you can try the solutions explained in this official doc.
Solution A: Store static files you want to authorize outside of wwwroot, and call UseStaticFiles to specify a path and other StaticFileOptions after calling UseAuthorization, then set the fallback authorization policy.
Solution B: Store static files you want to authorize outside of wwwroot, and serve it via a controller action method to which authorization is applied and return a FileResult object.
I am trying to make sure I did not miss anything in the AWS CloudFront documentation or anywhere else ...
I have a (not public) S3 bucket configured as origin in a CloudFront web distribution (i.e. I don't think it matters but I am using signed urls).
Let's say a have a file in a S3 path like
/someRandomString/someCustomerName/someProductName/somevideo.mp4
So, perhaps the url generated by CloudFront would be something like:
https://my.domain.com/someRandomString/someCustomerName/someProductName/somevideo.mp4?Expires=1512062975&Signature=unqsignature&Key-Pair-Id=keyid
Is there a way to obfuscate the path to actual file on the generated URL. All 3 parts before the filename can change, so I prefer not to use "Origin Path" on Origin Settings to hide the begging of the path. With that approach, I would have to create a lot of origins mapped to the same bucket but different paths. If that's the only way, then the limit of 25 origins per distribution would be a problem.
Ideally, I would like to get something like
https://my.domain.com/someRandomObfuscatedPath/somevideo.mp4?Expires=1512062975&Signature=unqsignature&Key-Pair-Id=keyid
Note: I am also using my own domain/CNAME.
Thanks
Cris
One way could be to use a lambda function that receives the S3 file's path, copies it into an obfuscated directory (maybe it has a simple mapping from source to origin) and then returns the signed URL of the copied file. This will ensure that only the obfuscated path is visible externally.
Of course, this will (potentially) double the data storage so you need some way to clean up the obfuscated folders. That could be done on a time-based manner, so if each signed URL is expected to expire after 24 hours, you could create folders based on date, and each of the obfuscated directories could be deleted every other day.
Alternatively, you could use a service like tinyurl.com or something similar to create a mapping. It would be much easier, save on storage, etc. The only downside would be that it would not reflect your domain name.
If you have the ability to modify the routing of your domain then this is a non-issue, but I presume that's not an option.
Obfuscation is not a form of security.
If you wish to control which objects users can access, you should use Pre-Signed URLs or Cookies. This way, you can grant access to private objects via S3 or CloudFront and not worry about people obtaining access to other objects.
See: Serving Private Content through CloudFront
I have a small communications app in Ember and the server is polled every 10 seconds for new data.
Next to each thread, or reply under a thread, there is a user area that displays the user's email, avatar, and badges. The avatars are hosted on s3, and every time the site sends down JSON data, a get request is sent off for each avatar on s3.
There are other images along side the avatars, like badge icons, but those are hosted in the Rails public folder and do not trigger a get request.
Why is this happening? is there a way to cache or suppress the get requests every time JSON is returned from the server?
Why is this happening?
Sounds like some of your views are re-rendering when it gets new data. It may be possible to prevent this by changing how your templates are structured, but probably that is overkill for this situation. If you want to see what's going on, try adding instrumentation to your app to console.log when views get re-rendered: How to profile the page render performance of ember?
Assuming the avatars and other images are both getting re-rendered, the s3 ones probably trigger a new get request because the headers being returned by s3 are not set to allow browser caching.
is there a way to cache or suppress the get requests every time JSON is returned from the server?
Yes, adjust settings on your s3 images so that browser will cache them.
In s3 management console, go to properties -> metadata section for each file you want to update
set an Expires header for your files
other headers to look at are Cache-Control and Last-Modified
See How to use browser caching with Amazon S3?
Just wondering if there is a recommended strategy for storing different types of assets/files in separate S3 buckets or just put them all in one bucket? The different types of assets that I have include: static site images, user's profile images, user-generated content like documents, files, and videos.
As far as how to group files into buckets. That is really not that critical of an issue unless you want to have different domain names or CNAMEs fordifferent types on content, in which case you would need a separate bucket for each domain name you would want to use.
I would tend to group them by functionality. Perhaps static files used in your application that you have full control over you might deploy into a separate bucket from content that is going to be user generated. Or you might want to have video in a different bucket than images, etc.
To add to my earlier comments about S3 metadata. It is going to be a critical part of optimizing how you server up content from S3/Cloudfront.
Basically, S3 metadata consists of key-value pairs. So you could have Content-Type as a key with a value of image/jpeg for example if the file is .jpg. This will automatically send appropriate Content-Type headers corresponding to your values for requests made directly to S3 URL or via Cloudfront. The same is true of Cache-Control metatags. You can also use your own custom metatags. For example, I use a custom metatag named x-amz-meta-md5 to store an md5 hash of the file. It is used for simple bucket comparisons against content stored in a revision control system, so we don't have to make checksums of each file in the bucket on the fly. We use this for pushing differential content updates to the buckets (i.e. only push those that have changed).
As far as how revision control goes. I would HIGHLY recommend using versioned file names. In other words say you have bigimage.jpg and you want to make an update, call it bigimage1.jpg and change your code to reflect this. Why? Because optimally, you would like to set long expiration time frames in your Cache-Control headers. Unfortunately, if you then want to deploy a file of the same name and you are using Cloudfront, it becomes problematic to invalidate the edge caching locations. Whereas if you have a new file name, Cloudfront would just begin to populate the edge nodes and you don't have to worry about invalidating the cache at all.
Similarly for user-produced content, you might want to include an md5 or some other (mostly) unique identifier scheme, so that each video/image can have its own unique filename and place in the cache.
For your reference here is a link to the AWs documentation on setting up streaming in Cloudfront
http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/CreatingStreamingDistributions.html
So I have been reading about signed URL's and some of its benefits. Especially the part about hot linking. Our app doesn't allow users to embed media (photo, video, audio) from our site. So signed URL's looks like the right direction. Mostly to prevent hotlinking.
So now that I know my requirements. I have a few questions.
Does this mean I have to add a policy to my bucket, denying read-write access to any of the files or folders in the bucket?
Do I have to create signed URL's for each page visit? So let's say 100 users visit the same page where the song can be played. Does this mean I have to create 100 signed URL's?
Creating S3 signed URL's are free?
Touching on point #2. Is it normal practice for Amazon S3 to create several signed URL's? I mean what happens if 1,000 users end up coming to the same song page..
Your thoughts?
REFERENCE:
For anyone interested on how I was able to generate signed url's. Based on https://github.com/appoxy/aws gem and the docs at http://rubydoc.info/gems/aws/2.4.5/frames :
s3 = Aws::S3.new(APP_CONFIG['amazon_access_key_id'], APP_CONFIG['amazon_secret_access_key'])
bucket_gen = Aws::S3Generator::Bucket.create(s3, 'bucket_name')
signed_url = bucket_gen.get(URI.unescape(URI.parse(URI.escape('http://bucket_name.s3.amazonaws.com/uploads/foobar.mp3')).path[1..-1]), 1.hour)
By default, your bucket will be set to private. When you upload files to S3, you can set the ACL (permissions) - in your case, you'll want to make sure the files are private.
The simplest solution is to create new signed urls for each visitor. You could do something like generate new urls everyday, store them somewhere, and then use those but that adds complexity for little benefit. The one place where you might need this though is too enable client side caching. Everytime you create a new url, the browser sees it as a different file and will download a fresh copy. If this isn't the behaviour you want, you need to generate urls that expire far in the future and reuse those - but that will reduce the effectiveness of preventing hotlinking.
Yes, generating urls are free. They are generated on the client and don't touch S3. I suppose there is a time/processing cost, but I have created pages with hundreds of urls that are generated on each visit and have not noticed any performance issues.