List of reserved parameter names for AWS S3 - amazon-s3

It seems that the GET parameter location is a reserved parameter on AWS S3. Say I have a resource on an S3 bucket, accessible via the web:
http://my-bucket.s3.amazonaws.com/index.html
... and I simply append the GET parameter location to it, I get an HTTP 403:
http://my-bucket.s3.amazonaws.com/index.html?location=US
It works so long as I change the parameter name to something else. For example:
http://my-bucket.s3.amazonaws.com/index.html?loc=US
So clearly location is a reserved word in AWS S3. My question is: is there a list of all reserved words I shouldn't try to use as GET parameters with S3?
I searched the docs but couldn't find any such list.

location in the query tells S3 that you're asking for the location of a bucket. It's one of several "subresources" (things that are not objects) in S3 that are accessed via query string parameters.
You could probably compile a nearly complete list by reviewing the entire API reference documentation, but here's a partial list found in some older docs (Signature Version 2):
The subresources that must be included when constructing the CanonicalizedResource Element are acl, lifecycle, location, logging, notification, partNumber, policy, requestPayment, torrent, uploadId, uploads, versionId, versioning, versions, and website.
https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html
They periodically add new ones, like select and delete and tagging, so an exhaustive list is not future-proof.
Your safest bet is to use parameters beginning with x- (but not beginning with x-amz since these may be reserved or carry other implications). This is mentioned in the logging documentation:
You can include custom information to be stored in the access log record for a request by adding a custom query-string parameter to the URL for the request. Amazon S3 ignores query-string parameters that begin with "x-", but includes those parameters in the access log record for the request, as part of the Request-URI field of the log record.
https://docs.aws.amazon.com/AmazonS3/latest/dev/LogFormat.html

Related

hiding s3 path in aws cloudfront url

I am trying to make sure I did not miss anything in the AWS CloudFront documentation or anywhere else ...
I have a (not public) S3 bucket configured as origin in a CloudFront web distribution (i.e. I don't think it matters but I am using signed urls).
Let's say a have a file in a S3 path like
/someRandomString/someCustomerName/someProductName/somevideo.mp4
So, perhaps the url generated by CloudFront would be something like:
https://my.domain.com/someRandomString/someCustomerName/someProductName/somevideo.mp4?Expires=1512062975&Signature=unqsignature&Key-Pair-Id=keyid
Is there a way to obfuscate the path to actual file on the generated URL. All 3 parts before the filename can change, so I prefer not to use "Origin Path" on Origin Settings to hide the begging of the path. With that approach, I would have to create a lot of origins mapped to the same bucket but different paths. If that's the only way, then the limit of 25 origins per distribution would be a problem.
Ideally, I would like to get something like
https://my.domain.com/someRandomObfuscatedPath/somevideo.mp4?Expires=1512062975&Signature=unqsignature&Key-Pair-Id=keyid
Note: I am also using my own domain/CNAME.
Thanks
Cris
One way could be to use a lambda function that receives the S3 file's path, copies it into an obfuscated directory (maybe it has a simple mapping from source to origin) and then returns the signed URL of the copied file. This will ensure that only the obfuscated path is visible externally.
Of course, this will (potentially) double the data storage so you need some way to clean up the obfuscated folders. That could be done on a time-based manner, so if each signed URL is expected to expire after 24 hours, you could create folders based on date, and each of the obfuscated directories could be deleted every other day.
Alternatively, you could use a service like tinyurl.com or something similar to create a mapping. It would be much easier, save on storage, etc. The only downside would be that it would not reflect your domain name.
If you have the ability to modify the routing of your domain then this is a non-issue, but I presume that's not an option.
Obfuscation is not a form of security.
If you wish to control which objects users can access, you should use Pre-Signed URLs or Cookies. This way, you can grant access to private objects via S3 or CloudFront and not worry about people obtaining access to other objects.
See: Serving Private Content through CloudFront

Cloudfront query strings not logging

i have set up a cloudfront distribution to deal with my image resizing for an app.
I have my image resize function sitting in AWS Lambda, with An API gateway call wrapped around it. In order to call this function the following url is used:
/images?url=&width=&height=
and the example:
/images?height=300&width=300&url=smodlEMvQc
When i add this onto the end of my cloudfront URL as follows:
examplecloudfront.net/images?height=300&width=300&url=smodlEMvQc
The query strings never appear in the Popular Objects which indicates the urls are not be cached.
I have ticketed the forward query string option and so it should be showing the query string inclusive url in the popular items as I have tested the same urls many times without any success
For the sake of completeness.
The query strings will appear in cloudfront logs.
Enable them for the cloudfornt distribution that you are using from the AWS console.
This will direct the logs into an S3 bucket.
The logs are CSV files that show the URL and the query strings.

Allowing multiple content types in HTTP POST Amazon S3 upload policy document

Does anybody know how to allow multiple content types in an Amazon S3 upload policy when uploading using HTTP POST? I can't seem to find the answer to this anywhere.
I am aware that I can restrict an upload to any file with a MIME type that starts with "image/" as follows:
{"expiration": "2015-02-28T00:00:00Z",
"conditions": [
["starts-with", "$Content-Type", "image/*"]
]
}
But how would I go about allowing only a certain few MIME types which might not all start with the same characters?
This isn't supported. It's either a single pattern match (including a wildcard), or you have to allow all.
Depending on how the form is being generated -- dynamically, one assumes -- you might be able to simply tell the application the content-type of the file you intend to upload when requesting the resource that builds the form, hence, telling the application what content-type value to use on the form and when generating the policy document.
If the application doesn't find that content-type in its list of acceptable values, it could just refuse to render the form, and refuse to create and sign a matching policy statement.
Depending on the application, there may be little point in worrying too much about the Content-Type field here, because this is not actually restricting the content-types that can be uploaded... it's only restricting the value passed in the value parameter of input type="input" name="Content-Type". That's all this actually restricts.
There's no validation being performed as to whether that value accurately represents the MIME type of the payload that is being updated, so the policy document isn't restricting what kind of content you can upload. It's only restricting what kind of content you can claim you are uploading.
It may also be more appropriate to just accept otherwise-unusable uploads and handle the problem on the back-end, after the fact.

FineUploader: Harvest original last modified date when uploading to Amazon S3

I would like to send the last modified date of the uploaded file to the server. I have the javascroipt snippet to get that using FileApi ($(this).fineUploaderS3('getFile', id).lastModifiedDate). I would like to send this information when the uploadSuccess's endPoint is called, but I cannot find the callback which is right for me at Events | Fine Uploader documentation, and I cannot find the way I could inject the data.
These are submitted as POST parameters to my server when the upload finished to S3: key, uuid, name, bucket. I would like to inject the lastModified date here somehow.
Option 2:
Asking the Amazon S3 service about last modification date does not help directly, because the uploaded file has the current date, not the file's original date. It would be great if we could inject the information into the FineUploader->S3 communication in a way that S3 would use it for setting it's own last modified date for the uploaded file.
Other perspective I considered:
If I use onSubmit and setParams then I the Amazon S3 server will take it as 'x-amz-meta-lastModified'. The problem is that when I upload larger files (which is uploaded in chunks with an other dance) then I get signing error. ...<Code>SignatureDoesNotMatch</Code><Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.</Message>....
EDIT
The Other perspective I considered works. The bottleneck was the name of the custom metadata chih I used at setParams. It cannot contain capital letters, otherwise the signing fails. I did not find any reference documentation for it. For one I checked Object Key and Metadata - Amazon Simple Storage Service. If someone could find me a reference I would include that here.
The original question (when and how to send last modified date to the server component) remains.
(Server is PHP.)
EDIT2
The Option 2 will not work, as far my research went the "Last Modified" entry cannot be manually altered at Amazon S3.
If the S3 API does not return the expected last modified date, you can check the value of the lastModifiedDate on the File object associated with the upload (provided the browser supports the file API) and send that value as a parameter to the upload success endpoint. See the documentation for the setUploadSuccessParams API method for more details.

S3 Bucket Types

Just wondering if there is a recommended strategy for storing different types of assets/files in separate S3 buckets or just put them all in one bucket? The different types of assets that I have include: static site images, user's profile images, user-generated content like documents, files, and videos.
As far as how to group files into buckets. That is really not that critical of an issue unless you want to have different domain names or CNAMEs fordifferent types on content, in which case you would need a separate bucket for each domain name you would want to use.
I would tend to group them by functionality. Perhaps static files used in your application that you have full control over you might deploy into a separate bucket from content that is going to be user generated. Or you might want to have video in a different bucket than images, etc.
To add to my earlier comments about S3 metadata. It is going to be a critical part of optimizing how you server up content from S3/Cloudfront.
Basically, S3 metadata consists of key-value pairs. So you could have Content-Type as a key with a value of image/jpeg for example if the file is .jpg. This will automatically send appropriate Content-Type headers corresponding to your values for requests made directly to S3 URL or via Cloudfront. The same is true of Cache-Control metatags. You can also use your own custom metatags. For example, I use a custom metatag named x-amz-meta-md5 to store an md5 hash of the file. It is used for simple bucket comparisons against content stored in a revision control system, so we don't have to make checksums of each file in the bucket on the fly. We use this for pushing differential content updates to the buckets (i.e. only push those that have changed).
As far as how revision control goes. I would HIGHLY recommend using versioned file names. In other words say you have bigimage.jpg and you want to make an update, call it bigimage1.jpg and change your code to reflect this. Why? Because optimally, you would like to set long expiration time frames in your Cache-Control headers. Unfortunately, if you then want to deploy a file of the same name and you are using Cloudfront, it becomes problematic to invalidate the edge caching locations. Whereas if you have a new file name, Cloudfront would just begin to populate the edge nodes and you don't have to worry about invalidating the cache at all.
Similarly for user-produced content, you might want to include an md5 or some other (mostly) unique identifier scheme, so that each video/image can have its own unique filename and place in the cache.
For your reference here is a link to the AWs documentation on setting up streaming in Cloudfront
http://docs.amazonwebservices.com/AmazonCloudFront/latest/DeveloperGuide/CreatingStreamingDistributions.html