Block POST requests via S3 + CloudFlare? - amazon-s3

I'm using an S3 bucket to host a static site, and CloudFlare as a CDN.
I'm noticing a large amount of POST requests (seems like spamming attempts), and I'm trying to block them so that I don't have to pay for that traffic.
Is there a way to block all POST requests on Bucket Policy (that would be the best choice probably)? Perhaps a way to block POST requests on CloudFlare?

CloudFlare has a firewall app you can use to whitelist or blacklist IPs, IP ranges or countries. There doesn't seem to be a way to set rules based on HTTP request type so see if you can blacklist the IP addresses instead.
Check if you have anything listed already on the Threat Control dashboard. In theory it should already challenge suspicious requests.

Related

How to use Akamai infront of S3 buckets?

I have a static website that is currently hosted in apache servers. I have an akamai server which routes requests to my site to those servers. I want to move my static websites to Amazon S3, to get away from having to host those static files in my servers.
I created a S3 bucket in amazon, gave it appropriate policies. I also set up my bucket for static website hosting. It told me that I can access the site at
http://my-site.s3-website-us-east-1.amazonaws.com
I modified my akamai properties to point to this url as my origin server. When I goto my website, I get Http 504 errors.
What am i missing here?
Thanks
K
S3 buckets don't support HTTPS?
Buckets support HTTPS, but not directly in conjunction with the static web site hosting feature.
See Website Endpoints in the S3 Developer Guide for discussion of the feature set differences between the REST endpoints and the web site hosting endpoints.
Note that if you try to directly connect to your web site hosting endpoint with your browser, you will get a timeout error.
The REST endpoint https://your-bucket.s3.amazonaws.com will work for providing HTTPS between bucket and CDN, as long as there are no dots in the name of your bucket
Or if you need the web site hosting features (index documents and redirects), you can place CloudFront between Akamai and S3, encrypting the traffic inside CloudFront as it left the AWS network on its way to Akamai (it would still be in the clear from S3 to CloudFront, but this is internal traffic on the AWS network). CloudFront automatically provides HTTPS support on the dddexample.cloudfront.net hostname it assigns to each distribution.
I admit, it sounds a bit silly, initially, to put CloudFront behind another CDN but it's really pretty sensible -- CloudFront was designed in part to augment the capabilities of S3. CloudFront also provides Lambda#Edge, which allows injection of logic at 4 trigger points in the request processing cycle (before and after the CloudFront cache, during the request and during the response) where you can modify request and response headers, generate dynamic responses, and make external network requests if needed to implement processing logic.
I faced this problem currently and as mentioned by Michael - sqlbot, putting the CloudFront between Akamai and S3 Bucket could be a workaround, but doing that you're using a CDN behind another CDN. I strongly recommend you to configure the redirects and also customize the response when origin error directly in Akamai (using REST API endpoint in your bucket). You'll need to create three rules, but first, go to CDN > Properties and select your property, Edit New Version based on the last one and click on Add Rule in Property Configuration Settings section. The first rule will be responsible for redirect empty paths to index.html, create it just like the image below:
builtin.AK_PATH is an Akamai's variable. The next step is responsible for redirect paths different from the static ones (html, ico, json, js, css, jpg, png, gif, etc) to \index.html:
The last step is responsible for customize an error response when origin throws an HTTP error code (just like the CloudFront Error Pages). When the origin returns 404 or 403 HTTP status code, the Akamai will call the Failover Hostname Edge Server (which is inside the Akamai network) with the /index.html path. This setup will be triggered when refreshing pages in the browser and when the application has redirection links (which opens new tabs for example). In the Property Hostnames section, add a new hostname that will work as the Failover Hostname Edge Server, the name should has less than 16 characters, then, add the -a.akamaihd.net suffix to it (that's the Akamai pattern). For example: failover-a.akamaihd.net:
Finally, create a new empty rule just like the image below (type the hostname that you just created in the Alternate Hostname in This Property section):
Since you are already using Akamai as a CDN, you could simply use their NetStorage product line to achieve this in a simplified manner.
All you would need to do is to move the content from s3 to Akamai and it would take care of the rest(hosting, distribution, scaling, security, redundancy).
The origin settings on Luna control panel could simply point to the Netstorage FTP location. This will also remove the network latency otherwise present when accessing the S3 bucket from the Akamai Network.

How to serve HLS streams from S3 in secure way (authorized & authenticated)

Problem:
I am storing number of HLS streams in S3 with given file structure:
Video1
├──hls3
├──hlsv3-master.m3u8
├──media-1
├──media-2
├──media-3
├──media-4
├──media-5
├──hls4
├──hlsv4-master.m3u8
├──media-1
├──media-2
├──media-3
├──media-4
├──media-5
In my user API I know which exactly user has access to which video content
but I also need to ensure that video links are not sharable and only accessible
by users with right permissions.
Solutions:
1) Use signed / temp S3 urls for private S3 content. Whenever client wants to play some specific video it is
sending request to my API. If user has right permissions the API is generating signed url
and returning it back to client which is passing it to player.
The problem I see here is that real video content is stored in dozen of segments files in media-* directories
and I do not really see how can I protect all of them - would I need to sign each of the segment file urls separately?
2) S3 content is private. Video stream requests made by players are going through my API or separate reverse-proxy.
So whenever client decides to play specific video, API / reverse-proxy is getting the request, doing authentication & authorization
and passing the right content (master play list files & segments).
In this case I still need to make S3 content private and accessible only by my API / reverse-proxy. What should be the recommended way here?
S3 rest authentication via tokens?
3) Use encryption with protected key. In this case all of video segments are encrypted and publicly available. The key is also stored in S3
but is not publicly available. Every key request made by player is authenticated & authorized by my API / reverse-proxy.
These are 3 solutions I have in my mind right now. Not convinced on all of them. I am looking for something simple and bullet proof secure. Any recommendations / suggestions?
Used technology:
ffmpeg for video encoding to different bitrates
bento4 for video segmentation
would I need to sign each of the segment file urls separately?
If the player is requesting directly from S3, then yes. So that's probably not going to be the ideal approach.
One option is CloudFront in front of the bucket. CloudFront can be configured with an Origin Access Identity, which allows it to sign requests and send them to S3 so that it can fetch private S3 objects on behalf of an authorized user, and CloudFront supports both signed URLs (using a different algorithm than S3, with two important differences that I will explain below) or with signed cookies. Signed requests and cookies in CloudFront work very similarly to each other, with the important difference being that a cookie can be set once, then automatically used by the browser for each subsequent request, avoiding the need to sign individual URLs. (Aha.)
For both signed URLs and signed cookies in CloudFront, you get two additional features not easily done with S3 if you use a custom policy:
The policy associated with a CloudFront signature can allow a wildcard in the path, so you could authorize access to any file in, say /media/Video1/* until the time the signature expires. S3 signed URLs do not support wildcards in any form -- an S3 URL can only be valid for a single object.
As long as the CloudFront distribution is configured for IPv4 only, you can tie a signature to a specific client IP address, allowing only access with that signature from a single IP address (CloudFront now supports IPv6 as an optional feature, but it isn't currently compatible with this option). This is a bit aggressive and probably not desirable with a mobile user base, which will switch source addresses as they switch from provider network to Wi-Fi and back.
Signed URLs must still all be generated for all of the content links, but you can generate and sign a URL only once and then reuse the signature, just string-rewriting the URL for each file making that option computationally less expensive... but still cumbersome. Signed cookies, on the other hand, should "just work" for any matching object.
Of course, adding CloudFront should also improve performance through caching and Internet path shortening, since the request hops onto the managed AWS network closer to the browser than it typically will for requests direct to S3. When using CloudFront, requests from the browser are sent to whichever of 60+ global "edge locations" is assumed to be nearest the browser making the request. CloudFront can serve the same cached object to different users with different URLs or cookies, as long as the sigs or cookies are valid, of course.
To use CloudFront signed cookies, at least part of your application -- the part that sets the cookie -- needs to be "behind" the same CloudFront distribution that points to the bucket. This is done by declaring your application as an additional Origin for the distribution, and creating a Cache Behavior for a specific path pattern which, when requested, is forwarded by CloudFront to your application, which can then respond with the appropriate Set-Cookie: headers.
I am not affiliated with AWS, so don't mistake the following as a "pitch" -- just anticipating your next question: CloudFront + S3 is priced such that the cost difference compared to using S3 alone is usually negligible -- S3 doesn't charge you for bandwidth when objects are requested through CloudFront, and CloudFront's bandwidth charges are in some cases slightly lower than the charge for using S3 directly. While this seems counterintuitive, it makes sense that AWS would structure pricing in such a way as to encourage distribution of requests across its network rather than to focus them all against a single S3 region.
Note that no mechanism, either the one above or the one below is completely immune to unauthorized "sharing," since the authentication information is necessarily available to the browser, and thus to the user, depending on the user's expertise... but both approaches seem more than sufficient to keep honest users honest, which is all you can ever hope to do. Since signatures on signed URLs and cookies have expiration times, the duration of the share-ability is limited, and you can identify such patterns through CloudFront log analysis, and react accordingly. No matter what approach you take, don't forget the importance of staying on top of your logs.
The reverse proxy is also a good idea, probably easily implemented, and should perform quite acceptably with no additional data transport charges or throughput issues, if the EC2 machines running the proxy are in the same AWS region as the bucket, and the proxy is based on solid, efficient code like that found in Nginx or HAProxy.
You don't need to sign anything in this environment, because you can configure the bucket to allow the reverse proxy to access the private objects because it has a fixed IP address.
In the bucket policy, you do this by granting "anonymous" users the s3:getObject privilege, only if their source IPv4 address matches the IP address of one of the proxies. The proxy requests objects anonymously (no signing needed) from S3 on behalf of authorized users. This requires that you not be using an S3 VPC endpoint, but instead give the proxy an Elastic IP address or put it behind a NAT Gateway or NAT instance and have S3 trust the source IP of the NAT device. If you do use an S3 VPC endpoint, it should be possible to allow S3 to trust the request simply because it traversed the S3 VPC Endpoint, though I have not tested this. (S3 VPC Endpoints are optional; if you didn't explicitly configure one, then you don't have one, and probably don't need one).
Your third option seems weakest, if I understand it correctly. An authorized but malicious user gets the key an can share it all day long.

Wildcard subdomains point to appropriate S3/CloudFront subdirectories

I need multiple subdomains to point to individual buckets/subdirectories on Amazon S3 (synched to CloudFront distribution), where I'm hosting some static files.
So that ANY
SUBDOMAINNAME.example.com
automatically points to
s3.amazonaws.com/somebucket/SUBDOMAINNAME
or
somedistributionname.cloudfront.net/SUBDOMAINNAME
Is there a way to accomplish this without running a server for redirection?
Can it be done without changing DNS records for each new subdomain or, if not, adding the DNS rules programmatically?
What is the most efficient way of doing it, in terms of resource usage. (There might be hundreds of subdomains with 100s of daily requests for each)
update: this answer was correct when written, and the techniques described below are still perfectly viable but potentially less desirable since Lambda#Edge can now be used to accomplish this objective, as I explained in my answer to Serving a multitude of static sites from a wildcard domain in AWS.
No, there is no way to do this automatically.
Is there a way to accomplish this without running a server for redirection?
Technically, it isn't redirection that you'd need, to accomplish this. You'd need path rewriting, and that's why the answer to your ultimate question is "no" -- because Route 53 (and DNS in general) can't do anything related to paths.
Route 53 does support wildcard DNS, but that's of limited help without CloudFront and/or S3 supporting a mechanism to put the host header from the HTTP request into the path (which they don't).
Now, this could easily be accomplished in a "zero-touch" mode with a single Route 53 * wildcard entry, a single CloudFront distribution configured for *.example.com, and one or more EC2 instances running HAProxy to do the request path rewriting and proxy the request onward to the S3 bucket. A single line in a basic configuration file would accomplish that request rewrite:
http-request set-path /%[req.hdr(host)]%[path]
Then you'd need the proxy to send the the actual bucket endpoint hostname to S3, instead of the hostname supplied by the browser:
http-request set-header Host example-bucket.s3.amazonaws.com
The proxy would send the modified request to S3, return S3's response to CloudFront, which would return the response to the browser.
However, if you don't want to take this approach, since a server would be required, then the alternative solution looks like this:
Configure a CloudFront distribution for each subdomain, setting the alternate domain name for the distribution to match the specific subdomain.
Configure the Origin for each subdomain's distribution to point to the same bucket, setting the origin path to /one-specific-subdomain.example.com. CloudFront will change a request for GET /images/funny-cat.jpg HTTP/1.1 to GET /one-specific-subdomain.example.com/images/funny-cat.jpg HTTP/1.1 before sending the request to S3, resulting in the behavior you described. (This is the same net result as the behavior I described for HAProxy, but it is static, not dynamic, hence one distribution per subdomain; in neither case would this be a "redirect" -- so the address bar would not change).
Configure an A-record Alias in Route 53 for each subdomain, pointing to the subdomain's specific CloudFront distribution.
This can all be done programmatically through the APIs, using any one of the the SDKs, or using aws-cli, which is a very simple way to test, prototype, and script such things without writing much code. CloudFront and Route 53 are both fully automation-friendly.
Note that there is no significant disadvantage to each site using its own CloudFront distribution, because your hit ratio will be no different, and distributions do not have a separate charge -- only request and bandwidth charges.
Note also that CloudFront has a default limit of 200 distributions per AWS account but this is a soft limit that can be increased by sending a request to AWS support.
Since Lambda#edge this can be done with a lambda function triggered by the Cloud Front "Viewer Request" event.
Here is an example of such a Lambda function where a request like foo.example.com/index.html will return the file /foo/index.html from your origin.
You will need a CF distribution with the CNAME *.example.com, and an A record "*.example.com" pointing to it
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
const subdomain = getSubdomain(request);
if (subdomain) {
request.uri = '/' + subdomain + request.uri;
}
callback(null, request);
};
function getSubdomain(request) {
const hostItem = request.headers.host.find(item => item.key === 'Host');
const reg = /(?:(.*?)\.)[^.]*\.[^.]*$/;
const [_, subdomain] = hostItem.value.match(reg) || [];
return subdomain;
}
As for the costs take a look at lambda pricing. At current pricing is 0.913$ per million requests
A wildcard works on S3. I just put an A record * that points to an IP and it worked.

AWS CloudFront to host SSL and forward on to sendgrid not working

I am trying to have links in my emails from my application register as SSL/HTTPS secure links. This helps deliverability and other things email clients may do treating links as http vs https.
Our application is using SendGrid to send emails, which also supports click tracking on our links for us. In order to do this SendGrid, and most other email sender services replace the original link we put in, which was an https://blahblah.com link with their own link, http://clicktrack.sendgrid.net or something that is not https, but rather http.
SendGrid supports "white labeling" the click tracking link with something like
http://subdomain.blahblah.com and also https version if we set it up properly. SendGrids requirements for https/ssl link are shown here
https://sendgrid.com/docs/Classroom/Build/Add_Content/content_delivery_networks.html
Basically they are asking us to setup a CDN or other server that will host our SSL certificates, terminate the SSL, and then forward the request on to their servers. Once that is in place they can "turn on" ssl on their end for our email links.
I tried setting this up in AWS CloudFront with the origin as sendgrid.net and the distribution having our SSL certificate and a route 53 CNAME pointing to our distribution. So the subdomain.blahblah.com points to distribution CDN, CDN points to sendgrid, and all should work.
Testing this though it does NOT work. If I go to the http version of subdomain it does work, CDN forwards properly. AWS support has suggested it was an issue related to host headers and the CDN not being able to validate the origin when I had a 2nd CNAME for the origin on my subdomain2.blahblah.com. That led me to remove 2nd cname and direclty put sendgrid as origin, but that hasn't worked and they haven't provided a solution yet. I get error like this..
ERROR
The request could not be satisfied.
CloudFront wasn't able to connect to the origin.
Generated by cloudfront (CloudFront)
Request ID: pl1bS3OObC6mUd2vyyhM6bNFt3xyLsfzVIqNmiPkEO7mQgJyQCn_pA==
Any ideas welcome or a different way to do this?
The issue was in behaviors I was forwarding all headers. Should NOT forward "Host" header in this situation or the origin ssl call will break as it wont match expected. AWS support did finally figure this out and recommend to me :)

How can I hide a custom origin server from the public when using AWS CloudFront?

I am not sure if this exactly qualifies for StackOverflow, but since I need to do this programatically, and I figure lots of people on SO use CloudFront, I think it does... so here goes:
I want to hide public access to my custom origin server.
CloudFront pulls from the custom origin, however I cannot find documentation or any sort of example on preventing direct requests from users to my origin when proxied behind CloudFront unless my origin is S3... which isn't the case with a custom origin.
What technique can I use to identify/authenticate that a request is being proxied through CloudFront instead of being directly requested by the client?
The CloudFront documentation only covers this case when used with an S3 origin. The AWS forum post that lists CloudFront's IP addresses has a disclaimer that the list is not guaranteed to be current and should not be relied upon. See https://forums.aws.amazon.com/ann.jspa?annID=910
I assume that anyone using CloudFront has some sort of way to hide their custom origin from direct requests / crawlers. I would appreciate any sort of tip to get me started. Thanks.
I would suggest using something similar to facebook's robots.txt in order to prevent all crawlers from accessing all sensitive content in your website.
https://www.facebook.com/robots.txt (you may have to tweak it a bit)
After that, just point your app.. (eg. Rails) to be the custom origin server.
Now rewrite all the urls on your site to become absolute urls like :
https://d2d3cu3tt4cei5.cloudfront.net/hello.html
Basically all urls should point to your cloudfront distribution. Now if someone requests a file from https://d2d3cu3tt4cei5.cloudfront.net/hello.html and it does not have hello.html.. it can fetch it from your server (over an encrypted channel like https) and then serve it to the user.
so even if the user does a view source, they do not know your origin server... only know your cloudfront distribution.
more details on setting this up here:
http://blog.codeship.io/2012/05/18/Assets-Sprites-CDN.html
Create a custom CNAME that only CloudFront uses. On your own servers, block any request for static assets not coming from that CNAME.
For instance, if your site is http://abc.mydomain.net then set up a CNAME for http://xyz.mydomain.net that points to the exact same place and put that new domain in CloudFront as the origin pull server. Then, on requests, you can tell if it's from CloudFront or not and do whatever you want.
Downside is that this is security through obscurity. The client never sees the requests for http://xyzy.mydomain.net but that doesn't mean they won't have some way of figuring it out.
[I know this thread is old, but I'm answering it for people like me who see it months later.]
From what I've read and seen, CloudFront does not consistently identify itself in requests. But you can get around this problem by overriding robots.txt at the CloudFront distribution.
1) Create a new S3 bucket that only contains one file: robots.txt. That will be the robots.txt for your CloudFront domain.
2) Go to your distribution settings in the AWS Console and click Create Origin. Add the bucket.
3) Go to Behaviors and click Create Behavior:
Path Pattern: robots.txt
Origin: (your new bucket)
4) Set the robots.txt behavior at a higher precedence (lower number).
5) Go to invalidations and invalidate /robots.txt.
Now abc123.cloudfront.net/robots.txt will be served from the bucket and everything else will be served from your domain. You can choose to allow/disallow crawling at either level independently.
Another domain/subdomain will also work in place of a bucket, but why go to the trouble.