How to make Serverless return 404 instead of 403 for non-existing endpoints? - serverless-framework

I tried the Serverless framework following the instructions to create the Hello World application. Everything works well, calling the [url]/dev/hello-world returns 200 response with the json output as expected.
By default, looks like the response for non-existing endpoints is 403 http status code with json {"message":"Missing Authentication Token"}.
I'd like to host a website using the framework.
Is there any way to make the Serverless return 404 instead of 403 for non-existing endpoints?

Returning a 403 instead of 404 is a deliberate design decision.
This is a pattern that is used in many other AWS APIs (most notably S3). In S3, if the user would have had permissions to the see presence of the key (via the ListBucket permission), a 404 will be returned; otherwise a 403 will be returned. Because API Gateway enables permissions at the method level, we can't know whether or not the user should be permitted to have knowledge of the existence of the API resource level, and default to the 403 as a result.
You can elect to catch all missing API methods using a {proxy+} pattern.
events:
- http:
path: {proxy+} # catch any path not specified elsewhere
method: get # or change to any method if you prefer

I did something a bit different its not relavent to API calls, but it is to the final goal of using serverless to host a website in the CloudFrontDistribution section I added this.
CustomErrorResponses:
ErrorCode: 403
ResponseCode: 404
ResponsePagePath: /404.html

Related

How to pass original URI, with arguments, to Traefik ErrorPage handler specified in `query`?

I'm trying to use nginx to serve a custom error page using the Error Page middleware so that 404 requests to a lambda service (which I don't control) can be handled with a custom error page. I want to be able to get the context of this original request on that error page, either in Nginx for further forwarding, or else as a header for further handling e.g. in PHP or whatnot so I can provide contextual links on the 404 page.
However, right now after the redirection to Nginx in Traefik's ErrorPage middleware it seems the request has lost all the headers and data from the original service query.
The relevant part of my dockerfile:
traefik.port=8080
traefik.protocol=http
traefik.docker.network=proxy
traefik.frontend.rule=PathPrefix:/myservice;ReplacePathRegex:^/myservice/(.*) /newprefix/$$1
traefik.frontend.errors.myservice.status=404
traefik.frontend.errors.myservice.service=nginx
traefik.frontend.errors.myservice.query=/myservice-{status}
Nginx receives the forwarded 404 request, but the request URI comes through as nothing more than the path /myservice-404 specified in query (or /, if I omit traefik.frontend.errors.myservice.query). After the ReplacePathRegex I have the path of the original request available in the HTTP_X_REPLACED_PATH header, but any query arguments are no longer accessible in any header, and nginx can't see anything else about the original URI. For example, if I requested mysite.com/myservice/some/subpath?with=parameters, the HTTP_X_REPLACED_PATH header will show /myservice/some/subpath but not include the parameters.
Is it possible in Traefik to pass another service the complete context about the original request?
What I'm really looking for is something like try_files, where I could say "if this traefik request fails, try this other path instead", but I'd settle for being able to access the original, full request arguments within the handling backend server. If there was a way to send Nginx a request with the full path and query received by Traefik, that would be ideal.
tl;dr:
I am routing a request to a specific service in Traefik
If that request 404s, I want to be able to pass that request to Nginx for further processing / a contextual error page
I want Nginx and/or the page which receives the ErrorPage redirect to be able to know about the request that 404'd in the service
Unfortunately this is not possible with Traefik. I tried to achieve something similar but I realized that the only information that we are able to pass to the error page is the HTTP code, that's it.
The only options available are mentioned in their docs: https://doc.traefik.io/traefik/middlewares/errorpages/

Redirecting all 403 forbidden request to 404 page in aem

I am trying to redirect all forbidden request to 404 'not found' page.
Url I am trying to access.
http://localhost:4503/content/mysite/home.html (it is working fine).
But when I try to access,
http://localhost:4503/content/mysite (it is forbidden here).
My site is developed in adobe experience manager and I don't see any config/setting related with redirecting. So, I have to do something on web server which is Apache here.
And I am not pretty much familiar with Apache and creating rules in it.
I would like to ask if there is anything that redirect any forbidden request
to 404 not found page.
There are different options that you can try.
If the intent is to display some friendly message instead of the default forbidden message, you can define your own 403 error handler in AEM.
Overlay the 403.jsp at /apps/sling/servlet/errorhandler/ and add your custom html for displaying a relevant error message. HTTP response status code would still be 403 in this case.
Examples can be found in this Adobe's blog. https://helpx.adobe.com/experience-manager/6-3/sites/developing/using/customizing-errorhandler-pages.html
If you do not want the 403 HTTP status code in the response, you can try to override the status code in the aforementioned 403.jsp. In the JSP code, if the response is not already committed, you can use HttpServletResponse.setStatus API to set the 404 status code. If the response is already committed, this would not work as described in this Sling blog https://sling.apache.org/documentation/the-sling-engine/errorhandling.html
You can override it in the webserver using mod_rewrite or PHP. This SO question shows the options to achieve this.
You can apply a simple rule in dispatcher using /filter section to specify the HTTP requests that Dispatcher accepts. All other requests are sent back to the web server with a 404 error code.
In your case, it could be something like.
/filter {
/0001 { /glob "*" /type "deny" }
/0002 { /type "allow" /method "POST" /url "/content/mysite/[.]*.html" }
}
This will first deny access to all files and then allow access to specific content, which *.html pages under /mysite in this case.

API Gateway Redirect 302

I've got a service I'm proxying with gateway. A GET request to / will return a 302 with a Location header. The problem is the value of the Location header which I'm referencing in "integration.response.header.Location" is /login.
What this ends up doing is breaking my proxy by removing the stageName from the AWS provided URL for the API.
Instead of "{AWS_URL}/local/login", the redirect is going to "{AWS_URL}/login" which causes a 403 Forbidden from API Gateway.
If I manually modify the header mapping expression to use 'local/login' all works fine, but, the above should work, no?
Is there some hackery to maybe concat values into a header mapping expression?
Any help is greatly appreciated!
Thanks!
Moved to AWS Forums as it may be more appropriate - https://forums.aws.amazon.com/thread.jspa?threadID=228457

Return code for wrong HTTP method in REST API?

Our API user can get the root document (collection list) by sending GET request to root API address. If he sends POST, we should return something. The same question applies for other resource paths, like e.g. sending PATCH on query path etc. Not all methods have meaning on some paths.
As I see from HTTP RFCs is that we should return code 405: Method not allowed and sending back the Allowed response header with list of allowed methods.
I see that e.g. GitHub API returns 404: Not found in the case I explained above (sending POST to root).
What would be the proper response? 404 or 405? I see 405 more developer-friendly, so is there any reason not to use it?
The expected behavior in this case, as per the HTTP spec and by REST guidelines, would be to return 405 Method Not Allowed. The resource is there, since a GET works, so a 404 Not Found would be confusing.
I'm not familiar with the GitHub API but in some cases I see that for 403 Forbidden it also returns 404 Not Found:
Requests that require authentication will return 404 Not Found, instead of 403 Forbidden, in some places. This is to prevent the accidental leakage of private repositories to unauthorized users.
Maybe the behavior on the root address is part of a bigger mechanism that handles such cases generally, who knows. Maybe you could ask?

404 vs 403 when directory index is missing

This is mostly a philosophical question about the best way to interpret the HTTP spec. Should a directory with no directory index (e.g. index.html) return 404 or 403? (403 is the default in Apache.)
For example, suppose the following URLs exist and are accessible:
http://example.com/files/file_1/
http://example.com/files/file_2/
But there's nothing at:
http://example.com/files/
(Assume we're using 301s to force trailing slashes for all URLs.)
I think several things should be taken into account:
By default, Apache returns 403 in this scenario. That's significant to me. They've thought about this stuff, and they made the decision to use 403.
According to W3C, 403 means "The server understood the request, but is refusing to fulfill it." I take that to mean you should return 403 if the URL is meaningful but nonetheless forbidden.
403 might result in information disclosure if the client correctly guesses that the URL maps to a real directory on disk.
http://example.com/files/ isn't a resource, and the fact that it internally maps to a directory shouldn't be relevant to the status code.
If you interpret the URL scheme as defining a directory structure from the client's perspective, the internal implementation is still irrelevant, but perhaps the outward appearance should indeed have some bearing on the status codes. Maybe, even if you created the same URL structure without using directories internally, you should still use 403s, because it's about the client's perception of a directory structure.
In the balance, what do you think is the best approach? Should we just say "a resource is a resource, and if it doesn't exist, it's a 404?" Or should we say, "if it has slashes, it looks like a directory to the client, and therefore it's a 403 if there's no index?"
If you're in the 403 camp, do you think you should go out of your way to return 403s even if the internal implementation doesn't use directories? Suppose, for example, that you have a dynamic web app with this URL: http://example.com/users/joe, which maps to some code that generates the profile page for Joe. Assuming you don't write something that lists all users, should http://example.com/users/ return 403? (Many if not all web frameworks return 404 in this case.)
The first step to answering this is to refer to RFC 2616: HTTP/1.1. Specifically the sections talking about 403 Forbidden and 404 Not Found.
10.4.4 403 Forbidden
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
10.4.5 404 Not Found
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
My interpretation of this is that 404 is the more general error code that just says "there's nothing there". 403 says "there's nothing there, don't try again!".
One reason why Apache might return 403 on directories without explicit index files is that auto-indexing (i.e. listing all files in it) is disabled (a.k.a "forbidden"). In that case saying "listing all files in this directory is forbidden" makes more sense than saying "there is no directory".
Another argument why 404 is preferable: google webmaster tools.
Indeed, for a 404, Google Webmaster Tool displays the referer (allowing you to clean up the bad link to the directory), whereas for a 403, it doesn't display it.