Redirecting all 403 forbidden request to 404 page in aem - apache

I am trying to redirect all forbidden request to 404 'not found' page.
Url I am trying to access.
http://localhost:4503/content/mysite/home.html (it is working fine).
But when I try to access,
http://localhost:4503/content/mysite (it is forbidden here).
My site is developed in adobe experience manager and I don't see any config/setting related with redirecting. So, I have to do something on web server which is Apache here.
And I am not pretty much familiar with Apache and creating rules in it.
I would like to ask if there is anything that redirect any forbidden request
to 404 not found page.

There are different options that you can try.
If the intent is to display some friendly message instead of the default forbidden message, you can define your own 403 error handler in AEM.
Overlay the 403.jsp at /apps/sling/servlet/errorhandler/ and add your custom html for displaying a relevant error message. HTTP response status code would still be 403 in this case.
Examples can be found in this Adobe's blog. https://helpx.adobe.com/experience-manager/6-3/sites/developing/using/customizing-errorhandler-pages.html
If you do not want the 403 HTTP status code in the response, you can try to override the status code in the aforementioned 403.jsp. In the JSP code, if the response is not already committed, you can use HttpServletResponse.setStatus API to set the 404 status code. If the response is already committed, this would not work as described in this Sling blog https://sling.apache.org/documentation/the-sling-engine/errorhandling.html
You can override it in the webserver using mod_rewrite or PHP. This SO question shows the options to achieve this.

You can apply a simple rule in dispatcher using /filter section to specify the HTTP requests that Dispatcher accepts. All other requests are sent back to the web server with a 404 error code.
In your case, it could be something like.
/filter {
/0001 { /glob "*" /type "deny" }
/0002 { /type "allow" /method "POST" /url "/content/mysite/[.]*.html" }
}
This will first deny access to all files and then allow access to specific content, which *.html pages under /mysite in this case.

Related

Is it possible to obfuscate or eliminate Http 403 status code

A web developer reached out to me to inquire if I could prevent the 403 Forbidden status from showing on a Drupal site. Of course I thought they just wanted a redirect to a 404 page or to the home page but that wasn't it. They wanted to know if I code make the 403 status code something else or prevent it from being sent to the browser.
Example: When someone browses to mysite.com/contact, they are sent to mysite.com/homepage by default because of redirection as I changed the how ErrorDocument handles 403 and 404 errors in Apache. However if you open devtools in any browser you can see that a 403 error is thrown.
The developer would like for that indication of the error code to be removed or replaced by something else. I am pretty sure it isn't possible but I have been wrong in the past so asking. I have done some Googling and can't find anything to put me on the path to finding where that is generated server-side to see if I can manipulate it. Any help would be appreciated as to find out if this is possible or not.
I agree with you, it is not possible to "override" 403 and 404 at server-side level.
What it is possible is to override 403 and 404 error pages through twig template, but it is only a frontend option

How to make Serverless return 404 instead of 403 for non-existing endpoints?

I tried the Serverless framework following the instructions to create the Hello World application. Everything works well, calling the [url]/dev/hello-world returns 200 response with the json output as expected.
By default, looks like the response for non-existing endpoints is 403 http status code with json {"message":"Missing Authentication Token"}.
I'd like to host a website using the framework.
Is there any way to make the Serverless return 404 instead of 403 for non-existing endpoints?
Returning a 403 instead of 404 is a deliberate design decision.
This is a pattern that is used in many other AWS APIs (most notably S3). In S3, if the user would have had permissions to the see presence of the key (via the ListBucket permission), a 404 will be returned; otherwise a 403 will be returned. Because API Gateway enables permissions at the method level, we can't know whether or not the user should be permitted to have knowledge of the existence of the API resource level, and default to the 403 as a result.
You can elect to catch all missing API methods using a {proxy+} pattern.
events:
- http:
path: {proxy+} # catch any path not specified elsewhere
method: get # or change to any method if you prefer
I did something a bit different its not relavent to API calls, but it is to the final goal of using serverless to host a website in the CloudFrontDistribution section I added this.
CustomErrorResponses:
ErrorCode: 403
ResponseCode: 404
ResponsePagePath: /404.html

GCM URL and 302 redirect response

My Question is regarding GCM URL and 302 redirect response.
When I do curl -v url = https://gcm-http.googleapis.com/gcm/send, I get a 302 response with a new URL populated in location header. My question is, why can't I use the new URL received in 302 redirect always? What is the reason for Google responding with 302 redirect? I would really appreciate detailed explanation.
Many Thanks,
Sushil
Based from this article, error 302 means "Resource temporarily located elsewhere according to the Location header." This seems to be a previously reported issue (with GCM):
https://groups.google.com/forum/#!topic/android-gcm/WwEg6buc-K0
IO Exception while accessing Google Cloud message?
Suggested action is to re-run the request to the provided (temporary, alternate) URL.

Return code for wrong HTTP method in REST API?

Our API user can get the root document (collection list) by sending GET request to root API address. If he sends POST, we should return something. The same question applies for other resource paths, like e.g. sending PATCH on query path etc. Not all methods have meaning on some paths.
As I see from HTTP RFCs is that we should return code 405: Method not allowed and sending back the Allowed response header with list of allowed methods.
I see that e.g. GitHub API returns 404: Not found in the case I explained above (sending POST to root).
What would be the proper response? 404 or 405? I see 405 more developer-friendly, so is there any reason not to use it?
The expected behavior in this case, as per the HTTP spec and by REST guidelines, would be to return 405 Method Not Allowed. The resource is there, since a GET works, so a 404 Not Found would be confusing.
I'm not familiar with the GitHub API but in some cases I see that for 403 Forbidden it also returns 404 Not Found:
Requests that require authentication will return 404 Not Found, instead of 403 Forbidden, in some places. This is to prevent the accidental leakage of private repositories to unauthorized users.
Maybe the behavior on the root address is part of a bigger mechanism that handles such cases generally, who knows. Maybe you could ask?

how can I get the POST body when using apache mod_auth_form

I'm trying to use mod_auth_form using the mode described in the documentation as "Inline Login with Body Preservation". In the documentation they mention using mod_include or a CGI as the ErrorDocument in order to generate the login form, e.g.:
ErrorDocument 401 /cgi-bin/login.cgi
The scenario is if a user wants to POST from either a non-authenticated page, or from an authenticated page with a timed-out session.
The POST hits the target url, is intercepted by mod_auth_form, which invokes the ErrorDocument 401, the user enters credentials. On the login form page, a "special" hidden form variable httpd_body can be added (and httpd_method) which will be processed by the authentication handler to create a POST body to the original target.
The problem is the the login.cgi doesn't get the POST data since (apparently) apache doesn't pass the POST data to an ErrorDocument. The alternative to ErrorDocument is to use the directive AuthFormLoginRequiredLocation however this does a plain 302 redirect and of course the POST data is lost.
It seems the feature for httpd_body is impossible to use, as it is impossible to capture the original POST data. Even in the case of a GET, one would have to parse the referrer to get the GET variables.
Is there a way in Apache to read the POST data and store it somewhere before the authentication hook is run? Or some other solution I've missed?