HTTPS redirect fails with .htaccess rewrite for certain URL length - apache

I have an .htaccess file for showing a default image if the requested URL does not exist. I simplified it to this:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . default.png [L]
Using HTTPS, this suddenly stopped working if the URL exceeds a certain length (connection closed).
HTTP always works.
It used to work like this for years and it still does on other servers.
It also seems that the kind of characters matter:
not working:
https://server.abc/images/01234567890123456789012345678901234567890123456789abc.png
https://server.abc/images/012345678901234567890123456789012345678901234567890123456789.png
working:
https://server.abc/images/01234567890123456789012345678901234567890123456789.png
https://server.abc/images/01234567890123456789012345678901234567890123456789123.png
https://server.abc/images/0123456789012345678901234567890123456789012345678912345.png
The redirect works if the condition is removed (second line), so it seems like it has something to do with REQUEST_FILENAME, HTTPS and the byte size (encoding?) of the filename/URL string.
This occurs with Apache/2.4.46 and macOS/10.15.7. It might have started after one of the latest security updates.
Any idea where this is coming from or what kind of configuration could cause this?
Thanks for your help!

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . default.png [L]
It's not clear why this would "fail" for only certain requests over HTTPS only. A "security" update (particularly if it involves mod_security) is a likely cause - although an unusual one.
However, you shouldn't really be doing it this way to begin with. This will result in a request for any non-existent URL being served /default.png with a "200 OK" response and potentially risk being indexed by search engines and abused by a malicious user.
What you are doing here is essentially setting a custom 404 response to an image, which you could do with the following instead and which will also return the "correct" 404 status.
ErrorDocument 404 /default.png
Now, any request that does not map to file (or directory) will be served the image /default.png but with a 404 "Not Found" HTTP response code, so search engines/bots get the "correct" response.
This also naturally gets around the REQUEST_FILENAME issue, assuming these "not working" URLs do ultimately result in a 404 and not some other response (due to the "security" update).

Related

POST information getting lost in .htaccess redirect

So, I have a fully working CRUD. The problem is, because of my file structure, my URLs were looking something like https://localhost/myapp/resources/views/add-product.php but that looked too ugly, so after research and another post here, I was able to use a .htaccess file to make the links look like https://localhost/myapp/add-product (removing .php extension and the directories), and I'm also using it to enforce HTTPS. Now, most of the views are working fine, but my Mass Delete view uses POST information from a form on my index. After restructuring the code now that the redirect works, the Mass Delete view is receiving an empty array. If I remove the redirect and use the "ugly URLs" it works fine. Here's how my .htaccess file is looking like:
Options +FollowSymLinks +MultiViews
RewriteEngine On
RewriteBase /myapp/
RewriteRule ^resources/views/(.+)\.php$ $1 [L,NC,R=301]
RewriteCond %{DOCUMENT_ROOT}/myapp/resources/views/$1.php -f
RewriteRule ^(.+?)/?$ resources/views/$1.php [END]
RewriteCond %{HTTPS} off
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
I didn't actually write any of it, it's a mesh between answered questions and research. I did try to change the L flag to a P according to this post: Is it possible to redirect post data?, but that gave me the following error:
Internal Server Error
The server encountered an internal error or misconfiguration and was unable to complete your request.
Please contact the server administrator at admin#example.com to inform them of the time this error occurred, and the actions you performed just before this error.
More information about this error may be available in the server error log.
Apache/2.4.52 (Win64) OpenSSL/1.1.1m PHP/8.1.2 Server at localhost Port 443
POST information getting lost in .htaccess redirect
You shouldn't be redirecting the form submission in the first place. Ideally, you should be linking directly to the "pretty" URL in your form action. If you are unable to change the form action in the HTML then include an exception in your .htaccess redirect to exclude this particular URL from being redirected.
Redirecting the form submission is not really helping anyone here. Users and search engines can still see the "ugly" URL (it's in the HTML source) and you are doubling the form submission that hits your server (and doubling the user's bandwidth).
"Redirects" like this are only for when search engines have already indexed the "ugly" URL and/or is linked to by external third parties that you have no control over. This is in order to preserve SEO, just like when you change any URL structure. All internal "ugly" URLs should have already been converted to the "pretty" version. The "ugly" URLs are then never exposed to users or search engines.
So, using a 307 (temporary) or 308 (permanent) status code to get the browser to preserve the request method across the redirect should not be necessary in the first place. For redirects like this it is common to see an exception for POST requests (because the form submission shouldn't be redirected). Or only target GET requests. For example:
RewriteCond %{REQUEST_METHOD} GET
:
Changing this redirect to a 307/8 is a workaround, not a solution. And if this redirect is for SEO (as it only should be) then this should be a 308 (permanent), not a 307 (temporary).
Aside:
RewriteCond %{HTTPS} off
RewriteRule .* https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
Your HTTP to HTTPS redirect is in the wrong place. This needs to go as the first rule, or make sure you are redirecting to HTTPS in the current first rule and include this as the second rule, before the rewrite (to ensure you never get a double redirect).
By placing this rule last then any HTTP requests to /resources/views/<something>.php (or /<something>) will not be upgraded to HTTPS.

How to redirect 404 errors (and 403) to index.html with a 200 response

I am building a static website that uses JS to parse a URL in order to work out what to display.
I need every URL to actually open index.html where the JS can pull apart the path and act accordingly.
For example http://my.site/action/params will be parsed as an action with some parameters params.
Background, this will be served from AWS S3 via CloudFront using custom error redirection - and this works fine on AWS.
I am, however, trying to build a dev environment under Ubuntu running apache and want to emulate the redirection locally.
I have found a couple of pages that come close, but not quite.
This page shows how to do the redirect to a custom error page on the server housed in a file called "404". As 404 is the actual error response code, the example looks a bit confusing and I am having trouble modifying the example to point to index.html.
The example in the accepted answer suggests:
Redirect 200 /404
ErrorDocument 404 /404
which I have modified to:
Redirect 200 /index.html
ErrorDocument 404 /index.html
However this returns a standard 404 Not Found error page.
If I remove the Redirect line, leaving just the ErrorDocument line, I get the index.html page returned as required, but the https status response is still a 404 code where I need it to be a 200.
If I leave the Redirect line as per the example I actually get the same result as my modified version, so I suspect this is the line that is incorrect, but I can't figure it out.
(I'm using the Chrome Dev Tools console to see the status codes etc).
I think I have found a solution using Rewrite rules instead of error docs.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.html [L]
The key I was missing in this approach seems to be not including an R=??? status response code at the end of the rewrite rule. It took me a while to find that!
As it uses mod_rewrite rather than defining error pages I assume that the mechanism is different to how CloudFront does it, but for my dev system needs it seems that the result is the same - which means I can work on the site without having to invalidate the CloudFront cache after every code change and upload.

Apache rewrite XSS protection issues

We are developing a website for a trucking company and it recently been subjected to penetration testing. One of the attacks done was injecting a XSS script into the request url:
ourcompanyhostname.com/abc/authorize<script>alert('xss');</script>
Since our web server is Apache, we have fixed the issue by setting up the ff. in the httpd.conf file. basically, rather reflecting the script in the 404 response erorr, a generic 400 response is thrown instead.
RewriteRule ^/abc/authorize/.*[^A-Za-z0-9./\-_]+ "-" [L,R=400]
The issue is when the attack was changed to the one below, it no longer can be caught:
ourcompanyhostname.com/abc/authorize%3c%3cSCRIPT%3ealert(%22XSS%22)%3b%2f%2f%3c%3c%2fSCRIPT%3e
Response still was 404 instead of 400.
Is there another way to achieve what we want? We already have tried doing the one below but it still won't work. We just want it to return an http 400 error when a XSS attack is done.
RewriteCond %{REQUEST_URI} ^.*(\*|;|<|>|\)|%0A|%0D|%3C|%3E|%00).* [NC]
RewriteCond %{REQUEST_URI} abc
RewriteRule ^(.*)$ "-" [L,R=400]
I don't think the encoding matters, mod_rewrite sees the path in the URL after decoding.
I think you may have missed that your original rule requires matching a trailing slash after "authorize" and the new malicious request doesn't have it.
Your final rule works fine for me, if you get an unexpected result for a particular URL you have to study the rewritelog/logelvel trace8 output.
If the 404 is generated by Apache, just use a custom ErrorDocument for 404.
If it is generated by your EE server, do the same in your web.xml.

mod_rewrite force internal redirect

My goal is to reduce the visibility of my app's signature. This is not security by obscurity, just a superficial bit of defence in depth, so that at first glance an attacker cannot tell if it is a static site or not. (Also cosmetic; it just feels "cleaner" to hide app details even if they would never become visible in normal operation). Therefore I want to deny access to some directories without revealing that they exist, so I must give the exact same 404 response my app would give if the user requested a non-existent page.
In an .htaccess file, I have the following:
RewriteEngine on
RewriteCond "%{REQUEST_FILENAME}" "!-f"
RewriteCond "%{REQUEST_FILENAME}" "!-d"
RewriteRule "^(.*)" "index.php?page=$1"
RewriteRule "^(secret_dir1|secret_dir2)(/.*)?$" "index.php?page=404"
where index.php renders a nice pretty webpage according to the value of the "page" GET parameter; if "page" does not correspond to a page at the app level, or "page" is set to 404, the script renders a pretty 404 page with proper headers and everything.
Here's where the problem happens. "App-level" 404s work as expected; a 404 page is rendered. However, if the user requests mydomain.com/dir_i_am_trying_to_hide, they are given a 301 redirect to mydomain.com/dir_i_am_trying_to_hide/?page=404: an external redirect instead of an internal rewrite.
Why is it sending out an external redirect instead of just rewriting the url? How am I supposed to avoid this properly? Barring that, is there a way to force the server to do an internal rewrite instead? (The Apache docs seem to indicate you can force a RewriteRule to be external, but not the other way around)
Turns out my rewrite rule was not causing the external redirect; Apache's DirectorySlash was; I would query hostname/secret_dir1 and it would send a redirect to hostname/secret_dir1/.
I'm not sure why the query string was changed, but adding DirectorySlash off fixed it.

Frontloading mod_rewrite rule is causing index.php to load twice

I've been working on a project that uses a frontloader to handle all requests (Routing domain.com/args/go/here to Index.php?req=args/go/here), and it's worked very well... Or I should say, I thought it did - I recently added a new logger, and to test it I placed a test log message in index.php. This message was being written to my log file twice, every time I reloaded the page, and after much debugging I found the cause to be my .htaccess file - for whatever reason, it loads index.php twice for every request.
Here's my .htaccess:
RewriteEngine On
RewriteBase /site/beta/ #I added this after I discovered the bug
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^index\.php$ #This too. Doesn't work
RewriteRule ^(.*)$ index.php?args=$1 [L]
I've also tried:
FallbackResource /site/beta/index.php
Which both doesn't work (Index.php just doesn't load if you try to go to, say, 127.0.0.1/site/beta/admin/controls/ - but it does if you just go to /index.php), and still loads twice.
Is anyone able to help me? I spent a few hours in IRC, and no one could come up with a solution that worked. (The two above are the only ones suggested)
Are you completly sure it's a mod_rewrite bug? If you enable RewriteLog file with a high rewriteLogLevel (9) do you see the same requests handled 2 times?
For me every time I see the 'same request done 2 times' I think about another strange web bug: The empty IMG src bug.
If you have somewhere in your HTML an
<IMG SRC="">
or in one of the css (harder to find) a:
url()
Then you've got it. HTTP protocol dictate that an empty GET url (and an image or url() in css is a GET implicit request) MUST be a call to the same url as the one which render the original page (and it can be a POST as well if you get your page as a POST request).
There're really few reason to have a mod_rewrite responding 2 times to one single request. Check with Firebug or LiveHTTP Requests that you're not always sending the index.php request 2 times. Or test your server with a telnet-mode HTTP request, by hand, as this will certainly send only one request.
This can also be the browser trying to load (invisibly it seems unless you check the apache access logs) favicon.ico. I had the same issue until I put one in my sites root directory. I know the issue was resolved for the initial asker, I'm putting this here for people like myself looking for an answer to the same question.