Apache RewriteCond %{REQUEST_FILENAME} !-f failing when file exists - apache

I'm pretty sure the issue is that %{REQUEST_FILENAME} does not reference a URI as it as been changed, but instead as it has been requested.
I have some code like this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule app/?(.*)$ /some-site/map-app/$1 [NC,QSA]
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule some-site/map-app/?(.*)$ /some-site/map-app/index.html [NC,L,QSA]
</IfModule>
The effect is supposed to be that
/app goes to -> /some-site/map-app when searching for files
if that fails (as it will often because it's an SPA), it goes to /some-site/map-app/index.html
For some reason it's rewriting every path to the index.html fallback. This means that #1 is occurring enough to meet the RewriteRule condition, but for some reason the RewriteCond are not working.
If I remove the logic for #2, the files resolve fine so the issue is not that the paths it produces are bad.
I've read the docs on "RewriteCond Specials" (https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html)
Why are these two not failing for paths that exist -- and have been produced by the first logic block -- within the second logic block?
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
Some examples and the desired outcome:
/app/ -> /some-site/map-app/index.html
/app/map/id-3 -> /some-site/map-app/index.html
/app/app.js -> /some-site/map-app/app.js
/app/assets/img/1.png -> /some-site/assets/img/1.png (RewriteEngine logic not included in post, but example included in case it changes potential answers)

Immediately after the first rewrite, the REQUEST_FILENAME server variable contains the rewritten URL-path only (eg. /some-site/map-app/foo), not the absolute filesystem path that the rewritten URL would map to. So, attempting to do a filesystem check on REQUEST_FILENAME at this stage will always fail.
The request needs to be remapped back to the filesystem for the REQUEST_FILENAME variable to be updated to the absolute filesystem path. This only occurs at the start of (and before each pass through) the rewriting engine.
You can force the rewriting engine to start over by simply including the L (last) flag on the first rule. This ends the current round of processing and passes the rewritten URL back into the rewrite engine, at which point the rewritten URL is remapped to the filesystem and REQUEST_FILENAME is updated.
Alternatively, don't use REQUEST_FILENAME in the second rule and instead manually construct the absolute filename from the DOCUMENT_ROOT and rewritten URL-path.
For example:
# If the previously rewritten URL does not map to a file (or directory)...
RewriteCond %{DOCUMENT_ROOT}/$0 !-f
RewriteCond %{DOCUMENT_ROOT}/$0 !-d
RewriteRule ^some-site/map-app(?:$|/(.*)) /some-site/map-app/index.html [NC,L]
Where $0 is a backreference to the entire URL-path that the RewriteRule pattern would match.
There is an issue with your existing regex (RewriteRule pattern). The regex app/?(.*)$ matches appanything since the slash is optional, which I'm sure is not the intention. Presumably you want to match app or app/ or app/<something>? This should also be anchored to the start of the URL-path, otherwise, it will also match /some-site/map-app/<something> (which is intended for the second rule). The same applies to the second rule (updated above).
So, try the following instead (if not using the L flag on the first rule):
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^app(?:$|/(.*)) /some-site/map-app/$1 [NC]
# If the previously rewritten URL does not map to a file (or directory)...
RewriteCond %{DOCUMENT_ROOT}/$0 !-f
RewriteCond %{DOCUMENT_ROOT}/$0 !-d
RewriteRule ^some-site/map-app(?:$|/(.*)) /some-site/map-app/index.html [NC,L]
No need for the <IfModule> wrapper or repeating the RewriteEngine directive. The QSA (Query String Append) flags were also superfluous, since this is the default action. I would also be wary of using the NC flag (on an internal rewrite) since this permits both /app and /APP to map to the same URL (minor duplicate content issue).
I also wonder whether the directory checks are really necessary?

Related

Remove trailing slash from next js static site [duplicate]

Found a problem with my site on NextJS. During development, I navigated the site using buttons and manually changing the browser address bar. It happened that I accidentally added a slash to the end, but my localhost server removed it and everything worked fine.
But everything changed when I uploaded my static application to the hosting. It automatically began to add these slashes when reloading the page. Because of this, my pictures on the site break.
As far as I understand, you need to correctly configure the .htaccess file.
Here is what it looks like now:
RewriteEngine On
RewriteRule ^([^/]+)/$ $1.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]+)/$ $1.html
RewriteRule ^([^/]+)/([^/]+)/$ /$1/$2.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$
RewriteRule (.*)$ /$1/ [R=301,L]
RewriteRule ^([^/]+)/$ $1.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^/]+)/$ $1.html
RewriteRule ^([^/]+)/([^/]+)/$ /$1/$2.html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(\.[a-zA-Z0-9]{1,5}|/)$
RewriteRule (.*)$ /$1/ [R=301,L]
Your existing rules are all expecting (or forcing) a trailing slash on all your URLs. So, if the canonical URL (and the URL you are linking to) does not include a trailing slash then all these rules essentially need to be reversed. However, there are other issues here (the first rule, for instance, is unconditionally rewriting the request to append the .html extension, which is repeated in the next rule with a condition.)
Try the following instead:
RewriteEngine On
# (OPTIONAL) Remove trailing slash if it happens to be on the request
# Exclude physical directories (which must end in a slash)
RewriteRule %{REQUEST_FILENAME} !-d
RewriteRule (.+)/$ /$1 [R=301,L]
# Rewrite request to corresponding ".html" file if it exists
RewriteCond %{DOCUMENT_ROOT}/$1.html -f
RewriteRule ^([^.]+)$ $1.html [L]
Your original directives only handled URLs with one or two path depth (eg. /foo/ or /foo/bar/). The second rule above handles any path depth (if so required). eg. /foo, /foo/bar, /foo/bar/baz etc. (no trailing slash).
As an optimisation I've assumed your URLs that require rewriting do not contain dots (that are otherwise used to delimit the file extension).
Note that the RewriteRule pattern (first argument) matches against the URL-path only (not the query string). If there is any query string on the initial request then this is simply passed through by default. (With regards to the rewrite and client-side JS, the query string is available on the initial request and should be parsed as before.)
Because of this, my pictures on the site break.
This will happen if you are using relative URLs to your images. You should really be using root-relative (starting with a slash) or absolute URLs to resolve this issue. See also:
404 not found - broken links to CSS, images

Redirecting all urls, including no path, to a file in subdirectory

I have checked a large amount of existing answers regarding .htaccess redirects. However none of them have helped me.
What I want to accomplish is redirecting all request urls to /api/init.php. However I've only gotten so far to where my index page www.example.com simply gives me a file listing because of the missing index.php file, while every url request with a path is working.
How can I accomplish this with .htaccess without ending up with a directory listing on my landing page?
This is as far as I got:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /api/init.php?path=$1 [NC,L,QSA]
Well your site root is a directory, so this rule you have excludes existing directories. What you could do is only exclude existing files, and allow existing directories to be handled by the PHP script. Like this:
RewriteEngine on
RewriteCond %{REQUEST_URI} !=/api/init.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /api/init.php?path=$1 [L,QSA]
I removed the NC flag as it's not needed. I added a condition to prevent an unnecessary file-system check.
You don't have to pass the path on in a URL parameter, as you could get it from $_SERVER['REQUEST_URI'] in PHP (not the same as REQUEST_URI in mod_rewrite, in PHP it always has the original URI). If you wanted to do that then your rule becomes nice and simple:
RewriteEngine on
RewriteCond %{REQUEST_URI} !=/api/init.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^ /api/init.php [L]
Because the query string will just be passed on unaffected (so QSA is not needed).

Why doesn't my .htaccess work when the URL includes a real file name?

I'm trying to pass all requests within a certain subdirectory to a file called "handler.php". The .htaccess I have in place works if the URL is not an actual file name, but not if I enter the name of a real file; it instead loads that file directly, never hitting handler.php.
Could someone explain to me what I'm doing wrong here? The .htaccess file looks like this:
RewriteEngine On
RewriteRule ^$ handler.php?url=$1
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /handler.php?url=$1 [L,QSA]
Is there something I'm doing wrong here? I want all requests to pass the url into handler.php.
The rule doesn't match real files, because the conditions say so
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These conditions match, if the request does not (!) match a real file (-f) and if it doesn't match a real directory (-d).
If you want to handle all requests no matter what, remove these conditions. You can also remove the first rule, because it is handled by this one too. This leaves just the second rule, but you must prevent a rewrite loop with another condition
RewriteCond %{REQUEST_URI} !^/handler\.php
RewriteRule ^(.*)$ /handler.php?url=$1 [L,QSA]

RewriteCond strange behavior (file exists check)

I can't understand why redirect depends on RewriteRule (not on RewriteCond).
My .htaccess:
Options +FollowSymLinks +SymLinksIfOwnerMatch
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ true.txt
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ false.txt
</IfModule>
Root folder contains:
true.txt (contains 'true')
false.txt (contains 'false')
test.txt (contains 'test')
If I try to open test.txt I get true and if I try to open nonexist.txt i get true too.
Now I change my .htaccess:
...
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ $1
...
And now if I try to open test.txt I get test and if I try to open nonexist.txt i get false.
UPDATE: Thanks for answers, I understood how it works but one problem still exists.
If I try to check 'if file exists' in another directory it always returns false.
/files/test.txt
/script/.htaccess
/script/false.txt
/script/true.txt
now my .htaccess looks like
RewriteCond %{REQUEST_FILENAME} .*(true|false).*$
RewriteRule .* - [S=2]
RewriteCond %{DOCUMENT_ROOT}/files/%{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ true.txt [L]
RewriteCond %{DOCUMENT_ROOT}/files/%{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ false.txt [L]
I always get false.
I also tried RewriteCond ../files/%{REQUEST_FILENAME} and also always get false result.
If I move test.txt in script folder then and change RewriteCond %{REQUEST_FILENAME} all works fine.
It's because of the way mod_rewrite works: the user requests test.txt, mod_rewrite catches the requests and rewrites the URI to false.txt, then it makes a second pass, by sending an internal request for false.txt, which is caught and rewritten to true.txt. Then a third pass is made, the request is caught and rewritten to true.txt, but since the URI stays the same, no more passes are made.
It's rather counter-intuitive, but there's logic to it. Here's the control flow diagram from the docs:
The [L] flag is often advertised as a magic bullet to stop the recursion, but in fact it just ensures that once a request matches a pattern, then the execution stops and no further processing will take place in that pass, but the internal request will be sent out anyhow, so a second pass is made through the same ruleset. The execution stops only if the URI is unchanged after a pass.
re: update
Your problem is, the REQUEST_FILENAME environmental variable actually holds a path (by default the full filesystem path, but there are a few twists to that), so %{DOCUMENT_ROOT}/files/%{REQUEST_FILENAME} ends up being something horrible.
As for a solution... well, it's tricky, I think. It'd be a lot easier if the .htaccess were in root. The only solution I can think of right now is:
RewriteEngine on
RewriteCond %{REQUEST_URI} script/(.*)$
RewriteCond %{DOCUMENT_ROOT}/files/%1 -f
RewriteRule .* true.txt [L]
RewriteCond %{REQUEST_URI} !(true.txt)|(false.txt)
RewriteRule .* false.txt [L]
It's rather ugly, and not very scalable or portable. In the first condition I get the file's name, in the second I check if it exists, and if it does, it's true. Everything else is false. Then again, if the files directory is also in the scope of the .htaccess, it's easier and nicer by magnitudes.
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_URI} !(true|false)\.txt$
RewriteRule .* true.txt [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* false.txt [L]
Note the second RewriteCond to prevent rewriting true.txt and false.txt files, and L flag on the rules to stop rules execution
These are to prevent rules loop
UPDATE:
%{REQUEST_FILENAME} is full path, hence if you add it to some path, you'll get false (it will try to match this, essentially: /var/www/subfolder/var/www/filename.txt
To match a file in another folder you will need a match vs URI part...
Here's how you can do it:
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} ^/([^/]+)$
RewriteCond %{DOCUMENT_ROOT}/files/%1 -f
RewriteRule .* files/$0 [L]
The first condition checks if the request was to some filename in the root directory (it checks that uri starts with a /, but does not contain any more slashes
Note that the first condition encloses everything but the slash in the beginning with parentheses - this matched subpattern will be used later
The second condition ensures the file, which name is saved in subpattern %1 (matched by first condition) exists in subfolder files/ inside %{DOCUMENT_ROOT}
If both the rules matched, the request is rewritten to that file (via sub-request - the browser is not redirected).
Instead of using "RewriteCond %{REQUEST_FILENAME} !-f" you can try:
"RewriteCond %{THE_REQUEST} !-U", which checks the if the address exists.
Sometimes the file path and the address where the file is served are different, making the former unusable.
example:
RewriteEngine On
RewriteCond %{THE_REQUEST} !-U
RewriteRule ^(.*/media/.*)\.(gif|png|jpe?g)$ https://xyz.company.com$1.$2 [NC,L,R=301]

Whitelist in .htaccess

Instead of blacklisting inaccessible directories (like with deny all) I want to use a whitelist. Basically, I need this functionality:
If the uri requests a file that exists in /public directory, display it;
Otherwise route the request to /public/index.php;
'public' string is not needed in request string: http://site.com/flower.jpg displays DOCUMENT_ROOT/public/flower.jpg file from the file system;
Example:
Directory structure:
public\
flower.jpg
index.php
data\
secret_file.crt
Request string and expected result:
site.com/flower.jpg
flower.jpg is displayed
site.com/data/secret_file.crt
site.com/public/flower.jpg
site.com/public
site.com/data
site.com/any/random_url
request is routed to public/index.php
What I have now:
(and even that with outside help)
# the functionality described in #1 above
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f
RewriteRule .* public%{REQUEST_URI} [L]
# I'd like to take out the following line so ALL other requests route to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* public/index.php
If I remove the
RewriteCond %{REQUEST_FILENAME} !-f
line, it seizes to work, I've experimented countless configurations, read the modRewrite docs but can't figure out why this simple thing refuses to simply function.
Can anyone help me out or point in the right direction?
Complete final solution for reference
RewriteEngine On
# following line stops mod_rewrite from looping because this rule has already been applied
RewriteCond %{REQUEST_URI} !^/public/index.php
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f
RewriteRule .* /public%{REQUEST_URI} [L]
# don't apply this rule if the first rule has been applied
RewriteCond %{REQUEST_URI} !^/public/
RewriteRule .* /public/index.php [L]
It's a little more complicated when the application is in a subdirectory, like http://site.com/uk/, but this works great.
Ok, this is going to be a little confusing to explain. The problem you are having is that when mod_rewrite rewrites something, without the [R] or [P], it redirects internally, and all the rewrite rules get applied again. This keeps happening until the rewritten uri is the same as the un-rewritten uri. So the first rule you have is getting rewritten by the second rule. You need to prevent that from happening.
First, let's look at the first rule. What you had is totally fine, except you need to add a condition for the caveat site.com/public/flower.jpg rerouted to public/index.php. This means if the request itself has a /public/ in it, it will not serve the request (and let the 2nd rule handle things). An additional caveat here is if you have a directory "public" inside "/public", as in DOCUMENT_ROOT/public/public/, it will be inaccessible.
# Make sure the request itself isn't for /public/
RewriteCond %{THE_REQUEST} !^[A-Z]+\ /public/
# Make sure the filename exists.
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f
RewriteRule ^ /public%{REQUEST_URI} [L]
Here we've done the extra check for a request starting with something like GET /public/flower.jpg, if it matches, we skip this rule entirely. Also, this rule will break if you try to access a directory in /public/. For example, if you have a directory "stuff" inside "/public" and try to access it via the request site.com/stuff/, this rule will not allow you to see the contents (even if there is an index.html file in /stuff/) because you are not checking if directories exist. You can do that by adding this condition for -d, like so:
# Make sure the request itself isn't for /public/
RewriteCond %{THE_REQUEST} !^[A-Z]+\ /public/
# Make sure the filename/directory exists.
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -d
RewriteRule ^ /public%{REQUEST_URI} [L]
The -d condition along with the [OR] of the -f says: if %{DOCUMENT_ROOT}/public%{REQUEST_URI} is a regular file OR a directory. (See the RewriteCond docs)
Now for the second rule, and this is going to look a bit confusing because we have to handle the negation of the first rule's conditions. If the first rule passes and the URI is rewritten, 2 things happen:
The request doesn't start with something like: GET /public/
The uri got rewritten to "/public/[something]"
So we'll have 2 conditions to deal with that. If the first rule rewrote the URI, we don't want to touch it again. This solves the problem that I mentioned in the first paragraph. Additionally, we don't want to URI to get re-rewritten, causing a rewrite loop. So we need to add a condition to stop rewriting if the 2nd rule has already been applied, which means the URI is now /public/index.php. Here are the combination of those conditions:
# stops mod_rewrite from looping because this rule has already been applied
RewriteCond %{REQUEST_URI} !^/public/index.php
# don't apply this rule if the first rule has been applied
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /public/ [OR]
RewriteCond %{REQUEST_URI} !^/public/
RewriteRule ^ /public/index.php [L]
This may work:
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_FILENAME} -f [OR]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} -f
RewriteRule (.*) public$1 [QSA,L]
RewriteRule .* public/index.php
The optimized version may work too but I'm not sure:
RewriteCond %{DOCUMENT_ROOT}(/public|public|)%{REQUEST_FILENAME} -f
RewriteRule (.*) public$1 [QSA,L]
RewriteRule .* public/index.php
By the way your logic is weird: the following rule:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* public/index.php
Means: "if the request is not a file, rewrite to public/index.php". The problem is here: if it's a file, what's going on? Nothing. The RewriteRule is ignored. This is not safe, imagine if it's a file that you may not want the user to access? Just remove this rule, it's useless, and without it, it's safer (from my point of view).
May I ask you to tell me if the optimized version worked?
Please try to use the RewriteLog directive: it helps you to track down such problems:
# Trace:
# (!) file gets big quickly, remove in prod environments:
RewriteLog "/web/logs/mywebsite.rewrite.log"
RewriteLogLevel 9
RewriteEngine On
Tell me if it works.
I'm a bit confused with your first set of rules, since %{REQUEST_URI} would be /public/flower.jpg if I'm not mistaking. I would have done it this way :
RewriteCond public/%{REQUEST_FILENAME} -f
RewriteRule ^.*$ public/%{REQUEST_FILENAME} [L]
RewriteCond public/%{REQUEST_FILENAME} !-f
RewriteRule ^.*$ public/index.php [L]
I'm not sure of the behaviour if %{REQUEST_FILENAME} is empty but basically the rules says:
If the filename exists in public, rewrite all URI to that file, if it does not rewrite to index.php
Would that work for you?
Have you considered programmatically creating your .htaccess file to blacklist anything that isn't on a whitelist that you set in whatever file you use to create it? If you ask me, you can't get much simpler.