RewriteCond strange behavior (file exists check) - apache

I can't understand why redirect depends on RewriteRule (not on RewriteCond).
My .htaccess:
Options +FollowSymLinks +SymLinksIfOwnerMatch
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ true.txt
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ false.txt
</IfModule>
Root folder contains:
true.txt (contains 'true')
false.txt (contains 'false')
test.txt (contains 'test')
If I try to open test.txt I get true and if I try to open nonexist.txt i get true too.
Now I change my .htaccess:
...
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ $1
...
And now if I try to open test.txt I get test and if I try to open nonexist.txt i get false.
UPDATE: Thanks for answers, I understood how it works but one problem still exists.
If I try to check 'if file exists' in another directory it always returns false.
/files/test.txt
/script/.htaccess
/script/false.txt
/script/true.txt
now my .htaccess looks like
RewriteCond %{REQUEST_FILENAME} .*(true|false).*$
RewriteRule .* - [S=2]
RewriteCond %{DOCUMENT_ROOT}/files/%{REQUEST_FILENAME} -f
RewriteRule ^(.*)$ true.txt [L]
RewriteCond %{DOCUMENT_ROOT}/files/%{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ false.txt [L]
I always get false.
I also tried RewriteCond ../files/%{REQUEST_FILENAME} and also always get false result.
If I move test.txt in script folder then and change RewriteCond %{REQUEST_FILENAME} all works fine.

It's because of the way mod_rewrite works: the user requests test.txt, mod_rewrite catches the requests and rewrites the URI to false.txt, then it makes a second pass, by sending an internal request for false.txt, which is caught and rewritten to true.txt. Then a third pass is made, the request is caught and rewritten to true.txt, but since the URI stays the same, no more passes are made.
It's rather counter-intuitive, but there's logic to it. Here's the control flow diagram from the docs:
The [L] flag is often advertised as a magic bullet to stop the recursion, but in fact it just ensures that once a request matches a pattern, then the execution stops and no further processing will take place in that pass, but the internal request will be sent out anyhow, so a second pass is made through the same ruleset. The execution stops only if the URI is unchanged after a pass.
re: update
Your problem is, the REQUEST_FILENAME environmental variable actually holds a path (by default the full filesystem path, but there are a few twists to that), so %{DOCUMENT_ROOT}/files/%{REQUEST_FILENAME} ends up being something horrible.
As for a solution... well, it's tricky, I think. It'd be a lot easier if the .htaccess were in root. The only solution I can think of right now is:
RewriteEngine on
RewriteCond %{REQUEST_URI} script/(.*)$
RewriteCond %{DOCUMENT_ROOT}/files/%1 -f
RewriteRule .* true.txt [L]
RewriteCond %{REQUEST_URI} !(true.txt)|(false.txt)
RewriteRule .* false.txt [L]
It's rather ugly, and not very scalable or portable. In the first condition I get the file's name, in the second I check if it exists, and if it does, it's true. Everything else is false. Then again, if the files directory is also in the scope of the .htaccess, it's easier and nicer by magnitudes.

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -f
RewriteCond %{REQUEST_URI} !(true|false)\.txt$
RewriteRule .* true.txt [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* false.txt [L]
Note the second RewriteCond to prevent rewriting true.txt and false.txt files, and L flag on the rules to stop rules execution
These are to prevent rules loop
UPDATE:
%{REQUEST_FILENAME} is full path, hence if you add it to some path, you'll get false (it will try to match this, essentially: /var/www/subfolder/var/www/filename.txt
To match a file in another folder you will need a match vs URI part...
Here's how you can do it:
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} ^/([^/]+)$
RewriteCond %{DOCUMENT_ROOT}/files/%1 -f
RewriteRule .* files/$0 [L]
The first condition checks if the request was to some filename in the root directory (it checks that uri starts with a /, but does not contain any more slashes
Note that the first condition encloses everything but the slash in the beginning with parentheses - this matched subpattern will be used later
The second condition ensures the file, which name is saved in subpattern %1 (matched by first condition) exists in subfolder files/ inside %{DOCUMENT_ROOT}
If both the rules matched, the request is rewritten to that file (via sub-request - the browser is not redirected).

Instead of using "RewriteCond %{REQUEST_FILENAME} !-f" you can try:
"RewriteCond %{THE_REQUEST} !-U", which checks the if the address exists.
Sometimes the file path and the address where the file is served are different, making the former unusable.
example:
RewriteEngine On
RewriteCond %{THE_REQUEST} !-U
RewriteRule ^(.*/media/.*)\.(gif|png|jpe?g)$ https://xyz.company.com$1.$2 [NC,L,R=301]

Related

Apache RewriteCond %{REQUEST_FILENAME} !-f failing when file exists

I'm pretty sure the issue is that %{REQUEST_FILENAME} does not reference a URI as it as been changed, but instead as it has been requested.
I have some code like this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule app/?(.*)$ /some-site/map-app/$1 [NC,QSA]
</IfModule>
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule some-site/map-app/?(.*)$ /some-site/map-app/index.html [NC,L,QSA]
</IfModule>
The effect is supposed to be that
/app goes to -> /some-site/map-app when searching for files
if that fails (as it will often because it's an SPA), it goes to /some-site/map-app/index.html
For some reason it's rewriting every path to the index.html fallback. This means that #1 is occurring enough to meet the RewriteRule condition, but for some reason the RewriteCond are not working.
If I remove the logic for #2, the files resolve fine so the issue is not that the paths it produces are bad.
I've read the docs on "RewriteCond Specials" (https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html)
Why are these two not failing for paths that exist -- and have been produced by the first logic block -- within the second logic block?
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
Some examples and the desired outcome:
/app/ -> /some-site/map-app/index.html
/app/map/id-3 -> /some-site/map-app/index.html
/app/app.js -> /some-site/map-app/app.js
/app/assets/img/1.png -> /some-site/assets/img/1.png (RewriteEngine logic not included in post, but example included in case it changes potential answers)
Immediately after the first rewrite, the REQUEST_FILENAME server variable contains the rewritten URL-path only (eg. /some-site/map-app/foo), not the absolute filesystem path that the rewritten URL would map to. So, attempting to do a filesystem check on REQUEST_FILENAME at this stage will always fail.
The request needs to be remapped back to the filesystem for the REQUEST_FILENAME variable to be updated to the absolute filesystem path. This only occurs at the start of (and before each pass through) the rewriting engine.
You can force the rewriting engine to start over by simply including the L (last) flag on the first rule. This ends the current round of processing and passes the rewritten URL back into the rewrite engine, at which point the rewritten URL is remapped to the filesystem and REQUEST_FILENAME is updated.
Alternatively, don't use REQUEST_FILENAME in the second rule and instead manually construct the absolute filename from the DOCUMENT_ROOT and rewritten URL-path.
For example:
# If the previously rewritten URL does not map to a file (or directory)...
RewriteCond %{DOCUMENT_ROOT}/$0 !-f
RewriteCond %{DOCUMENT_ROOT}/$0 !-d
RewriteRule ^some-site/map-app(?:$|/(.*)) /some-site/map-app/index.html [NC,L]
Where $0 is a backreference to the entire URL-path that the RewriteRule pattern would match.
There is an issue with your existing regex (RewriteRule pattern). The regex app/?(.*)$ matches appanything since the slash is optional, which I'm sure is not the intention. Presumably you want to match app or app/ or app/<something>? This should also be anchored to the start of the URL-path, otherwise, it will also match /some-site/map-app/<something> (which is intended for the second rule). The same applies to the second rule (updated above).
So, try the following instead (if not using the L flag on the first rule):
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^app(?:$|/(.*)) /some-site/map-app/$1 [NC]
# If the previously rewritten URL does not map to a file (or directory)...
RewriteCond %{DOCUMENT_ROOT}/$0 !-f
RewriteCond %{DOCUMENT_ROOT}/$0 !-d
RewriteRule ^some-site/map-app(?:$|/(.*)) /some-site/map-app/index.html [NC,L]
No need for the <IfModule> wrapper or repeating the RewriteEngine directive. The QSA (Query String Append) flags were also superfluous, since this is the default action. I would also be wary of using the NC flag (on an internal rewrite) since this permits both /app and /APP to map to the same URL (minor duplicate content issue).
I also wonder whether the directory checks are really necessary?

Why doesn't my .htaccess work when the URL includes a real file name?

I'm trying to pass all requests within a certain subdirectory to a file called "handler.php". The .htaccess I have in place works if the URL is not an actual file name, but not if I enter the name of a real file; it instead loads that file directly, never hitting handler.php.
Could someone explain to me what I'm doing wrong here? The .htaccess file looks like this:
RewriteEngine On
RewriteRule ^$ handler.php?url=$1
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /handler.php?url=$1 [L,QSA]
Is there something I'm doing wrong here? I want all requests to pass the url into handler.php.
The rule doesn't match real files, because the conditions say so
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
These conditions match, if the request does not (!) match a real file (-f) and if it doesn't match a real directory (-d).
If you want to handle all requests no matter what, remove these conditions. You can also remove the first rule, because it is handled by this one too. This leaves just the second rule, but you must prevent a rewrite loop with another condition
RewriteCond %{REQUEST_URI} !^/handler\.php
RewriteRule ^(.*)$ /handler.php?url=$1 [L,QSA]

Complex(?) htaccess rewriting / redirecting

It seems every few weeks I have to ask more .htaccess rewriting/redirecting questions. Every time I think I understand it, another wrench gets thrown into my project that shows that I don't.
EDIT: My original question wasn't very clear so the following is an attempt to be more concise.
As it stands, all of the .html files live in the root directory. eg: http://example.com/about.html
There aren't any sub-directories with the exception of normal ones like img, css, etc.
For tracking purposes, if someone types in http://example.com/random/ where "random" can be any string of characters, I'd want them to see the index.html file, without modifying the url. The directory "random" doesn't actually exist on the server at all.
The same goes for other pages like about.html. If someone types in http://example.com/random/about.html I'd want them to see the about.html page.
At the same time, I'd like http://example.com/random/about or http://example.com/about (missing file extension) to also show the about page.
However, if someone typed in a page that doesn't exist, I'd like for it to use the ErrorDocument
Example: I don't have a file named "pickups.html" so the following would all be 404s:
http://example.com/pickups.html
http://example.com/pickups
http://example.com/random/pickups.html
http://example.com/random/pickups
It would be nice if the end redirect/rewrite did have the file extension stripped off (because it looks nicer).
My thoughts are that any request ending with a / would just serve up the index.html file that exists at the site root. So that leaves the files.
My thought process is:
strip the file extension off of the request
check if that file with an extension exists at site root
if yes, display that page.
if no, 404.
My initial code (had help on it) was this:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+)/(.*)$ /$2 [R=301,L]
I understand that in that code I'm grabbing everything after the last slash and serving it from the document root. Unfortunately, it doesn't account for files that do not exist.
Starting with existing files, they will be passed through unchanged. This also prevents rewrite loops.
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^ - [L]
Next are existing files, requested as part of an optional, virtual subdirectory
RewriteCond %{DOCUMENT_ROOT}/$2 -f
RewriteRule ^(.+/)?(.+)$ /$2 [L]
RewriteCond %{DOCUMENT_ROOT}/$2.html -f
RewriteRule ^(.+/)?(.+)$ /$2.html [L]
This splits the request into an optional prefix (.+/)? and the file part. If this file part exists, maybe with an appended .html, you're done.
Next comes anything with a trailing slash, just rewrite to index.html
RewriteRule /$ /index.html [L]
Anything else will be requests for non-existing files, which yield a 404 status code.
In order to remove an optional .html extension and remove an optional trailing slash / for existing files, we must insert two rules at the beginning
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{DOCUMENT_ROOT}/$2.html -f
RewriteRule ^(.+/)?(.+?)\.html/?$ /$1$2 [R,L]
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{DOCUMENT_ROOT}/$2.html -f
RewriteRule ^(.+/)?(.+?)/$ /$1$2 [R,L]
These rules are similar to the other rules, except they do a redirect R|redirect instead of a rewrite, and have an additional condition to prevent a rewrite loop.
Putting everything together gives
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{DOCUMENT_ROOT}/$2.html -f
RewriteRule ^(.+/)?(.+?)\.html/?$ /$1$2 [R,L]
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{DOCUMENT_ROOT}/$2.html -f
RewriteRule ^(.+/)?(.+?)/$ /$1$2 [R,L]
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^ - [L]
RewriteCond %{DOCUMENT_ROOT}/$2 -f
RewriteRule ^(.+/)?(.+)$ /$2 [L]
RewriteCond %{DOCUMENT_ROOT}/$2.html -f
RewriteRule ^(.+/)?(.+)$ /$2.html [L]
RewriteRule /$ /index.html [L]

Whitelist in .htaccess

Instead of blacklisting inaccessible directories (like with deny all) I want to use a whitelist. Basically, I need this functionality:
If the uri requests a file that exists in /public directory, display it;
Otherwise route the request to /public/index.php;
'public' string is not needed in request string: http://site.com/flower.jpg displays DOCUMENT_ROOT/public/flower.jpg file from the file system;
Example:
Directory structure:
public\
flower.jpg
index.php
data\
secret_file.crt
Request string and expected result:
site.com/flower.jpg
flower.jpg is displayed
site.com/data/secret_file.crt
site.com/public/flower.jpg
site.com/public
site.com/data
site.com/any/random_url
request is routed to public/index.php
What I have now:
(and even that with outside help)
# the functionality described in #1 above
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f
RewriteRule .* public%{REQUEST_URI} [L]
# I'd like to take out the following line so ALL other requests route to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* public/index.php
If I remove the
RewriteCond %{REQUEST_FILENAME} !-f
line, it seizes to work, I've experimented countless configurations, read the modRewrite docs but can't figure out why this simple thing refuses to simply function.
Can anyone help me out or point in the right direction?
Complete final solution for reference
RewriteEngine On
# following line stops mod_rewrite from looping because this rule has already been applied
RewriteCond %{REQUEST_URI} !^/public/index.php
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f
RewriteRule .* /public%{REQUEST_URI} [L]
# don't apply this rule if the first rule has been applied
RewriteCond %{REQUEST_URI} !^/public/
RewriteRule .* /public/index.php [L]
It's a little more complicated when the application is in a subdirectory, like http://site.com/uk/, but this works great.
Ok, this is going to be a little confusing to explain. The problem you are having is that when mod_rewrite rewrites something, without the [R] or [P], it redirects internally, and all the rewrite rules get applied again. This keeps happening until the rewritten uri is the same as the un-rewritten uri. So the first rule you have is getting rewritten by the second rule. You need to prevent that from happening.
First, let's look at the first rule. What you had is totally fine, except you need to add a condition for the caveat site.com/public/flower.jpg rerouted to public/index.php. This means if the request itself has a /public/ in it, it will not serve the request (and let the 2nd rule handle things). An additional caveat here is if you have a directory "public" inside "/public", as in DOCUMENT_ROOT/public/public/, it will be inaccessible.
# Make sure the request itself isn't for /public/
RewriteCond %{THE_REQUEST} !^[A-Z]+\ /public/
# Make sure the filename exists.
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f
RewriteRule ^ /public%{REQUEST_URI} [L]
Here we've done the extra check for a request starting with something like GET /public/flower.jpg, if it matches, we skip this rule entirely. Also, this rule will break if you try to access a directory in /public/. For example, if you have a directory "stuff" inside "/public" and try to access it via the request site.com/stuff/, this rule will not allow you to see the contents (even if there is an index.html file in /stuff/) because you are not checking if directories exist. You can do that by adding this condition for -d, like so:
# Make sure the request itself isn't for /public/
RewriteCond %{THE_REQUEST} !^[A-Z]+\ /public/
# Make sure the filename/directory exists.
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -f [OR]
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_URI} -d
RewriteRule ^ /public%{REQUEST_URI} [L]
The -d condition along with the [OR] of the -f says: if %{DOCUMENT_ROOT}/public%{REQUEST_URI} is a regular file OR a directory. (See the RewriteCond docs)
Now for the second rule, and this is going to look a bit confusing because we have to handle the negation of the first rule's conditions. If the first rule passes and the URI is rewritten, 2 things happen:
The request doesn't start with something like: GET /public/
The uri got rewritten to "/public/[something]"
So we'll have 2 conditions to deal with that. If the first rule rewrote the URI, we don't want to touch it again. This solves the problem that I mentioned in the first paragraph. Additionally, we don't want to URI to get re-rewritten, causing a rewrite loop. So we need to add a condition to stop rewriting if the 2nd rule has already been applied, which means the URI is now /public/index.php. Here are the combination of those conditions:
# stops mod_rewrite from looping because this rule has already been applied
RewriteCond %{REQUEST_URI} !^/public/index.php
# don't apply this rule if the first rule has been applied
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /public/ [OR]
RewriteCond %{REQUEST_URI} !^/public/
RewriteRule ^ /public/index.php [L]
This may work:
RewriteCond %{DOCUMENT_ROOT}/public%{REQUEST_FILENAME} -f [OR]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} -f [OR]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} -f
RewriteRule (.*) public$1 [QSA,L]
RewriteRule .* public/index.php
The optimized version may work too but I'm not sure:
RewriteCond %{DOCUMENT_ROOT}(/public|public|)%{REQUEST_FILENAME} -f
RewriteRule (.*) public$1 [QSA,L]
RewriteRule .* public/index.php
By the way your logic is weird: the following rule:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* public/index.php
Means: "if the request is not a file, rewrite to public/index.php". The problem is here: if it's a file, what's going on? Nothing. The RewriteRule is ignored. This is not safe, imagine if it's a file that you may not want the user to access? Just remove this rule, it's useless, and without it, it's safer (from my point of view).
May I ask you to tell me if the optimized version worked?
Please try to use the RewriteLog directive: it helps you to track down such problems:
# Trace:
# (!) file gets big quickly, remove in prod environments:
RewriteLog "/web/logs/mywebsite.rewrite.log"
RewriteLogLevel 9
RewriteEngine On
Tell me if it works.
I'm a bit confused with your first set of rules, since %{REQUEST_URI} would be /public/flower.jpg if I'm not mistaking. I would have done it this way :
RewriteCond public/%{REQUEST_FILENAME} -f
RewriteRule ^.*$ public/%{REQUEST_FILENAME} [L]
RewriteCond public/%{REQUEST_FILENAME} !-f
RewriteRule ^.*$ public/index.php [L]
I'm not sure of the behaviour if %{REQUEST_FILENAME} is empty but basically the rules says:
If the filename exists in public, rewrite all URI to that file, if it does not rewrite to index.php
Would that work for you?
Have you considered programmatically creating your .htaccess file to blacklist anything that isn't on a whitelist that you set in whatever file you use to create it? If you ask me, you can't get much simpler.

.htaccess question - URL-rewriting

I have an issue with URL-rewriting in .htaccess. Here is .htaccess file:
RewriteEngine On
RewriteBase /community/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^view-all-results$ forums/index.php?view=view-all-results [R=302]
RewriteRule ^view-all-results/$ forums/index.php?view=view-all-results [R=302]
I need to rewrite url like "/community/view-all-results?u=2" to "community/forums/index.php?view=view-all-results&u=2".
But according to the above rule I'll get "community/forums/index.php?view=view-all-results".
I tried to change RewriteRule to
RewriteRule ^view-all-results?(.*)$ forums/index.php?view=view-all-results&$1 [R=302]
But it doesn't work properly. It still rewrites URL to "community/forums/index.php?view=view-all-results".
When I changed rule(put + instead of *):
RewriteRule ^view-all-results?(.+)$ forums/index.php?view=view-all-results&$1 [R=302]
I've got URL like "community/forums/index.php?view=view-all-results&s". So I don't understand this behavior.((
I will be very appreciated for any suggestions.
The magic flag is in the docs: [QSA], which will add the original querystring to your url.
Normal matching is only done against the path, not agains the querysting, which you would find in the magic variable %{QUERY_STRING}). Matching this variable can be done in a RewriteCond condition. You could also append this variable to the resulting url, but QSA is infinitely more userfriendely here.
Give this a try...
RewriteEngine On
RewriteBase /community/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^view-all-results/?$ forums/index.php?view=view-all-results [QSA]
Basically the first half of a RewriteRule doesn't match against the QUERY_STRING, so you second to example will never match against it. The main thing your first code was missing was the QSA flag, which tells it to pass the QUERY_STRING it receives along with the newly created QUERY_STRING. I also removed the R=302, as I assume you don't want the URL to change.
Edit: Oh, I also combined the rules by making the trailing slash optional.