Apache does not rewrite request if file on path exists - apache

I'm doing a rewrite with mod_rewrite on every request that does not match an existing file or directory. This is my configuration:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^.*$ /index.php [NC,L]
This is used to map URLs like /abc/foo or /abc/foo/10 to my app. And it works just fine.
To improve the performance, my app now stores the results of a call to /abc/foo in a file foo in the corresponding directory /abc - so that after the first call the rewrite conditions do no longer apply (file does not exist) and apache directly serves the data without first invoking the app. Works fine as well.
The problem is: Requesting /abc/foo/10 does now no longer cause the URL to get rewritten, instead I get an error "404 File Not Found". The log entries state that the rewrite condition !-f is no longer true, but actually the file /abc/foo/10 does not exist. /abc/foo exists, but is a file, not a directory.
How can I get this to work?
(MultiViews is disabled)

This is because foo exists as a file and apache serves foo with the additional /10 passed as a query string. So, your application should write some additional code to the foo file, that also checks if a request includes some additional url component and then handle creation of the directory "foo" and the file 10.

You must be in per-dir/htaccess context w/ AcceptPathInfo on.
Therefore REQUEST_FILENAME matched the part that existed, and is not the same as REQUEST_URI.
Use the REQUEST_URI var if you don't care where the request was previously mapped in your rewritecond.
In per-vh context, these vars are always the same.

Project design is a little bit wrong - others already pointed out that it's not scallable - how could You cache a request to /abc/foo/10 if there is already a /abc/foo file?
Answer to that and to Your problem is usage of subfolders instead of files.
So instead of cache structure of:
/abc/foo
/abc/bar
...?
use:
/abc/index.html
/abc/foo/index.html
/abc/bar/index.html
/abc/foo/10/index.html
and each time create new directory with index.html
This time Apache would find out that there is /abc/foo folder but no /abc/foo/10 file in it, so RewriteCond will apply.
edit
You could also try a different way - to modify url with mod_rewrite, changing urls:
/abc/foo
/abc/bar
/abc/foo/10
to something like:
/cache/abc~foo
/cache/abc~bar
/cache/abc~foo~10
htaccess rules (roughly):
# redirecting to cache folder and removing last '/'
RewriteCond %{REQUEST_URI} ^/(abc|cde)
RewriteRule ^(.*?)/?$ /cache/$1 [L]
# recursive replacing '/' with '~'
RewriteCond %{REQUEST_URI} ^/cache/.*/
RewriteRule cache/(.*)/(.*)$ /cache/$1~$2 [L]
Your standard htaccess rules should follow

Related

htaccess not working but check successful localhost

I have a simple .htaccessfile
DirectoryIndex index.php
RewriteEngine On
RewriteRule ^v4r.info/(.*)/(.*) v4r.info/NGOplus/index.php?NGO=$1&page=$2 [L,QSA]
I tested the file in htaccess.madewithlove.com, it gives a correct result and copy&pasting the result works flawlessly. (http://localhost/v4r.info/NGOplus/index.php?NGO=action-for-woman&page=board.list.php&ff=710;;;;;&startdate=2017-11-11)
But htaccess fails on localhost with an error:
File does not exist:
/var/www/html/public_html/v4r.info/action-for-woman/board.list.php
The test URL is
localhost/v4r.info/NGOplus/index.php?NGO=action-for-woman&page=board.list.php&ff=710;;;;;&startdate=2017-11-11
htaccess is active. (rubbish line gives "internal server error")
in another directory htaccess is working fine.
apache.conf seems ok (AllowOverride All)
Added:
The htaccess file is not in the base directory but in the 1. subdirectory (v4r.info).
What works is htaccess in v4r.info/NGOplus with a symlink 'action-for-woman' to NGOplus
RewriteRule ^(.+?)/?$ index.php?page=$1 [L,QSA]
Here, apache does a «local» rewrite, i.e. just the last part of the URL (the directory name 'action-for-woman' I have to extract from $_SERVER ...)
my .htaccess file is in v4r.info directory what is not the root directory.
In that case, your rule will never match. The RewriteRule pattern matches a URL-path relative to the directory that contains the .htaccess file.
But anyhow, rewriting is not recursive afaik.
Yes, it is "recursive" in a directory context (ie. .htaccess). In that the rewrite engine "loops" repeatedly until the URL passes through unchanged, or you have explicitly set END (Apache 2.4).
Try the following instead:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} !index\.php$
RewriteRule ^([^/]+)/([^/]+)$ /v4r.info/NGOplus/index.php?NGO=$1&page=$2 [L,QSA]
The check against the REDIRECT_STATUS environment variable is to ensure that only direct requests are rewritten and not already rewritten requests.
However, this pattern is still far too generic as it matches any two path segments. I put the 2nd condition that checks index.php just so you can request /v4r.info/NGOplus/index.php directly (as you were doing in your tests). However, this could be avoided by making the regex more specific.

How does Apache handle index.php/some_text webpage requests? It returns http status 200 instead of expected 404

I have a website on a shared server with some very basic php pages in the public_html directory, as well as some sub-directories with other pages in:
index.php
test.php
subdir1/index.php
subdir2/index.php
Looking at my visitor logs, I'm getting visits to index.php/some_text and index.php/some_other_text and so on. Naively I would expect those to receive an http status 404 as a) there is no directory called index.php and b) no files exist called some_text and some_other_text. However Apache is returning the file index.php with an http status 200.
Is there something I can set in .htaccess that will return a 404 status in these cases, without restricting the valid subdirectories?
I found some suggestions to set "DirectorySlash Off" but that made no difference. I've also tried
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ - [R=404,L]
But that too made no difference.
Thanks.
I'm getting visits to index.php/some_text and index.php/some_other_text and so on.
The part of the URL that starts with a slash and follows a physical file is called additional pathname information (or path-info). So, /some_text (in your example) is path-info.
In this case index.php receives the request and /some-text is passed to the script via the PATH_INFO environment variable (in PHP this is available in the $_SERVER['PATH_INFO'] superglobal).
By default, whether path-info is valid on the URL is dependent on the handler responsible for the request. PHP files allow path-info by default, but .html files do not. So, by default /index.html/some-text will result in a 404.
You can disable path-info by setting AcceptPathInfo Off in your Apache config / .htaccess file. By doing this, a request for /index.php/some-text will now result in a 404.
Conversely, if you set AcceptPathInfo On then /index.html/some-text will also be permitted.
Alternatively, you can use mod_rewrite in .htaccess to explicitly trigger a 404 for such URLs. For example, to target .php files (anywhere) only:
RewriteEngine On
RewriteRule \.php/ - [R=404]
Or, just .php files in the document root:
RewriteRule ^[^/]+\.php/ - [R=404]
Or, you can explicitly check the PATH_INFO server variable to block any URL that includes path-info. For example:
RewriteCond %{PATH_INFO} .
RewriteRule . - [R=404]
Note that some frameworks use path-info to route requests in a front-controller pattern (as opposed to using a query string or parsing the requested URI directly).
Reference:
https://httpd.apache.org/docs/2.4/mod/core.html#acceptpathinfo
I found some suggestions to set "DirectorySlash Off"
That has nothing to do with this issue. Setting DirectorySlash Off prevents mod_dir from appending trailing slashes to requests for directories.
I have since tried
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/[^/]+\.php/.*$
RewriteRule ^(.*)$ - [R=404,L]
This will only then impact *.php files in the root directory, leaving any subdirectories alone. I think. It produces the behaviour I want but it doesn't feel like a good solution.

htaccess - rewrite rule not working when requested URL is a folder on my system

All requests to my site should be rewritten to index.php?page=blah, where blah is the page that's requested (except for css, js, jp(e)g, gif and png files).
This is how my .htaccess file looks like:
RewriteEngine On
RewriteCond %{REQUEST_URI} !\.(?:css|js|jpe?g|gif|png)$ [NC]
RewriteRule ^(.*)$ index.php?page=$1 [L,QSA]
The .htaccess is in this directory: localhost:8080/example/, so when I go to localhost:8080/example/abc, it is (internally) rewritten to localhost:8080/example/index.php?page=abc.
However when I go to localhost:8080/example/res, I get redirected to localhost:8080/example/res/?page=res. I found out that this only happens to directories; when I go to localhost:8080/example/core(also a folder on my file system), I get redirected to localhost:8080/example/core/?page=core while it should be internally rewritten to localhost:8080/example/index.php?page=core and the url visible to the user should stay localhost:8080/example/core/
EDIT:
Thanks to #w3dk, who solved the problem stated above. But I found another problem, which may be related to the problem above:
When I go to:
localhost:8080/example/index/a, it's internally rewritten to localhost:8080/example/index.php?page=index.php/a, while it should be rewritten to localhost:8080/example/index.php?page=index/a.
I found out that this happens when index is a file, cause when I go to localhost:8080/example/exampleFile/abc, it's redirected to localhost:8080/example/index.php?page=exampleFile.php/abc, which shouldn't be the case.
The 2 files in my directory are:
index.php (everything should be directed to this file)
example.php
Apache seems to ignore the php file extension, cause this also works for exampleFile.txt
This is probably happening because of a conflict with mod_dir. The default behaviour (DirectorySlash On) is for mod_dir to automatically "fix" the URL when you request a physical directory without a trailing slash. It does this with an external 301 redirect, before your rule is processed. Your rule then fires, which modifies the target URL, a Location header gets returned to the client and the browser redirects.
This won't happen if you include the trailing slash on the original request. eg. localhost:8080/example/core/. mod_dir then does not need to "fix" the URL and issue a redirect. Although this may not be desirable for you?
Since you are wanting to internally rewrite all directories then the simple fix is to disable this behaviour in .htaccess:
DirectorySlash Off
You will need to clear your browser cache before testing, as the earlier 301s by mod_dir will have been cached locally.
Reference (note the security warning):
https://httpd.apache.org/docs/current/mod/mod_dir.html#directoryslash
You can use this
.htaccess file
Note: The directory folder1 must be unique in the URL. It won't work for http://domain.com/folder1/folder1.html. The directory folder1 must exist and have content in it.
RewriteEngine On
RewriteCond %{HTTP_HOST} domain.com$ [NC]
RewriteCond %{HTTP_HOST} !folder1
RewriteRule ^(.*)$ http://domain.com/folder1/$1 [R=301,L]

How to fwd urls to existing paths AND one more path with apache's mod_rewrite?

My current .htaccess looks like this:
RewriteEngine On
# RewriteCond %{REQUEST_URI} !^/_project
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
# RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^.*$ index.php [QSA,L]
The uncommented lines are pretty straightforward:
The two Conds make sure the Rule isn't applied to existing files (!-f) and folders (!-d).
The Rule sends everyting else to index.php
The uncommented lines I took from somewhere. I believe it's the best way to do what I require: 'pretty urls'.
Basically it works. Existing files (e.g. /css/general.css) are requestable and non-existing paths (e.g. /admin/login) are routed to index.php. Existing and non-existing paths must be able to work 'amongst eachother': /css/all.css is sometimes a buffered existing css file and sometimes (when it doesn't exist) it's handled by PHP. /css/general.css is always a file. /css/club_N.css (N is a number) is always a PHP script.
/_project/ is an existing folder with Basic HTTP Auth protection. For instance /_project/phpinfo.php works as well. In the _project folder I have created a (valid) symlink to the backups folder: /_project/backups/. Somehow the (existing) files in the backups folder can't be reached. For instance /_project/backups/today.bz2 is routed to index.php =( The same happens with either or both commented lines uncommented.
What's wrong with the htaccess config? If I remove the Rewrite stuff entirely, I get a 403 Forbidden. Probably something with the .htaccess in the _project folder (?).
PS. Obviously I can't show you the actual website. People wouldn't like it if you could download their backups =)
.htaccess files are hierarchical in scope, any such files in parent directories apply to their children.
The Basic Auth in /_project/ will apply to subdirectories unless you switch it off in those directories, as will the RewriteRule declaration. Often it is wise to add RewriteEngine off in the .htaccess of the child directory structure to stop the rules applying there, or possibly add a conditional blocking that structure on the original rule set.

.htaccess mod_rewrite issue

Almost in any project I work on, some issues with .htaccess occur. I usually just find the easiest solution and leave it because I don't have any knowledge or understanding for Apache, servers etc. But this time I thought I would ask you guys.
This is the files and folders in my (simplified) setup:
/modrewrite-test
.htaccess
/config
/inc
/lib
/public_html
.htaccess
/cms
/navigation
index.php
edit.php
/pages
index.php
edit.php
login.php
page.php
The "config", "inc" and "lib" folders are meant to be "hidden" from the root of the website. I try to accomplish this by making a .htaccess-file in the root that redirects the user to "public_html". The .htacess-file contains this:
RewriteEngine On
RewriteRule (.*) public_html/$1
This works perfect. If I type "http://localhost/modrewrite-test/login.php" in my browser, I end up in public_html/login.php which is my intention. So this works fine. The .htaccess-file in "public_html" contains this:
RewriteEngine On
# Root
RewriteRule ^$ page.php [L]
# Login
RewriteRule ^(admin)|(login)\/?$ login.php [L]
# Page (if not a file/directory)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ page.php?url=$1 [L]
The first rewrite just redirects me to public_html/page.php if I try to reach "http://localhost/modrewrite-test/". The next rewrite is just for the convenience of users trying to log in - so if they try to reach "http://localhost/modrewrite-test/admin" or "http://localhost/modrewrite-test/login" they will end up at the login.php-file. The third and last rewrite handles the rest of the requests. If I try to reach "http://localhost/modrewrite-test/bla/bla/bla" it will just redirect me to public_html/page.php (with the 'url' GET-variable set) instead of finding a folder called "la", containing a folder named "bla" and etc.
All of these things work perfect but a minor issues occurs when I for instance try to reach "http://localhost/modrewrite-test/cms/navigation" without a slash at the end of the URL. When I try to reach that page the browser is somehow redirected to "http://localhost/modrewrite-test/public_html/cms/navigation/". The correct page is shown but why does it get redirected and add the "public_html" part in the URL? The desired behavior is that the URL stays intact and that the page public_html/cms/navigation/index.php is shown.
The files and folders in the (simplified) can be found at http://highbars.com/modrewrite-test.zip
I ran into the same problem with "strange" redirects when trying to access existing directory without slash at end. In my case this redirection was done by mod_dir Apache module. To disable redirection I used DirectorySlash directive. Try putting in .htaccess files following string:
DirectorySlash Off
RewriteBase may help. Try this in public_html/.htaccess:
RewriteEngine On
RewriteBase /
Add the following to /modrewrite-test/.htaccess:
RewriteBase /modrewrite-test
Just to be on the safe side, I'd add the same rule also to /modrewrite-test/public_html/.htaccess. I found that having RewriteBase always set prevents a lot of potential problems in the future. This however means that you might need to update the values if you change the URI structure of your site.
Update:
I don't think that this is possible with your current folder structure. I believe that the problem is that existing subdirectories prevent rewrite rules from firing. Note the behavior please - everything works fine while you are working with non-existent files and directories, thanks to these two conditions:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
However if you try to open any index file from an existing subdirectory, you get redirected to .../public_html/.... Since you can properly open /modrewrite-test/cms/navigation/edit.php, I can only assume that the request is being overwritten by some Apache core directive, which adds slashes at end of folder URLs. Notice that everything works fine if you have an ending-slash at each URL (i.e. the Apache core directory does not need to "correct" your URL, thus everything gets rewritten by your own rewrite rules).
Suggested solution (unless anyone can advise better):
Change /modrewrite-test/public_html/.htaccess as follows:
RewriteEngine On
RewriteBase /modrewrite-test
# Page (if not a file/directory)
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ page.php?url=$1 [L]
Then Remove all PHP files from subfolders and use the Front Controller pattern, i.e. route all requests through your main page.php file and do not delegate anything down below.
You can then use the Factory pattern to initiate individual UIs (i.e. navigation/edit.php) directly from your main page.php file based on contents of $_GET['url'] (make sure to properly sanitize that).
Update #2:
This other post on StackOverflow advises on project structure used by Zend Framework - it essentially shows the approach which I suggested above. It is a valuable information asset regardless if you use Zend Framework or not.