.htaccess rewrite url for missing reources - apache

I would like to rewrite files that don't exist to a php handler. I am currently using this .htaccess file, but it doesn't work as I'd like:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /app/index.php?_url=/$1
When I have no files, it works; it redirects correctly for resource that does not exist to my app, and I am able to capture it.
When I have exactly matching file (e.g.: test.txt and I request /test.txt); it loads test.txt correctly.
However, when I have a partial match (e.g.: test.txt exists, but I request /test); it does not redirect at all. In fact, it gives me the standard Apache 404. I want this to actually rewrite to my app, so I can deal with the request in a different manner.
I'm using pretty much default apache 2.2.22 from Debian. Is there some configuration I am missing, or is this the intended behaviour of the rewrite? Is there a way to achieve what I want?

I think you have MultiViews enabled in your Apache by default. Try adding this line on top of your .htaccess:
Options -MultiViews
With MultiViews Apache does its own rewrites and that usually conflicts with mod_rewrite

Related

How to avoid the need of typing .php on the url?

I'm on MacOs Big Sur, using Apache and PHP. What I want is: not needing to put .php on the end of my files to load it.
For instance, instead of typing this on the URL:
127.0.0.1/public_html/home.php
I want just to type
127.0.0.1/public_html/home
To achieve this, I'm using this code in .htaccess:
RewriteEngine On
Options -Indexes
DirectoryIndex home.php index.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ $1.php [L]
The code above works on my hosting, but for some reason, it does not work on my development machine. Instead, a get a 404 error.
The .htaccess file with the code is on the root of public_html folder.
What am I missing?
By typing some "nonsense" at the top of the .htaccess file and not getting an error (ordinarily you would get a 500 Internal Server Error) it would seem that .htaccess overrides were not enabled on the server. So, .htaccess files were effectively disabled - which they are by default on Apache 2.4.
To enable .htaccess overrides (to allow .htaccess to override the server config) you need to set the AllowOverride directive in the appropriate <Directory> container in the server config (or <VirtualHost> container). The default on Apache 2.4 is AllowOverride None.
With the directives as posted you would need a minimum of:
AllowOverride FileInfo Indexes Options
FileInfo for mod_rewrite, Indexes for DirectoryIndex and Options for Options and related directives.
Although it is common (and easier) to just set:
AllowOverride All
Reference:
https://httpd.apache.org/docs/2.4/mod/core.html#allowoverride
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ $1.php [L]
These directives are not strictly correct. Whilst they may work OK for the URLs you are testing, they would result in a rewrite-loop (500 error response) if you simply append a slash to your URLs (and there is no directory by that name), eg. /home/ (or /home/<anything>). This is because your condition that tests for the presence of the .php file is not necessarily the same as the URL-path you are rewriting to. See my answer to the following question on ServerFault for a thorough explanation of this issue: https://serverfault.com/questions/989333/using-apache-rewrite-rules-in-htaccess-to-remove-html-causing-a-500-error
Also, there's no need to check that the request does not map to a directory to then check if the request + .php extension maps to a file. If the request maps to a file then it can not also be a directory, so if the 2nd condition is true, the 1st condition must also be true and is therefore superfluous.
And there's no need to backslash-escape literal dots in the RewriteCond TestString - this is an "ordinary" string, not a regex.
So, these directives should be written like this instead:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
RewriteRule (.+) $1.php [L]
(RewriteBase should not be used here.)
You can further optimise this by excluding requests that already contain what looks like a file extension (assuming your URLs that need rewriting do not contain a dot near the end of the URL-path). For example:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}.php [L]
(With this 2nd version, it does not matter if RewriteBase is set - it is not used.)
DirectoryIndex home.php index.php
You gave an example URL of /public_html/home (to which .php is appended). However, this DirectoryIndex directive allows home.php to also be served when simply requesting the directory /public_html/. It should be one or the other, not both.

.htaccess rewrite - remove all extensions

I would like a to do a rewrite rule that removes all extensions - regardless of filename
https://example.com/filename.extension -> https://example.com/filename
for example:
https://example.com/horses.txt -> https://example.com/horses
https://example.com/icecream.json -> https://example.com/icecream
I tried:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^(.*)\.*$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)$ *? [QSA,L]
</IfModule>
not working as it should
You can only reasonably do what you are asking with MultiViews.
For example, as simple as:
Options +MultiViews
You need to remove your existing mod_rewrite directives.
Now, a request for example.com/horses will be correctly routed to /horses.txt, or whatever file extensions you are using. MultiViews uses mod_negotiation.
This isn't so easy to do with mod_rewrite, since you need to test each file extension in turn in order to work out what file you need to rewrite back to in order to route the request correctly. eg. Should a request for example.com/horses route to /horses.txt or horses.jpg? MultiViews does this comparison for you.
I would like a to do a rewrite rule that removes all extensions
Although, you need to actually remove the file extension in the HTML source. This isn't something you do in .htaccess, unless you need to preserve SEO or backlinks that have already linked back to the old URLs.
UPDATE: Perhaps I wasn't clear enough, I would like the url to display without the extension even if it is linked to it, or to go to that file if linked without the extension
Well, you need to actually remove the file extension on all your internal links. You can issue a "redirect" in .htaccess to remove the extension for the benefit of search engines and 3rd party links - but if you rely on this for your internal links then it will potentially slow users and your site as you are doubling the number of requests hitting your server.
To remove the file extension for direct requests (SEO / 3rd party links), you could do something like this:
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^([^.]+)\.[\w]{2,4}$ /$1 [R=302,L]
This does assume that the only dot in the URL-path is the one that delimits the file extension.
The difficult part is then internally rewriting the request back to the underlying file with an extension - that's where MultiViews comes in (first part of my answer).

Using mod_write for cleanurls with Lets Encrypt

I have enabled Let's Encrypt on a server running Apache on Ubuntu 14.04 and used the auto option to re-direct all http requests to https. This is working fine.
However, I now want to use mod_rewrite to use cleanurls on my site - all I need to do is remove the .php extension from all filenames. (e.g. https://example.com/contact routes to https://example.com/contact.php)
I have tried adding the following rewrite rule to the .htaccess file:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.*)$ $1.php
This configuration works fine on my localhost setup (without SSL) but does not work on the instance running Lets Encrypt.
I have tested that the .htaccess is working by adding this rule which works as expected (redirecting all www requests to the root domain)
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
I suspect that there may be some conflict between the Lets Encrypt auto setup option and my mod_rewrite rule but I am stuck as to how to make them both work together.
Any help would be much appreciated.
Disable MultiViews in .htaccess:
Options -MultiViews
MultiViews (part of mod_negotiation) is likely resulting in a conflict. This does something very similar to what you are trying to achieve using mod_rewrite. With MultiViews enabled (possibly enabled in the server config, although the default is disabled), a request for /filename, will result in Apache looking for a file that matches (that would return the appropriate mime-type) by stepping through the files in that directory (essentially trying various extensions where the basename matches).
I have checked what REQUEST_FILENAME is returning - it is the path to the filename (e.g. [REQUEST_FILENAME] => /var/www/sitename/public_html/output.php)
Yeah, that's the problem. MultiViews has already "fixed" the URL (output to output.php) before mod_rewrite has been able to do its thing.

Rewriting URL over a file .htaccess

Here is my problem.
I know how to rewrite a URL only if the file doesn't exist.
But I came across a problem that I have never encountered before.
Given the URL : http://www.my-host.com/agences/my-agencies
With at the directory root 2 files :
agences.php
.htaccess
In the .htaccess :
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^agences/(.*) /agences.php?agence=$1
This does not redirect to the /agences.php and is not even interpreted.
If I change the RewriteRule by:
RewriteRule ^agences/(.*) $1
It doesn't even process the rewrite rules.
And so even if I prepend the slash to the regex condition like this :
RewriteRule ^/agences/(.*) $1
I run on an apache 2.4.10, with the AllowOverride all configure in the vhost.
Thanks for the help.
Add that at the beginning of the code:
Options -MultiViews
The effect of MultiViews is as follows: if the server receives a
request for /some/dir/foo, if /some/dir has MultiViews enabled, and
/some/dir/foo does not exist, then the server reads the directory
looking for files named foo.*, and effectively fakes up a type map
which names all those files, assigning them the same media types and
content-encodings it would have if the client had asked for one of
them by name. It then chooses the best match to the client's
requirements. http://httpd.apache.org/docs/2.0/en/content-negotiation.html

mod_rewrite problem: RewriteCond %{REQUEST_FILENAME} !-f matches even when REQUEST_FILENAME shouldn't (fully) match

For some reason this rule
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ ./rewrite.php?p=$1&%{QUERY_STRING} [L]
doesn't work for URLs like this http://site.com/index/var/val
All other URLs work but this doesn't. It starts working when I either remove !-f
part or rename index.php file located in the root to something else (e.g. test.php). So somehow site.com/index seems to be equal to site.com/index.php in the eyes of mod_rewrite? The files are located in the root so there shouldn't be any other (upper) .htaccess files involved. This doesn't happen to index only, for example if I create /something.xml, test.com/something/... will suddenly stop working. This happens on some servers only.
Does anyone know why this could be happening?
PS. /index directory is not present on this server
The faulty module is mod_negotiation, not mod_rewrite.
In debian :
a2dismod negotiation
Edit:
To be a little more specific this is the effect of Multiviews, handled by mode_negotiation. So you could keep the module and remove the MultiViews handling with:
Options -MultiViews
From documentation:
A MultiViews search is enabled by the MultiViews Options. If the server receives a request for /some/dir/foo and /some/dir/foo does not exist, then the server reads the directory looking for all files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client's requirements, and returns that document.
I've also solved this problem by removing the MultiViews keyword from the
<Directory>
section from my server configuration.
Hope this helps.
I believe the ${REQUEST_FILENAME} aproaches the file as if it was served to a browser directly.
I had a similar problem with this:
/content/detailed-page (Rewritten URL and parsed by php)
The file was returned to me the same way as:
/content/detailed-page.html (real file)