httpd RewriteRule not working as expected - apache

On my server I am running awstats, a script that I can currently access via the following URL:
https://stats.example.com/bin/awstats.pl/?config=global
I am trying to use rewrite rules such that I can just use
https://stats.example.com/global
This is what I have defined for a rewrite rule
RewriteRule ^(.*)$ bin/awstats.pl/?config=$1 [NC,L]
httpd virtual host
# Address
ServerName stats.example.com
# Rewrite
RewriteRule ^(.*)$ bin/awstats.pl/?config=$1 [NC,L]
Options ExecCGI
AddHandler cgi-script .cgi .pl
Alias /awstatsstuff "/path/to/awstatsstuff/"
<Directory "/path/to/awstatsstuff">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
</Directory>
The problem is that anything I try and access (besides the index), will give me a 400, and my apache logs show no errors.
Should this rule be working correctly, do I have a different configuration issue? Or am I missing something? Yes, RewriteEngine is on.
edit
Based on Michael Berkowski's comment I determined that is is infact an issue with resources also being directed to the pl script, I have since modified and am using the following:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^/([0-9a-z]+\.[0-9a-z]+\.[0-9a-z]+)$ bin/awstats.pl/?config=$1 [NC,L]
I can now load the page again using
https://stats.example.com/bin/awstats.pl/?config=www.example.com
This means that all resources can be loaded correctly, however
https://stats.example.com/www.exmaple.com
will return a 400 ( this does not come from the pl script which will return a 200 and error message if the specified config file can not be found, again, no error messages in the logs.
another edit
In changing [NC,L] to [R=302], I am provided with the correct redirect upon request,
curl -k "https://stats.example.com/a.b.c"
...
<p>The document has moved here.</p>
...
Using [R=403] proves that the rewrite rule is working as expected
The problem that I am now facing is that when using [NC,L], I am still receiving a 400, with no errors available in the httpd log.

I strongly suspect requests for documents other than the index are being mistakenly trapped by the very permissive (.*) and sent to config= in error. The 400 (bad request) probably then results from awstats tripping over values it cannot handle there.
Two things should take place. First, you need to exclude real existing files and directories from the rewrite, conventionally done with a pair of RewriteCond. Then, instead of the very general (.*) matcher, use a matcher more specific to the values that should actually be considered valid for config=.
# If the requested document is not a known
# file or directory on disk...
RewriteCond %{REQUEST_FILENAME} !=f
RewriteCond %{REQUEST_FILENAME} !=d
# Rewrite patterns matching only the expected
# config= values for awstats
RewriteRule ^([a-z0-9]+\.[a-z0-9]+\.[a-z0-9]+)$ bin/awstats.pl?config=$1 [L,NC]
Above I have used [a-z0-9]+\. to match 3 part FQDN strings as mentioned in the comment thread. That may need additional refinement. To also support the string "global" for example, you could expand it to
RewriteRule ^(global|[a-z0-9]+\.[a-z0-9]+\.[a-z0-9]+)$ bin/awstats.pl?config=$1 [L,NC]

Related

How to avoid the need of typing .php on the url?

I'm on MacOs Big Sur, using Apache and PHP. What I want is: not needing to put .php on the end of my files to load it.
For instance, instead of typing this on the URL:
127.0.0.1/public_html/home.php
I want just to type
127.0.0.1/public_html/home
To achieve this, I'm using this code in .htaccess:
RewriteEngine On
Options -Indexes
DirectoryIndex home.php index.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ $1.php [L]
The code above works on my hosting, but for some reason, it does not work on my development machine. Instead, a get a 404 error.
The .htaccess file with the code is on the root of public_html folder.
What am I missing?
By typing some "nonsense" at the top of the .htaccess file and not getting an error (ordinarily you would get a 500 Internal Server Error) it would seem that .htaccess overrides were not enabled on the server. So, .htaccess files were effectively disabled - which they are by default on Apache 2.4.
To enable .htaccess overrides (to allow .htaccess to override the server config) you need to set the AllowOverride directive in the appropriate <Directory> container in the server config (or <VirtualHost> container). The default on Apache 2.4 is AllowOverride None.
With the directives as posted you would need a minimum of:
AllowOverride FileInfo Indexes Options
FileInfo for mod_rewrite, Indexes for DirectoryIndex and Options for Options and related directives.
Although it is common (and easier) to just set:
AllowOverride All
Reference:
https://httpd.apache.org/docs/2.4/mod/core.html#allowoverride
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ $1.php [L]
These directives are not strictly correct. Whilst they may work OK for the URLs you are testing, they would result in a rewrite-loop (500 error response) if you simply append a slash to your URLs (and there is no directory by that name), eg. /home/ (or /home/<anything>). This is because your condition that tests for the presence of the .php file is not necessarily the same as the URL-path you are rewriting to. See my answer to the following question on ServerFault for a thorough explanation of this issue: https://serverfault.com/questions/989333/using-apache-rewrite-rules-in-htaccess-to-remove-html-causing-a-500-error
Also, there's no need to check that the request does not map to a directory to then check if the request + .php extension maps to a file. If the request maps to a file then it can not also be a directory, so if the 2nd condition is true, the 1st condition must also be true and is therefore superfluous.
And there's no need to backslash-escape literal dots in the RewriteCond TestString - this is an "ordinary" string, not a regex.
So, these directives should be written like this instead:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
RewriteRule (.+) $1.php [L]
(RewriteBase should not be used here.)
You can further optimise this by excluding requests that already contain what looks like a file extension (assuming your URLs that need rewriting do not contain a dot near the end of the URL-path). For example:
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
RewriteRule !\.\w{2,4}$ %{REQUEST_URI}.php [L]
(With this 2nd version, it does not matter if RewriteBase is set - it is not used.)
DirectoryIndex home.php index.php
You gave an example URL of /public_html/home (to which .php is appended). However, this DirectoryIndex directive allows home.php to also be served when simply requesting the directory /public_html/. It should be one or the other, not both.

List directory contents with Apache

Let me start by saying that my knowledge of Apache is almost none, so I apologize if I am not using the correct terminology.
I have a website written in Vue, and the routing is taken care by Vue Router. In their documentation, they specify that in order for the router to work correctly, you have to put this in the .htaccess file of your website:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.html$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.html [L]
</IfModule>
From what I have been able to understand, all requests are sent back to the index.html file which will take care of loading the correct component based on the path.
My goal is to now allow my website to have a path (let's say /documents) which is not picked up by Vue, but instead shows the contents of the directory and allows you to both navigate and download the contents (Like this).
I have tried a few ways, but they all return a 403 or 500 (possibly due to a mistake in my config). I understand that I need to add a RewriteRule but all of those that I tried return weird errors.
Thanks in advance
You can have multiple rewrite rules based on what the RewriteBase is . In your current set, the rule is applying to the root of the host.
You can add another rule with RewriteBase /documents/. More info: What does RewriteBase do and how to use it?
I recommend reading the docs: https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html
The RewriteCond directive defines a rule condition.
So here a dirty explanation:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
So your RewriteConds says that if the given path/url isn't a file (!-f) and not a directory (!-d) then the next rewrite rule (RewriteRule . /index.html [L]) takes action.
RewriteRule . /index.html [L]
"." is a wildcard, so all urls will be redirect to index.html.
The [L] flags stops the execution (https://httpd.apache.org/docs/2.4/rewrite/flags.html#flag_l)
The RewriteRule ^index\.html$ - [L] stops the execution if the url is index.html.
So, your rewrite rule fulfill your requirements and seems correct.
When you get a 403 you maybe need to add Options +Indexes to your config or htaccess.
In the end, after looking through the docs, I was not able to understand how to set it up. I found this page, and using option #2 I was able to get the directory to at least show up.
I then added the auth to the folder through the .htaccess file and added the .htpasswd file with the username/password combo
TLDR
Create the folder in the location you want. In my case it was in httpdocs/documents
Create a .htaccess file where you put the following contents:
# Omit this section if you do not need the auth
AuthType Basic
AuthName "restricted area"
AuthUserFile /path/to/your/.htpasswd
require valid-user
Order allow,deny
Allow from all
Options +Indexes
Create the .htpasswd file in the location you specified above. To generate the username/password combo I used this
Any corrections are welcome!

Using mod_write for cleanurls with Lets Encrypt

I have enabled Let's Encrypt on a server running Apache on Ubuntu 14.04 and used the auto option to re-direct all http requests to https. This is working fine.
However, I now want to use mod_rewrite to use cleanurls on my site - all I need to do is remove the .php extension from all filenames. (e.g. https://example.com/contact routes to https://example.com/contact.php)
I have tried adding the following rewrite rule to the .htaccess file:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.*)$ $1.php
This configuration works fine on my localhost setup (without SSL) but does not work on the instance running Lets Encrypt.
I have tested that the .htaccess is working by adding this rule which works as expected (redirecting all www requests to the root domain)
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
I suspect that there may be some conflict between the Lets Encrypt auto setup option and my mod_rewrite rule but I am stuck as to how to make them both work together.
Any help would be much appreciated.
Disable MultiViews in .htaccess:
Options -MultiViews
MultiViews (part of mod_negotiation) is likely resulting in a conflict. This does something very similar to what you are trying to achieve using mod_rewrite. With MultiViews enabled (possibly enabled in the server config, although the default is disabled), a request for /filename, will result in Apache looking for a file that matches (that would return the appropriate mime-type) by stepping through the files in that directory (essentially trying various extensions where the basename matches).
I have checked what REQUEST_FILENAME is returning - it is the path to the filename (e.g. [REQUEST_FILENAME] => /var/www/sitename/public_html/output.php)
Yeah, that's the problem. MultiViews has already "fixed" the URL (output to output.php) before mod_rewrite has been able to do its thing.

Rewriting URL over a file .htaccess

Here is my problem.
I know how to rewrite a URL only if the file doesn't exist.
But I came across a problem that I have never encountered before.
Given the URL : http://www.my-host.com/agences/my-agencies
With at the directory root 2 files :
agences.php
.htaccess
In the .htaccess :
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^agences/(.*) /agences.php?agence=$1
This does not redirect to the /agences.php and is not even interpreted.
If I change the RewriteRule by:
RewriteRule ^agences/(.*) $1
It doesn't even process the rewrite rules.
And so even if I prepend the slash to the regex condition like this :
RewriteRule ^/agences/(.*) $1
I run on an apache 2.4.10, with the AllowOverride all configure in the vhost.
Thanks for the help.
Add that at the beginning of the code:
Options -MultiViews
The effect of MultiViews is as follows: if the server receives a
request for /some/dir/foo, if /some/dir has MultiViews enabled, and
/some/dir/foo does not exist, then the server reads the directory
looking for files named foo.*, and effectively fakes up a type map
which names all those files, assigning them the same media types and
content-encodings it would have if the client had asked for one of
them by name. It then chooses the best match to the client's
requirements. http://httpd.apache.org/docs/2.0/en/content-negotiation.html

Apache DirectorySlash Off - Site breaks

If i set DirectorySlash Off in my .htaccess file and call the directory without the trailing slash i get an 403-Forbidden from my server. If i call it with slash everything works fine.
Could anyone explain why? Here are my fully anonymized .htaccess:
# GLOBAL CONFIG
Options +FollowSymlinks
DirectorySlash Off
AddDefaultCharset utf-8
php_value post_max_size 256M
php_value upload_max_filesize 256M
# BEGIN WordPress
RewriteEngine On
RewriteBase /folder/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /folder/index.php [L]
# END WordPress
# REMOVE WWW
RewriteCond %{HTTP_HOST} ^([^.]+)\.domain\.com$ [NC]
RewriteRule ^(.*)$ http://domain.com$1 [R=301,L]
As you know per the documentation, when DirectorySlash is set to Off, requests to /folder do not have DirectoryIndex evaluated. This means that the request will not be automatically mapped to /folder/index.php.
mod_dir performs this check in the "fixup" phase of the request processing. mod_rewrite, which is responsible for your RewriteRule definitions, also performs its processing in this phase when you specify the rules in a .htaccess file.
However, it was programmed with an awareness of modules like mod_dir, and includes a check to make sure that the current directory was requested with a trailing slash. If not, it declines to handle the request, since doing so might lead to undefined behaviour.
The request then moves on to the content-generation phase, which, since the request was not mapped to a real file, is handled by mod_autoindex. Given that Indexes are disabled on your host by default, mod_autoindex returns 403 Forbidden which is what you see.
Note that since DirectoryIndex is not evaluated, even if mod_rewrite were to process the request, it would still fail, because no auto-resolution to index.php would occur, and your rule
RewriteRule . /folder/index.php [L]
wouldn't match, because the . requires a match on something (but the request would be blank).
Enabling DirectorySlash prevents this scenario by correcting the prevented actions in all of the previously mentioned scenarios except the last note, which is taken care of by the fact that DirectoryIndex maps the request to index.php anyway.
With Apache 2.4 you can allow rewrites in .htaccess files by setting RewriteOptions AllowNoSlash.
Changes with Apache 2.3.16
...
*) mod_rewrite: Add the AllowNoSlash RewriteOption, which makes it possible
for RewriteRules to be placed in .htaccess files that match the directory
with no trailing slash. PR 48304.
[Matthew Byng-Maddick <matthew byng-maddick bbc.co.uk>]
...
See Apache documentation of mod_rewrite
I think because when you turn DirectorySlash off, it disable the autocorrection of the url and it is trying to show the directory list but fortunately you have probably disabled this somewhere (or in file permissions) so it sends a 403-Forbidden. I guess that when you turn it on, it works normally.
From what I understand from the docs, it is not very good to use DirectorySlash off for security.
http://httpd.apache.org/docs/2.1/mod/mod_dir.html
As Tom already answered, there is special option for RewriteOptions, but only for Apache 2.3.16+, so if you, like me, have an apache of the older version, then you cannot rewrite url for same directory, because apache doesn't know about this directory.
Example:
"GET /somedir" will point to <Directory /var/www/html/public> in rewrite log, but(!) requested filename (%f) in access log will still /var/www/html/public/somedir/ - this is crazy apache logic. And apache will show you either 503 (without Options +Indexes) or directory listing (otherwise) with wrong urls such as /subdir/ instead of /somedir/subdir/
So, I've found only one worked solution for me - using aliases:
AliasMatch "/somedir$" "/var/www/html/public/somedir/index.html"
Hope this helps someone else in 2020+ :D