htaccess rewrites causing side affects - apache

I have the following in my .htaccess file.
RewriteEngine On
RewriteRule ^([^/]+)?$ /member/profile.php?user=$1 [L]
RewriteRule ^assets(/.*)?$ /member/assets$1 [L]
RewriteRule ^images(/.*)?$ /member/images$1 [L]
RewriteRule ^php(/.*)?$ /member/php$1 [L]
The desired effect is:
https://example.com/username -> https://example.com/member/profile.php?user=$1
This works, however, the issue is there are 2 undesired outcomes happening from this.
First: https://example.com and https://example.com/ return 404 errors but https://example.com/index.php works just fine.
Second: https://example.com/username/ ends up forwarding to https://example/member/php/?user=username and returning a 404 error.
I have also attempted
DirectoryIndex index.htm index.html index.php
But this seems to have no effect on the issue
My actual desired end result would look more like:
https://example.com -> https://example.com/index.php
https://example.com/ -> https://example.com/index.php
https://example.com/username -> https://example.com/member/profile.php?user=$1
https://example.com/username/ -> https://example.com/member/profile.php?user=$1

RewriteRule ^([^/]+)?$ /member/profile.php?user=$1 [L]
First: https://example.com and https://example.com/ return 404 errors but https://domain.name/index.php works just fine.
The first rule will catch the request (since it allows an empty URL-path) and will rewrite the request to /member/profile.php?user=. So, presumably it is your script that is triggering the 404?
In fact, it looks like you are missing a slash before ? to match an optional trailing slash (ie. /username or /username/), rather than making the entire pattern optional! ie. ^([^/]+)/?$
You would also need the NS (nosubreq) flag to prevent the subrequest by mod_dir for the DirectoryIndex (ie. index.php) also being caught by this rule. However, this rule is arguably matching too much, as it will also catch direct requests for index.php (and any other files you might have in the root). So, maybe you need to be more restrictive in what characters are allowed in usernames? For example, at a minimum, exclude dots (as well as slashes) with ^([^/.]+)/?$? Or allow only letters and numbers (and underscores), eg. ^(\w+)/?$. (\w is a shorthand character class that represents [0-9a-zA-Z_].)
Note that the first rule will also match assets, images and php - so these are valid usernames. Is that intentional? You could reverse the rules so this does not happen, but you would need to ensure that there are no usernames that match these strings.
NB: https://example.com and https://example.com/ are exactly the same request. (The browser effectively appends the slash after the hostname to make a valid HTTP request. See the following question on the Webmasters stack: Is trailing slash automagically added on click of home page URL in browser?)
Second: https://example.com/username/ ends up forwarding to https://example.com/member/php/?user=username and returning a 404 error.
I can't see how that would happen with the directives as posted. None of your rules would match /username/ (with a trailing slash), unless the username is "assets", "images" or "php" - but that still wouldn't result in the stated rewrite? However, /username/ would result in a 404 because nothing actually happens to rewrite the URL!
Your rules should perhaps be written like this instead:
RewriteEngine On
RewriteRule ^(\w+)/?$ member/profile.php?user=$1 [L]
RewriteRule ^assets(/.*) member/assets$1 [L]
RewriteRule ^images(/.*) member/images$1 [L]
RewriteRule ^php(/.*) member/php$1 [L]
The capturing subpattern in rules 2, 3 and 4 is not optional, so I've removed the trailing ?$.
I've also removed the slash prefix on the substitution string, to make it a relative file-path.
Which could also be further "simplified" to:
RewriteEngine On
RewriteBase /member
RewriteRule ^(\w+)/?$ profile.php?user=$1 [L]
RewriteRule ^((assets|images|php)/.*) $1 [L]

Related

SilverStripe 4 - Remove slash at the end of the URL

I've been trying for some time to delete the slash at the end of the URL link, but it doesn't work. I searched a lot of examples but none of them solve my problem.
I'm using Silverstripe 4 and currently running on a local server.
I have to remove the slash for SEO reasons.
My current URL is:
www.example.com/
// Need to be like below
www.example.com
I try via htaccess
Exampe from stackoveflow question
I put in /public/.htaccess
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R] # <- for test, for prod use [L,R=301]
and when i visit homepage slash is there at the end.
I try via code in SiteTree
public function Link($action = null)
{
return rtrim(parent::Link($action), '/');
}
Above code remove slash at the end from all pages but on home page still there.
www.example.com/about-us (here removed)
www.exaple.com/ (here exists)
And also try via config file
Director::config()->set('alternate_base_url', rtrim(Environment::getEnv('SS_BASE_URL'), '/'));
But again slash exists at the end of the url on homepage.
Does someone have solution for this? Thanks!
Here is my full htaccess file
<IfModule mod_rewrite.c>
# Turn off index.php handling requests to the homepage fixes issue in apache >=2.4
<IfModule mod_dir.c>
DirectoryIndex disabled
DirectorySlash On
</IfModule>
SetEnv HTTP_MOD_REWRITE On
RewriteEngine On
# Enable HTTP Basic authentication workaround for PHP running in CGI mode
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
# Deny access to potentially sensitive files and folders
RewriteRule ^vendor(/|$) - [F,L,NC]
RewriteRule ^\.env - [F,L,NC]
RewriteRule silverstripe-cache(/|$) - [F,L,NC]
RewriteRule composer\.(json|lock) - [F,L,NC]
RewriteRule (error|silverstripe|debug)\.log - [F,L,NC]
# Process through SilverStripe if no file with the requested name exists.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* index.php
# REMOVE SLASH AT THE END OF THE URL
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R] # <- for test, for prod use [L,R=301]
</IfModule>
www.example.com/
// Need to be like below
www.example.com
What you are trying to do is not possible! You are trying to remove the slash at the start of the URL-path, immediately after the hostname. This is not the same as the slash at the end of the URL-path.
There is always a slash at the start of the URL-path, even if you don't always see this in the browser's address bar (the browser often "prettifies" the URL you see in the address bar). This is necessary in order to form a valid HTTP request.
Whether you request www.example.com (no slash) or www.example.com/ (with slash), the user-agent/browser actually makes the exact same request to the server, ie. www.example.com/ (note that Google Chrome never displays the trailing slash after the hostname in the address bar, aka "omnibox", even if you type it in). If you look at the first line of the HTTP request headers you will see something like the following in both cases:
GET / HTTP/1.1
Note the first slash (delimited by spaces) - that represents the URL-path. It is not valid to have nothing here (eg. GET HTTP/1.1 is not a valid HTTP request).
This is different to removing the slash at the end of the URL-path, eg. www.example.com/about-us/ to www.example.com/about-us. In this case the trailing slash is just another character. (Although there is naturally a complication when the URL-path maps to a physical directory since mod_dir will (by default) always append a trailing slash in this instance.)
See also my answer to the following question on the Webmasters Stack for more detail:
https://webmasters.stackexchange.com/questions/35643/is-trailing-slash-automagically-added-on-click-of-home-page-url-in-browser
Further reference:
https://www.rfc-editor.org/rfc/rfc2616#section-5.1.2
I try via htaccess
Attempting to remove the slash from the start of the URL-path will result in a redirect loop since the user-agent will correct the request each time. For the request to have reached your server then there must have been a slash at the start of the URL-path (ie. after the hostname).
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R]
The homepage, ie. document root, is a directory so the condition fails and the rule is not processed.
But the pattern ^(.*)/$ will only successfully match a non-empty URL-path. And /$1 naturally redirects with a slash prefix. To omit the slash you would need to do something like this:
# DON'T DO THIS
RewriteRule ^$ https://www.example.com [R,L]
But this is nonsense and will result in a redirect loop, for the reasons mentioned above.

Mod Rewrite -- redirect all content from subdirectory

I have a scenario where there is a a site with subdirectories and content etc originally in a subdirectory /main
The site and all content has been moved back to the root and is working fine
We need to rewrite so that any http call to /main/, /main/page1, /main/page2 etc is redirected back to the / directory but the uri /page1, /page2 etc
This is what we have so far
RewriteCond %{REQUEST_URI} ^/main/.*
RewriteRule ^/main/(.*) /$1 [L]
Any comments welcome
Thanks very much
In .htaccess context, the url that is matched in the first parameter of RewriteRule doesn't include a leading slash and doesn't include the query string. Having a leading slash will cause the rule to never match. In your case your RewriteCond is unnecessary, as it matches exactly what the RewriteRule would match. Change your rule to the following url and it should work. Please note that this is an internal rewrite (the client won't see this change). If you need a redirect (the client will display the url without main in the address bar), add the [R] flag to the rule.
RewriteRule ^main/(.*)$ $1 [L]
See the documentation.

.htaccess mod_rewrite linking to wrong page

I have in my .htaccess the following code:
RewriteEngine On
RewriteRule ^/?([^/\.]+)/?$ $1.php [L]
RewriteRule ^/?([^/\.]+).php$ $1/ [R,L]
RewriteRule ^/?([^/\.]+)/?$ $1.php [L] is working fine. What this is doing is taking a url like http://www.example.com/whatever and making it read the page as http://www.example.com/whatever.php.
However, what I'd like to be able to do is take a url like http://www.example.com/whatever.php and automatically send it to http://www.example.com/whatever, hence the second line of the code. However, this isn't working. What its doing now, is as soon as it comes across a link ending in .php, the url becomes http://localhost/C:/Sites/page/whatever/, and pulling a 403: Forbidden page.
All I want to know is what I can to so that http://www.example.com/whatever.php will be read as http://www.example.com/whatever, and that if http://www.example.com/whatever.php is entered into the URL bar, it will automatically redirect to http://www.example.com/whatever.
Does that make any sense?
EDIT
Ok, so it appears I wasn't all too clear.. basically, I want /whatever/ to read as whatever.php while the URL still stays as /whatever/, right? However, if the URL was /whatever.php, I want it to actually redirect the users URL to /whatever/, and then once again read it as whatever.php. Is this possible?
If you're rules are inside an .htaccess file, you can omit the leading slash when you match against a URI:
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
Also note that a leading slash is included in the target (/$1.php), this makes sure /whatever/ gets rewritten to /whatever.php. When you redirect, if you are missing this leading slash, apache prepends the document root to it. Thus /whatever.php gets redirected to the document root C:/Sites/page/whatever/. Even if you include the leading slash, this will never work because you're going to cause a redirect loop:
Enter "http://www.example.com/whatever.php" in your address bar
apache redirects you to "http://www.example.com/whatever/"
apache gets the URI whatever/ and applies the first rule and the URI gets rewritten to /whatever.php
The URI gets put through the rewrite engine again
the URI /whatever.php matches the second rule and redirects the browser to "http://www.example.com/whatever/"
repeat steps 3-5
You need to add a condition that the actual request is for /whatever.php:
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
So altogether, you'll have:
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
You're making a relative path substitution in a per-directory context (.htaccess is a per-directory context). This requires RewriteBase. Per-directory rewrites are done in a later stage of processing, when URLs have been mapped to paths. But the rewrite must produce a URL, which is processed again. I think without the RewriteBase to supply the URL prefix, you end up with a filesystem prefix instead of the URL. That may be why you're getting the C:/Sites thing. Try RewriteBase. But after a correct RewriteBase to specify the correct URL prefix to be tacked in front to the relative rewritten part, I'm afraid you will have the rewrite loop, because you're rewriting whatever.php to whatever; and whatever to whatever.php.
Reference: http://httpd.apache.org/docs/current/rewrite/tech.html

Apache rewrite exception for / index but not second and deeper indexes?

I want to have an exception for...
http ://localhost/
...while rewriting the index of any directories underneath...
http ://localhost/directory1/
http ://localhost/directory2/
By having an empty item below (the first item which is between the characters (| on the third line) it creates an exception for ALL indexes so how can I make the exception NOT apply to the localhost/ itself using this copy of .htaccess?
http ://localhost/.htaccess
RewriteEngine on
RewriteRule ^(|directory2/|directory2/) - [L]
RewriteRule !\.(css|xhtml|xml|zip)$ rewrite.php
...and I can not mess with server configuration. Additionally this question is not redirect related.
Try using a RewriteCond to match your directory, and then apply the RewriteRule to anything that matches.
EDIT: Also, I think your ! line might be causing some problems. I tested with the rewrite rule tester and tweaked my suggested fix to look like this:
RewriteCond %{REQUEST_URI} ^/(directory1|directory2)/
RewriteRule .(css|xhtml|xml|zip)$ - [L]
RewriteRule .* rewrite.php
This is generally how I match things -- if you have some things you don't want to process, match them and stop processing rules with the [L] directive, then continue ahead for anything else.
This rule makes an exception for the root index. Since nothing is between ^ (^ = starts with) and $ ($ = ends with) the requested URI this matches http:// localhost/ exactly. Having [space] index.php (or change the extension to what you want) forces the file to rewrite to itself. It does not appear to loop.
RewriteRule ^$ index.php [QSA]

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.