mod_rewrite: hide real urls but keep available as different files - apache

Possible this question has already been answered but I didn't find any answer after hours of searching.
I need to put the site under "maintenance mode" and redirect/rewrite all requests to site_down.html, but at the same time I need the site to be available if I enter the address like files are in a subfolder.
ex:
if I type http://example.com/login.php I need site_down.html to be displayed.
but if I specify http://example.com/test/login.php I need real login.php do be displayed.
I need this to be done with rewrite, so copying everything to another directory isn't a solution.
I tried a couple dozens of combinations, but I'm still unable to achieve what I need
This is one version of my .htaccess file ():
DirectoryIndex site_down.html
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^test\/(.*)$ $1 [S=1]
RewriteRule ^(.*\.php)$ site_down.html
RewriteRule .* - [L]
</IfModule>
This code should rewrite all requests with "test/*" to "parent folder" and skip next rewrite rule and then terminate rewriting at RewriteRule .* - [L]. If there is no "test/" in url - all request should be rewritten to site_down.html
What am I doing wrong?
Could you suggest any valid solutions, please?
Thank you.

Essentially, you are searching for 2 rules. One rule will translate a virtual subdirectory to the working files. The other rule will translate the url to the working files to a splash page. We just have to make sure that if the first rule matches, the second rule doesn't match. We can do this by making sure " /test/" (including that leading space) was not in THE_REQUEST (or the string that the client sent to the server to request a page; something in the form of GET /test/mypage.php?apes=bananas HTTP/1.1). THE_REQUEST doesn't change on a rewrite, which makes it perfect for that. Skipping a rule like you did usually doesn't have the effect you expect, because mod_rewrite makes multiple passes through .htaccess until the resulting url doesn't change anymore, or it hits a limit and throws an error. The first time it will skip the rule, but the second time it will not do that.
RewriteCond %{THE_REQUEST} !\ /test/
RewriteRule \.php site_down.html [L]
RewriteRule ^test/(.*)$ $1 [L]

Related

Scalable URL Rewrite Rule

I am organizing sub-websites into different folders on a LAMP server for ease of maintenance, but do not want the end user to know they are organized into those folders via the URL.
Directory structure example:
/category1/website1
/category1/website2
/category2/website3
/category2/website4
/category2/website5
/category3/website6
/category4/website7
etc.
Currently it shows as http://www.example.com/category1/website1 however I want it to show as http://www.example.com/website1 all the time - even if they put the category name in there.
The trick is I need to do this from a .htaccess file with X amount of categories having X amount of sub-websites in them. Currently I am using
Redirect 301 /website1 /category1/website1
for each website to allow users to use the shorter link, but ultimately they end up seeing the category in the address and the .htaccess file is long with 200+ sub-websites involved. :(
Any help would be appreciated!
Using mod_rewrite it's pretty straight-forward if you are going to manually enter them in:
RewriteEngine on
RewriteRule ^/website1$ /category1/website1 [L,NC]
RewriteRule ^/website2$ /category1/website2 [L,NC]
It's possible to create a RewriteMap as well. There's a lot of helpful information in the documention
Hope that helps.
Since there is no discernible "pattern" that links the website to the category subdirectory and you are limited to .htaccess then you have no choice but to manually list all the mappings in your .htaccess file, in the same way you have listed the redirects.
Assuming you want to internally rewrite a URL of the form /website1/<something> to /category1/website1/<something> - in the same way the Redirect directive works (which is prefix-matching).
You could do something like the following in .htaccess:
RewriteEngine On
RewriteRule ^(website1/.*) /category1/$1 [L]
RewriteRule ^(website2/.*) /category1/$1 [L]
$1 is a backreference to the captured group in the RewriteRule pattern. eg. If you request /website1/foo then it will rewrite to /category1/website1/foo.
This does require that you request at least /website1/, with a trailing slash (since this is a directory of sorts).
You can potentially group sites together that are in the same category. For example:
RewriteRule ^((website3|website4|website5)/.*) /category2/$1 [L]
I want it to show as http://www.example.com/website1 all the time - even if they put the category name in there
For this you would need to implement the reverse... a redirect to remove the category from the URL. However, an added complication is that you only want to redirect direct requests and not rewritten requests (as above).
This will need to go before the internal rewrites above.
Unless there is a discernible pattern that distinguishes category names then you will need a rule for each category. For example:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^category1/(.*) /$1 [R=302,L]
...and repeat for each category.
The check against the REDIRECT_STATUS environment variable ensures that we are only targeting direct requests and not rewritten requests.
Change 302 (temporary) to 301 (permanent) - if that is the intention - only once you have confirmed that it works OK in order to avoid caching issues.
However, this may be easier (and more "scalable") to implement with an additional .htaccess file in each of the category subdirectories instead. And this would be the same directives in each category subdirectory:
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) /$1 [R=302,L]
Since the URL-path matched by the RewriteRule pattern is relative to the directory containing the .htaccess file, the $1 backreference contains the URL-path less the /category/ prefix.

.htaccess mod_rewrite linking to wrong page

I have in my .htaccess the following code:
RewriteEngine On
RewriteRule ^/?([^/\.]+)/?$ $1.php [L]
RewriteRule ^/?([^/\.]+).php$ $1/ [R,L]
RewriteRule ^/?([^/\.]+)/?$ $1.php [L] is working fine. What this is doing is taking a url like http://www.example.com/whatever and making it read the page as http://www.example.com/whatever.php.
However, what I'd like to be able to do is take a url like http://www.example.com/whatever.php and automatically send it to http://www.example.com/whatever, hence the second line of the code. However, this isn't working. What its doing now, is as soon as it comes across a link ending in .php, the url becomes http://localhost/C:/Sites/page/whatever/, and pulling a 403: Forbidden page.
All I want to know is what I can to so that http://www.example.com/whatever.php will be read as http://www.example.com/whatever, and that if http://www.example.com/whatever.php is entered into the URL bar, it will automatically redirect to http://www.example.com/whatever.
Does that make any sense?
EDIT
Ok, so it appears I wasn't all too clear.. basically, I want /whatever/ to read as whatever.php while the URL still stays as /whatever/, right? However, if the URL was /whatever.php, I want it to actually redirect the users URL to /whatever/, and then once again read it as whatever.php. Is this possible?
If you're rules are inside an .htaccess file, you can omit the leading slash when you match against a URI:
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
Also note that a leading slash is included in the target (/$1.php), this makes sure /whatever/ gets rewritten to /whatever.php. When you redirect, if you are missing this leading slash, apache prepends the document root to it. Thus /whatever.php gets redirected to the document root C:/Sites/page/whatever/. Even if you include the leading slash, this will never work because you're going to cause a redirect loop:
Enter "http://www.example.com/whatever.php" in your address bar
apache redirects you to "http://www.example.com/whatever/"
apache gets the URI whatever/ and applies the first rule and the URI gets rewritten to /whatever.php
The URI gets put through the rewrite engine again
the URI /whatever.php matches the second rule and redirects the browser to "http://www.example.com/whatever/"
repeat steps 3-5
You need to add a condition that the actual request is for /whatever.php:
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
So altogether, you'll have:
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
You're making a relative path substitution in a per-directory context (.htaccess is a per-directory context). This requires RewriteBase. Per-directory rewrites are done in a later stage of processing, when URLs have been mapped to paths. But the rewrite must produce a URL, which is processed again. I think without the RewriteBase to supply the URL prefix, you end up with a filesystem prefix instead of the URL. That may be why you're getting the C:/Sites thing. Try RewriteBase. But after a correct RewriteBase to specify the correct URL prefix to be tacked in front to the relative rewritten part, I'm afraid you will have the rewrite loop, because you're rewriting whatever.php to whatever; and whatever to whatever.php.
Reference: http://httpd.apache.org/docs/current/rewrite/tech.html

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.

How can you ignore the end of a URL using mod_rewrite?

I'd like to structure my website like this:
domain.com/person/edit/1
domain.com/person/edit/2
domain.com/person/edit/3
etc.
I have a page to which all these requests should go:
domain.com/person/edit.html
The JavaScript will look at the trailing part of the url when the page is loaded so I want the server to internally ignore it.
I've got this rewrite rule:
RewriteRule ^person/view/(.*)$ person/view.html [L]
I'm sure that I'm missing something obvious but when I visit one of the pages above I get this 404 message:
The requested URL /person/view.html/1 was not found on this server.
As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page. Instead it seems to be applying the rule at the earliest possible moment and then appending the rest of the unmatched url to the re-written one.
How do I get these re-writes to work properly?
"As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page."
Well .. [L] flag tells Apache to stop checking other rules .. and rewrite goes to next iteration .. where it again checks against all rules again (that is how it works).
Try these "recipe" (put it somewhere on top of your .htaccess):
Options +FollowSymLinks -MultiViews
# activate rewrite engine
RewriteEngine On
# Do not do anything for already existing files
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
Another idea to try -- add DPI flag to your [L]: [L,DPI]
If Options will not help, then rewrite rule should. But it all depends on your Apache's configuration. If the above does not work -- please post your whole .htaccess (update your question).

Why would mod_rewrite rewrite twice?

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?
From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline
From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]