Scalable URL Rewrite Rule - apache

I am organizing sub-websites into different folders on a LAMP server for ease of maintenance, but do not want the end user to know they are organized into those folders via the URL.
Directory structure example:
/category1/website1
/category1/website2
/category2/website3
/category2/website4
/category2/website5
/category3/website6
/category4/website7
etc.
Currently it shows as http://www.example.com/category1/website1 however I want it to show as http://www.example.com/website1 all the time - even if they put the category name in there.
The trick is I need to do this from a .htaccess file with X amount of categories having X amount of sub-websites in them. Currently I am using
Redirect 301 /website1 /category1/website1
for each website to allow users to use the shorter link, but ultimately they end up seeing the category in the address and the .htaccess file is long with 200+ sub-websites involved. :(
Any help would be appreciated!

Using mod_rewrite it's pretty straight-forward if you are going to manually enter them in:
RewriteEngine on
RewriteRule ^/website1$ /category1/website1 [L,NC]
RewriteRule ^/website2$ /category1/website2 [L,NC]
It's possible to create a RewriteMap as well. There's a lot of helpful information in the documention
Hope that helps.

Since there is no discernible "pattern" that links the website to the category subdirectory and you are limited to .htaccess then you have no choice but to manually list all the mappings in your .htaccess file, in the same way you have listed the redirects.
Assuming you want to internally rewrite a URL of the form /website1/<something> to /category1/website1/<something> - in the same way the Redirect directive works (which is prefix-matching).
You could do something like the following in .htaccess:
RewriteEngine On
RewriteRule ^(website1/.*) /category1/$1 [L]
RewriteRule ^(website2/.*) /category1/$1 [L]
$1 is a backreference to the captured group in the RewriteRule pattern. eg. If you request /website1/foo then it will rewrite to /category1/website1/foo.
This does require that you request at least /website1/, with a trailing slash (since this is a directory of sorts).
You can potentially group sites together that are in the same category. For example:
RewriteRule ^((website3|website4|website5)/.*) /category2/$1 [L]
I want it to show as http://www.example.com/website1 all the time - even if they put the category name in there
For this you would need to implement the reverse... a redirect to remove the category from the URL. However, an added complication is that you only want to redirect direct requests and not rewritten requests (as above).
This will need to go before the internal rewrites above.
Unless there is a discernible pattern that distinguishes category names then you will need a rule for each category. For example:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^category1/(.*) /$1 [R=302,L]
...and repeat for each category.
The check against the REDIRECT_STATUS environment variable ensures that we are only targeting direct requests and not rewritten requests.
Change 302 (temporary) to 301 (permanent) - if that is the intention - only once you have confirmed that it works OK in order to avoid caching issues.
However, this may be easier (and more "scalable") to implement with an additional .htaccess file in each of the category subdirectories instead. And this would be the same directives in each category subdirectory:
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) /$1 [R=302,L]
Since the URL-path matched by the RewriteRule pattern is relative to the directory containing the .htaccess file, the $1 backreference contains the URL-path less the /category/ prefix.

Related

mod_rewrite: hide real urls but keep available as different files

Possible this question has already been answered but I didn't find any answer after hours of searching.
I need to put the site under "maintenance mode" and redirect/rewrite all requests to site_down.html, but at the same time I need the site to be available if I enter the address like files are in a subfolder.
ex:
if I type http://example.com/login.php I need site_down.html to be displayed.
but if I specify http://example.com/test/login.php I need real login.php do be displayed.
I need this to be done with rewrite, so copying everything to another directory isn't a solution.
I tried a couple dozens of combinations, but I'm still unable to achieve what I need
This is one version of my .htaccess file ():
DirectoryIndex site_down.html
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^test\/(.*)$ $1 [S=1]
RewriteRule ^(.*\.php)$ site_down.html
RewriteRule .* - [L]
</IfModule>
This code should rewrite all requests with "test/*" to "parent folder" and skip next rewrite rule and then terminate rewriting at RewriteRule .* - [L]. If there is no "test/" in url - all request should be rewritten to site_down.html
What am I doing wrong?
Could you suggest any valid solutions, please?
Thank you.
Essentially, you are searching for 2 rules. One rule will translate a virtual subdirectory to the working files. The other rule will translate the url to the working files to a splash page. We just have to make sure that if the first rule matches, the second rule doesn't match. We can do this by making sure " /test/" (including that leading space) was not in THE_REQUEST (or the string that the client sent to the server to request a page; something in the form of GET /test/mypage.php?apes=bananas HTTP/1.1). THE_REQUEST doesn't change on a rewrite, which makes it perfect for that. Skipping a rule like you did usually doesn't have the effect you expect, because mod_rewrite makes multiple passes through .htaccess until the resulting url doesn't change anymore, or it hits a limit and throws an error. The first time it will skip the rule, but the second time it will not do that.
RewriteCond %{THE_REQUEST} !\ /test/
RewriteRule \.php site_down.html [L]
RewriteRule ^test/(.*)$ $1 [L]

In Apache, how do I redirect from a specific path and query string?

I want to redirect from, e.g.,
http://mystore.com/category.php?id=123
to
http://mystore.com/categories/foo
and also from, e.g.,
http://mystore.com/product.php?id=456
to
http://mystore.com/products/bar
These will be permanent (301) redirects and there will be about a dozen of them. I don't need to extract any information from the paths or query strings, I just need to match them exactly. And I would like avoid specifying absolute URLs if at all possible.
I figure this can be done with mod_rewrite and some combination of RewriteConds and RewriteRules, but I'm already doing some URL rewriting and my attempts so far have had undesired results.
Here's an anonymised excerpt from my .htaccess file before any modifications:
RewriteBase /
RewriteRule sitemap.xml index.php?route=sitemap [L]
# skip files and directories
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^?]*) index.php?route=$1 [L,QSA]
This works as intended. I've tried adding several different combinations of conditions and rules just before the last line, most recently
RewriteCond %{QUERY_STRING} id=123
RewriteRule category.php categories/foo [L,R=301]
Something about that last rule causes problems. The home page loads, but style sheets, images, and other resources do not.
At this point, I'm considering just creating PHP scripts named category.php and product.php to handle the redirects.... Am I just a few characters away from the solution?
The resources (styles sheets, images etc.) are not loaded because there might be relative paths which have become invalid.
The problem is that the browser considers "categories" to be a folder and so the path to the resources is not valid.
A quick fix (if you are running on a domain/subdomain and not in a folder) is to put / in the path of all your resources.
For example: change style.css to /style.css so it is still included when you are on the categories page.
I never did figure out the problem, but I solved it by changing the order of the directives and nothing else. I moved the new redirects to just after the RewriteBase directive and everything works perfectly.

Apache .htaccess RewriteRule

Here's my situation. I have a web root and several subdirectories, let's say:
/var/www
/var/www/site1
/var/www/site2
Due to certain limitations, I need the ability to keep one single domain and have separate folders like this. This will work fine for me, but many JS and CSS references in both sites point to things like:
"/js/file.js"
"/css/file.css"
Because these files are referenced absolutely, they are looking for the 'js' and 'css' directories in /var/www, which of course does not exist. Is there a way to use RewriteRules to redirect requests for absolutely referenced files to point to the correct subdirectory? I have tried doing things like:
RewriteEngine on
RewriteRule ^/$ /site1
or
RewriteEngine on
RewriteRule ^/js/(.*)$ /site1/js/$1
RewriteRule ^/css/(.*)$ /site1/css/$1
But neither of these work, even redirecting to only one directory, not to mention handling both site1 and site2. Is what I'm trying possible?
EDIT: SOLUTION
I ended up adapting Jon's advice to fit my situation. I have the ability to programatically make changes to my .htaccess file whenever a new subdirectory is added or removed. For each "site" that I want, I have the following section in my .htaccess:
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} !^/index.php$
RewriteCond %{HTTP_COOKIE} sitename=site1
RewriteCond %{REQUEST_URI} !^/site1/
RewriteRule ^(.*)$ /site1/$1 [L]
Index.php is a file that lists all my sites, deletes the "sitename" cookie, and sets a cookie of "sitename=site#" when a particular one is selected. My RewriteConds check,
If the request is not for /
If the request is not for /index.php
If the request contains the cookie "sitename=site1"
If the request does not start with "/site1/"
If all of these conditions are met, then the request is rewritten to prepend "/site1/" before the request. I tried having a single set of Conds/Rules that would match (\w+) instead of "site1" in the third Condition, and then refer to %1 in the fourth Condition and in the Rule, but this did not work. I gave up and settled for this.
If the RewriteRules are in your .htaccess file, you need to remove the leading slashes in your match (apache strips them before sending it to mod_rewrite). Does this work?
RewriteEngine on
RewriteRule ^js/(.*)$ /site1/js/$1
RewriteRule ^css/(.*)$ /site1/css/$1
EDIT: To address the comment:
Yes, that works, but when I do RewriteRule ^(.*)$ /site1/$1, it causes Apache to issue internal server errors. But to me, it seems like that should just be a generic equivalent of the individual rules!
What's happening with that rule is when /something/ gets rewritten to /site/something/, and apache internally redirects, it gets rewritten again, to /site/site/something/, then again, then again, etc.
You'd need to add a condition to that, something like:
RewriteCond %{REQUEST_URI} !^/site/
RewirteRule ^(.*)$ /site/$1 [L]
You need to set up symlinks, which the rewrite rules will use so your absolute links at the server level can follow the symbolic links to the central site hosting account.

How to do a mod_rewrite redirection to relative URL

I am trying to achieve a basic URL redirection for pretty-URLs, and due to images, CSS etc. also residing in the same path I need to make sure that if the URL is accessed without a trailing slash, it is added automatically.
This works fine if I put the absolute URL like this:
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ http://www.mydomain.com/myParentDir/$1/ [R,nc,L]
But if I change this to a relative URL, so that I don't have to change it each time I move things in folders, this simply doesn't work.
These are what I tried and all do not work, or redirect me to the actual internal directory path of the server like /public_html/... :
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ ./myParentDir/$1/ [R,nc,L]
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ myParentDir/$1/ [R,nc,L]
What is the right way to do a URL redirection so that if the user enters something like:
http://www.mydomain.com/somedir/myVirtualParentDir/myVirtualSubdir
he gets redirected to (via HTTP 301 or 302):
http://www.mydomain.com/somedir/myVirtualParentDir/myVirtualSubdir/
Thanks.
EDIT: Adding some more details because it does not seem to be clear.
Lets say I am implementing a gallery, and I want to have pretty URLs using mod_rewrite.
So, I would like to have URLs as follows:
http://www.mydomain.com/somedir/galleries/cats
which shows thumbnails of cats, while:
http://www.mydomain.com/somedir/galleries/cats/persian
which shows one image from the thumbnails of all cats, named persian.
So in actual fact the physical directory structure and rewriting would be as follows:
http://www.domain.com/somedir/gallery.php?category=cats&image=persian
So what I want to do is put a .htaccess file in /somedir which catches all requests made to /galleries and depending on the virtual subdirectories following it, use them as placeholders in the rewriting, with 2 rewrite rules:
RewriteRule ^galleries/(A-Z0-9_-]+)/$ ./gallery.php?category=$1 [nc]
RewriteRule ^galleries/(A-Z0-9_-]+)/+([A-Z0-9_-]+)$ ./gallery.php?category=$1&image=$2 [nc]
Now the problem is that the gallery script in fact needs some CSS, Javascript and Images, located at http://www.domain.com/somedir/css, http://www.domain.com/somedir/js, and http://www.domain.com/somedir/images respectively.
I don't want to hardcode any absolute URLs, so the CSS, JS and Images will be referred to using relative URLs, (./css, ./js, ./images etc.). So I can do rewriting URLs as follows:
RewriteRule ^galleries/[A-Z0-9_-]+/css/(.*)$ ./css/$1 [nc]
The problem is that since http://www.domain.com/somedir/galleries/cats is a virtual directory, the above only works if the user types:
http://www.domain.com/somedir/gallaries/cats/
If the user omits the trailing slash mod_dir will not add it because in actual fact this directory does not actually exist.
If I put a redirect rewrite with the absolute URL it works:
RewriteRule ^galleries/([A-Z0-9_-]+)$ http://www.mydomain.com/subdir/galleries/$1/ [R,nc,L]
But I don't want to have the URL prefix hardcoded because I want to be able to put this on whatever domain I want in whatever subdir I want, so I tried this:
RewriteRule ^galleries/([A-Z0-9_-]+)$ galleries/$1/ [R,nc,L]
But instead it redirects to:
http://www.mydomain.com/home/myaccount/public_html/subdir/galleries/theRest
which obviously is not what I want.
EDIT: Further clarifications
The solution I am looking for is to avoid hardcoding the domain name or folder paths in .htaccess. I am looking for a solution where if I package the .htaccess with the rest of the scripts and resources, wherever the user unzips it on his web server it works out of the box. All works like that apart from this trailing slash issue.
So any solution which involves hardcoding the parent directory or the webserver's path in .htaccess in any way is not what I am looking for.
Here's a solution straight from the Apache Documentation (under "Trailing Slash Problem"):
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]
Here's a solution that tests the REQUEST_URI for a trailing slash, then adds it:
RewriteCond %{REQUEST_URI} !(/$|\.)
RewriteRule (.+) http://www.example.com/$1/ [R=301,L]
Here's another solution that allows you to exempt certain REQUEST_URI patterns:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [L,R=301]
Hope these help. :)
This rule should add a trailing slash to any URL which is not a real file/directory (which is, I believe, what you need since Apache usually does the redirect automatically for existing directories).
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+[^/])$ $1/ [L,R=301]
Edit:
In order to prevent Apache from appending the path relative to the document root, you have to use RewriteBase. So, for instance, in the folder meant to be your application's root, you add the following, which overrides the physical path:
RewriteBase /
This might work:
RewriteRule ^myParentDir/[A-Z0-9_-]+$ %{REQUEST_URI}/ [NS,L,R=301]
However, I'm not sure why you think you need this at all. Just make your CSS / JS / image file rewrite rule look something like this:
RewriteRule ^galleries/([A-Za-z0-9_-]+/)*(css|js|images)/(.*)$ ./$2/$3
and everything should work just fine regardless of whether the browser requests /somedir/galleries/css/whatever.css or /somedir/galleries/cats/css/whatever.css or even /somedir/galleries/cats/persian/calico/css/whatever.css.
Ps. One problem with this rule is that it prevents you from having any galleries names "css", "js" or "images". You might want to fix that by naming those virtual directories something like ".css", ".js" and ".images", or using some other naming scheme that doesn't conflict with valid gallery names.
I'm not sure I complelty understand your problem.
The trailing slash redirection is done automatically on most Apache installation because of mod_dir module (99% of chance you'have the mod_dir module).
You may need to add:
DirectorySlash On
But it's the default value.
So. If you access foo/bar and bar is not a file in foo directory but a subdirectory then mod_dir performs the redirection to foo/bar/.
The only thing I known that could break this is the Option Multiviews which is maybe trying to fin a bar.php, bar.php, bar.a-mime-extension-knwon-by-apache in the directory. So you could try to add:
Option -Multiviews
And remove all rewriteRules. If you do not get this default Apache behavior you'll maybe have to look at mod-rewrite, but it's like using a nuclear bomb to kill a spider. Nuclear bombs may get quite touchy to use well.
EDIT:
For the trailing slash problem with mod-rewrite you can check this documentation howto, stating this should work:
RewriteEngine on
RewriteBase /myParentDir/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.