Apache mod_rewrite a subdomain to a subfolder (via internal redirect) - apache

I'm trying to write a set of mod_rewrite rules that allow my users to utilize a single folder for doing development on different projects, and not have to mess with adding vhosts for every single project.
My idea to accomplish this, is to set up a "Global VHost" for every single user who needs this ability (only 3-4), the vhost would be something like: .my-domain.com. From there, I want to promote my users to write code as if it were on a domain, and not in a sub folder. For example, if bob was working on a project named 'gnome,' I'd like the URL bob (and anyone else on our internal network) loads to get to this project to be: http://gnome.bob.my-domain.com. But, what I'd like Apache to do, is recognize that "gnome" is a "project" and thus map the request, internally, to bob.my-domain.com/gnome/.
I've got what I thought would work, and it's quite simple, but..it doesn't work! The request just goes into an infinite loop and keeps prefixing the sub domain onto the re-written request URI.
The mod rewrite code i have is:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^.]+)\.bob\.my-domain\.com
RewriteCond %{REQUEST_URI} !^/%1.*
RewriteRule ^(.*)$ /%1/$1 [L]
I've googled around a bit about this, but I've yet to find any real solutions that work. Has anyone tried this - or maybe, does anyone have a better idea? One that doesn't involve making a virtual host for every project (I've got designers..I think everyone would agree that a designer shouldn't be making virtual hosts..)
Thanks!
Here is a snippet from the rewrite_log:
[rid#838dc88/initial] (3) [perdir /home/bob/http/] strip per-dir prefix: /home/bob/http/index.html -> index.html
[rid#838dc88/initial] (3) [perdir /home/bob/http/] applying pattern '^(.*)$' to uri 'index.html'
[rid#838dc88/initial] (4) [perdir /home/bob/http/] RewriteCond: input='gnome.bob.my-domain.com' pattern='^([^.]+)\.bob\.my-domain\.com' => matched
[rid#838dc88/initial] (4) [perdir /home/bob/http/] RewriteCond: input='/index.html' pattern='!^/%1.*' => matched
[rid#838dc88/initial] (2) [perdir /home/bob/http/] rewrite 'index.html' -> '/gnome/index.html'
[rid#838dc88/initial] (1) [perdir /home/bob/http/] internal redirect with /gnome/index.html [INTERNAL REDIRECT]
[rid#8392f30/initial/redir#1] (3) [perdir /home/bob/http/] strip per-dir prefix: /home/bob/http/gnome/index.html -> gnome/index.html
[rid#8392f30/initial/redir#1] (3) [perdir /home/bob/http/] applying pattern '^(.*)$' to uri 'gnome/index.html'
[rid#8392f30/initial/redir#1] (4) [perdir /home/bob/http/] RewriteCond: input='gnome.bob.my-domain.com' pattern='^([^\.]+)\.bob\.my-domain\.com' => matched
[rid#8392f30/initial/redir#1] (4) [perdir /home/bob/http/] RewriteCond: input='/gnome/index.html' pattern='!^/%1.*' => matched
[rid#8392f30/initial/redir#1] (2) [perdir /home/bob/http/] rewrite 'gnome/index.html' -> '/gnome/gnome/index.html'
[rid#8392f30/initial/redir#1] (1) [perdir /home/bob/http/] internal redirect with /gnome/gnome/index.html [INTERNAL REDIRECT]
[rid#8397970/initial/redir#2] (3) [perdir /home/bob/http/] add path info postfix: /home/bob/http/gnome/gnome -> /home/bob/http/gnome/gnome/index.html
This is just a snippet, there are a few 10s or 100 or so lines of apache basically rewriting /gnome/index.html to /gnome/gnome/gnome/gnome/gnome/index.html, etc before apache hits its rewrite limit, gives up, and throws error 500

After a few years of ignoring this problem and coming back to it at various points, I finally found a workable solution.
RewriteEngine on
RewriteCond %{HTTP_HOST} ^([^.]+)\.bob\.my-domain\.com
RewriteCond %1::%{REQUEST_URI} !^(.*?)::/\1/
RewriteRule ^(.*)$ /%1/$1 [L]
What I found was that back-references for previous RewriteCond directions are not available in the ConditionPattern parameter of future RewriteConditions. If you want to use a back-reference from a previous RewriteCond directive, you can only use it in the TestString parameter.
The above directives prepend the sub-domain matched in the 1st RewriteCond directive to the RequestURI, delimited by ::. What we then do in the RewriteCond Test String (regex) is re-capture the sub-domain name, then check to make sure our actual RequestURI doesn't begin with that sub-domain as a folder using a back reference within the same regex.
This sounds a lot more confusing than it really is, and I can't take the credit for discovering the answer. I found the answer as a response to another question here, %N backreference inside RewriteCond. Thanks to Jon Lin for answering that question, and unknown to him, my question too!

You might want to check
http://httpd.apache.org/docs/2.2/vhosts/mass.html
it deals with the DocumentRoot problem that you were experiencing.
Rule goes something like this
VirtualDocumentRoot /var/www/%1/
You can change the %1 for whatever suits you (http://httpd.apache.org/docs/2.0/mod/mod_vhost_alias.html)
Cheers

Some Questions:
You said "map internally" -- do you NOT want to use a redirect?
Are you using the same VirtualHost for gnome.bob.mysite.com and bob.mysite.com
Did you remember to create a ServerAlias for *.bob.mysite.com?
Here is a rough version that you could modify to work. It will capture the subdomain and requested URL, and do a redirect to the main domain with the subdomain as the first part of the path, followed by the requested path, followed by the query string.
ServerName www.mysite.com
ServerAlias *.mysite.com
RewriteCond %{HTTP_HOST} ^([a-zA-Z0-9-]+)\\.mysite.com$
RewriteRule ^/(.*) http://www.mysite.com/%1/$1 [R=301,L]',

Have you tried using another rewrite rule to process the one before it?
RewriteEngine On
RewriteCond %{HTTP_HOST} ^([^.]+)\.bob\.my-domain\.com
RewriteCond %{REQUEST_URI} !^/%1.*
RewriteRule ^(.*)$ /%1/$1 [C]
RewriteRule ^/(.*)\.bob\.my-domain\.com/(.*) /$1/$2 [L]
But I think your bigger problem is the fact that your server doesn't understand it is getting served under a different name.
It thinks it is running in the /gnome/ directory while the browser things it is running in the / directory. So any relative URL's that you have are going to cause issues.
What you need is a filter that will run all the URL's in your page through a processor and change them from /gnome/ to /.

Related

Rewrite subdomain to subdirectory in Apache .htaccess file

Suppose I have a domain called example.com and I want to use rewrite rules in the .htaccess file of Apache to rewrite:
https://office.example.com/index.html
to
https://example.com/office/index.html.
How would I do that? I checked lots of answers here, and the solution seems to be something like this:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^office.example.com$
RewriteRule ^(.*)$ https://example.com/office/$1 [L,NC,QSA]
This works when I test it here:
https://htaccess.madewithlove.be?share=b40ca72f-86d3-5452-a04b-ac9f24812c57
Regrettably, it does generate an error 500 on my server. I enabled logging and found that this seems to be a recursion problem:
AH00124: Request exceeded the limit of 10 internal redirects.
In the logs it seems to add office to office endlessly: /office/office/office/.... I have no idea why this is happening. The rewritten URL doesn't meet the rewrite condition, so why would it do this?
I have found a way to make it "work". If I add R=301 to the RewriteRule attributes it does a redirect, and works, but I would prefer if the original URL remained in the address bar.
Here's the log for the first 2 redirects:
init rewrite engine with requested uri /
applying pattern '^(.*)$' to uri '/'
applying pattern '^(.*)$' to uri '/'
applying pattern '^(.*)$' to uri '/'
pass through /
[perdir /var/www/vhosts/example.com/httpdocs/] strip per-dir prefix: /var/www/vhosts/example.com/httpdocs/ ->
[perdir /var/www/vhosts/example.com/httpdocs/] applying pattern '^(.*)$' to uri ''
[perdir /var/www/vhosts/example.com/httpdocs/] rewrite '' -> 'https://example.com/office/'
reduce https://example.com/office/ -> /office/
[perdir /var/www/vhosts/example.com/httpdocs/] internal redirect with /office/ [INTERNAL REDIRECT]
#1 init rewrite engine with requested uri /office/
#1 applying pattern '^(.*)$' to uri '/office/'
#1 applying pattern '^(.*)$' to uri '/office/'
#1 applying pattern '^(.*)$' to uri '/office/'
#1 pass through /office/
#1 [perdir /var/www/vhosts/example.com/httpdocs/] strip per-dir prefix: /var/www/vhosts/example.com/httpdocs/office/
#1 [perdir /var/www/vhosts/example.com/httpdocs/] applying pattern '^(.*)$' to uri 'office/'
#1 [perdir /var/www/vhosts/example.com/httpdocs/] rewrite 'office/' -> 'https://example.com/office/office/'
#1 reduce https://example.com/office/office/ -> /office/office/
#1 [perdir /var/www/vhosts/example.com/httpdocs/] internal redirect with /office/office/ [INTERNAL REDIRECT]
#2 init rewrite engine with requested uri /office/office/
#2 applying pattern '^(.*)$' to uri '/office/office/'
#2 applying pattern '^(.*)$' to uri '/office/office/'
#2 applying pattern '^(.*)$' to uri '/office/office/'
#2 pass through /office/office/
#2 [perdir /var/www/vhosts/example.com/httpdocs/] add path info postfix: /var/www/vhosts/example.com/httpdocs/office
#2 [perdir /var/www/vhosts/example.com/httpdocs/] strip per-dir prefix: /var/www/vhosts/example.com/httpdocs/office/
#2 [perdir /var/www/vhosts/example.com/httpdocs/] applying pattern '^(.*)$' to uri 'office/office/'
#2 [perdir /var/www/vhosts/example.com/httpdocs/] rewrite 'office/office/' -> 'https://example.com/office/office/off
#2 reduce https://example.com/office/office/office/ -> /office/office/office/
#2 [perdir /var/www/vhosts/example.com/httpdocs/] internal redirect with /office/office/office/ [INTERNAL REDIRECT]
RewriteCond %{HTTP_HOST} ^office.example.com$
RewriteRule ^(.*)$ https://example.com/office/$1 [L,NC,QSA]
Rather confusing, this should implicitly trigger an external 302 (temporary) redirect, not an internal rewrite - when specifying a different host in the substitution string to the one being requested. (Although in my experience, any absolute URL in the substitution string triggers an external redirect.)
If it does trigger an internal rewrite (as indicated by the logs) then the requested hostname does not change (since this is not a separate request) and you will indeed get a rewrite loop.
However, if "the subdomain is an alias of the main domain" and a rewrite is what's required, then there is no need to specify a hostname in the substitution string and you will indeed need to make additional checks to prevent an internal rewrite loop (500 error).
Try the following instead:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^office\.example\.com [NC]
RewriteRule !^office office%{REQUEST_URI} [L]
...to exclude any requests (including rewritten requests) that already start /office.
No need for the NC and QSA flags.
Alternatively, to only target direct requests (not rewritten requests) you could check the REDIRECT_STATUS environment variable instead (which is empty on the initial request and set to "200", as in 200 OK, after the first successful rewrite).
For example:
RewriteCond %{HTTP_HOST} ^office\.example\.com [NC]
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) office/$1 [L]
This has the added "benefit" that you can potentially have a sub-subdirectory called /office as well. ie. /office/office.
UPDATE: A third version is to check against the REQUEST_URI server variable. However, I would not expect this to be any different from the first version above.
RewriteCond %{HTTP_HOST} ^office\.example\.com [NC]
RewriteCond %{REQUEST_URI} !^/office
RewriteRule ^ office%{REQUEST_URI} [L]
Sadly enough, both your suggestions gave the same error as before.
Two things to try...
Add a slash prefix on the substitution string. ie. /office%{REQUEST_URI} and /office/$1 respectively. This changes the substitution string into a URL-path, rather than a relative filesystem path. However, I wouldn't necessarily expect this to make any difference in this respect. (It would be required for an external redirect.)
Use the END flag instead of L on the RewriteRule directives - this is an Apache 2.4 addition that should halt all processing. The L flag "only" ends the current pass before restarting the rewriting process (hence the need for additional checks to prevent rewrite loops).
But now any other file (IMG, CSS) gives an 404.
The above rewrites everything, so it will naturally rewrite all static resources if they don't already start /office. (If they already start /office then they should already be excluded by the above rules.)
To exclude common resources, you could make an exception (an additional RewriteCond directive) to exclude specific file extensions. For example:
RewriteCond %{REQUEST_URI} !\.(css|js|png|jpg|gif)$
And/or add an additional RewriteCond directive to exclude requests that already map to physical files (although this is "marginally" more expensive). For example:
RewriteCond %{REQUEST_FILENAME} !-f
Summary:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^office\.example\.com [NC]
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_URI} !\.(css|js|png|jpg|gif)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) office/$1 [END]

Unable to ignore mod_rewrite internal redirects with NS flag

I have defined a couple mod_rewrite rules in an .htaccess file, one to rewrite the URL path from /rwtest/source.html to /rwtest/target.html, and another to prohibit direct access to /rwtest/target.html. That is, all users wishing to see the content of /rwtest/target.html must enter /rwtest/source.html in their URL bar.
I was trying to use the NS flag in the forbid rule to prevent rewritten URLs from being denied as well, but it appears this flag does not distinguish between the first request and the internal redirect. It would seem that NS should do the job, but I'm sure I'm misunderstanding something.
Can someone please clarify this behavior? What exactly makes this internal redirect not an internal subrequest that the NS flag can ignore?
Details:
Here's my full .htaccess file:
Options +FollowSymLinks -Multiviews
RewriteEngine on
RewriteBase /rwtest
# Forbid rule. Prohibit direct access to target.html. Note the NS flag.
RewriteRule ^target.html$ - [F,NS]
# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html
I'm running Apache 2.4.9 on Windows 7 x64, but I've observed similar behavior on Apache 2.4.3 on Linux. Here's Log output for a request to /rwtest/source.html.
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] strip per-dir prefix: C:/Apache24/htdocs/rwtest/source.html -> source.html
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] applying pattern '^target.html$' to uri 'source.html'
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] strip per-dir prefix: C:/Apache24/htdocs/rwtest/source.html -> source.html
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] applying pattern '^source.html$' to uri 'source.html'
[rewrite:trace2] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] rewrite 'source.html' -> 'target.html'
[rewrite:trace3] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] add per-dir prefix: target.html -> C:/Apache24/htdocs/rwtest/target.html
[rewrite:trace2] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] trying to replace prefix C:/Apache24/htdocs/rwtest/ with /rwtest
[rewrite:trace5] [rid#20b6200/initial] strip matching prefix: C:/Apache24/htdocs/rwtest/target.html -> target.html
[rewrite:trace4] [rid#20b6200/initial] add subst prefix: target.html -> /rwtest/target.html
[rewrite:trace1] [rid#20b6200/initial] [perdir C:/Apache24/htdocs/rwtest/] internal redirect with /rwtest/target.html [INTERNAL REDIRECT]
[rewrite:trace3] [rid#20ba360/initial/redir#1] [perdir C:/Apache24/htdocs/rwtest/] strip per-dir prefix: C:/Apache24/htdocs/rwtest/target.html -> target.html
[rewrite:trace3] [rid#20ba360/initial/redir#1] [perdir C:/Apache24/htdocs/rwtest/] applying pattern '^target.html$' to uri 'target.html'
[rewrite:trace2] [rid#20ba360/initial/redir#1] [perdir C:/Apache24/htdocs/rwtest/] forcing responsecode 403 for C:/Apache24/htdocs/rwtest/target.html
Workarounds
I've posted a few workarounds below.
There are several workarounds for this, each with their pros and cons. As a disclaimer, I've only tested them in an .htaccess context.
Workaround 1. Check for empty REDIRECT_STATUS
Add a RewriteCond checking to see if %{ENV:REDIRECT_STATUS} is empty. If it is empty, then the current request is not an internal redirect.
Pros
Most direct way to determine internal redirect.
Cons
Lack of documentation. The page on Custom Error Responses mentions this variable briefly:
REDIRECT_ environment variables are created from the environment variables which existed prior to the redirect. They are renamed with a REDIRECT_ prefix, i.e., HTTP_USER_AGENT becomes REDIRECT_HTTP_USER_AGENT. REDIRECT_URL, REDIRECT_STATUS, and REDIRECT_QUERY_STRING are guaranteed to be set, and the other headers will be set only if they existed prior to the error condition.
I've tried every other REDIRECT_ variable in RewriteCond, yet all of them except REDIRECT_STATUS were empty for internal redirects. Why REDIRECT_STATUS is the special one in mod_rewrite remains a mystery.
Example
# Forbid rule. Prohibit direct access to target.html.
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^target.html$ - [F]
# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html
Credits for this approach go to URL rewrite : internal server error.
Workaround 2. Halt rewrite rule processing with END
Unlike the L flag, END halts rewrite rules even for internal redirects.
Pros
Simple. Just an extra flag.
Cons
Does not give you enough control over which rules to process and which to skip.
Example
# Forbid rule. Prohibit direct access to target.html.
RewriteRule ^target.html$ - [F]
# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html [END]
For more information see END flag.
Workaround 3. Match against original URL in THE_REQUEST
%{THE_REQUEST}
The full HTTP request line sent by the browser to the server (e.g., "GET /index.html HTTP/1.1").
THE_REQUEST does not change with internal redirects, so you can match against it.
Pros
Can be used to match against the original URL even in the second round of URL processing.
Cons
Significantly more complicated than the other approaches. Forces the use of RewriteCond where just one RewriteRule would have been sufficient.
Matches against the full URL which has not been unescaped (decoded), unlike most other variables.
Inconvenient to use in multiple RewriteRules. RewriteConds can be copied above every RewriteRule or the value can be exported to an environment variable (see example). Both hacky alternatives.
Example
# Forbid rule. Prohibit direct access to target.html.
RewriteCond %{THE_REQUEST} "^[^ ]+ ([^ ?]*)" # extract path from request line
RewriteCond %1 ^/rwtest/target.html$
RewriteRule ^ - [F]
# Rewrite rule. Rewrite source.html to target.html.
RewriteRule ^source.html$ target.html
Or, export the path to an environment variable and use it in multiple RewriteRules.
# Extract the original URL and save it to ORIG_URL.
RewriteCond %{THE_REQUEST} "^[^ ]+ ([^ ?]*)" # extract path from request line
RewriteRule ^ - [E=ORIG_URL:%1]
# Forbid rule. Prohibit direct access to target.html.
RewriteCond %{ENV:ORIG_URL} ^/rwtest/target.html$
RewriteRule ^ - [F]
# Rewrite rule. Rewrite source.html to target.html.
RewriteCond %{ENV:ORIG_URL} ^/rwtest/source.html$
RewriteRule ^ target.html

Why this modrewrite rule not having an redirect loop

Apache Version: Apache/2.2.22 (Ubuntu)
I have the following rewrite rule defined in .htaccess file
RewriteRule ^goto/(.*)$ goto/index.php?q=$1 [L,QSA]
And it is working fine (means it is reaching index.php of goto folder). But my thought in this that it should generate a redirect loop.
Suppose a url is http://example.com/goto/foo. So in first iteration it will have http://example.com/goto/index.php?q=foo. In second iteration it should match rewriterule goto/(.*) and should have a redirect loop.
My question is how it avoiding the redirect loop?
My .htacces file contains only the folwwing.
RewriteEngine on
RewriteRule ^goto/(.*)$ goto/index.php?q=$1 [L,QSA]
And inside goto folder there is only index.php. No other files there.
EDIT
I have also tested this using wamp 2.2
Apache version 2.2.2
Below is the rewrite log
[perdir D:/wamp/www/test/blog/] strip per-dir prefix: D:/wamp/www/test/blog/goto/ddfd -> goto/ddfd
[perdir D:/wamp/www/test/blog/] applying pattern '^goto/(.*)$' to uri 'goto/ddfd'
[perdir D:/wamp/www/test/blog/] rewrite 'goto/ddfd' -> 'goto/index.php?q=ddfd'
split uri=goto/index.php?q=ddfd -> uri=goto/index.php, args=q=ddfd
[perdir D:/wamp/www/test/blog/] add per-dir prefix: goto/index.php -> D:/wamp/www/test/blog/goto/index.php
[perdir D:/wamp/www/test/blog/] strip document_root prefix: D:/wamp/www/test/blog/goto/index.php -> /test/blog/goto/index.php
[perdir D:/wamp/www/test/blog/] internal redirect with /test/blog/goto/index.php [INTERNAL REDIRECT]
[perdir D:/wamp/www/test/blog/] strip per-dir prefix: D:/wamp/www/test/blog/goto/index.php -> goto/index.php
[perdir D:/wamp/www/test/blog/] applying pattern '^goto/(.*)$' to uri 'goto/index.php'
[perdir D:/wamp/www/test/blog/] rewrite 'goto/index.php' -> 'goto/index.php?q=index.php'
split uri=goto/index.php?q=index.php -> uri=goto/index.php, args=q=index.php&q=ddfd
[perdir D:/wamp/www/test/blog/] add per-dir prefix: goto/index.php -> D:/wamp/www/test/blog/goto/index.php
[perdir D:/wamp/www/test/blog/] initial URL equal rewritten URL: D:/wamp/www/test/blog/goto/index.php [IGNORING REWRITE]
Last entry says it is IGNORING REWRITE. So what configuration is actually instructing to ignore rewrite in this case?
On my setup, this rule indeed causes an infinite loop. It will eventually give a 500 Internal Error, because it exceeds the maximum amount of internal redirects. In .htaccess the [L] flag will only stop the current cycle of rewrites to stop, but it won't stop a new cycle from happening. It will only stop if the url stops changing. (This is different behaviour than in httpd.conf when not in per-directory context where the [L] flag will stop rewriting completely)
There are a couple of ways you can stop the infinite loop.
#1. Any url with index.php in it will not be rewritten
RewriteRule index\.php - [L]
RewriteRule ^goto/(.*)$ goto/index.php?q=$1 [L,QSA]
Every url with index.php in it will match the first rule. Because the url is not rewritten, it will not initiate a new cycle.
#2. Exclude
RewriteCond %{REQUEST_URI} !^/goto/index\.php$
RewriteRule ^goto/(.*)$ goto/index.php?q=$1 [L,QSA]
Use a condition to check if index.php is not in the current url
#3. Use the END flag
RewriteRule ^goto/(.*)$ goto/index.php?q=$1 [END,QSA]
Use the END flag. Please note that this flag is only available from Apache 2.3.9 and up. It will stop rewriting completely in .htaccess context.
Actually this is due to non-presence of leading slash in target URL of this URL:
RewriteRule ^goto/(.*)$ goto/index.php?q=$1 [L,QSA]
Now try this rule with leading slash before goto:
RewriteRule ^goto/(.*)$ /goto/index.php?q=$1 [L,QSA]
Now you will notice infinite looping error.
Reason is that in the first case target URI does not begin with a slash. In that case mod_rewrite is smart enough to prevent infinite looping, as it detects that the original request of goto/index.php is the same as the internal rewrite to goto/index.php after first pass, therefore it won't perform the further rewrite with this message in RewriteLog:
[IGNORING REWRITE]
Thanks for the update. Since the first rewritten URI up to (but not including) the query string is the same as the second rewritten URI up to the same point, it is being discarded as redundant. This is a feature of mod_rewrite designed to prevent infinite rewrite loops. If you replace your rule with:
RewriteRule ^goto/(.*)$ /goto/index.php?q=$1 [L,QSA]
(notice the '/' in front of the substitution), assuming goto is in DocumentRoot, you will get the loop you expected. This is because "/goto/index.php" (the result) is different from "goto/index.php" (the originally matched URI).
-- Original answer follows --
Can you verify that you are actually rewriting anything at all? I ask because this is a rewrite from and to a relative path, and you have no RewriteBase directive in the .htaccess file. Unless .htaccess is in the server's (or virtual host's) DocumentRoot directory, the rewrite engine requires RewriteBase to determine how to finish rewriting it.
Further, even if this is in the DocumentRoot directory (which is implied by the fact that the original URL to be rewritten is http://example.com/goto/foo), unless there is an AllowOverride directive giving FileInfo override permission, AND an Options directive specifying FollowSymLinks, effective for the directory, mod_rewrite directives in .htaccess files are ignored.
As for "reaching index.php" - without seeing the configuration, I really can't comment on why it's successfully serving a page, despite the lack of rewriting. It could be that an ErrorDocument directive gives you a page that matches what you expect. There could even be some rewriting rules in the server or virtual host configuration files that handle the case where /([^/]+)/index.php exists, rewriting requests that would otherwise generate a 404 error code to use the appropriate index.php file, instead.
References:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule - explains what conditions must be satisfied for rewriting to work.
http://httpd.apache.org/docs/2.2/mod/core.html#allowoverride - documents the FileInfo keyword for the AllowOverride directive.
http://codex.wordpress.org/htaccess - describes several ways to manage rewriting to index.php

Apache Mod Rewrite for Pretty URLs isn't working

I'm trying to figure out how to do an apache mod_rewrite to remap $_GET.
What I'm trying to accomplish:
Currently, to get to the page one would have to go to
http://www.domain.com/index.php?URL=pages/the-page.php
I would like this to work in 2 ways:
If someone goes to domain.com/the-page, it takes them to the above but keeps it looking like this. Secondly, if someone goes to the http://www.domain.com/index.php?URL=pages/the-page.php, it will still show as domain.com/the-page, keeping the URL short and clean.
Most Recently Tried Code
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{REQUEST_URI} ^/index\.php$
RewriteCond %{QUERY_STRING} URL=pages/([a-z0-9-_]+)\.php$
RewriteRule ^(.*) /%1
I'm pretty sure I setup everything right in the apache httpd.conf. I'm using XAMPP to test locally, restarted apache on changes, still nothing. Where am I going wrong?
I would prefer to handle this in .htaccess
I am using XAMPP localhost and trying on live server.
Log File:
127.0.0.1 - - [05/Apr/2013:16:50:43 --0400] [localhost/sid#2f3140][rid#3b14068/initial] (3) [perdir C:/xampp/htdocs/cdi/] strip per-dir prefix: C:/xampp/htdocs/cdi/index.php -> index.php
127.0.0.1 - - [05/Apr/2013:16:50:43 --0400] [localhost/sid#2f3140][rid#3b14068/initial] (3) [perdir C:/xampp/htdocs/cdi/] applying pattern '^(.*)' to uri 'index.php'
127.0.0.1 - - [05/Apr/2013:16:50:43 --0400] [localhost/sid#2f3140][rid#3b14068/initial] (1) [perdir C:/xampp/htdocs/cdi/] pass through C:/xampp/htdocs/cdi/index.php
Updated log with Olaf's script (last rule commented out)
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (3) [perdir C:/xampp/htdocs/cdi/] strip per-dir prefix: C:/xampp/htdocs/cdi/index.php -> index.php
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (3) [perdir C:/xampp/htdocs/cdi/] applying pattern '^' to uri 'index.php'
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (3) [perdir C:/xampp/htdocs/cdi/] strip per-dir prefix: C:/xampp/htdocs/cdi/index.php -> index.php
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (3) [perdir C:/xampp/htdocs/cdi/] applying pattern '^index\.php$' to uri 'index.php'
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (2) [perdir C:/xampp/htdocs/cdi/] rewrite 'index.php' -> '/newhome?'
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (3) split uri=/newhome? -> uri=/newhome, args=<none>
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (2) [perdir C:/xampp/htdocs/cdi/] explicitly forcing redirect with http://localhost/newhome <--this one seems to be causing the issue
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (1) [perdir C:/xampp/htdocs/cdi/] escaping http://localhost/newhome for redirect
127.0.0.1 - - [05/Apr/2013:20:02:24 --0400] [localhost/sid#2e3140][rid#3b14090/initial] (1) [perdir C:/xampp/htdocs/cdi/] redirect to http://localhost/newhome [REDIRECT/302]
Thank you everyone that is helping. I've spent 2 days trying to get this to work!!!
Basically, you need two rules. One rule to redirect the client to a clean URL and another to internally rewrite the pretty URL to the real content via index.php.
Assuming the index.php and .htaccess is in a directory cdi
RewriteEngine on
# prevent endless loop
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ - [L]
# redirect the client
RewriteCond %{QUERY_STRING} URL=pages/(.+?)\.php
RewriteRule ^index\.php$ /cdi/%1? [R,L]
# exclude rewriting all files located in /cdi/files
RewriteCond %{REQUEST_URI} !^/cdi/files/
# rewrite to real content
RewriteRule ^.*$ /cdi/index.php?URL=pages/$0.php [L]
Update:
When the request is /cdi/index.php?URL=pages/abc.php, the second rule extracts the needed URL part and redirects the client to the new URL path. The client then requests the new URL /cdi/abc and the third rule takes this and does an internal rewrite to the real content.
This all works fine as it should, but would rewrite and redirect indefinitely. To break this endless rule, the first rule checks the environment %{ENV:...}, if the request was already redirected REDIRECT_STATUS and then stops the cycle with the RewriteRule
RewriteRule ^ - [L]
which matches everything ^ and does no substitution, but ends the rewrite cycle with the flag [L]
Instead of using the system provided environment STATUS/REDIRECT_STATUS, you can also set a variable yourself with the flag E=SEO:1 for example, and then test for this variable with
RewriteCond %{ENV:REDIRECT_SEO} 1
For the REDIRECT_ prefix, see Available Variables.
You could try this:
RewriteRule ^/([a-z0-9_-]{1,40})/?$ index.php?URL=pages/$1.php
Though ideally you might want to get rid of the "pages/" part of the query string variable, as this fixed constant could be handled by the index.php script.
You approach seems fine but your RewriteCond doesn't match your requirements:
RewriteCond %{REQUEST_URI} ^index.php?URL=pages
means "rewrite the URL if someone requests something that starts with 'index.php"—but that's not what anyone will be requesting. You want your visitors to request pretty URLs.
If your server only needs to serve those requests for /the-page, you can drop the condition entirely. Then any URL will be rewritten. (Note: This might not be what you want!)
Otherwise, the condition should read something like this:
RewriteCond %{REQUEST_URI} ^[a-z0-9-_]{1,40}
If you don't want to mess with regular expressions, you could also try this:
RewriteCond %{REQUEST_FILENAME} !-f
which means "if the user requests a URL for which no file can be found, rewrite the URL according to the upcoming RewriteRule."
If you want the group ([0-9]+) to be alphabetic then just change it to ([a-z]+) and if you've wanted it to be alphanumeric, then change it to ([a-z0-9]+), and ([a-z0-9-_]+) if with a hyphen and an underscore. If you've wanted it to set their limits manually, you can do that with this format ([a-z0-9-_]{1,40}). Do you see, the plus sign is gone, for it limited the [chars] with 1 to anything, and the {1,40} limited the [chars] with 1 to 40, you can either change it.
Do you know what the real problem is? Is my stress.. Imagine even I know that you want to remap /$var into /index.php?URL=pages/$var.php I'm still trying giving you a wrong information that will rewrite /index.php?URL=pages/$var.php into /$var. I just have realize that after my 4 hours sleep. Did you see what's happening when the time of your sleep isn't right? Maybe a rule I would gives to you when my brain's in functioning well, was:
RewriteRule ^([a-z0-9-_]+)/?$ /index.php?URL=pages/$1.php
Why did the viewers letting this to happened.. My previous codes are needed to be voted down.

Help understanding rewrite log (want to internally rewrite a page when requested from specific HTTP_HOST)

I have a Drupal site, site.com, and our client has a campaign that they're promoting for which they've bought a new domain name, campaign.com. I'd like it so that a request for campaign.com internally rewrites to a particular page of the Drupal site. Note Drupal uses an .htaccess file in the document root.
The normal Drupal rewrite is
# Rewrite URLs of the form 'x' to the form 'index.php?q=x'.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
I added the following before the normal rewrite.
# Custom URLS (eg. microsites) go here
RewriteCond %{HTTP_HOST} =campaign.com
RewriteCond %{REQUEST_URI} =/
RewriteRule ^ index.php?q=node/22 [L]
Unfortunately it doesn't work, it just shows the homepage. Turning on the rewrite log I get this.
1. [rid#2da8ea8/initial] (3) [perdir D:/wamp/www/] strip per-dir prefix: D:/wamp/www/ ->
2. [rid#2da8ea8/initial] (3) [perdir D:/wamp/www/] applying pattern '^' to uri ''
3. [rid#2da8ea8/initial] (2) [perdir D:/wamp/www/] rewrite '' -> 'index.php?q=node/22'
4. [rid#2da8ea8/initial] (3) split uri=index.php?q=node/22 -> uri=index.php, args=q=node/22
5. [rid#2da8ea8/initial] (3) [perdir D:/wamp/www/] add per-dir prefix: index.php -> D:/wamp/www/index.php
6. [rid#2da8ea8/initial] (2) [perdir D:/wamp/www/] strip document_root prefix: D:/wamp/www/index.php -> /index.php
7. [rid#2da8ea8/initial] (1) [perdir D:/wamp/www/] internal redirect with /index.php [INTERNAL REDIRECT]
8. [rid#2da7770/initial/redir#1] (3) [perdir D:/wamp/www/] strip per-dir prefix: D:/wamp/www/index.php -> index.php
9. [rid#2da7770/initial/redir#1] (3) [perdir D:/wamp/www/] applying pattern '^' to uri 'index.php'
10.[rid#2da7770/initial/redir#1] (3) [perdir D:/wamp/www/] strip per-dir prefix: D:/wamp/www/index.php -> index.php
11.[rid#2da7770/initial/redir#1] (3) [perdir D:/wamp/www/] applying pattern '^(.*)$' to uri 'index.php'
12.[rid#2da7770/initial/redir#1] (1) [perdir D:/wamp/www/] pass through D:/wamp/www/index.php
I'm not used to mod_rewrite, so I might be missing something, but comparing the logs from a call to http://site.com/node/3 and from http://campaign.com/ I can't see any meaningful difference. Specifically uri and args on line 4 seem correct, the internal redirect on line 7 seems right, and the pass through on line 12 seems right (because the file index.php exists). But for some reason it seems the query string's been discarded/ignored around the time of the internal redirect. I'm completely stumped.
Also, if anyone could provide a reference on understanding the rewrite log, that might help. It'd be great if there's a way to track the query string through the internal redirect.
FWIW I'm using WampServer 2.1 with Apache 2.2.17.
Thanks for asking this question, it's something that I need to do too. I don't know the way to do this by means of the .htaccess, and hope that someone here can answer that.
But I do the same thing by using Drupal's menu system with this code in a custom module:
function mymodule_menu() {
$items = array();
$items['domain_redirect'] = array(
'page callback' => 'domain_redirect',
'type' => MENU_NORMAL_ITEM,
'access arguments' => array('access content'),
);
return $items;
}
function domain_redirect() {
switch ($_SERVER['SERVER_NAME'])
{
case "campaign.com":
$goto = "node/22";
break;
default:
$goto = "/";
}
drupal_goto($goto);
}
Then set the frontpage to domain_redirect.