Why would mod_rewrite rewrite twice? - apache

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?

From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline

From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]

Related

mod_rewrite: hide real urls but keep available as different files

Possible this question has already been answered but I didn't find any answer after hours of searching.
I need to put the site under "maintenance mode" and redirect/rewrite all requests to site_down.html, but at the same time I need the site to be available if I enter the address like files are in a subfolder.
ex:
if I type http://example.com/login.php I need site_down.html to be displayed.
but if I specify http://example.com/test/login.php I need real login.php do be displayed.
I need this to be done with rewrite, so copying everything to another directory isn't a solution.
I tried a couple dozens of combinations, but I'm still unable to achieve what I need
This is one version of my .htaccess file ():
DirectoryIndex site_down.html
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^test\/(.*)$ $1 [S=1]
RewriteRule ^(.*\.php)$ site_down.html
RewriteRule .* - [L]
</IfModule>
This code should rewrite all requests with "test/*" to "parent folder" and skip next rewrite rule and then terminate rewriting at RewriteRule .* - [L]. If there is no "test/" in url - all request should be rewritten to site_down.html
What am I doing wrong?
Could you suggest any valid solutions, please?
Thank you.
Essentially, you are searching for 2 rules. One rule will translate a virtual subdirectory to the working files. The other rule will translate the url to the working files to a splash page. We just have to make sure that if the first rule matches, the second rule doesn't match. We can do this by making sure " /test/" (including that leading space) was not in THE_REQUEST (or the string that the client sent to the server to request a page; something in the form of GET /test/mypage.php?apes=bananas HTTP/1.1). THE_REQUEST doesn't change on a rewrite, which makes it perfect for that. Skipping a rule like you did usually doesn't have the effect you expect, because mod_rewrite makes multiple passes through .htaccess until the resulting url doesn't change anymore, or it hits a limit and throws an error. The first time it will skip the rule, but the second time it will not do that.
RewriteCond %{THE_REQUEST} !\ /test/
RewriteRule \.php site_down.html [L]
RewriteRule ^test/(.*)$ $1 [L]

Redirect loop with simple htaccess rule

I have been pulling my air out over this. It worked before the server migration!
Ok so basically it's as simple as this:
I have a .php file that I want to view the content of using a SEO friendly URL via a ReWrite rule.
Also to canonicalise and to prevent duplicate content I want to 301 the .php version to the SEO friendly version.
This is what I used and has always worked till now on the new server:
RewriteRule ^friendly-url/$ friendly-url.php [L,NC]
RewriteRule ^friendly-url.php$ /friendly-url/$1 [R=301,L]
However disaster has struck and now it causes a redirect loop.
Logically I can only assume that in this version of Apache it is tripping up as it's seeing that the script being run is the .php version and so it tries the redirect again.
How can I re-work this to make it work? Or is there a config I need to switch in WHM?
Thanks!!
This is how your .htaccess should look like:
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
# To externally redirect /friendly-url.php to /friendly-url/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(friendly-url)\.php [NC]
RewriteRule ^ /%1/? [R=302,L]
## To internally redirect /anything/ to /anything.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1\.php -f
RewriteRule ^(.+?)/$ $1.php [L]
Note how I am using R=302, because I don't want the rule to cache on my browser until I confirm its working as expected, then, once I can confirm its working as expected I switch from R=302 to R=301.
Keep in mind you may have also been cached from previous attempts since you're using R=301, so you better of trying to access it from a different browser you have used just to make sure its working.
However disaster has struck and now it causes a redirect loop.
It causes a redirect loop because your redirecting it to itself, the different on my code is that I capture the request, and redirect the php files from there to make it friendly and then use the internal redirect.
The exact same .htaccess file will work differently depending on where it's placed because the [L]ast flag means something different depending on location. In ...conf, [L]ast means all finished processing so get out, but in .htaccess the exact same [L]ast flag means start all over at the top of this file.
To work as expected when moving a block of code from ...conf to .htaccess, most .htaccess files will need one or the other of these tweaks:
Change the [L]ast flags to [END]. (Problem is, the [END] flag is only available in newer [version 2.3.9 and later] Apaches, and won't even "fall back" in earlier versions.)
Add boilerplate code like this at the top of each of your .htaccess files:
*
RewriteCond %{ENV:REDIRECT_STATUS} !^[\s/]*$
RewriteRule ^ - [L]

Apache .htaccess RewriteRule

Here's my situation. I have a web root and several subdirectories, let's say:
/var/www
/var/www/site1
/var/www/site2
Due to certain limitations, I need the ability to keep one single domain and have separate folders like this. This will work fine for me, but many JS and CSS references in both sites point to things like:
"/js/file.js"
"/css/file.css"
Because these files are referenced absolutely, they are looking for the 'js' and 'css' directories in /var/www, which of course does not exist. Is there a way to use RewriteRules to redirect requests for absolutely referenced files to point to the correct subdirectory? I have tried doing things like:
RewriteEngine on
RewriteRule ^/$ /site1
or
RewriteEngine on
RewriteRule ^/js/(.*)$ /site1/js/$1
RewriteRule ^/css/(.*)$ /site1/css/$1
But neither of these work, even redirecting to only one directory, not to mention handling both site1 and site2. Is what I'm trying possible?
EDIT: SOLUTION
I ended up adapting Jon's advice to fit my situation. I have the ability to programatically make changes to my .htaccess file whenever a new subdirectory is added or removed. For each "site" that I want, I have the following section in my .htaccess:
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} !^/index.php$
RewriteCond %{HTTP_COOKIE} sitename=site1
RewriteCond %{REQUEST_URI} !^/site1/
RewriteRule ^(.*)$ /site1/$1 [L]
Index.php is a file that lists all my sites, deletes the "sitename" cookie, and sets a cookie of "sitename=site#" when a particular one is selected. My RewriteConds check,
If the request is not for /
If the request is not for /index.php
If the request contains the cookie "sitename=site1"
If the request does not start with "/site1/"
If all of these conditions are met, then the request is rewritten to prepend "/site1/" before the request. I tried having a single set of Conds/Rules that would match (\w+) instead of "site1" in the third Condition, and then refer to %1 in the fourth Condition and in the Rule, but this did not work. I gave up and settled for this.
If the RewriteRules are in your .htaccess file, you need to remove the leading slashes in your match (apache strips them before sending it to mod_rewrite). Does this work?
RewriteEngine on
RewriteRule ^js/(.*)$ /site1/js/$1
RewriteRule ^css/(.*)$ /site1/css/$1
EDIT: To address the comment:
Yes, that works, but when I do RewriteRule ^(.*)$ /site1/$1, it causes Apache to issue internal server errors. But to me, it seems like that should just be a generic equivalent of the individual rules!
What's happening with that rule is when /something/ gets rewritten to /site/something/, and apache internally redirects, it gets rewritten again, to /site/site/something/, then again, then again, etc.
You'd need to add a condition to that, something like:
RewriteCond %{REQUEST_URI} !^/site/
RewirteRule ^(.*)$ /site/$1 [L]
You need to set up symlinks, which the rewrite rules will use so your absolute links at the server level can follow the symbolic links to the central site hosting account.

RewriteRule and php download counter

(1) I have a site that serves up MP3 files:
http://domain/files/1234567890.mp3
(2) I have a php script that tracks file download counts:
http://domain/modules/download_counter.php?file=/files/1234567890.mp3
After download_counter.php records the download, it redirects to the original file:
Header("Location: $FQDN_url");
(3) I'd like all my public links to be presented as the direct file urls from (1). I'm trying to use Apache to redirect the requests to download_counter.php:
RewriteRule ^files/(.+\.mp3)$ /modules/download_counter.php?file=/files/$1 [L]
I'm currently stuck on (3), as it results in a redirect loop, since download_counter.php simply redirects the request back to the original file (rather than streaming the file contents).
I'm also motivated to use download_counter.php as is (without modifying it's redirect behaviour). This is because the script is part of a larger CMS module, and I'd like to avoid complicating my upgrade path.
Perhaps there is no solution to my problem (other than modifying the download_counter script). WDYT?
If this is not about the strongest protection ever (as I can see, it is not), then just have your script to redirect browser not to the file, but to the
http://domain/files/1234567890.mp3/redirected
Ensure your webserver will still serve such request correctly as a file download. If it will, then just add negative RewriteCond that will ensure, that redirection is done if and only if the link is not ending with /redirected
UPDATED ANSWER
i think you are into a lot of troubles because your pseudo url are actually real urls: they lead to the file. So you should change your pseudo url to something like domain.com/downloads/file.mp3 and then just check whether the requested file does not exist, so that the redirect does not loop.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^downloads/(.+\.mp3)$ /modules/download_counter.php?file=/files/$1 [L]
I first thought something that would use the referer would work:
RewriteCond %{HTTP_REFERER} !download_conter\.php
RewriteCond %{HTTP_REFERER} !=""
RewriteRule ^files/(.+\.mp3)$ /modules/download_counter.php?file=/files/$1 [L]
However, the browser does not cooperate. Let's say you click some link in file.html to something.mp3 and then are forwarded to download_counter.php. When the php script does the forward it sets as referer not download_counter.php but file.html.
The only way I see you could do this would be using an external rewriting program that would keep some state -- the first time the user requested the file it would save that information and make the rewrite, the second time it would know it had made the rewrite in the first place and would pass through the request unmodified. See the documentation of RewriteMap.
Put in http://domain/files/ this .htaccess file...
RewriteEngine On
RewriteOptions Inherit
RewriteRule ^(.*).(mp3)$ /modules/download_counter.php?file=$1.$2 [R,L]
This should do the trick...

Apache URL Rewriting,

I am trying to get URL rewriting to work on my website. Here is the contents of my .htaccess:
RewriteEngine On
RewriteRule ^blog/?$ index.php?page=blog [L]
RewriteRule ^about/?$ index.php?page=about [L]
RewriteRule ^portfolio/?$ index.php?page=portfolio [L]
#RewriteRule ^.*$ index.php?page=blog [L]
Now the 3 uncommented rewrite rules work perfectly, if I try http://www.mysite.com/blog/, I get redirected to http://www.mysite.com/index.php?page=blog, the same for "about" and "portfolio". However, if I mistype blog, say I try http://www.mysite.com/bloh/, then obviously I get a 404 error. The last rule, the commented one, was to help prevent that. Any URL should get redirected to the blog, but of course this rule is still parsed even if we have successfully used a previous one, so I used the "last" flag ([L]). If I uncomment my last rule, anything, including blog, about, and portfolio, redirect to blog. Shouldn't the "last" flag stop the execution as soon as it finds a matching rule?
Thanks.
Yes, the Last flag means it won't apply any of the rules following this rule in this request.
After rewriting the URL, it makes an internal request using the new rewritten URL which would match your last RewriteRule and thus your redirects go into an infinite loop.
Use the RewriteCond directive to limit rewriting to URLs that don't start with index.php, and you should be fine.
You could add a condition like:
RewriteCond %{REQUEST_URI} !^index\.php
I'll also mention that using RewriteRule ^.*$ is a good way to break all of your media requests (css, js, images) as well. You might want to add some conditions like:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
To make sure you're not trying to rewrite actual files or directories that exist on your server. Otherwise they'll be unreachable unless index.php serves those too!
From apache's mod_rewrite docs
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite
rules. This corresponds to the Perl
last command or the break command in
C. Use this flag to prevent the
currently rewritten URL from being
rewritten further by following rules.
Remember, however, that if the
RewriteRule generates an internal
redirect (which frequently occurs when
rewriting in a per-directory context),
this will reinject the request and
will cause processing to be repeated
starting from the first RewriteRule.
You could use
ErrorDocument 404 /index.php?page=blog
but you should be aware of the fact that it doesn't return 404 error code, but a redirect one and I don't know if that is such a good practice.
After you [L]eave processing for the request, the whole processing runs again for the new (rewritten) URL. You could get out of that loop by using this before your other rules:
RewriteRule ^index.php - [L]
which means "for index.php, don't rewrite and leave processing."