simple 301 redirect with variable not working, why? - apache

Here's what I got so far. The first part works but not the redirect itself.
What do I need to do to make it work?
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ page.php?name=$1 [L]
RewriteRule ^page.php?name=([^/\.]+)/?$ /$1 [R=301,L]
Also if I have multiple of these rules do I leave the [L] only on the last one?

Besides the first rule overriding the second one, your second rule also won't work because you're trying to match the query string in a RewriteRule. Try something like this instead:
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^name=([^/.&]+)/?$
RewriteCond %{ENV:REDIRECT_LOOP} !1
RewriteRule ^page\.php$ /%1? [NS,R=301,L]
RewriteRule ^([^/.]+)/?$ page.php?name=$1 [NS,QSA,E=LOOP:1]
(I included the QSA flag so that an URL like /foobar?foo=bar will be rewritten to /page.php?name=foobar&foo=bar instead of just /page.php?name=foobar. If you don't want that, leave it out.)
Note: The second RewriteCond is there to keep the first rule from matching again after the second one has matched. The problem is that, in .htaccess context, mod_rewrite acts more or less as if all rules had the PT flag, causing the ruleset to be rerun from the start after every rewrite, even internal ones. Or, to quote the documentation:
"If you are using RewriteRule in either .htaccess files or in <Directory> sections, it is important to have some understanding of how the rules are processed. The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or <Directory> section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over."
The workaround I'm using is to set a custom environment variable with E=LOOP:1 when the internal rewrite triggers, and check for it before doing the external rewrite. Note that, when the request processing restarts after the internal rewrite, Apache prepends REDIRECT_ to the names of all environment variables set during the previous pass, so even though the variable we set is named just LOOP, the one we need to check for is REDIRECT_LOOP.

Related

mod_rewrite rule matched, but chain is not left

please help me to find a solution for this behavior which is very strange to me.
Here is my htaccess
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^emltr\.gif$ aaa.html [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [L,QSA]
</IfModule>
Note the L flag, so I would say that when "gif" rule is matched the chain is left.
But it is not.
The requested URL is "emltr.gif"
If the "catch-all" rule and conditions are commented, then the "gif" rule is correctly taken. ("aaa.html" does not exists, this is a test to prevent unwanted circular behavior.)
If the "catch-all" rule is uncommented, then IT is taken, rather than the first rule. Why is the second one taken rather than the first? Or else: why isn't the chain left even though the L flag, and then second rule is evaluated?
Thank you
This is documented in the manual L|last
The [L] flag causes mod_rewrite to stop processing the rule set. In most contexts, this means that if the rule matches, no further rules will be processed.
If you are using RewriteRule in either .htaccess files or in <Directory> sections, ... The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.
An alternative flag, [END], can be used to terminate not only the current round of rewrite processing but prevent any subsequent rewrite processing from occurring in per-directory (htaccess) context. This does not apply to new requests resulting from external redirects.
In short, when you have rewrite rules in an .htaccess file or inside a Directory directive, the request will be processed again, if it was rewritten by this round. Only when there is no more rewrite, it will stop.

mod_rewrite - stop repeating for each directory

I have spent hours trying to get a simple rewrite working, there must be an error in my fundamental understanding of mod_rewrite:
I want a rule that does the following substitution:
www.example.com/fr/ -> www.example.com/?lang=fr
which I have working, but for subdirectories:
www.example.com/fr/other/directories/ -> www.example.com/other/directories/?lang=fr&lang=fr&lang=fr
It seems the rule is being applied once for every sub-directory (it should only be applied once).
Also, a request without a trailing slash causes another lang=fr to be appended to the query string
The rule is located in the < VirtualHost > and not within a < Directory > tag
RewriteRule ^/(en|fr|zh|gr|it)/(.*)$ /$2?lang=$1 [QSA]
I am also using the DocumentIndex /index.php index.php directive
Many thanks.
Could try adding a / in your second rewrite just after $s2 to experiment with the extra addition.
And try adding [L] to the conditions, might stop it repeating over every directory.
RewriteRule ^/(en|fr|zh|gr|it)/(.*)$ /$2/?lang=$1 [QSA,L]
However, I've heard [L] behaves differently when in htaccess opposed to being in the httpd.conf - I'm no expert on it I'm afraid.
To be certain a RewriteRule will only be applied once add this condition before:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^/(en|fr|zh|gr|it)/(.*)$ /$2?lang=$1 [QSA,L]
Because any internal redirection made by mod-rewrite (and you'll have a lot, even with the [L] tag) will have the REDIRECT_STATUS envirronement variable altered.
The [L] tag means stop the chain of rewriteRule in case you have some other rules there. But the result of the mod-rewrite rewrite in your directory will always be tested against his new destination (and here the new destination is the same directory), and rules will be applied for the rewritten content as if it was a new one (except this env variable is set).

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.

Apache URL Rewriting,

I am trying to get URL rewriting to work on my website. Here is the contents of my .htaccess:
RewriteEngine On
RewriteRule ^blog/?$ index.php?page=blog [L]
RewriteRule ^about/?$ index.php?page=about [L]
RewriteRule ^portfolio/?$ index.php?page=portfolio [L]
#RewriteRule ^.*$ index.php?page=blog [L]
Now the 3 uncommented rewrite rules work perfectly, if I try http://www.mysite.com/blog/, I get redirected to http://www.mysite.com/index.php?page=blog, the same for "about" and "portfolio". However, if I mistype blog, say I try http://www.mysite.com/bloh/, then obviously I get a 404 error. The last rule, the commented one, was to help prevent that. Any URL should get redirected to the blog, but of course this rule is still parsed even if we have successfully used a previous one, so I used the "last" flag ([L]). If I uncomment my last rule, anything, including blog, about, and portfolio, redirect to blog. Shouldn't the "last" flag stop the execution as soon as it finds a matching rule?
Thanks.
Yes, the Last flag means it won't apply any of the rules following this rule in this request.
After rewriting the URL, it makes an internal request using the new rewritten URL which would match your last RewriteRule and thus your redirects go into an infinite loop.
Use the RewriteCond directive to limit rewriting to URLs that don't start with index.php, and you should be fine.
You could add a condition like:
RewriteCond %{REQUEST_URI} !^index\.php
I'll also mention that using RewriteRule ^.*$ is a good way to break all of your media requests (css, js, images) as well. You might want to add some conditions like:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
To make sure you're not trying to rewrite actual files or directories that exist on your server. Otherwise they'll be unreachable unless index.php serves those too!
From apache's mod_rewrite docs
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite
rules. This corresponds to the Perl
last command or the break command in
C. Use this flag to prevent the
currently rewritten URL from being
rewritten further by following rules.
Remember, however, that if the
RewriteRule generates an internal
redirect (which frequently occurs when
rewriting in a per-directory context),
this will reinject the request and
will cause processing to be repeated
starting from the first RewriteRule.
You could use
ErrorDocument 404 /index.php?page=blog
but you should be aware of the fact that it doesn't return 404 error code, but a redirect one and I don't know if that is such a good practice.
After you [L]eave processing for the request, the whole processing runs again for the new (rewritten) URL. You could get out of that loop by using this before your other rules:
RewriteRule ^index.php - [L]
which means "for index.php, don't rewrite and leave processing."

Why would mod_rewrite rewrite twice?

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?
From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline
From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]