mod_rewrite - stop repeating for each directory - apache

I have spent hours trying to get a simple rewrite working, there must be an error in my fundamental understanding of mod_rewrite:
I want a rule that does the following substitution:
www.example.com/fr/ -> www.example.com/?lang=fr
which I have working, but for subdirectories:
www.example.com/fr/other/directories/ -> www.example.com/other/directories/?lang=fr&lang=fr&lang=fr
It seems the rule is being applied once for every sub-directory (it should only be applied once).
Also, a request without a trailing slash causes another lang=fr to be appended to the query string
The rule is located in the < VirtualHost > and not within a < Directory > tag
RewriteRule ^/(en|fr|zh|gr|it)/(.*)$ /$2?lang=$1 [QSA]
I am also using the DocumentIndex /index.php index.php directive
Many thanks.

Could try adding a / in your second rewrite just after $s2 to experiment with the extra addition.
And try adding [L] to the conditions, might stop it repeating over every directory.
RewriteRule ^/(en|fr|zh|gr|it)/(.*)$ /$2/?lang=$1 [QSA,L]
However, I've heard [L] behaves differently when in htaccess opposed to being in the httpd.conf - I'm no expert on it I'm afraid.

To be certain a RewriteRule will only be applied once add this condition before:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^/(en|fr|zh|gr|it)/(.*)$ /$2?lang=$1 [QSA,L]
Because any internal redirection made by mod-rewrite (and you'll have a lot, even with the [L] tag) will have the REDIRECT_STATUS envirronement variable altered.
The [L] tag means stop the chain of rewriteRule in case you have some other rules there. But the result of the mod-rewrite rewrite in your directory will always be tested against his new destination (and here the new destination is the same directory), and rules will be applied for the rewritten content as if it was a new one (except this env variable is set).

Related

Apache Rules for JS/CSS versioning using mod_rewrite

I've been trying this for the last 2h.
how can I use mod rewrite to do this:
/assets/app.min.4364736473.js -> /assets/app.min.js
/assets/app.min.4364736473.css -> /assets/app.min.css
Not that there's anything wrong with nikoshr's answer, but perhaps this is more versatile, working anywhere in the site that the format is used, since it's not likely to be used for any usual extension. In case it is, it checks that the file with the number doesn't exist before executing. The .min is optional, it works with or without it:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+)\.\d+(\.(?:js|css))$ $1$2
Finally, not using the [L] flag since other rules might need to be applied on the rewritten URL and no need to stop that happening.
It will only work properly in a .htaccess or <Directory> context. Otherwise (in a root config or <VirtualHost> context) it will need modifying for the file test to work or existing files won't be noticed and it will execute anyway.
It could be
RewriteEngine on
RewriteRule /assets/app\.min\.\d+\.js$ /assets/app.min.js [L]
RewriteRule /assets/app\.min\.\d+\.css$ /assets/app.min.css [L]

simple 301 redirect with variable not working, why?

Here's what I got so far. The first part works but not the redirect itself.
What do I need to do to make it work?
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ page.php?name=$1 [L]
RewriteRule ^page.php?name=([^/\.]+)/?$ /$1 [R=301,L]
Also if I have multiple of these rules do I leave the [L] only on the last one?
Besides the first rule overriding the second one, your second rule also won't work because you're trying to match the query string in a RewriteRule. Try something like this instead:
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^name=([^/.&]+)/?$
RewriteCond %{ENV:REDIRECT_LOOP} !1
RewriteRule ^page\.php$ /%1? [NS,R=301,L]
RewriteRule ^([^/.]+)/?$ page.php?name=$1 [NS,QSA,E=LOOP:1]
(I included the QSA flag so that an URL like /foobar?foo=bar will be rewritten to /page.php?name=foobar&foo=bar instead of just /page.php?name=foobar. If you don't want that, leave it out.)
Note: The second RewriteCond is there to keep the first rule from matching again after the second one has matched. The problem is that, in .htaccess context, mod_rewrite acts more or less as if all rules had the PT flag, causing the ruleset to be rerun from the start after every rewrite, even internal ones. Or, to quote the documentation:
"If you are using RewriteRule in either .htaccess files or in <Directory> sections, it is important to have some understanding of how the rules are processed. The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or <Directory> section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over."
The workaround I'm using is to set a custom environment variable with E=LOOP:1 when the internal rewrite triggers, and check for it before doing the external rewrite. Note that, when the request processing restarts after the internal rewrite, Apache prepends REDIRECT_ to the names of all environment variables set during the previous pass, so even though the variable we set is named just LOOP, the one we need to check for is REDIRECT_LOOP.

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.

Apache Rewrite Directory With Exceptions

I am trying to setup a rewrite that will get any pages that are in the news folder (with the exception of index.shtml and template.shtml (where template.shtml will have a get variable news in it). All other pages should rewrite to template.shtml?news=(same name as news/name).
What I have so far is:
RewriteCond %{REQUEST_URI} !^/news/((index|template).shtml)?$
RewriteRule ^news/(.*) /news/template.shtml?news=$1
This seems to exclude the main /news/, but not template.shtml and the rewrite seems to loop.
How can I resolve this? Any help would be appreciated.
Thanks.
Well -- this one works just fine for me:
RewriteCond %{REQUEST_URI} !^/news/(index|template)\.shtml$
RewriteRule ^news/(.+)$ /news/template.shtml?news=$1 [L,QSA]
This rule will ignore requests to /news/index.shtml and /news/template.shtml.
It will also do nothing when requesting just /news/ (as I have changed .* to .+ to be on a safer side).
Anything else will be rewritten to /news/template.shtml?news=whatever
I've also added the QSA flag to preserve any existing query string (useful for keeping referral data, e.g. /news/hello-pink-kitten?source=google will be rewritten as /news/template.shtml?news=hello-pink-kitten&source=google)

Why would mod_rewrite rewrite twice?

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?
From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline
From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]