Ignore requests from internal redirects - apache

RewriteRule ^resources/.+$ - [L]
RewriteRule .? index.php?t=$0 [QSA,L]
Would produce a 500 - Internal Server Error, because it would repeat again and again the same rule, due to internal redirected requests which are exactly treated as the first one. It would lead to an infinite chain of index.php?t=index.php&t=index.php&t=index.php&[...infinite more...]&t=test.php
But in my opinion this is not much better:
RewriteRule ^resources/.+$ - [L]
RewriteCond %{QUERY_STRING} !t=
RewriteCond %{REQUEST_URI} !^index\.php$
RewriteRule .? index.php?t=$0 [QSA,L]
Because now the user could input index.php?t=test.php as address, would pass the script and get the same content as if he had given test.php. I don't like that.
So how do I execute the first one without the issue of repeating internal redirects? Surely, a flag VL - Very Last would do the trick but sadly it does not exist.

First we have a look at all parameters given to the rules possibly indicating whether this is a chained request or not. This means, we either 1) need a variable changed in chained requests not relative to the changed URI or 2) the opposite, a variable which is relative to the changed URI and did not change (because we can compare it then against the others who did chage).
The problem is, they almost all update according to the applied RewriteRules.
IS_SUBREQ (1) and THE_REQUEST (2) are the only interesting variables but sadly internal redirects are not treated as subrequests, so IS_SUBREQ disappears. Only THE_REQUEST does not change and contains the real given path, so we have found our entry point.
With this in mind here is the annoying complex solution:
RewriteEngine On
# Set SCRIPT_URI and SUBREQ
# MUST be the first statements in the file
# SCRIPT_URI is the original browser-requested path
# SUBREQ is "true" if the original browser-requested path is not overriden yet
RewriteCond %{ENV:REQUEST_PARSED} !true
RewriteCond %{THE_REQUEST} ^\s*\w+\s+(http://[^\s/]+/|/?)([^\s\?]*)[\s\?$]
RewriteRule .? - [E=SCRIPT_URI:/%2,C]
RewriteRule .? - [E=REQUEST_PARSED:true]
RewriteCond %{ENV:SCRIPT_URI} ^(.*?)/\.($|/.*$)
RewriteRule .? - [E=SCRIPT_URI:%1%2,N]
RewriteCond %{ENV:SCRIPT_URI} ^(.*?)/[^/]+/\.\.($|/.*$)
RewriteRule .? - [E=SCRIPT_URI:%1%3,N]
RewriteCond %{ENV:SCRIPT_URI} ^(.*?)//\.\.($|/.*$)
RewriteRule .? - [E=SCRIPT_URI:%1/%2,N]
RewriteCond %{ENV:SCRIPT_URI}#%{REQUEST_URI} !^/*(.*)#/*\1$
RewriteRule .? - [E=SUBREQ:true]
# SCRIPT_URI and SUBREQ are set now. Actual content follows:
RewriteCond %{ENV:SUBREQ} !true
RewriteRule ^resources/.+$ - [L]
RewriteCond %{ENV:SUBREQ} !true
RewriteRule .? index.php?t=$0 [QSA,L]

Related

Combination of rewrites in .htaccess doesn't work

I have different rules in my .htaccess file which work fine individually but combined in one file they don't.
Here are some examples of my file:
# take care of %C2%A0
RewriteRule ^(.+)\xc2\xa0(.+)$ $1-$2 [L,NE]
# executes **repeatedly** as long as there are more than 1 spaces in URI
RewriteRule "^(\S*) +(\S* .*)$" $1-$2 [L,NE]
# executes when there is exactly 1 space in URI
RewriteRule "^productdetails/617/6/(\S*) (\S*?)/?$" /$1-$2/302 [L,R=302,NE]
Also I've got the following:
RewriteCond %{QUERY_STRING} ^pid=617/?$
RewriteRule ^productdetails\.asp$ /Casio-CDP120-Digital-Piano-in-Black/302? [L,NC,R=301]
which still work fine.
I have now added the following:
RewriteRule "^categories/3/Kawai Digital Pianos/?$" /Compare/Kawai-Digital-Pianos [L,NC,R=301]
which used to rewrite:
mysite.co.uk/categories/3/Kawai%20Digital%20Pianos/ to mysite.co.uk/Compare/Kawai-Digital-Pianos
this does not work anymore
Any help to get the last rule working in combination with the others would be great
You just need to make sure order of rules is correct. For your examples following order should work:
RewriteRule "^categories/3/Kawai Digital Pianos/?$" /Compare/Kawai-Digital-Pianos [L,NC,R=301]
RewriteCond %{QUERY_STRING} ^pid=617/?$
RewriteRule ^productdetails\.asp$ /Casio-CDP120-Digital-Piano-in-Black/302? [L,NC,R=301]
# take care of %C2%A0
RewriteRule ^(.+)\xc2\xa0(.+)$ $1-$2 [L,NE]
# executes **repeatedly** as long as there are more than 1 spaces in URI
RewriteRule "^(\S*) +(\S* .*)$" $1-$2 [L,NE]
# executes when there is exactly 1 space in URI
RewriteRule "^productdetails/617/6/(\S*) (\S*?)/?$" /$1-$2/302 [L,R=301,NE]

RewriteCond to match query string parameters in any order

I have a URL which may contain three parameters:
?category=computers
&subcategory=laptops
&product=dell-inspiron-15
I need 301 redirect this URL to its friendly version:
http://store.example.com/computers/laptops/dell-inspiron-15/
I have this but cannot make it to work if the query string parameters are in any other order:
RewriteCond %{QUERY_STRING} ^category=(\w+)&subcategory=(\w+)&product=(\w+) [NC]
RewriteRule ^index\.php$ http://store.example.com/%1/%2/%3/? [R,L]
You can achieve this with multiple steps, by detecting one parameter and then forwarding to the next step and then redirecting to the final destination
RewriteEngine On
RewriteCond %{QUERY_STRING} ^category=([^&]+) [NC,OR]
RewriteCond %{QUERY_STRING} &category=([^&]+) [NC]
RewriteRule ^index\.php$ $0/%1
RewriteCond %{QUERY_STRING} ^subcategory=([^&]+) [NC,OR]
RewriteCond %{QUERY_STRING} &subcategory=([^&]+) [NC]
RewriteRule ^index\.php/[^/]+$ $0/%1
RewriteCond %{QUERY_STRING} ^product=([^&]+) [NC,OR]
RewriteCond %{QUERY_STRING} &product=([^&]+) [NC]
RewriteRule ^index\.php/([^/]+/[^/]+)$ http://store.example.com/$1/%1/? [R,L]
To avoid the OR and double condition, you can use
RewriteCond %{QUERY_STRING} (?:^|&)category=([^&]+) [NC]
as #TrueBlue suggested.
Another approach is to prefix the TestString QUERY_STRING with an ampersand &, and check always
RewriteCond &%{QUERY_STRING} &category=([^&]+) [NC]
This technique (prefixing the TestString) can also be used to carry forward already found parameters to the next RewriteCond. This lets us simplify the three rules to just one
RewriteCond &%{QUERY_STRING} &category=([^&]+) [NC]
RewriteCond %1!&%{QUERY_STRING} (.+)!.*&subcategory=([^&]+) [NC]
RewriteCond %1/%2!&%{QUERY_STRING} (.+)!.*&product=([^&]+) [NC]
RewriteRule ^index\.php$ http://store.example.com/%1/%2/? [R,L]
The ! is only used to separate the already found and reordered parameters from the QUERY_STRING.
I take a slightly different approach for this sort of thing, leveraging ENV VARs set and read by mod_rewrite. I find it more readable / maintainable to refer to the backreferences by name like this, and these ENV VARs can be reused later in request processing too. Overall I think it's a more powerful and flexible approach than the accepted answer here. In any case, it works well for me. I've copied my gist below in its entirety:
From https://gist.github.com/cweekly/5ee064ddd551e1997d4c
# Mod_rewrite is great at manipulating HTTP requests.
# Using it to set and read temp env vars is a helpful technique.
#
# This example walks through fixing a query string:
# Extract good query params, discard unwanted ones, reorder good ones, append one new one.
#
# Before: /before?badparam=here&baz=w00t&foo=1&bar=good&mood=bad
# After: /after?foo=1&bar=good&baz=w00t&mood=happy
#
# Storing parts of the request (or anything you want to insert into it) in ENV VARs is convenient.
# Note the special RewriteRule target of "-" which means "no redirect; simply apply side effects"
# This lets you manipulate the request at will over multiple steps.
#
# In a RewriteRule, set custom temp ENV VARs via [E=NAME:value]
# Note it's also possible to set multiple env vars
# like [E=VAR_ONE:hi,E=VAR_TWO:bye]
#
# You can read these values using %{ENV:VAR_NAME}e <- little "e" is not a typo
#
# Tangent:
# Note you can also read these env vars the same way, if you set them via SetEnvIf[NoCase]
# (It won't work to use SetEnv, which runs too early for mod_rewrite to pair with it.)
#
# Regex details:
# (?:) syntax means "match but don't store group in %1 backreference"
# so (?:^|&) is simply the ^ beginning or an & delimiter
# (the only 2 possibilities for the start of a qs param)
# ([^&]+) means 1 or more chars that are not an & delimiter
RewriteCond %{QUERY_STRING} (?:^|&)foo=([^&]+)
RewriteRule ^/before - [E=FOO_VAL:%1]
RewriteCond %{QUERY_STRING} (?:^|&)bar=([^&]+)
RewriteRule ^/before - [E=BAR_VAL:%1]
RewriteCond %{QUERY_STRING} (?:^|&)baz=([^&]+)
RewriteRule ^/before - [E=BAZ_VAL:%1]
RewriteRule ^/before /after?foo=%{FOO_VAL}e&bar=%{BAR_VAL}e&baz=%{BAZ_VAL}e&mood=happy [R=301,L]
P.S. This is not a copy/pasteable solution to your question, but rather shows exactly how to handle this kind of problem. Armed w this understanding, leveraging it for your example will be completely trivial. :)
1) In case You just need to check that all parameters are in url:
RewriteCond %{QUERY_STRING} (^|&)category\=computers($|&)
RewriteCond %{QUERY_STRING} (^|&)subcategory\=laptops($|&)
RewriteCond %{QUERY_STRING} (^|&)product\=dell\-inspiron\-15($|&)
RewriteRule ^$ http://store.example.com/computers/laptops/dell-inspiron-15/? [R=301,L]
2) In case You need exact set of parameters:
RewriteCond %{QUERY_STRING} ^&*(?:category\=computers|subcategory\=laptops|product\=dell\-inspiron\-15)(?!.*&\1(?:&|$))(?:&+(category\=computers|subcategory\=laptops|product\=dell\-inspiron\-15)(?!.*&\1(?:&|$))){2}&*$
RewriteRule ^$ http://store.example.com/computers/laptops/dell-inspiron-15/? [R=301,L]
This rule is generated by 301 redirect generator

Why does mod_rewrite process the rules after the [L] flag

There are the rules:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.+) dir/index.php?$1 [L]
RewriteRule dir/index\.php.* - [F]
Why the last rule is processed and it returns Forbidden for all requests?
I need that if file or directory is not found then the next rule shouldn't be processed.
The next example isn't working for me as well:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .? - [S=1]
RewriteRule dir/index\.php.* - [F]
RewriteRule (.+) dir/index.php?$1
It still returns Forbidden for all requests.
Why the last rule is processed and it returns Forbidden for all requests?
When the URL foobar is requested:
The two conditions (line 2, 3) match
Pattern matches, the resulting URL becomes dir/index.php?foobar (line 4)
The [L] flag causes the rewriting to stop -- it does not stop Apache from having another go at the rewritten URL since it has changed (see below).
With dir/index.php as the input URL:
The condition does not match (line 2) since file exists
Jumps to line 5
Pattern matches, hence the Forbidden error
When directory or filename changes, Apache has to re-evaluate various configuration sections (e.g. Directory and Files) and the .htaccess file for the "re-written" path. This is why Apache might perform another iteration even when the previous one was ended by [L] flag.
The last string supposes to restrict the direct access to UFL handler.
Direct access means requesting the file through a link like: domain.com/dir/index.php
I think adding another condition before line 5 should work:
RewriteCond %{THE_REQUEST} dir/index\.php\x20HTTP/\d\.\d$
RewriteRule . - [F]
The THE_REQUEST server variable contains the request sent by the browser without any rewriting applied. This could be useful to detect what page was originally requested by the browser.
THE_REQUEST
The full HTTP request line sent by the browser to the server (e.g.,
"GET /index.html HTTP/1.1"). This does not include any additional
headers sent by the browser. This value has not been unescaped
(decoded), unlike most other variables below.
I am not exactly sure of what you meant by "the next rule".
But if you don't want some rules to be executed when a non-existent file is requested, then using the following structure may help. (The following piece of code is copied from the Apache RewriteRule Flags Page)
# Is the request for a non-existent file?
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# If so, skip these two RewriteRules
RewriteRule .? - [S=2]
RewriteRule (.*\.gif) images.php?$1
RewriteRule (.*\.html) docs.php?$1
And also using [R] for redirecting instead of [L] might help with the problem of returning Forbidden for all requests.

mod_rewrite: Prevent multiple rewrites using an environment variable

I'm currently returning a 404 error for *.php and internally redirecting all requests to a PHP file if one exists, using the following:
RewriteCond %{REQUEST_URI} /(?!index$)
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ /$1.php [QSA,L,E=norewrite:1]
RewriteCond %{ENV:norewrite} !1
RewriteCond %{REQUEST_URI} \.php$
RewriteRule ^(.+)$ - [R=404]
This works fine. However, I wish to have the ability to serve up PHP (or other) source (with the appropriate extension), from files e.g. index.php.src, while having index.php.src also return a 404 if accessed directly.
RewriteCond %{REQUEST_URI} /(?!index$)
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ /$1.php [QSA,L,E=norewrite:1]
RewriteCond %{ENV:norewrite} !1
RewriteCond %{REQUEST_FILENAME}\.src -f
RewriteRule ^(.+)$ /$1.src [L,E=norewrite:1]
RewriteCond %{ENV:norewrite} !1
RewriteCond %{REQUEST_URI} \.php$ [OR]
RewriteCond %{REQUEST_URI} \.src$
RewriteRule ^(.+)$ - [R=404]
This does not appear to work. It internally redirects to index.php, then to index.php.src, then 404s.
What's interesting is that, in the first sample, the environment variable DOES prevent the second ruleset from executing, and the page loads as expected. When I add that middle ruleset you see in the second sample, the environment variable no longer seems to have any effect.
If I remove that second ruleset from the second sample, leaving the additional lines in the last ruleset, as is, it behaves just like the first sample (except that requesting e.g. index.php.src returns a 404, which is what I want).
For various reasons, it would be unacceptable to use a query string for this purpose, it must be an environment variable.
How can I make this work? What am I doing wrong?
Edit:
In case I explained it poorly (I'm fairly sure I did)...
The following two files exist: 'index.php' and 'index.php.src'
If I request http ://domain.com/ with the first set of rules, I get my homepage (as expected). With the second set of rules, I get a 404. With the second set of rules, minus the second stanza, I get my homepage (as expected).
If I request http ://domain.com/index with either set of rules, I get a 404, as expected.
If I request http ://domain.com/index.php with either set of rules, I get a 404. This is expected with the first set, but I expect to be served the contents of 'index.php.src'.
If I request http ://domain.com/index.php.src with the first set of rules, I get the contents of 'index.php.src', as expected since the rule to 404 on *.src isn't in that set. I get a 404 as expected with the second set, with or without the second stanza.
The problem appears to be in the second stanza, but I can't make out what's wrong...
Here's what I did that worked:
RewriteCond %{IS_SUBREQ} false
RewriteCond %{REQUEST_URI} /(?!index$)
RewriteCond %{ENV:REDIRECT_STOP} !1
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^(.+)$ /$1.php [QSA,L,E=STOP:1]
RewriteCond %{IS_SUBREQ} false
RewriteCond %{ENV:REDIRECT_STOP} !1
RewriteCond %{REQUEST_FILENAME}\.src -f
RewriteRule ^(.+)$ /$1.src [L,E=STOP:1]
RewriteCond %{IS_SUBREQ} false
RewriteCond %{ENV:REDIRECT_STOP} !1
RewriteCond %{REQUEST_URI} \.php$ [OR]
RewriteCond %{REQUEST_URI} \.src$
RewriteRule ^(.+)$ - [R=404]
You'll notice I added the %{IS_SUBREQ} bits, which helped with the homepage redirerect issue that was causing it to 404. At that point, the query string method became acceptable, and I did get it working with that method, but I'm not the type to jsut let it be at that, I knew this could be done and I was gonna do it (I did!)
Aside from changing the variable name from 'norewrite' to 'STOP', which I did for clarity, I learned that environment variables set by mod_rewrite are prefixed with 'REDIRECT_' when an internal redirect occurs. That's why setting the value of 'norewrite' ('STOP'), then checking that same variable, was not working. When I appended 'REDIRECT_' to it in the check lines, it now behaves as expected.
Requesting '/awesome' will process 'awesome.php' and return its output
Requesting '/awesome.php' will return the content of 'awesome.php.src' (which is a symlink to 'awesome.php')
Requesting 'awesome.php.src' will return a 404
This is exactly what I wanted! Hopefully this will help someone else, as well.

Mod_rewrite in two directions?

An existing page is called /foo/bar.php. What I have done is a rewrite so that when a user types /foobar, it load the contents of /foo/bar.php (while keeping /foobar in the url bar)
But I also want the opposite - when a user clicks on a link or types /foo/bar.php, I want to have /foobar in the url. The reason is to avoid manually changing all the links.
How could I do that (if possible without an http redirect, but via some rewrite magic)? And is it possible for those two rules to co-exist?
Edit - After the first response, I realized my description of the problem was not proper. /foobar is not supposed to be a concatenation of foo, bar of /foo/bar.php, but an arbitrary string (/whatever).
Edit 2:
I now added RewriteRule ^whetever/?$ /foo/bar.php [L] in the / .htaccess. Then I added RewriteRule bar\.php$ /whetever [R=302,L] in the /foo .htaccess. The problem is it 's a circular reference and fails.
Thanks,
John
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/foo/[^/]+\.php$
RewriteCond %{IS_SUBREQ} !true
RewriteRule ^/foo/([^/]+)\.php$ /foo$1 [R,L]
RewriteCond %{REQUEST_URI} ^/foo[^/]
RewriteRule ^/foo(.*) /foo/$1.php [L]
The first part matches /foo/something.php and transforms them into /foosomething, but only if it is not a sub-request.
The second part takes any /foosometing and transforms it into /foo/something.php, via sub-request
You can try matching against %{THE_REQUEST} and only do the redirect when the actual request is for the php file:
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /foo/bar\.php
RewriteRule bar\.php$ /whatever [R=302,L]
RewriteRule ^whatever/?$ /foo/bar.php [L]