Apache mod_rewrite .htaccess rules not matching - escaped characters - apache

I'm trying to do some relatively simple .htaccess rules to 301 redirect some URLs. The rules are not matching. Here is what I have:
RewriteEngine On
RewriteBase /
# This works fine
RewriteCond %{QUERY_STRING} ^id=2$ [NC]
RewriteRule ^products\.php$ /? [R=301,NE,NC,L]
# This doesn't match at all
RewriteRule ^products/-Smart-Smoker-\'Storm\'-White-Manual-Electronic-Cigarette-\(510\)\.html http://www.smartsmoker.co.uk/products/-Smart-Smoker-Storm-White-Manual-Electronic-Cigarette-%28510%29.html [NC,R=301]
# Neither does this
RewriteRule ^products/Christmas-Cracker-%252d-FREE-Shipping\.html$ http://www.smartsmoker.co.uk/categories/Electronic-Cigarette-Kits/Breeze-Mini-Electronic-Cigarette/ [L,R=301]

Best bet when having rewrite drama is to break it down into the shortest functioning match then expand on it noting that conditions above might be affecting what you want to actually test. Having said that, I can't see any problems myself using the minimal example below:
RewriteEngine on
RewriteBase /
RewriteRule ^products/-Smart-Smoker-\'Storm\'-White-Manual-Electronic-Cigarette-\(510\)\.html$ /test.html [L]
Also the URLs you redirect to don't have to be URL encoded.

Related

What's wrong with this Apache mod_rewrite rule?

I'm attempting to write some redirects for an Apache site, but my mod_rewrite skills are rusty. What's wrong with this rule?
RewriteRule ^/en/(.*)$ /$1 [R=301,L]
I expect it to redirect http://example.com/en/whatevs.html to http://example.com/whatevs.html, but it does not seem to match.
RewriteEngine On
RewriteRule ^en/(.*)$ /$1 [R=301,L]
You were close. It's hard in rewrite to remember when to use the starting / and when not to. I put in the rewriteengine on in case it had slipped your mind to include that.
For example:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/en/.*$
RewriteRule ^en/(.*) /$1 [R=301,L]
Note which has the slash and which does not.
And there's some subtle differences also depending on if your rules are in httpd.conf files or in .htaccess files, but I forget the exact differences.

Remove all Subfolders Condition Rewrite .htaccess Apache 1.3.42

I have done a lot of research about removing subfolders however cannot find away to create an .htaccess rule to remove all subfolders in my root directory, example below:
www.domain.com/dan/dan changes to www.domain.com/dan
www.domain.com/pam/pam changes to www.domain.com/pam
www.domain.com/jam/jam changes to www.domain.com/jam
The .htaccess rule should keep this pattern up through infinity without me having to add the names of the subfolders to my rule, kind of like a wildcard condition or catchall scenario.
However, there is one condition, only remove subfolder if the file has the same name as I have illustrated above in my example.
I’m on Apache 1.3.42 so will need a solution that is not for the newer versions please.
Checkout my .htaccess file below, I’ve done a lot of SEO work to it as you can see:
AddType application/x-httpd-php .html
RewriteEngine On
RewriteBase /
#non www to www
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
#removing trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L]
#html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^\.]+)$ $1.html [NC,L]
#index redirect
#directory remove index.html
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.arkiq.com/ [R=301,L]
#directory remove index
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\ HTTP/
RewriteRule ^index http://www.arkiq.com/ [R=301,L]
#sub-directory remove index.html
RewriteCond %{THE_REQUEST} /index\.html
RewriteRule ^(.*)/index\.html$ /$1 [R=301,L]
#sub-directory remove index
RewriteCond %{THE_REQUEST} /index
RewriteRule ^(.*)/index /$1 [R=301,L]
#remove .html
RewriteCond %{THE_REQUEST} \.html
RewriteRule ^(.*)\.html$ /$1 [R=301,L]
Let me know if you know how to forward all subfolders to their respectively named files with one rule as that would be superb.
I have no setup here to test this rule with a real installation of apache, but I am pretty sure you can achieve this by using a positive lookahead with a capture group.
RewriteRule ^(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$ /$1$4 [R,L]
What does this do? ^(.*?) will match everything before the last two slashes. If you would go to example.com/test/test, it would match exactly nothing. ([^/]+) will match the first thing we want to test and puts it in capture group 2. (?=\2(/|$)) is the positive lookahead. A lookahead will 'peek' at the next characters, but will not consume any. \2 is replaced with the second capture group and (/|$) will either match a slash or the end of the string. The last ([^/]+) will match the second 'thing' and /? will make sure that the url is matched even if a / exists at the end of the url. After applying this rule this should happen:
example.com/test/test --> example.com/test
example.com/test/test2 --> no rewrite, because '2' does not match '/' or the end of the string
example.com/test/test/ --> example.com/test
example.com/sub/test/test --> example.com/sub/test
Debugging this rule
If you get an internal server error, please go to your apache error log and read what error it gives. Here is proof it works on a clean .htaccess on Apache 2.4.4 and, while it takes 1 minute to check an error log, it takes me several hours to read all patch notes for all Apache versions of the last 3 years.
External redirect, internal rewrite, preventing infinite loop
Assuming that above rule works on your version of mod_rewrite/apache/regex, the following construction will work to externally redirect your request, then internally rewrite it back. Please note that /test/test will not do anything sensible, unless you tell apache how to execute such a file. Proof of concept.
#The external redirect
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /(.*?)([^/]+)/(?=\3(/|\ ))
RewriteRule ^(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$ /$1$4 [R,L]
#The internal rewrite
RewriteCond %{REQUEST_URI} !^/(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*/|)([^/]+)/?$ /$1$2/$2 [L]
You mention DirectorySlash Off. Please note that on current versions of Apache this would only get applied to an actual external request. While doing internal rewrites you are safe. In both examples above, in Apache 2.4.4, even though I redirect to an url without a trailing slash, Apache will still append a slash in a second redirect. I am clueless how this was handled in 1.3.
If Apache 1.3 doesn't support backreferences or lookaround in it's regex engine, which I still can't test, there is no real way of testing if an url contains two segments that are the same via mod_rewrite. You'll either need to use a custom router page or write out every url out there (which can cause performance issues, as that is likely a lot). Rewriting to a router page goes like this:
RewriteRule ^(.*)$ /myrouter.php?url=$1 [L]
This router page in a language of your choice can send the 301 or 302 header too with a custom location. It will need to handle all other requests too that are matched by the rewriterule above.

.htaccess rewrite all 'other' URLs

Hopefully this is a simple one. I have a really basic .htaccess that rewrites any request to /admin (or /admin/etc) to /_admin/index.php. So far so good:
#Options All -Indexes
RewriteEngine On
RewriteBase /admin
RewriteRule ^admin/$ /_admin/index.php [QSA]
RewriteRule ^admin$ /_admin/index.php [QSA]
RewriteRule ^admin/(.+)$ /_admin/index.php [QSA]
What I also want is a generic "catch all else" rule that rewrites any other url (/users, /turnips/, /some/other/path and so forth) back to /index.php
I can't seem to get that to work - its either server error 500's or /admin also gets rewritten to the root page. I'm sure I just need to do something with RewriteCond but I can't figure it out.
Thanks!
Add this after the other rules. It would be the default rule if the previous rules are not applied.
RewriteBase /
RewriteCond %{REQUEST_URI} !index\.php [NC]
RewriteRule .* index.php [L,QSA]
First of all I suggest you add the L flag to your rewrites so you're sure to avoid unintended matches after matching a rewrite (unless intended of course).
Secondly WordPress uses the following code to rewrite all URLs that are not matching index.php OR a file that already exists. This way files accessed directly like images, text files, downloads etc are not rewritten. Note that originally it also included the line RewriteCond %{REQUEST_FILENAME} !-d to also not rewrite directories but you seem to want that behaviour:
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . /index.php - [L]
Please see if this fits your needs.

htaccess file to redirect non-www and https requests, and rewrite URLs

I'm trying to create my first .htaccess to make an URL rewrite and other things, but I've wasted two days searching for information and trying to make it work with no success.
What I would like to achieve is the following:
1) redirect all the non-www URLs to the www URL (it seems a best practice for seo?)
domain.com --> www.domain.com
2) redirect all the https requests to http ones:
https://www.domain.com --> http://www.domain.com
https://domain.com --> http://www.domain.com
3) rewrite my URLs to be SEO friendly, and eventually strip any trailing slash at the end of the address:
www.domain.com/abc --> www.domain.com/index.php?page=abc
www.domain.com/abc/ --> www.domain.com/abc
What I have so far, are three snippets which taken individually work:
# 1) This should be correct, I hope!
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
# 2) This seems to work 100% right
RewriteEngine On
RewriteCond %{HTTPS} =on
RewriteRule .* http://%{SERVER_NAME}%{REQUEST_URI} [R,L]
#3) Major issues here
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ index.php?page=$1 [L]
This works partially: it does what I want, but I get all relative paths broken when I put a trailing slash after the SEO friendly URL:
www.domain.com/abc OK
www.domain.com/abc/ page gets displayed but all relative URL for css and images are broken
I don't even know how to combine the different rules in a single htaccess file to make them interact correctly.
Sorry for the long explaination, this problem is becoming quite frustrating for me.
The problem with relative path to css files or images of course is located in the trailing slash. The browser thinks he is in a subfolder an appends the relative pathes to the css files to your current location, which would result in
www.domain.com/abc/styles/style.css
instead of
www.domain.com/styles/style.css
There are two ways, to work around this:
My preferred solution ist to make all paths in html absolute.
Redirect to same path without the trailing slash. since you seem to ignore the trailing slash, this should have no side effects. A matching rule would look like this
RewriteRule ^([^/\.]+)/?$ http://%{HTTP_HOST}$1 [R=301,L]
Your complete htaccess-file would be this one:
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
RewriteCond %{HTTPS} =on
RewriteRule .* http://%{SERVER_NAME}%{REQUEST_URI} [R,L]
RewriteRule ^([^/\.]+)/$ http://%{HTTP_HOST}$1 [R=301,L]
RewriteRule ^([^/\.]+)$ index.php?page=$1 [L]

mod_rewrite equivalent of a RedirectMatch rule

I have a RedirectMatch rule in my .htaccess file that works fine, but I'm having trouble coming up with a comparable rule using mod_rewrite.
My goal is to have this URL: mysite.com/anything/print show this page: mysite.com/anything?view=print.
The rule that works to Redirect it is this:
RedirectMatch 301 ^(.*)/print/?$ http://mysite.com/$1?view=print
But now I'd like to change this from a visible 301 redirect to an "invisible" rewrite using mod_rewrite. I have tried many different variations on this (with and without RewriteBase), and none have worked:
RewriteEngine On
RewriteBase /
RewriteRule ^(.*)/print/? $1?view=print
What am I doing wrong? Mod_rewrite is definitely enabled, and there are functioning Wordpress-based mod_rewrite rules in the same .htaccess file.
UPDATE
Using tips from #Nathan, I now have this. However, I still get a 404 when I visit mypost/print.
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/index\.php
RewriteRule ^(.*)/print/?$ /index.php/$1?view=print [L]
When I append /print/ to a permalink, the WP_Debug plugin indicates the following:
Request: myposttype/mypost/print
Query String: attachment=print
Matched Rewrite Rule: myposttype/[^/]+/([^/]+)/?$
Matched Rewrite Query: attachment=print
If you have typical wordpress rules in the htaccess file, your rules should come before the # BEGIN WordPress block. Otherwise the wordpress rules will stop the rewrite matching before your rules get called.
Also, you should add $ after your regex pattern unless you also want to match something like: http://domain.com/page/print/something/else/here
Lastly, in order for Wordpress to recognize the change in the URL without redirecting the page, you need to append the URL to index.php/ so that Wordpress will use path info permalinks.
E.g.
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/index\.php
RewriteRule ^(.*)/print/?$ /index.php/$1?view=print [L]
# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
RewriteEngine on
RewriteRule ^(.*)/print/$ /$1?view=print [L]
or
RewriteRule ^(.*)/print$ /$1?view=print [L]
without "/" in the end of URL string