mod_rewrite 301 redirect from old urls to new - apache

Website has changed its url names due to SEO reasons, e.g. it was:
/category/filter1/f00/filter2/123/filter3/100-500/filter4/36.html
now:
/category/color/red/size/big/price/100-500/style/classic.html
I know the old and new names, they're fixed. Please help me to build a rewrite rule which will result in 301 redirect from old urls to new. I did research and I see that I cannot make it using RewriteMap for example, so I ended up making something like RewriteRule (.*)filter1(.*) $1color$2 [L] etc. Not only I don't like the way it looks, but also it doesn't give me a 301 redirect.
UPDATE: Note that at the moment I have several rules, one per filter name/value, e.g.:
RewriteEngine on
# make sure it's a catalog URL, not anything else
RewriteCond %{REQUEST_URI} !^/(category1|category2|category3|category4)
RewriteRule .* - [L]
# rewrite filter names
RewriteRule (.*)filter1(.*) $1color$2 [L]
RewriteRule (.*)filter2(.*) $1price$2 [L]
...etc...
It works as expected - changing all the names in URL, but setting R flag causes the stop on first rule and redirect to URL like:
/var/www/vhosts/site/htdocs/category/color/red/filter2/123/ etc...
I separated rules because any of filters may or may not exist in the URL. I will greatly appreciate the better solution.

Here is my own answer: it is possible to do with environment variables. We need to replace old filter names and values with new ones, and then make only one 301 redirect to new URL. Here what I've done using mod_rewrite and environment variables:
RewriteEngine on
RewriteRule /filter1/ - [E=filters:/color/]
RewriteRule /f00[.\/] - [E=filters:%{ENV:filters}red]
RewriteRule /0f0[.\/] - [E=filters:%{ENV:filters}green]
RewriteRule /00f[.\/] - [E=filters:%{ENV:filters}blue]
RewriteRule /filter2/ - [E=filters:%{ENV:filters}/size/]
RewriteRule /123[.\/] - [E=filters:%{ENV:filters}big]
RewriteRule /32[.\/] - [E=filters:%{ENV:filters}small]
RewriteRule /filter3/([^/^\.]+) - [E=filters:/price/$1]
RewriteRule /filter4/ - [E=filters:%{ENV:filters}/style/]
RewriteRule /36[.\/] - [E=filters:%{ENV:filters}classic]
RewriteRule /37[.\/] - [E=filters:%{ENV:filters}urban]
RewriteCond %{REQUEST_URI} ^/(category1|category2|category3|category4)/
RewriteCond %{ENV:filters} !^$
RewriteRule ^([^/]+)/ /$1%{ENV:filters}.html [L,R=301]
Basically, I've reformatted whole the URL in environment variable filters then checked if it's a category and not some else part of the website, and finally made redirect to this category+filters variable, appended .html at the end.

Even though the new URL looks prettier to a human, I'm not sure if there's a need to change the existing URL for SEO reasons.
To get a redirect instead of a rewrite, you must use the R|redirect flag. So your rule would look like
RewriteRule (.*)filter1(.*) $1color$2 [R,L]
But if you have multiple redirects, this might impact your SEO results negatively, see Chained 301 redirects should be avoided for SEO , but Google will follow 2 or 3 stacked redirects
Remember that ideally you shouldn’t have any stacked redirects or even a single redirect if you can help it, but if required Google will follow chained redirects
But every additional redirect will make it more likely that Google won’t follow the redirects and pass PageRank
For Google keep it to two and at a maximum three redirects if you have to
Bing may not support chained redirects at all
This means try to replace multiple filters at once
RewriteRule ^(.*)/filter1/(.*)/filter2/(.*)$ $1/color/$2/size/$3 [R,L]
and so on.
When the filters may come in an arbitrary order, you may use several rules and do a redirect at the end
RewriteRule ^(.*)filter1(.*)$ $1color$2 [L]
RewriteRule ^(.*)filter2(.*)$ $1price$2 [L]
RewriteRule ^(.*)filter3(.*)$ $1size$2 [L]
RewriteCond %{ENV:REDIRECT_STATUS} 200
RewriteRule ^ %{REQUEST_URI} [R,L]
RewriteCond with REDIRECT_STATUS is there to prevent an endless loop.
When it works as it should, you may replace R with R=301. Never test with R=301.
A final note, be very careful with these experiments. I managed to kill my machine twice (it became unresponsive and I had to switch off) during tests.

Related

.htaccess RewriteRule from long url to show short url

Im trying to rewrite url from long to short but cant wrap my head around this.
My survey rewrite works wonderfully but after completing my survet php redirects to www.example.com/survey_thank_you.php?survey_id=1
but I would like to show url like www.example.com/thank_you
Im not even sure if this is possible.
Im new with .htaccess and i have tried almost everthing
.htaccess
Options +FollowSymLinks
Options -MultiViews
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^survey_thank_you.php?survey_name=([0-9a-zA-Z]+)/?$ Thank_you [L,NC,QSA]
RewriteRule ^([0-9a-zA-Z]+)/?$ survey_form.php?survey_name=$1 [L,NC,QSA] #works like charm.
Any help or directions will be highly appreciated.
Solution:
Options +FollowSymLinks
Options -MultiViews
RewriteEngine on
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ^survey_id=([0-9a-zA-Z]+)/?$
RewriteRule ^survey_thank_you\.php$ /%1/thank_you [R,L,QSD]
RewriteRule ^([0-9a-zA-Z]+)/thank_you$ survey_thank_you.php?survey_id=$1 [L,NC,QSA]
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^([0-9a-zA-Z]+)/?$ survey_form.php?survey_name=$1 [L,NC,QSA]
but after completing my survet php redirects to www.example.com/survey_thank_you.php?survey_id=1
You need to "correct" the URL that PHP is redirecting you to after the survey. If the desired URL is /thank_you (or /Thank_you?) then PHP should be redirecting to that URL.
You then use mod_rewrite in .htaccess to internally rewrite /thank_you back into the URL that your application understands. ie. /survey_thank_you.php?survey_id=1. However, therein lies another problem, where does the 1 (survey_id) come from in the query string? Presumably you don't want to hardcode this? So this would need to passed in the requested URL. eg. /1/thank_you or perhaps /thank_you/1?
However, is this really necessary? The resulting "thank you" page is not a page that should be indexed or a page that is normally navigated to by the user, so implementing a user-friendly URL here doesn't seem to be a worthwhile exercise?
RewriteRule ^survey_thank_you.php?survey_name=([0-9a-zA-Z]+)/?$ Thank_you [L,NC,QSA]
RewriteRule ^([0-9a-zA-Z]+)/?$ survey_form.php?survey_name=$1 [L,NC,QSA] #works like charm.
You are using a survey_name URL parameter (referencing an alphanumeric value) in your directives, but a survey_id ("numeric"?) URL parameter in your earlier example? So, which is it? Or are these rules unrelated?
You state that the second rule "works like charm", but how? What URL are you requesting? That would seem to rewrite /Thank_you to survey_form.php?survey_name=Thank_you - but that does not look correct?
As mentioned in comments, the RewriteRule pattern matches against the URL-path only. To match against the query string you need an additional condition that matches against the QUERY_STRING server variable. This would also need to be an external 3xx redirect, not an internal rewrite (in order to change the URL that the user sees). Therein lies another problem... if you don't change the URL that your PHP script is redirecting to then users will experience two redirects after submitting the form.
You also need to be careful to avoid a redirect loop, since you are internally rewriting the request in the opposite direction. You need to prevent the redirect being triggered after the request is rewritten. ie. Only redirect direct requests from the user should be redirected.
So, to answer your specific question, it should be rewritten something like this instead:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ^survey_name=[0-9a-zA-Z]+/?$
RewriteRule ^survey_thank_you\.php$ /Thank_you [QSD,R,L]
The check against the REDIRECT_STATUS environment variable ensures that only direct requests are processed, not internally rewritten requests by the later rewrite. REDIRECT_STATUS is empty on the initial request and set to the string 200 (as in 200 OK status) after the first successful rewrite.
The QSD flag (Apache 2.4) is necessary to discard the original query string from the redirect response.
So the above would redirect /survey_thank_you.php?survey_name=<something> to /Thank_you.
But this is losing the "survey_name" (or survey_id?), so should perhaps be more like the following, in order to preserve the "survey_name":
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} ^survey_name=([0-9a-zA-Z]+)/?$
RewriteRule ^survey_thank_you\.php$ /%1/Thank_you [QSD,R,L]
Where %1 is a backreference to the value of the survey_name URL parameter captured in the preceding CondPattern.
However, you would then need to modify your rewrite that turns this back into an understandable URL.
(But you should probably not be doing this in the first place without first changing the actual URLs in the application.)

Htaccess Redirect URL with two forward slashes (not double) won't work

I want to redirect from one domain to a new domain. At the same time, the URL structure has changed.
Old: https://www.olddomain.com/parentpage/oldtitle/
New: https://www.newdomain.com/newtitle
This is wordpress, and I placed this code above the Wordpress stuff, as well as tested it here: https://htaccess.madewithlove.be/
I tried this, which doesn't work:
Redirect 301 /parentpage/title https://www.newdomain.com/newtitle
Also, when testing it at https://htaccess.madewithlove.be/, I do have this redirect:
Redirect 301 /parentpage https://www.newdomain.com/parentpage
The tester would skip my preferred redirect above, and use this one, leaving me with this, which does not exist:
https://www.newdomain.com/parentpage/oldtitle
Even when I place the preferred redirect above this one. I need both, unfortunately.
Have also tried the following RewriteRules (not all at the same time)
ReWriteRule https://www.olddomain.com/parentpage/oldtitle/ https://www.newdomain.com/newtitle
ReWriteRule /parentpage/oldtitle/ https://www.newdomain.com/newtitle
ReWriteRule "https://www.olddomain.com/parentpage/oldtitle/" "https://www.newdomain.com/newtitle"
I think it has something to do with that second forward slash separating the parentpage name and page title, but I can't figure out how to fix it.
In RewriteRule it wouldn't match http or https in it, you may try following.
please make sure you clear your browser cache before testing your URLs.
RewriteEngine ON
RewriteCond %{HTTP_HOST} ^(?:www\.)olddomain\.com [NC]
RewriteCond %{REQUEST_URI} ^/parentage/oldtitle/?$ [NC]
RewriteRule ^(.*)$ https://www.newdomain.com/newtitle [R=301,L]

How to add "everything else" rule to mod_rewrite

How can I make mod_rewrite redirect to a certain page or probably just throw 404 if no other rules have been satisfied? Here's what I have in my .htaccess file:
RewriteEngine on
RewriteRule ^\. / [F,QSA,L]
RewriteRule ^3rdparty(/.*)$ / [F,QSA,L]
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^((images|upload)/.+|style.css)$ $1 [L]
RewriteRule ^$ special [QSA]
RewriteRule ^(special|ready|building|feedback)/?$ $1.php [QSA,L]
RewriteRule ^(ready|building)/(\d+)/?$ show_property.php?type=$1&property_id=$2 [QSA,L]
RewriteRule . error.php?code=404 [QSA,L]
This is supposed, among other things, to send user to error.php if he tries to access anything that was not explicitly specified here (by the way, what is the proper way to throw 404?). However, instead it sends user from every page to error.php. If I remove the last rule, everything else works.
What am I doing wrong?
What is happening is that when you are doing a rewrite, you then send the user to the new URL, where these rewrite rules are then evaluated again. Eventually no other redirectoin rules will be triggered and it will get to the final rule and always redirect to the error.php page.
So you need to put some rewrite conditions in place to make this not happen.
The rewrite engine loops, so you need to pasthrough successful rewrites before finally rewriting to error.php. Maybe something like:
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} !^/(special|ready|building|feedback|show_property)\.php
RewriteCond %{REQUEST_URI} !^/((images|upload)/.+|style.css)$
RewriteRule ^ error.php?code=404 [QSA,L,R=404]
Each condition makes sure the URI isn't one of the ones your other rules have rewritten to.
The R=404 will redirect to the error.php page as a "404 Not Found".
Unfortunatelly, it didn't work - it allows access to all files on the server (presumably because all conditions need to be satisfied). I tried an alternate solution:
Something else must be slipping through, eventhough when I tested your rules plus these at the end in a blank htaccess file, it seems to work. Something else you can try which is a little less nice but since you don't actually redirect the browser anywhere, it would be hidden from clients.
You have a QSA flag at the end of all your rules, you could add a unique param to the query string after you've applied a rule, then just check against that. Example:
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^((images|upload)/.+|style.css)$ $1?_ok [L,QSA]
then at the end:
RewriteCond %{QUERY_STRING} !_ok
RewriteRule ^ error.php?code=404&_ok [QSA,L,R=404]
In theory if none of the rules are matched (and the requested URL does not exist), it's already a 404. So I think the simplest solution is to use an ErrorDocument, then rewrite it:
RewriteEngine On
ErrorDocument 404 /404.php
RewriteRule ^404.php$ error.php?code=404 [L]
# All your other rules here...
You can do the same for any other HTTP error code.
The problem here is that after the mod_rewrite finishes rewriting the URL, it is resubmitted to the mod_rewrite for another pass. So, the [L] flag only makes the rule last for the current pass. As much better explained in this question, mod_rewrite starting from Apache version 2.3.9, now supports another flag - [END], that makes the current mod_rewrite pass the last one. For Apache 2.2 a number of solutions are offered, but since one of them was a bit clumsy and another didn't work, my current solution is to add another two rules that allow a specific set of files to be accessed while sending 404 for everything else:
RewriteRule ^((images|upload)/.+|style.css|(special|ready|building|feedback|property).php)$ - [QSA,L]
RewriteRule .* - [QSA,L,R=404]
I think your last rule should be
RewriteRule ^(.*)$ error.php?code=404&query=$1 [QSA,L]
You could leave out the parenthesis and the $1 parameter, but maybe it's useful to know, what the user tried to achieve.
Hope, this does the trick!

Replace one URL with another by using mod_rewrite

Current situation
RewriteRule ^$ /index.php?page=Portal [R=301,L]
When a user comes to the website and goes to the "root" url of the domain (RegExp "^$") he's redirected to /index.php?page=Portal
That's working.
Now we have "index.php?page=Portal" in the google index and we have tons of links to that page on various locations all over the internet.
Intended new situation
We want the portal page to show up on the root url - no redirect. That's no problem... Just remove the redirect:
RewriteRule ^$ /index.php?page=Portal [L]
Now we also want the old url to redirect to the new location, and that's where I fail but can't see why:
RewriteCond %{QUERY_STRING} ^page=Portal$
RewriteRule ^index.php$ http://www.jacatu.de/? [R=301,L]
As soon as I do this I end up in a redirect loop:
(When I change to 302 in .htaccess I see 302 redirects, so the loop really seems to be caused by mod_rewrite)
But why? All rules are marked as last [L] - so I think I can rule out that rule 2 triggers rule 1.
I enabled logging as suggested by Jacek Prucia and in fact it looked like having [L] in the URL doesn't stop execution. Both rules were processed.
I now changed the first rewrite to
RewriteRule ^$ /index.php?page=Portal&int=1 [L]
so that it doesn't match the RewriteCond of the internal rewrite so theoretically my problem is solved. It would be nice to know, though, why it did what it did. :)

Redirecting to same page with .htaccess

From my .htaccess file:
RewriteRule ^showPAGE.php page [NC,R=301]
RewriteRule ^page showPAGE.php [NC,L]
I want users going to url domain.com/showPAGE.php to be redirected to domain.com/page .
When domain.com/page is being entered, I want it to show the content of the file showPAGE.php.
Is that possible to do?
The above results an infinite redirection loop.
Thanks
You're trying to do something that's very tricky. The problem is that, by design, the RedirectRule directive always triggers again the complete set of rules. You can only get out of the loop when you obtain a final URL that does not match any of the rules and that's the tricky part since you are reusing the showPAGE.php name.
My best attempt so far involves adding a fake hidden string:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/showPAGE\.php
RewriteCond %{QUERY_STRING} !^internal
RewriteRule ^ http://%{HTTP_HOST}/page [NC,R=301,L]
RewriteRule ^page$ showPAGE.php?internal [NC,L]
It works but it's not pleasant. Definitively, it's easier to handle the redirection from with PHP or to simply pick another name.
The redirect from showPAGE.php to page needs to have [L] so that it will stop processing and redirect at once, rather than going on and applying other rules (which at once map it back to showPAGE.php). Try this:
RewriteRule ^showPAGE.php page [NC,R=301,L]
RewriteRule ^page showPAGE.php [NC,L]