Isapi Rewrite 301 redirect resolves as 404 - circular reference? - seo

I'm trying to use IIS Isapi Rewrite to do the following...
I need seo-friendly URLs to be (silently) converted back to application friendly URLs like so:
RewriteRule ^/seo-friendly-url/ /test/index.cfm [I,L]
Simple enough.
But I also need URLs already indexed in search engines (for example) to be 301 redirected to the seo-friendly version. Like so:
RewriteRule ^/test/index.cfm /seo-friendly-url/ [I,R=301]
Each of these works fine in isolation. But when I have both in my .ini file I end up with /seo-friendly-url/ showing in my browser address bar but I'm being served a 404. (Yes, /test/index.cfm definitely exists!)
I know it looks like a circular reference, but the first rule only rewrites the URL between IIS and the application - there's no redirect, so I'm not hitting Isapi Rewrite a second time. Or am I wrong about that?
I've enabled logging on Isapi Rewrite and I'm seeing the following:
HttpFilterProc SF_NOTIFY_PREPROC_HEADERS
DoRewrites
New Url: '/seo-friendly-url/'
ApplyRules (depth=0)
Rule 1 : 1
Result (length 15): /test/index.cfm
ApplyRules (depth=1)
Rule 1 : -1
Rule 2 : 1
Result (length 18): /seo-friendly-url/
ApplyRules: returning 301
ApplyRules: returning 1
Rewrite Url to: '/seo-friendly-url/'
Anyone got any ideas?

You have two different rewrites here and it should work if you do it right
The first one is never seen by the client user-agent. It requests /seo-friendly and you rewrite it internally and respond
The second one is not really a rewrite, but a redirect. You send that back to the client and it re-requests the /seo-friendly -- I think you need to use [R=301,L] to say that this is the end of the line -- just return it (L does that)

Through some trial and error I've come up with a solution for this.
Specify that the redirect match is at the end of the string using the $ symbol:
RewriteRule ^/test/index.cfm$ /seo-friendly-url/ [I,R=301]
Make the rewritten URL trivially different from the redirect match string - in this case adding an unnecessary "?":
RewriteRule ^/seo-friendly-url/ /test/index.cfm? [I,L]

Related

.htaccess RewriteRule based on anchors

I am trying to update a RewriteRule. Previously, the redirect looked like this:
https://mywebsite.com/docs/#/en/introduction → https://manual.mywebsite.com/#/en/introduction
I would like to use /docs/ for something else now but I would like to keep redirecting requests containing a forward slash after the # to the manual subdomain. This is what I would like to achieve:
The old redirects continue working as usual:
https://mywebsite.com/docs/#/en/introduction → https://manual.mywebsite.com/#/en/introduction
This would not get redirected since there is no forward slash following the #:
https://mywebsite.com/docs/#overview
Here is what I have:
The .htaccess file containing the following existing rule which redirects everything:
RewriteRule ^docs/(.*) https://manual.mywebsite.com/$1 [L,R=301]
I tried this but it did not work (I tried with https://htaccess.madewithlove.com/ which says that my rule does not match the URL I entered):
RewriteRule ^docs/#/(.*) https://manual.mywebsite.com/#/$1 [L,R=301]
I also read about the NE (no escape) flag (https://httpd.apache.org/docs/2.2/rewrite/advanced.html#redirectanchors) which did not help either.
I am also sure that the server is actually using the file.
To summarize, my problem is that I want to match a URL containing /docs/#/ and redirect it to a subdomain, keeping the /#/ and everything that follows it.
Anchors are not part of the URL that's transmitted to the server: They stay within the browser, thus you can't build a rewrite rule that take anchors into account.
Inspect your access-logs to see what you get

.htaccess 301 redirect old to new domain, from hosting root to subfolder, working but why?

The scenario: I've moved a WordPress website to a new domain and want to 301 redirect all the pages from the old domain to the new domain. Both sites are on the same hosting account running Apache. The old site is at the root level (public_html), and the new site is in a subfolder (below/inside the root).
I've managed to make this work, but I'd like to learn and understand why it works. So below is a quick overview of my 'journey' and solution, together with three specific questions.
First I tried to do the redirects like this (code added to the root .htaccess file):
# 301 Page Redirects - not working - causes redirect loop
redirect 301 / https://new-domain.com/
redirect 301 /services/ https://new-domain.com/services/
redirect 301 /recipes/ https://new-domain.com/recipes/
But this causes a redirect loop. I'm guessing because the .htaccess file with these rules is at the root level and therefore also affects the subfolders.
Question 1: Is my assumption above about the reason for the redirect loop correct?
Then I tried to be more specific and put this code in the root .htaccess file instead:
# 301 Page Redirects - not working - does nothing at all - not sure why
redirect 301 https://old-domain.com/ https://new-domain.com/
redirect 301 https://old-domain.com/services/ https://new-domain.com/services/
redirect 301 https://old-domain.com/recipes/ https://new-domain.com/recipes/
I was hoping the above code would do the trick, because it's more specific about the old domain. My thinking was that it specifies the old domain exactly and so would circumvent the redirect loop. But instead this code seems to have no effect at all. The redirect loop was gone, but now no redirects were happening anymore at all.
Question 2: Why would the above code not produce any redirects at all?
Then I found this answer and applied the code from that, which works perfectly and creates all the redirects. Plus it's much more elegant than my previous attempts above. This is the code:
# 301 Redirects from old-domain.com to new-domain.com - THIS CODE WORKS - Yay!
RewriteEngine On
RewriteCond %{HTTP_HOST} ^old-domain.com$ [OR]
RewriteCond %{HTTP_HOST} ^www.old-domain.com$
RewriteRule (.*)$ https://new-domain.com/$1 [R=301,L]
Question 3: Why does this code not cause any redirect loops when I place it in the root .htaccess?
I realise I'm copy/pasting code without fully understanding why it works. So I'd love an explanation in simple terms about these behaviours. Thank you.
Answer 1:
From the info that you have provided, I would say no. You did not specify if the new-domain.com website is configured (in apache configuration) with its document root being public_root or public_root/subfolder (judging by the described behaviour I would say it is the former). In that case, when you request https://old-domain.com/anything, the server will (because of the unconditional redirect in your first rule) respond with redirect to https://new-domain.com/anything. Client browser will then request that URL and it will hit the same Apache and same .htaccess, which will again result in the same redirect, causing the loop.
Answer 2:
Redirect syntax:
Redirect [status] [URL-path] URL
The old URL-path is a case-sensitive (%-decoded) path beginning with a
slash.
In your rule, you are specifying [URL-path] as https://old-domain.com/, which is wrong: it can be /, /services/, or /recipes, but not https://old-domain.com/ or https://old-domain.com/services/. The request [URL-path] does not match [URL-path] specified in your rule, so redirect never happens.
Answer 3:
This basically does the same thing as your first rule in Answer 1., with one important difference: the server will respond with redirect only if the hostname in request (or to be more precise, the content of the Host: header in request) is equal to old-domain.com or www.old-domain.com, which will prevent the loop since the second request from the client will use new-domain.com hostname.
Also, from the above, seems like your "new" website in subfolder will never be served: either if old-domain.com or new-domain.com is requested, the site from public_html folder will be shown (and only the hostname in clients browser address bar will change).

Remove and block unwanted postback string

Google webmaster page found duplicate content due to the following:
If we take this dynamic search page example.com/armin-music-page-1
google found post back string after "page-1" as shown in example below
example.com/armin-music-page-1$dneix
example.com/armin-music-online-page-1&q=sa=x&ei=-a
example.com/music-dance-club-mix-page-1%balbla
example.com/armin-search-page-1#einx
and many random postback strings
My question, how do i remove or redirect to 404 anything that is generated after "page-1" via apache mod_rewrite .htaccess so google finds clean url only
Thank you in advance!
You can get rid of the stuff after page-1 by redirecting to the URL where that's removed:
RewriteRule ^(.+-page-1)(.+)$ /$1? [L,R=301]
(rule needs to be near the top of the htaccess file)
Or if you want to send to 404:
RewriteRule ^(.+-page-1)(.+)$ - [L,R=404]
But one thing you can't do is deal with requests that look like this:
example.com/armin-search-page-1#einx
because the #einx part of the URL is never sent to the server, so there's no way for the server to match against it. All apache and mod_rewrite sees is /armin-search-page-1.

multiple folder redirect

I have been trying variations of the following without success:
Redirect permanent /([0-9]+)/([0-9]+)/(.?).html http://example.com/($3)
It seems to have no effect. I have also tried rewrite with similar lack of results.
I want all links similar to: http://example.com/2002/10/some-long-title.html
to redirect the browser and spiders to: http://example.com/some-long-title
When I save this to my server, and visit a link with the nested folders, it just returns a 404 with the original URL unchanged in the address bar. What I want is the new location in the address bar (and the content of course).
I guess this is more or less what you are looking for:
RewriteEngine On
ReriteRule ^/([0-9]+)/([0-9]+)/(.?)\.html$ http://example.com/$3 [L,R=301]
This can be used inside the central apache configuration. If you have to use .htaccess files because you don't have access to the apache configuration then the syntax is slightly different.
Using mod_alias, you want the RedirectMatch, not the regular Redirect directive:
RedirectMatch permanent ^/([0-9]+)/([0-9]+)/(.+)\.html$ http://example.com/$3
Your last grouping needs to be (.+) which means anything that's 1 character or more, what you had before, (.?) matches anything that is either 0 or 1 character. Also, the last backreference doesn't need the parentheses.
Using mod_rewrite, it looks similar:
RewriteEngine On
RewriteRule ^/([0-9]+)/([0-9]+)/(.+)\.html$ http://example.com/$3 [L,R=301]

Redirect URL using Query String parameter in URL

I have a bunch of URLs from an old site that I recently updated to EE2. These URLs look like this:
http://www.example.com/old/path.asp?dir=Folder_Name
These URLs will need to redirect to:
http://www.example.com/new_path/folders/folder_name
Not all Folder_Name strings match up with folder_name URL segments, so these will most likely need to be static redirects.
I tried the following Rule for a particular folder called "Example_One" which maps to a page on the new site called "example1":
Redirect 301 /old/path.asp?dir=Example_One
http://www.example.com/new_path/folders/example1
But this doesn't work. Instead of redirecting I just get a 404 error telling me that http://www.example.com/old/path.asp?dir=Example_One cannot be found.
EDIT:
There's a secondary problem here too which may or may not be related: I have a catch-all rule that looks like this:
redirect 301 /old/path.asp http://www.example.com/new_path
Using the rule, requests like the first one above will be redirected to:
http://www.example.com/new_path?dir=Folder_Namewhich triggers a 404 error.
Just had to scour Google a bit more to find the proper syntax for mod_rewrite.
Using the example from above:
RewriteCond %{QUERY_STRING} ^dir=Example_One$
RewriteRule ^/old/path\.asp$ /new_path/folders/example1? [R=301,L]
This fixes both of the problems above -- as to why EE is 404ing with one parameter in the Query String, that problem is still unsolved, but this works as a workaround to that problem.
You can also redirect URLs to a specific page where the parameter may have a different value each time. One example of this is Google UTM Campaign tracking (in situations like this where the tracking query string triggers a 404):
Link: http://www.example.com/?utm_source=xxx&..... (triggers 404)
Should Redirect to: http://www.example.com
RewriteCond %{QUERY_STRING} ^utm_source=
RewriteRule ^$ http://www.example.com? [R=301,L]
Note: This will only redirect those requests to the homepage, as defined by ^$. If you want to redirect utm requests to a different page, you'll need to change the first part of that RewriteRule.