How to remove % from end of url in apache rewrite - apache

I've noticed an increase of requested URLs with a % at the end.
eg. http://sample.com/countries/usa%
We have an Apache rewrite rule the converts a properly formed request to the desired page on the server
RewriteRule ^countries/([a-zA-Z]+)$ /index.php?c=$2
However, when the user (or a bot?) adds the % symbol to the end it forces a 400 error.
Google's Webmaster Tools has found an increase of these type of errors and I haven't a clue how to remove it. I can't do it in PHP because the error is happening at the Apache level.
Any help will be most appreciated.

I would add something like this in the beginning of your .htaccess file:
RewriteRule ^(.*)\%$ $1 [R=301,L]
This causes a permanent redirect (because of R=301) to the page without the '%'

Related

.htaccess RewriteRule based on anchors

I am trying to update a RewriteRule. Previously, the redirect looked like this:
https://mywebsite.com/docs/#/en/introduction → https://manual.mywebsite.com/#/en/introduction
I would like to use /docs/ for something else now but I would like to keep redirecting requests containing a forward slash after the # to the manual subdomain. This is what I would like to achieve:
The old redirects continue working as usual:
https://mywebsite.com/docs/#/en/introduction → https://manual.mywebsite.com/#/en/introduction
This would not get redirected since there is no forward slash following the #:
https://mywebsite.com/docs/#overview
Here is what I have:
The .htaccess file containing the following existing rule which redirects everything:
RewriteRule ^docs/(.*) https://manual.mywebsite.com/$1 [L,R=301]
I tried this but it did not work (I tried with https://htaccess.madewithlove.com/ which says that my rule does not match the URL I entered):
RewriteRule ^docs/#/(.*) https://manual.mywebsite.com/#/$1 [L,R=301]
I also read about the NE (no escape) flag (https://httpd.apache.org/docs/2.2/rewrite/advanced.html#redirectanchors) which did not help either.
I am also sure that the server is actually using the file.
To summarize, my problem is that I want to match a URL containing /docs/#/ and redirect it to a subdomain, keeping the /#/ and everything that follows it.
Anchors are not part of the URL that's transmitted to the server: They stay within the browser, thus you can't build a rewrite rule that take anchors into account.
Inspect your access-logs to see what you get

hashtag in apache .htaccess

By using the following .htaccess
RewriteEngine On
RewriteRule ^([0-9]+)/([0-9]+)$ /api/web/index.html#$1/$2 [R=301,NC,L]
When user types the following URL at their browser.
http://localhost:8080/1/2
I'm expecting, Apache will perform internal redirection, and change the displayed URL at browser too (through R=301).
http://localhost:8080/api/web/index.html#1/2
Changing the displayed URL at browser is important. This is to ensure index.html's JavaScript can parse the url correctly.
However, what I really get is
http://localhost:8082/api/web/index.html%231/2
I will get Apache error.
Apache false thought that, I wish to fetch a file named 2 located in directory api/web/index.html%231/
Is there anything I can solve this through modifying .htaccess only?
The # is getting encoded as %23. Try using the NE flag in your rule:
RewriteRule ^([0-9]+)/([0-9]+)$ /api/web/index.html#$1/$2 [R=301,NC,L,NE]
the NE flag tells mod_rewrite not to encode the URI.

Remove and block unwanted postback string

Google webmaster page found duplicate content due to the following:
If we take this dynamic search page example.com/armin-music-page-1
google found post back string after "page-1" as shown in example below
example.com/armin-music-page-1$dneix
example.com/armin-music-online-page-1&q=sa=x&ei=-a
example.com/music-dance-club-mix-page-1%balbla
example.com/armin-search-page-1#einx
and many random postback strings
My question, how do i remove or redirect to 404 anything that is generated after "page-1" via apache mod_rewrite .htaccess so google finds clean url only
Thank you in advance!
You can get rid of the stuff after page-1 by redirecting to the URL where that's removed:
RewriteRule ^(.+-page-1)(.+)$ /$1? [L,R=301]
(rule needs to be near the top of the htaccess file)
Or if you want to send to 404:
RewriteRule ^(.+-page-1)(.+)$ - [L,R=404]
But one thing you can't do is deal with requests that look like this:
example.com/armin-search-page-1#einx
because the #einx part of the URL is never sent to the server, so there's no way for the server to match against it. All apache and mod_rewrite sees is /armin-search-page-1.

mod_rewrite to redirect url not working

Cannot seem to get a mod_rewrite to work. We have a domain name that has already been printed here, there and everywhere when the website was Flash. It has a # in its trail /#login.php and we want so that when people put this in it redirects them to /login.php. I have already tried this rule but can't get it to work:
RewriteEngine On
RewriteRule ^/#login.php$ /login.php
I have also checked that the rewrite engine is working by using a redirect to google. Just need the out of date #login.php to go to the new login.php
thanks
The # in the URL (or "fragment") is not sent to the server, it's purely for the client side to point to some part of the page. If you see http://hostname.com/#login.php in your address bar, the only thing the server gets is a request for /. You may need to employ some javascript on the page to look at the browser's address bar to find a fragment and maybe send that to the server as a query string.
Try :
RewriteEngine On
RewriteBase /
RewriteRule ^#login\.php$ /login.php [QSA,L]
Mod_rewrite is enabled ? available ?

Isapi Rewrite 301 redirect resolves as 404 - circular reference?

I'm trying to use IIS Isapi Rewrite to do the following...
I need seo-friendly URLs to be (silently) converted back to application friendly URLs like so:
RewriteRule ^/seo-friendly-url/ /test/index.cfm [I,L]
Simple enough.
But I also need URLs already indexed in search engines (for example) to be 301 redirected to the seo-friendly version. Like so:
RewriteRule ^/test/index.cfm /seo-friendly-url/ [I,R=301]
Each of these works fine in isolation. But when I have both in my .ini file I end up with /seo-friendly-url/ showing in my browser address bar but I'm being served a 404. (Yes, /test/index.cfm definitely exists!)
I know it looks like a circular reference, but the first rule only rewrites the URL between IIS and the application - there's no redirect, so I'm not hitting Isapi Rewrite a second time. Or am I wrong about that?
I've enabled logging on Isapi Rewrite and I'm seeing the following:
HttpFilterProc SF_NOTIFY_PREPROC_HEADERS
DoRewrites
New Url: '/seo-friendly-url/'
ApplyRules (depth=0)
Rule 1 : 1
Result (length 15): /test/index.cfm
ApplyRules (depth=1)
Rule 1 : -1
Rule 2 : 1
Result (length 18): /seo-friendly-url/
ApplyRules: returning 301
ApplyRules: returning 1
Rewrite Url to: '/seo-friendly-url/'
Anyone got any ideas?
You have two different rewrites here and it should work if you do it right
The first one is never seen by the client user-agent. It requests /seo-friendly and you rewrite it internally and respond
The second one is not really a rewrite, but a redirect. You send that back to the client and it re-requests the /seo-friendly -- I think you need to use [R=301,L] to say that this is the end of the line -- just return it (L does that)
Through some trial and error I've come up with a solution for this.
Specify that the redirect match is at the end of the string using the $ symbol:
RewriteRule ^/test/index.cfm$ /seo-friendly-url/ [I,R=301]
Make the rewritten URL trivially different from the redirect match string - in this case adding an unnecessary "?":
RewriteRule ^/seo-friendly-url/ /test/index.cfm? [I,L]