Apache Redirect // to no // - apache

An email went out with the wrong link (https://www.digitalmarketer.com/digital-marketing/content-marketing-strategy//) and we need to redirect the // to (https://www.digitalmarketer.com/digital-marketing/content-marketing-strategy/) but no matter what I try, the redirect isn't working.
They also want it to be redirected to always have https:///www at the beginning and to never have index.html at the end, so already in the .htaccess file I have:
RewriteEngine On
RewriteCond %{HTTPS} !=on
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{HTTPS}s ^on(s)|
RewriteRule ^ http%1://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
RewriteRule ^content\-marketing\-strategy/index\.html$ /digital-marketing/content-marketing-strategy/? [L,R=301]
I've tried adding a new RewriteRule, but this won't work:
RewriteRule ^content\-marketing\-strategy//$ /digital-marketing/content-marketing-strategy/? [L,R=301]
I'm very new to Apache and redirects so any help is much appreciated! Thank you!
Edit: Of note, this is in an .htaccess file inside of the digital-marketing folder (https://www.digitalmarketer.com/digital-marketing/.htaccess) which was done so all the above rules would only apply to the digital-marketing folder.

You can use insert rule at the end of your other rules to strip multiple // into /:
RewriteCond %{THE_REQUEST} //
RewriteRule ^.*$ /digital-marketing/$0 [R=301,L,NE]
Apache automatically strips down multiple // into one inside the pattern for RewriteRule thus captured value $0 will have all // converted into /

You can write a wildcard expression to remove trailing slashes. The below will match any HTTP or HTTPS URL that trails in a forward slash, and remove all trailing forward slashes from that URL:
RewriteRule ^(.*)/+$ $1 [R=301,L]
And more using 301 redirects, see more here: Best Practice: 301 Redirect HTTP to HTTPS (Standard Domain)
Good luck!

I see nothing in the way that the rule is written that would make it not rewrite. However you have multiple rules with the L flag that might stop processing on the rewrite at an earlier point than you are looking for. From the documentation
The [L] flag causes mod_rewrite to stop processing the rule set. In most contexts, this means that if the rule matches, no further rules will be processed.
(https://httpd.apache.org/docs/current/rewrite/flags.html).
You can try this page out http://htaccess.mwl.be/ to test all your rules together. You might have to rewrite them a bit to work with that page, it's not aware of the level your .htaccess file is at so you will have to rewrite all your rules to trigger from the root for example: RewriteRule ^digital\-marketing/content\-marketing\-strategy//$ /digital-marketing/content-marketing-strategy/? [L,R=301]

Related

Rewrite condition to particular URL and redirect

I have a redirect rule that redirects a url of pattern /abcd/* to http://myweb.com/abcd/*.
I want to apply this redirect except /abcd/index.html.
I have tried this.
RewriteCond %{THE_REQUEST} !^/abcd/index
RewriteCond %{REQUEST_URI} !^/abcd/index [NC]
RewriteRule ^/abcd/(.+)$ http://myweb.com/abcd/$1 [NC,R=301,L]
I am not sure if it is correct.
Please suggest me the correct way of doing this.
Try this rule without leading slash in RewriteRule:
RewriteCond %{REQUEST_URI} !^/abcd/index [NC]
RewriteRule ^abcd/(.+)$ http://myweb.com/abcd/$1 [NC,R=301,L]
Difference is ^abcd/(.+)$ instead of ^/abcd/(.+)$
Your 2 conditions were redundant so I reduced it to one.
.htaccess is per directory directive and Apache strips the current directory path (thus leading slash) from RewriteRule URI pattern.

Remove all Subfolders Condition Rewrite .htaccess Apache 1.3.42

I have done a lot of research about removing subfolders however cannot find away to create an .htaccess rule to remove all subfolders in my root directory, example below:
www.domain.com/dan/dan changes to www.domain.com/dan
www.domain.com/pam/pam changes to www.domain.com/pam
www.domain.com/jam/jam changes to www.domain.com/jam
The .htaccess rule should keep this pattern up through infinity without me having to add the names of the subfolders to my rule, kind of like a wildcard condition or catchall scenario.
However, there is one condition, only remove subfolder if the file has the same name as I have illustrated above in my example.
I’m on Apache 1.3.42 so will need a solution that is not for the newer versions please.
Checkout my .htaccess file below, I’ve done a lot of SEO work to it as you can see:
AddType application/x-httpd-php .html
RewriteEngine On
RewriteBase /
#non www to www
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
#removing trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L]
#html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^\.]+)$ $1.html [NC,L]
#index redirect
#directory remove index.html
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.arkiq.com/ [R=301,L]
#directory remove index
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\ HTTP/
RewriteRule ^index http://www.arkiq.com/ [R=301,L]
#sub-directory remove index.html
RewriteCond %{THE_REQUEST} /index\.html
RewriteRule ^(.*)/index\.html$ /$1 [R=301,L]
#sub-directory remove index
RewriteCond %{THE_REQUEST} /index
RewriteRule ^(.*)/index /$1 [R=301,L]
#remove .html
RewriteCond %{THE_REQUEST} \.html
RewriteRule ^(.*)\.html$ /$1 [R=301,L]
Let me know if you know how to forward all subfolders to their respectively named files with one rule as that would be superb.
I have no setup here to test this rule with a real installation of apache, but I am pretty sure you can achieve this by using a positive lookahead with a capture group.
RewriteRule ^(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$ /$1$4 [R,L]
What does this do? ^(.*?) will match everything before the last two slashes. If you would go to example.com/test/test, it would match exactly nothing. ([^/]+) will match the first thing we want to test and puts it in capture group 2. (?=\2(/|$)) is the positive lookahead. A lookahead will 'peek' at the next characters, but will not consume any. \2 is replaced with the second capture group and (/|$) will either match a slash or the end of the string. The last ([^/]+) will match the second 'thing' and /? will make sure that the url is matched even if a / exists at the end of the url. After applying this rule this should happen:
example.com/test/test --> example.com/test
example.com/test/test2 --> no rewrite, because '2' does not match '/' or the end of the string
example.com/test/test/ --> example.com/test
example.com/sub/test/test --> example.com/sub/test
Debugging this rule
If you get an internal server error, please go to your apache error log and read what error it gives. Here is proof it works on a clean .htaccess on Apache 2.4.4 and, while it takes 1 minute to check an error log, it takes me several hours to read all patch notes for all Apache versions of the last 3 years.
External redirect, internal rewrite, preventing infinite loop
Assuming that above rule works on your version of mod_rewrite/apache/regex, the following construction will work to externally redirect your request, then internally rewrite it back. Please note that /test/test will not do anything sensible, unless you tell apache how to execute such a file. Proof of concept.
#The external redirect
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /(.*?)([^/]+)/(?=\3(/|\ ))
RewriteRule ^(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$ /$1$4 [R,L]
#The internal rewrite
RewriteCond %{REQUEST_URI} !^/(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*/|)([^/]+)/?$ /$1$2/$2 [L]
You mention DirectorySlash Off. Please note that on current versions of Apache this would only get applied to an actual external request. While doing internal rewrites you are safe. In both examples above, in Apache 2.4.4, even though I redirect to an url without a trailing slash, Apache will still append a slash in a second redirect. I am clueless how this was handled in 1.3.
If Apache 1.3 doesn't support backreferences or lookaround in it's regex engine, which I still can't test, there is no real way of testing if an url contains two segments that are the same via mod_rewrite. You'll either need to use a custom router page or write out every url out there (which can cause performance issues, as that is likely a lot). Rewriting to a router page goes like this:
RewriteRule ^(.*)$ /myrouter.php?url=$1 [L]
This router page in a language of your choice can send the 301 or 302 header too with a custom location. It will need to handle all other requests too that are matched by the rewriterule above.

How to avoid chain redirections using mod_rewrite?

Basicly i'm working on my site to be SEO-friendly. I wanted to achieve following:
Rewrite urls to pretty ones
Remove multiple slashes (eg. example.com/////something/// to example.com/something/
Redirect www version to a non-www version.
Hide index.php file from all urls
Redirect from old (/?id=something/ to new urls /something/)
I came up with this .htaccess code:
RewriteCond %{THE_REQUEST} //
RewriteRule .* $0 [R=301]
RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteCond %{QUERY_STRING} ^id=([a-z0-9\/-]+)
RewriteRule ^(.*)$ http://example.com/%1? [R=301]
RewriteRule ^index.php(.*)$ /$1 [R=301]
RewriteRule ^([a-z0-9\/-]+)$ /?id=$1 [L]
...and though it's working it has a side effect: chain redirects, eg. example.com/?id=something////// -> example.com/something////// -> example.com/something/
So is there a way to rewrite or modify this code so it'll be redirecting just once to the preferred version of the url?
Trying to interpret what you want, let's look at the rules in your question:
.1 Can't understand the purpose of this:
RewriteCond %{THE_REQUEST} //
RewriteRule .* $0 [R=301]
.2 This rule-set in your question removes www and converts the query string ?id=val to /val, but only when the incoming URI has www AND there is a query string as both conditions must be met:
RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteCond %{QUERY_STRING} ^id=([a-z0-9\/-]+)
RewriteRule ^(.*)$ http://example.com/%1? [R=301]
.3 This rule
RewriteRule ^index.php(.*)$ /$1 [R=301]
Hides index.php, but only when it is in the root directory. Example:
http://www.example.com/index.php?id=val
Does not work when it is in a subdirectory. Example:
http://www.example.com/folder/index.php?id=val
.4 Can't understand the purpose of this:
RewriteRule ^([a-z0-9\/-]+)$ /?id=$1 [L]
I suggest this instead:
RewriteEngine On
RewriteBase /
#Redirects all www to non-www
RewriteCond %{HTTP_HOST} www\.example\.com$ [NC]
RewriteRule ^(.*)/?$ http://example.com/$1 [R=301,L]
#Hides "index.php" keeping the query if present
RewriteRule ^(.*)/index\.php$ $1/ [R=301,QSA,L]
#Converts query string `?id=val` to `/val`
RewriteCond %{QUERY_STRING} id=([^/]+)
RewriteRule .* /%1? [R=301,L]
Remember spiders will "adapt" to the correct new structure after a few months, and the problem may ultimately be a whole lot less severe than what it looks like initially. You can leave all the .htaccess code in place, knowing it always be there to correct any "old" references yet will in fact hardly ever actually be used.
I've never found an easy way to avoid multiple round trips back to the client when "fixing up" a URL to be in some sort of canonical form. mod_rewrite seems to be more focussed on the "local" redirect case where the client has no idea that the content it got back came out of a file structure that doesn't perfectly match that implied by the URL.
It is possible to save up all the URL mods locally, then provoke only one round trip to the client that delivers all the URL corrections all at once by setting everything in newly created "environment" variables then at the end asking basically "has anything changed?" However doing so is notably verbose and rather awkward and quite error-prone and has never become a "recommended technique".

Using .htcaccess to redirect to www. AND add trailing forward slash

I have searched around for a while and had a go at tweaking this file myself and I'm almost there but there is one case which I can't figure out...
How to get both a www. AND a forward slash at the same time
If I type in spectrl.com, it redirects to www.spectrl.com CORRECT - Adds www.
If I type in www.spectrl.com/ebaycalculator it redirects to www.spectrl.com/ebaycalculator/ CORRECT - Adds /
But if I type in spectrl.com/ebaycalculator I get a 404 error when it should go to www.spectrl.com/ebaycalculator/
Here's my .htcaccess file, kept at the root:
RewriteBase /
Options +FollowSymlinks
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://spectrl.com/$1/ [L,R=301]
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^ http://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
Thanks
#Kavi
Try this:
RewriteEngine On
RewriteCond "%{HTTP_HOST}" "^(?:www\.)?(.*)" [NC]
RewriteCond "%{REQUEST_URI}" "!/$"
RewriteRule "(.*)" "http://www.%1%/$1/" [R=301,L]
RewriteCond "%{HTTP_HOST}" "!^www\." [NC]
RewriteRule "(.*)" "http://www.%1/$1" [R=301,L]
The first RewriteCond captures the hostname (without any leading www.) in the reference %1. That condition will always succeed.
The second RewriteCond checks for the trailing slash; if not found, the next RewriteRule will be triggered.
That first RewriteRule uses the captured www.-less host name to construct a redirect that includes www. and the training /.
The second stanza will be triggered if the request falls through because it does have a trailing /. It checks for a leading www., and does the same sort of redirect (only without appending a slash, since there's already one there) as the first stanza.
At least, that's how is should work; I haven't tested it. :-)
After removing and re-uploading .htaccess and then clearing the cache, everything seems to be working as intended using my original code in the question.
Hope this will be helpful for someone else.

Apache Redirect problem in .htaccess

I am having problems getring a simple redirect statement to take effect on my Godaddy account. I have the following statements in my .htaccess file:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www.mydomain.net$ [NC]
RewriteRule ^(.*)$ http://mydomain.net/$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^mydomain.net$ [NC]
RewriteRule ^/lists/$ / [R=301]
RewriteCond %{HTTP_HOST} ^mydomain.net$ [NC]
RewriteRule ^/blog/$ http://myotherdomain.net/ [R=301]
The 1st redirect ALWAYS work. The 2nd and 3rd ones however, NEVER work. I just get a 404 from the server. The Apache logs do not reveal any useful infomation - just a 404.
Any ideas, anybody?
Your help will be greatly appreciated.
Thanks
Per-directory Rewrites
When using the rewrite engine in .htaccess files the per-directory prefix (which always is the same for a specific directory) is automatically removed for the pattern matching and automatically added after the substitution has been done.
– http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriterule
So just leave the leading slash out of the pattern.
For simple redirects like that, better use the simple RedirectMatch directives:
RedirectMatch 301 ^/lists/$ http://mydomain.net/
RedirectMatch 301 ^/blog/$ http://myotherdomain.net/
If you insist on using rewriting make sure you add the L flag to your rules.
Apache mod_rewrite Flags says :
You will almost always want to use [R] in conjunction with [L] (that is, use [R,L]) because on its own, the [R] flag prepends http://thishost[:thisport] to the URI, but then passes this on to the next rule in the ruleset, which can often result in 'Invalid URI in request' warnings.
Simply remove the slashs at the beginning. It also might be useful to make the slashs at the end optional.
RewriteCond %{HTTP_HOST} ^mydomain.net$ [NC]
RewriteRule ^lists/{0,1}$ / [R=301]
RewriteCond %{HTTP_HOST} ^mydomain.net$ [NC]
RewriteRule ^blog/{0,1}$ http://myotherdomain.net/ [R=301]
Put the first one last. Once it encounters a redirect match, it runs it and ignores the rest.