htaccess redirect outcome not as expected - apache

I hope someone can help me with the following problem.
I have a multiple language site with the language as a folder like
example.com/se/post
I want to get the language separated by domain like example.se.
So far no problem with a DNS alias and WPML plugin.
The problem I have is that I want to redirect example.com/se/post to example.se/post. I try to use this rule in the .HTACCESS file but it changes the URL to example.se/se with the /se that I do not need. I'm not very familiar with the rewrite engine in .HTACCESS file.
<IfModule mod_headers.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www\.)?nofairytales\.nl$ [NC]
RewriteCond %{REQUEST_URI} ^/sv(/.*)?$ [NC]
RewriteRule ^(.*)$ http://www.nofairytales.se%{REQUEST_URI} [L,R=301]
</IfModule>

RewriteCond %{HTTP_HOST} ^(www\.)?nofairytales\.nl$ [NC]
RewriteCond %{REQUEST_URI} ^/sv(/.*)?$ [NC]
RewriteRule ^(.*)$ http://www.example.se%{REQUEST_URI} [L,R=301]
This is close... you are capturing the URL-path (/post part) in the preceding condition but not using it in the substitution string. Instead, you are using REQUEST_URI which contains the full root-relative URL-path.
You are also matching sv in the URL-path, but redirecting to se in the TLD. The following should resolve the issue (with minimal changes):
RewriteCond %{REQUEST_URI} ^/se(/.*)?$ [NC]
RewriteRule ^(.*)$ http://www.example.se%1 [L,R=301]
Where %1 is a backreference to the captured subpattern in the preceding condition (the /post part).
However, You don't need the second (or even the first) condition(s), as it can be all done in the RewriteRule directive. There wouldn't seem to be a need to check the requested hostname, since if the language subdirectory is in the URL-path then it would seem you should redirect anyway to remove it.
For example, the following should be sufficient to redirect a single language code:
# Language "se"
RewriteRule ^se(?:/(.*))?$ https://www.example.se/$1 [R=301,L]
The non-capturing group that contains the slash delimiter ensures that we always have a trailing slash on the target URL (after the hostname). The first rule above requires the user-agent to "correct" the redirect response when the slash after the hostname is omitted (which it does).
For multiple languages you can modify the same rule with regex alternation. For example:
# All supported languages
RewriteRule ^(se|uk|us|au|id)(?:/(.*))?$ https://www.example.$1/$2 [R=301,L]
This assumes all language codes map to a TLD using the same code. If not then you can implement a "mapping" (lang code -> TLD) in the rule itself or use a RewriteMap if you have access to the server config. This could also provide a "default" TLD.
You could be more generic and allow any two-character language code in the regex. eg. ^([a-z]{2})(?:/(.*))?$. And simply pass this through to the TLD. However, a request for an unknown language (eg. /xx/post) - which might have resulted from an error on your site - will now result in either a malformed redirect (since the domain won't resolve) or worse, a redirect to a competitor lying in wait. And this might go undetected unless you run an analysis of your redirects. So, being more restrictive with the regex/rule may be advisable.

Related

I changed domains and post slug structure at the same time for my WP site. Can I use 1 redirect to do so with htaccess?

I am planning a domain change from example1.com to example2.com. To add a twist to it, I also want to change my permalinks at the same time. My current permalinks for posts have the date and I want to remove it.
I'm a bit hesitant to test and lose SEO so I was hoping someone could confirm this would work before.
Here is what I was thinking:
after changing domains I use this code in my htaccess
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example1.com [NC,OR]
RewriteCond %{HTTP_HOST} ^www.example1.com [NC]
RewriteRule ^\d{4}/\d{2}/(.*) https://example2.com/$1 [R=301,L]
then I found this rule to change dates:
RewriteRule ^[0-9]{4}/[0-9]{2}/(.*)$ https://example2.com/$1
I saw this one as well:
RewriteRule ^/(\d*)/(\d*)/([A-Za-z0-9-]*)$ https://example2.com/$4
I'm not sure what these rules specifically mean but I THINK I should be able to combine them like this?
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example1.com [NC,OR]
RewriteCond %{HTTP_HOST} ^www.example1.com [NC]
RewriteRule ^[0-9]{4}/[0-9]{2}/(.*)$ http://example2.com/$1 [L,R=301,NC]
It doesn't seem quite right.
Or would simply changing the permalink structure in WordPress affect the change so that
https://www.example1.com/2019/01/how-to-write-about-cars/
redirects to
https://www.example2.com/how-to-write-about-cars/
UPDATE
Using MrWhite's answer below. I added this code:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example1.com [NC,OR]
RewriteCond %{HTTP_HOST} ^www.example1.com [NC]
RewriteRule ^/(\d*)/(\d*)/([A-Za-z0-9-]*)$ https://example2.com/$4
This is working now in the case of
https://www.example1.com/2019/01/how-to-write-about-cars/
which redirects to
https://www.example2.com/how-to-write-about-cars/
However
https://www.example2.com/2019/01/how-to-write-about-cars/
does NOT redirect to
https://www.example2.com/how-to-write-about-cars/
It just returns a 404. This likely isn’t an issue as nothing should be bookmarked but just in case, is there a way to fix that?
Or would simply changing the permalink structure in WordPress affect the change
I don't believe this would implement the redirect from the old to new URL structure, if that is what you are thinking. (At least not by default.)
RewriteCond %{HTTP_HOST} ^example1.com [NC,OR]
RewriteCond %{HTTP_HOST} ^www.example1.com [NC]
RewriteRule ^[0-9]{4}/[0-9]{2}/(.*)$ http://example2.com/$1 [L,R=301,NC]
This looks OK. Although if the new URLs at example2.com don't contain the date (ie. /YYYY/MM/ prefix) then there wouldn't seem to be any need to check the requested hostname.
This rule must also go at the top of the .htaccess file, before any of the existing WordPress directives (ie. before the # BEGIN WordPress comment marker).
You should first test with a 302 (temporary) redirect to avoid potential caching issues.
Final Solution
This can, however, be tidied a bit. The following one-liner should be sufficient:
RewriteRule ^\d{4}/\d{2}/(.*) https://example2.com/$1 [R=301,L]
You do not need any of the RewriteCond directives. (Just the RewriteEngine On directive, if it doesn't already appear elsewhere in the .htaccess file.)
Note the https on the target URL. \d (shorthand character class) is the same as [0-9]. The trailing $ on the regex is not required since regex is greedy by default. The NC flag is not required either, since there is nothing case specific in this regex.
Aside: (Don't use this!)
I saw this one as well:
RewriteRule ^/(\d*)/(\d*)/([A-Za-z0-9-]*)$ https://example2.com/$4
This rule, however, is very wrong! Due to the slash prefix on the RewriteRule pattern this will never match in .htaccess and the rule will do nothing. But there are only 3 capturing groups in the regex, so the $4 backreference would always be empty (everything would be redirected to the homepage, which would likely be treated as a soft-404 by search engines).

No trailing slash causes 404, how to fix using htaccess?

The URLs are:
(Doesn't work) http://example.com/seller/samsung
(Works) http://example.com/seller/samsung/
The .htaccess rule I have for these types of URLs looks like:
RewriteRule ^seller/[^/]+/(.*)$ ./$1
What can I do to make both of these URLs to go to the same page?
You could just force a trailing slash to appear on the end of your URLs. You can do that by using the following in your .htaccess:
RewriteCond %{REQUEST_URI} !(/$|\.)
RewriteRule (.*) %{REQUEST_URI}/ [R=301,L]
Just make sure you clear your cache before you test this.
EDIT:
RewriteCond %{REQUEST_URI} /+[^\.]+$
RewriteRule ^(.+[^/])$ %{REQUEST_URI}/ [R=301,L]
What does the above do? So the condition grabs your directory, so for example /samsung and will check if it has a / on the end. If it does not, it will grab the directory at the end of URL (once again /samsung and add a / on to it. It will do this using a 301 redirect and would leave you with /samsung/.
As for the L flag (taken from official documentation):
The [L] flag causes mod_rewrite to stop processing the rule set. In
most contexts, this means that if the rule matches, no further rules
will be processed. This corresponds to the last command in Perl, or
the break command in C. Use this flag to indicate that the current
rule should be applied immediately without considering further rules.

htaccess - Simple rewrite rule not triggering

I'm trying to match a simple rule to rewrite a url but it's just not matching. I want to redirect
https://example.com/web/thanks/
to
https://example.com/thanks.php
Here's what I've tried
RewriteEngine On
RewriteBase /
RewriteRule ^thanks/$ https://example.com/thanks.php [R=302,L]
RewriteEngine On
RewriteRule ^thanks/$ https://example.com/thanks.php [R=302,L]
RewriteEngine On
RewriteBase /
RewriteRule ^/(.*)/thanks/$ https://example.com/thanks.php [R=302,L]
RewriteEngine On
RewriteBase /
RewriteRule ^web/thanks/$ https://example.com/thanks.php [R=302,L]
and many more tiny variations but none of them are triggering. I tried using this online tool and it returns "This rule was not met". What am I doing wrong?
To rewrite, just use your last rule
RewriteRule ^web/thanks/?$ /thanks.php [L]
with the following changes
no RewriteBase, this is only relevant for some relative URLs
optional trailing slash /?, if you want both /web/thanks or /web/thanks/ to work
no domain name, because this might trigger a redirect instead of a rewrite, see RewriteRule
Absolute URL
If an absolute URL is specified, mod_rewrite checks to see whether the hostname matches the current host. If it does, the scheme and hostname are stripped out and the resulting path is treated as a URL-path. Otherwise, an external redirect is performed for the given URL. To force an external redirect back to the current host, see the [R] flag below.
no R|redirect flag, because this triggers a redirect instead of a rewrite
The pattern ^.*thanks/$ or ^(.*)thanks/$ also works, but it matches any URL ending in thanks/, like /hellothanks/, /areyousurethanks/, /some/path/thanks/, ...

What's going on with my mod_rewrite?

I have a simple mod_rewrite system set up on my site which basically converts
http://site.com/file -> http://site.com/file.php
Here's the .htaccess file
Options -MultiViews
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www.site.com
RewriteRule ^(.*)$ http://site.com/$1 [R=301]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-z]+)/?$ http://site.com/$1.php [L]
This was working for a long time and then a couple of days ago I realized that while the RewriteRule was working, it was actually changing my URL in the status bar.
For instance, it would redirect /photos to /photos.php, but it would also change the URL to show the .php. This has never happened before and I'm not sure what happened to trigger the change.
Any ideas?
The first rewrite rule needs the [L] flag. From the mod_rewrite documentation for the [R] flag:
You will almost always want to use [R] in conjunction with [L] (that is, use [R,L]) because on its own, the [R] flag prepends http://thishost[:thisport] to the URI, but then passes this on to the next rule in the ruleset, which can often result in 'Invalid URI in request' warnings.
In this case, you don't get a warning, but appending the ".php" extension happens before issuing the redirect rather than when the second, redirected request comes in.
Also, remove the scheme and domain name from the substitution in the second rewrite rule. A full URL can cause an implicit redirect. From the documentation for RewriteRule:
The Substitution of a
rewrite rule is the string that replaces the original URL-path that
was matched by Pattern. The Substitution may
be a:
[...]
Absolute URL
If an absolute URL is specified,
mod_rewrite checks to see whether the
hostname matches the current host. If it does, the scheme and
hostname are stripped out and the resulting path is treated as
a URL-path. Otherwise, an external redirect is performed for
the given URL. To force an external redirect back to the
current host, see the [R] flag below.

How do I redirect a specific URL pattern when Drupal Clean URLs are on?

I have a Drupal 5.23 installation using clean URLs with Apache and the mod_rewrite module. I am using an .htaccess file for the clean URLs functionality with the following configuration:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !=/favicon.ico
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
</IfModule>
I am going to be disabling the Localization/Internationalization plugins on the website, which is going to change every single page's URL on the website from http://www.example.com/en/url-to-a-page to http://www.example.com/url-to-a-page (the /en portion is being stripped out).
I would like to add a mod_rewrite rule to give an HTTP 301 Redirect response for any incoming URLs with the /en portion in the URL so they are directed to the correct page.
I've tried adding the following lines to my .htaccess file both above and below the existing rules, but in both cases visiting a page with /en results in an HTTP 404 Not Found response:
RewriteRule ^en/(.+)$ http://www.example.com/$1 [R=301]
If I comment out the existing rules, my rule works just fine. I've also tried to add a condition to the rule, but this doesn't appear to have an effect either:
RewriteCond %{REQUEST_URI} =/en/*
This came up for me when writing all of my custom redirects, and it turns out the solution was to add an "L" to the redirect line. Give the following at try:
RewriteRule ^en/(.+)$ http://www.example.com/$1 [L,R=301]
Note the "L" near the end of the line. That, according to the Apache RewriteRule docs, means "Stop the rewriting process here and don't apply any more rewrite rules".
In addition to what sillygwailo suggest, I'd recommend you to make sure that your RewriteCond (needed, I think) actually matches..
from the apache docs:
=CondPattern' (lexicographically equal)
Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString is lexicographically equal to CondPattern (the two strings are exactly equal, character for character). If CondPattern is "" (two quotation marks) this compares TestString to the empty string.
So, It could possibly match only an URL containing an actual '*'..? Not sure, but you could also try this:
RewriteCond %{REQUEST_URI} ^/en/.*