.htaccess redirect for specific URL structures - apache

I have the following URL:
https://www.site-a.xyz/tutorials/post-name/2
I need it to redirect to the following URL
https://www.site-b.xyz/post-name/2
Essentially If there is a trailing number element to the URL (in this case /2) I need the /tutorials/ part of the URL to be removed.
Note: ONLY if there is a trailing number

Try the following (using mod_rewrite) near the top of your .htaccess file at www.site-a.xyz:
RewriteEngine On
RewriteRule ^tutorials/([^/]+/\d+)$ https://www.site-b.xyz/$1 [R=302,L]
In this case, the trailing "number" can be 1 or more digits. If it is only a single digit (as in your example) then this should be simplified (change \d+ to \d). The $1 is a backreference to the captured group in the RewriteRule pattern.
Note that this is a 302 (temporary) redirect, if this is intended to be permanent then change to 301 when you are sure it's working OK. 301s are cached by the browser so can make testing problematic.
UPDATE: To allow for an optional trailing slash on the source URL then add /? near the end of the RewriteRule pattern, like so:
RewriteRule ^tutorials/([^/]+/\d+)/?$ https://www.site-b.xyz/$1 [R=302,L]
This notatably strips that optional trailing slash from the redirect target. (Thus avoiding any duplicate content issues.)

Related

remove //xx after directory in URL

I need to redirect all URLs like this:
example.com/podcasts//rebt
to
example.com/podcasts
I am trying to adjust this code to do both but I can't get it to work:
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{REQUEST_URI} ^(.*?)(/{2,})(.*)$
RewriteRule . %1/%3 [R=301,L]
To remove //<something> at the end of the URL-path (eg. /podcasts//rebt to /podcasts, try the following instead at the top of the root .htaccess file:
RewriteEngine On
RewriteCond %{THE_REQUEST} \s([^?]+?)//
RewriteRule . %1 [R=301,L]
THE_REQUEST server variable contains the first line of the initial request headers (eg. GET /podcasts/rebt HTTP/1.1) and does not change when the request is internally rewritten (unlike REQUEST_URI).
The regex \s([^?]+?)// captures the part of the URL-path before the first instance of a double slash. Anything after and including the double slash, are discarded. This regex also ensures we do not inadvertently match against the query string (if any).
The %1 backreference contains the captured subpattern (ie. everything before the first double slash in the URL-path) from the preceding CondPattern.
Aside: Note that this will not work properly if the preceding URL-path maps to a physical directory, since it will result in two redirects. eg. /directory//something to /directory to /directory/ (by mod_dir). In this case, you should avoid removing the first trailing slash.
You should test first with a 302 (temporary) redirect to avoid any potential caching issues and only change to a 301 (permanent) redirect when you are sure it's working as intended. You should clear your browser cache before testing.
A look at your existing rule...
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{REQUEST_URI} ^(.*?)(/{2,})(.*)$
RewriteRule . %1/%3 [R=301,L]
This code is intended to reduce multiple slashes to single slashes in the URL-path, not remove the double slash and remaining path entirely. eg. /podcasts//rebt to /podcasts/rebt. However, since it checks against the REQUEST_URI server variable (which can change throughout the request) it may not work as intended.
Also, the condition that checks against the REQUEST_METHOD would seem to be redundant, unless you are erroneously POSTing to double-slashed URLs internally? A 301 redirect removes any POST data (since the browser converts it to GET) - hence why the check may be necessary in certain cases.

Redirect an url keeping its queries and adding a fixed one

I am trying to redirect from an url to another url but keeping the queries.
For example, from
/oldurl?query1=yes&query2=yes&... (or any list of queries)
to
/newurl?fixedquery=yes&query1=yes&query2=yes&...
So in pratice it would redirect the old url and its queries to a new url, keeping the old queries, plus a fixed query.
This is what I have been trying to use (unsuccessfully) in the .htaccess:
RedirectMatch 301 /oldurl/?$ newurl/?fixedquery=yes&$1
I also tried before using Rewrite
RewriteCond %{QUERY_STRING} ^fixedquery=yes$
RewriteRule ^oldurl/?$ newurl/? [R=301,L]
But this simply redirects if for /oldurl (adding fixedquery) and gives a 404 in case I pass a query to oldurl (e.g. /oldurl?var1=1).
Where am I wrong?
You need to use mod_rewrite in order to manipulate the query string. Try the following instead:
RewriteRule ^oldurl/?$ /newurl?fixedquery=yes [QSA,R=301,L]
Your example URL omits the trailing slash on the target URL, so I omitted it here. However, in your directives, you are including it?
The QSA flag appends/merges the original query string from the request. So the resulting URL is /newurl?fixedquery=yes&query1=yes&query2=yes&..., where query1=yes&query2=yes&... were passed on the initial request.
UPDATE: Note that this rule needs to go near the top of the file, before any existing rewrites. The order of directives in the .htaccess file can be important. Test first with 302 (temporary) redirects to avoid potential caching issues. And you will need to ensure the browser cache is cleared before testing.
A look at your attempts...
RedirectMatch 301 /oldurl/?$ newurl/?fixedquery=yes&$1
The RedirectMatch (mod_alias) directive matches against the URL-path only, but you have no capturing subgroup in the regex, so the $1 backreference is always empty.
RewriteCond %{QUERY_STRING} ^fixedquery=yes$
RewriteRule ^oldurl/?$ newurl/? [R=301,L]
This matches a request for /oldurl?fixedquery=yes and redirects to newurl/ - removing the query string entirely. However, this is also reliant on RewriteBase being set, otherwise this will result in a malformed redirect, exposing your directory structure.

htaccess rewrite condition that uses regex

I'm a noob when it comes to regex. What I'm trying to accomplish is:
https://www.example.com/shop/product-floating-front-rotor-kit/
should redirect to
https://www.example.com/shop/product-matching-front-rotor/
product should be the name of the product, I have to do this for multiple products.
Edit: This is what I have so far, am I even close?
RewriteEngine On
RewriteRule ^/shop/([a-z]+)-floating-front-rotor-kit/ ^/shop/$1-matching-front-rotor/
RewriteRule ^/shop/([a-z]+)-floating-front-rotor-kit/ ^/shop/$1-matching-front-rotor/
This is close, except that...
In .htaccess the URL-path matched by the RewriteRule pattern (first argument) does not start with a slash.
The substitution string has an erroneous ^ prefix. This should be an "ordinary" string, not a regex.
[a-z] does not match hyphens/dashes, which you state could occur in a product name.
You have not included an end-of-string anchor ($) on the end of the RewriteRule pattern, so any trailing string will be successful and discarded. (Is that the intention?)
This is an internal "rewrite", not a "redirect" as stated. You need to include the R flag. (An internal rewrite is unlikely to work here, since the target URL requires further rewriting.)
Try the following instead. This should go at the top of the .htaccess file, immediately after the RewriteEngine directive.
RewriteRule ^shop/([a-z-]+)-floating-front-rotor-kit/$ /shop/$1-matching-front-rotor/ [R=302,L]
This is a 302 (temporary) redirect.

htaccess RewriteRule : Problem to omit all after 1st argument

Goal: Want to rewrite all URLs of type
https://www.example.com/page/1234/?/blog/foo/bar/
to
https://www.example.com/page/1234/
In .htaccess I tried many variations along the line
RewriteEngine On
RewriteBase /
RewriteRule ^page/(\d+)/(.*)$ /page/$1 [R=301,L]
Using an .htaccess tester I see that at least the matching pattern is valid.
I would expect that the rewrite would not include anything after $1, but it does, and show the complete original URL.
What am I missing?
https://www.mypage.com/page/1234/?/blog/foo/bar/
Everything after the first ? is the query string part of the URL. By default, Apache passes the query string unaltered from the request to the target URL (unless you create a new query string yourself on the RewriteRule substitution). This explains why you are seeing the same query string on the target URL, without seemingly doing anything with it.
Incidentally, the RewriteRule pattern only matches against the URL-path only - this notably excludes the query string. To match the query string in mod_rewrite you need an additional condition that checks the QUERY_STRING server variable.
On Apache 2.4+ you can use the QSD (Query String Discard) flag to remove the query string from the target URL. Or, specify an empty query string on the substitution - by including a trailing ? (the ? itself does not appear on the resulting URL).
For example (on Apache 2.4):
RewriteCond %{QUERY_STRING} .
RewriteRule ^page/(\d+)/ /page/$1/ [QSD,R=301,L]
The RewriteCond directive checks for the presence of a query string, which is necessary to prevent a redirect loop.
The trailing (.*)$ on the RewriteRule pattern was superfluous.
You had omitted the trailing slash on the end of the substitution (that is present on the example URL). This would have also prevented a redirect loop, but as mentioned, this is not as per your example. (Alternatively, you could include the slash in the captured backreference.)
If you are still on Apache 2.2 then you would need to include a trailing ? instead of the QSD flag. For example:
RewriteRule ^page/(\d+)/ /page/$1/? [R=301,L]
You will need to clear your browser cache before testing, as 301 (permanent) redirects are cached persistently by the browser. For this reason, it is often easier to first test with 302 (temporary) redirects.

.htaccess rewrite with slash

I'm trying to URL rewrite using .htaccess
from
example.com/daily.php to example.com/daily (and example.com/daily/)
with the following code:
Options +FollowSymLinks
RewriteEngine on
RewriteRule daily/$ daily.php
however:
example.com/daily/ = ok
example.com/daily = not ok
RewriteRule daily/$ daily.php
In the above RewriteRule directive, daily/$ is a regular expression (regex) that matches against the URL-path in the request. This regex contains a trailing slash (/), so this will clearly not match a URL that does not end in a slash.
If you want to match both /daily/ and /daily (although I would not recommend this - see note below) then you need to make the trailing slash optional in the regex. You make this character optional by following it with ? (question mark). For example:
RewriteRule ^daily/?$ daily.php [L]
I've also included a start-of-string anchor ^, so it only matches /daily and not /<anything>daily. You will probably want the L (last) flag, if you plan on adding any more directives.
Aside: If you allow both /daily/ and /daily, which are technically two different URLs then you potentially have "duplicate content". You should choose one or the other as the canonical URL. And optionally route the non-canonical version to the other.