Redirect URLs with a trailing slash to URLs with no trailing slash via htaccess rule with [QSA,L] - apache

Recently I broke ties with WordPress and migrated all of my site's content to my own custom-made CMS. All works great except for one thing. All previous links to my site's blog posts have a trailing slash. Since none of my current URLs have a trailing slash, the previous links no longer work and my SEO is nearly non-existent. I've been attempting to find an htaccess rule that will redirect all trailing slash URLs to URLs with no trailing slash, but as of now, nothing works.

Use this redirect rule as your very first rule to remove trailing slash:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ /$1 [NE,R=301,L]

You don't want to remove it 100%for SEO in WordPress.#
But this will show you how to if you want that. WordPress "/" fixes. My WordPress Gist
Can convert so it's helpful
Config File: nginx.conf
...
location /mirror/foo/ {
...
rewrite ^(.*[^/])$ $1/ permanent;
...
}
...
Description WordPress "/" PHP fixes
Retrieve trailing slash string, if blog set for adding trailing slashes.
Conditionally adds a trailing slash if the permalink structure has a trailing slash, strips the trailing slash if not. The string is passed through the ‘user_trailingslashit’ filter. Will remove trailing slash from string, if blog is not set to have them.
Usage
<?php user_trailingslashit( $string, $type_of_string ); ?>
Parameters
$string
(string) (false) URL with or without a trailing slash.
Default: None
$type_of_url
(string) (false) The type of URL being considered (e.g. single, category, etc) for use in the filter.
Default: None
Return Value (string)
Adds/removes a trailing slash based on the permalink structure.
codex.wordpress. /Function_Reference/user_trailingslashit
2.
If you want to add "/"
<?php trailingslashit( $string ) ?>
Examples
<?php
$path = trailingslashit( '/home/julien/bin/dotfiles' );
?>
$path will now contain:
/home/julien/bin/dotfiles/
(Notice the trailing slash)
https://codex.wordpress.org/Function_Reference/trailingslashit
3.
This is new
The most important part of any test is the assertion. An assertion is a comparison between the value you expect to get from the system and the value that you actually get. The very simplest tests may consist of nothing but a single assertion. Example:
public function test_trailingslashit_should_add_slash_when_none_is_present() {
$this->assertSame( 'foo/', trailingslashit( 'foo' ) );
}
The assertSame() method accepts two parameters: the expected value (in this case, the hardcoded string 'foo/'), and the actual value (the value returned by trailingslashit()).
An annotated list of common assertions can be found below.
https://make.wordpress.org/core/handbook/testing/automated-testing/writing-phpunit-tests/#assertions
Try these rules: (in www.example.com's server block) rewrite ^/$ http://example.com permanent break; rewrite ^/main(.*)$ http://example.com$1 permanent break; rewrite ^(.*)$ http://blog.example.com$1 permanent; Make sure you reload nginx.
With this config:
http://www.example.com/ *redirects to
http://example.comt -
http://www.example.com/main/something redirects to
http://example.com/something - Everything else redirects to
http://blog.example.com/

Related

rewrite request for /folder to folder/index.php without 301 redirect with apache

So I put an index.php in /pipe/index.php
I'd like to rewrite (internal, not redirect)
https://host/pipe?token=abc to https://host/pipe/index.php?token=abc
what I tried (caveat, assumes there is always a ? in the url):
RewriteEngine on
RewriteRule "^([^?]*)(.*)$" "$1/$2" [PT]
my hope was to split at the ? and just insert a / there.
But it seems apache finds out that "oh, pipe is a folder" before checking my .htacces (?) Because despite my [PT] it still redirects with 301 to /pipe/?token=abc, when I hoped for internal rewrite.
But it seems apache finds out that "oh, pipe is a folder" before checking my .htacces (?)
Yes, mod_dir will append the trailing slash with a 301 redirect. Although this occurs after mod_rewrite has processed the URL (if indeed it is being processed at all - see below). (The PT flag is irrelevant in .htaccess, since the resulting rewrite is passed through as a URL-path by default.)
RewriteRule "^([^?]*)(.*)$" "$1/$2" [PT]
However, your existing rule (by itself) would result in a rewrite-loop (500 Internal Server Error) since it matches itself and repeatedly appends a slash. If you are seeing a 301 redirect as mentioned above then either this rule is not doing anything (are .htaccess overrides enabled?) or you have a conflict with other rules.
As you've stated, this rule also assumes that the query string (with leading ?) is also matched by the RewriteRule pattern. The RewriteRule directive matches against the URL-path only, not the query string. $2 in the above rule is therefore always empty (unless you have %3F in the URL-path, ie. a %-encoded ?).
The query string is contained in its own variable, QUERY_STRING. But you simply want to pass through the same query string, so you don't need to do anything special here, since that happens by default.
Solution
To prevent mod_dir appending the trailing slash, you need to set DirectorySlash Off at the top of the root .htaccess file.
Note that these directives must go in the .htaccess file in the root/parent directory, as opposed to the subdirectory that has the trailing slash omitted. This is because the mod_rewrite directives (that "fix" the URL by appending the trailing slash) would never actually be processed in the subdirectory .htaccess file. The trailing slash would seem to be required for mod_rewrite to function. (However, the mod_dir DirectorySlash Off directive would still be processed successfully, so the slash would not be appended.)
For example:
# Prevent mod_dir appending the trailing slash
DirectorySlash Off
# Must disable directory listings when "DirectorySlash Off" is set
Options -Indexes
However, you need to then manually append the trailing slash to any directory, where it is omitted, with an internal rewrite to "fix" the URL (and to correctly serve the DirectoryIndex document, ie. index.php).
# Ensure DirectoryIndex is set correctly
DirectoryIndex index.php
RewriteEngine On
# Append trailing slash to any directory where it has been omitted
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [L]
The trailing slash on the directory (via the internal rewrite) is required in order to serve the DirectoryIndex document, otherwise, you get a 403 Forbidden, even if the DirectoryIndex document is present.
If the trailing slash is omitted and directory listings (mod_autoindex) are enabled (disabled above) then a directory listing would be generated even if a DirectoryIndex document is present in that directory. (Which is why directory listings must be disabled when DirectorySlash Off is set.)
NB: You will need to make sure the browser cache is cleared since the earlier 301 redirect by mod_dir to append the trailing slash will have been cached by the browser.
This probably is what you are looking for:
RewriteEngine on
RewriteRule ^/?pipe/?$ /pipe/index.php [QSA,L]
The QSA flag is actually redundant here, it is the default, but it makes things clearer if you compare it to that variant (both work):
RewriteEngine on
RewriteRule ^/?pipe/?$ /pipe/index.php?%{QUERY_STRING} [QSD,L]
The documentation of the rewriting module, more specific of the RewriteRule directive clearly points out that the query string is not part of the path the rule's pattern is matched against.
If you want to have more control about the content of the query string you can use a RewriteCond:
RewriteEngine on
RewriteCond %{QUERY_STRING} ^token=(.*)$
RewriteRule ^/?pipe/?$ /pipe/index.php?token=%1 [QSD,L]
Also you might want to redirect the original URL:
RewriteEngine on
RewriteRule ^/?pipe/index.php /pipe [QSA,R=301,END]
RewriteRule ^/?pipe/?$ /pipe/index.php [QSA,L]
And finally you might also want to take a look at the DirectoryIndex directive which might offer a solution without any rewriting at all, though this depends a bit on your setup ...

htaccess RewriteRule : Problem to omit all after 1st argument

Goal: Want to rewrite all URLs of type
https://www.example.com/page/1234/?/blog/foo/bar/
to
https://www.example.com/page/1234/
In .htaccess I tried many variations along the line
RewriteEngine On
RewriteBase /
RewriteRule ^page/(\d+)/(.*)$ /page/$1 [R=301,L]
Using an .htaccess tester I see that at least the matching pattern is valid.
I would expect that the rewrite would not include anything after $1, but it does, and show the complete original URL.
What am I missing?
https://www.mypage.com/page/1234/?/blog/foo/bar/
Everything after the first ? is the query string part of the URL. By default, Apache passes the query string unaltered from the request to the target URL (unless you create a new query string yourself on the RewriteRule substitution). This explains why you are seeing the same query string on the target URL, without seemingly doing anything with it.
Incidentally, the RewriteRule pattern only matches against the URL-path only - this notably excludes the query string. To match the query string in mod_rewrite you need an additional condition that checks the QUERY_STRING server variable.
On Apache 2.4+ you can use the QSD (Query String Discard) flag to remove the query string from the target URL. Or, specify an empty query string on the substitution - by including a trailing ? (the ? itself does not appear on the resulting URL).
For example (on Apache 2.4):
RewriteCond %{QUERY_STRING} .
RewriteRule ^page/(\d+)/ /page/$1/ [QSD,R=301,L]
The RewriteCond directive checks for the presence of a query string, which is necessary to prevent a redirect loop.
The trailing (.*)$ on the RewriteRule pattern was superfluous.
You had omitted the trailing slash on the end of the substitution (that is present on the example URL). This would have also prevented a redirect loop, but as mentioned, this is not as per your example. (Alternatively, you could include the slash in the captured backreference.)
If you are still on Apache 2.2 then you would need to include a trailing ? instead of the QSD flag. For example:
RewriteRule ^page/(\d+)/ /page/$1/? [R=301,L]
You will need to clear your browser cache before testing, as 301 (permanent) redirects are cached persistently by the browser. For this reason, it is often easier to first test with 302 (temporary) redirects.

.htaccess rewrite with slash

I'm trying to URL rewrite using .htaccess
from
example.com/daily.php to example.com/daily (and example.com/daily/)
with the following code:
Options +FollowSymLinks
RewriteEngine on
RewriteRule daily/$ daily.php
however:
example.com/daily/ = ok
example.com/daily = not ok
RewriteRule daily/$ daily.php
In the above RewriteRule directive, daily/$ is a regular expression (regex) that matches against the URL-path in the request. This regex contains a trailing slash (/), so this will clearly not match a URL that does not end in a slash.
If you want to match both /daily/ and /daily (although I would not recommend this - see note below) then you need to make the trailing slash optional in the regex. You make this character optional by following it with ? (question mark). For example:
RewriteRule ^daily/?$ daily.php [L]
I've also included a start-of-string anchor ^, so it only matches /daily and not /<anything>daily. You will probably want the L (last) flag, if you plan on adding any more directives.
Aside: If you allow both /daily/ and /daily, which are technically two different URLs then you potentially have "duplicate content". You should choose one or the other as the canonical URL. And optionally route the non-canonical version to the other.

.htaccess redirect for specific URL structures

I have the following URL:
https://www.site-a.xyz/tutorials/post-name/2
I need it to redirect to the following URL
https://www.site-b.xyz/post-name/2
Essentially If there is a trailing number element to the URL (in this case /2) I need the /tutorials/ part of the URL to be removed.
Note: ONLY if there is a trailing number
Try the following (using mod_rewrite) near the top of your .htaccess file at www.site-a.xyz:
RewriteEngine On
RewriteRule ^tutorials/([^/]+/\d+)$ https://www.site-b.xyz/$1 [R=302,L]
In this case, the trailing "number" can be 1 or more digits. If it is only a single digit (as in your example) then this should be simplified (change \d+ to \d). The $1 is a backreference to the captured group in the RewriteRule pattern.
Note that this is a 302 (temporary) redirect, if this is intended to be permanent then change to 301 when you are sure it's working OK. 301s are cached by the browser so can make testing problematic.
UPDATE: To allow for an optional trailing slash on the source URL then add /? near the end of the RewriteRule pattern, like so:
RewriteRule ^tutorials/([^/]+/\d+)/?$ https://www.site-b.xyz/$1 [R=302,L]
This notatably strips that optional trailing slash from the redirect target. (Thus avoiding any duplicate content issues.)

Apache rewrite slash

I want to create rewrite rule(s) that catches couple of urls and redirects them depending if the content is available on the first location. If not, then call a url on the application so that it will regenerate it (and next time we can access it from the hard drive).
Let me insert the code here, so it will be easier to understand:
# I need to catch more than one page (and it has to work with and without the trailing slash!)
RewriteCond %{REQUEST_FILENAME} ^(/?|/page1/?|/page2/subpage/?)$ [NC]
# If the content exists
RewriteCond "%{DOCUMENT_ROOT}%{REQUEST_FILENAME}" -f
# Go to the exported folder and try to serve the page from there
# The first slash problem is here: if I have trailing slash, it will not work, because it will try to go here: /var/www/contentstatic/export/sites/default/$1//index.html
RewriteRule ^(.*)$ /var/www/contentstatic/export/sites/default/$1/index.html
# Otherwise run this rule (regenerate the file)
# This has to be changed (to something), because this will catch anything, but I need only the paths I defined earlier: ^(/?|/page1/?|/page2/subpage/?)$ <- Also I have to make sure the that last trailing slash is not there
RewriteRule ^(.*)$ http://application1:8080/export/sites/default/$1/index.html [P]
# At the bottom of the VirtualHost, there is another application that catches all the requests by default, so that's why I shouldn't use the "^(.*)$" in the previous RewriteRule
RewriteRule ^/(.*) http://application2:8080/$1 [P]
ProxyPassReverse / http://application2:8080/
The problems I have here:
This has to work with and without the trailing slash
I have to specify exactly what URLs to be served up from the /var/www/ folder or from the /export/sites/default folder, because if I don't do that the default application tries that, but it will fail
I also tried to remove the trailing slash from the url if it is there (in the first RewriteRule), but this rule:
[^/](.*)[^/]
changed the url from this: /page2/ to this: age2, so it removed the slashes and the first and last character.
Is it possible to use the same "^(/?|/page1/?|/page2/subpage/?)$" paths in the 3rd and 4th RewriteRule without repeating them?
Thanks