Remove .html and .html/amp extension with .htaccess only for files from directory - apache

After a site move I want to be able to remove the extension (if any) and query string (if any) to leave just the file name and keep the path
https://www.example.com/blog/anyfile.html
301 to >> https://www.example.com/blog/anyfile
https://example.com/blog/anyfile.html/amp
301 to >> https://www.example.com/blog/anyfile
https://www.example.com/blog/anyfile.html/amp?nonamp=1
301 to >> https://www.example.com/blog/anyfile
I tried something like this, but it doesn't keep the /blog/ folder:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/blog/
RewriteRule ^.*/([^/]+)\.html$ /$1? [L,NC,R]
also, I can't find a way to remove /amp after .html

Near the top of the root .htaccess file you could do something like the following to discard .html and .html/amp and .html/<anything> from the end of the URL-path. And discard the query string (if any) at the same time:
# Strip ".html" onwards from the end of the URL (and remove query string)
RewriteRule ^(.*)\.html(/.*)?$ https://www.example.com/$1 [QSD,R=301,L]
The QSD (Query String Discard) flag is preferable to appending an empty query string in order to remove the query string on Apache 2.4+.
You need to hardcode the scheme + hostname if you wish to satisfy your second example and redirect from example.com to www.example.com. This could be generalised (without hardcoding the domain) if we know that your site is only accessible by the www subdomain or domain apex and this single domain.
However, the above won't catch URLs that only include a query string, but don't contain .html in the URL-path. For that you could implement an additional rule, following the rule above:
# Strip the query string from any URL.
RewriteCond %{QUERY_STRING} .
RewriteRule ^ https://www.example.com%{REQUEST_URI} [QSD,R=301,L]
A look at your existing rule:
RewriteCond %{REQUEST_URI} ^/blog/
RewriteRule ^.*/([^/]+)\.html$ /$1? [L,NC,R]
You are only capturing the filename (anyfile in your example) and discarding the URL-path that precedes this (ie. blog/). So the $1 backreference only contains anyfile. This also only matches URLs that end in .html and not .html/amp.
Checking the URL-path in the RewriteCond directive is superfluous.

Related

htaccess redirect to exclude certain pages

My htaccess file currently redirects everything and has this
RewriteEngine On
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.domain.co.uk/$1 [R,L]
I need to exclude two urls that begin with "send"
I changed the last line to
RewriteRule !^send(.*) https://www.domain.co.uk/$1 [R,L]
It excluded the send urls but any url in a subfolder is redirected to the root index page.
RewriteRule !^send(.*) https://www.domain.co.uk/$1 [R,L]
Negated patterns don't capture anything (by definition), but trying to capture everything after send when send is not present in the URL, doesn't make much sense.
You can do something like the following and use the REQUEST_URI server variable in the substitution instead of the backreference:
RewriteRule !^send https://www.domain.co.uk%{REQUEST_URI} [R,L]
Note that the REQUEST_URI server variable already contains the slash prefix.

In Apache how to do an external redirect to the slashless version of a URL with a subfolder .htaccess file

On Apache 2.4 I have an .htaccess (in a subfolder) which rewrites slashless requests inside that folder to appropriate index files:
DirectorySlash Off
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_URI} !/$
RewriteCond %{REQUEST_FILENAME}/index.html -f
RewriteRule (.*) $1/index.html [L]
This works for the slashless version exactly as expected. Now I want to redirect the slashed version externally to the slashless version. I tried adding the lines:
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_URI} /$
RewriteRule ^(.*)/ $1 [R=302,L]
However this does not work: The redirect is issued, however it does not go to the slashless URL, but to a URL with a system specific part injected.
So, for a sample URL http://example.com/path/to/dir/ the redirected URL looks like this http://example.com/fs9e/username/sub/public/path/to/dir instead of just http://example.com/path/to/dir.
How can I fix this? Many thanks for any pointers!
PS: The real case is a little bit more complicated because I do a subdomain-to-folder rewrite in the root .htacces, but I assume this is not relevant here.
RewriteRule ^(.*)/ $1 [R=302,L]
You are missing the slash prefix (/) on the substitution string (2nd argument) - to make the substitution root-relative. Or rather, /subfolder/ (since this .htaccess file is located in a subfolder). Since this is a relative substitution string (not starting with a slash or scheme+hostname), the directory-prefix*1 (which I assume is /fs9e/username/sub/public/path/) is added back (by default*2), resulting in a malformed redirect. (This is correct for internal rewrites, but not external redirects.)
It should be like this:
RewriteRule ^(.*)/$ /subfolder/$1 [R=302,L]
Note you were also missing the end-of-string anchor ($) on the RewriteRule pattern. (This also negates the need for the preceding condition that checks that REQUEST_URI ends in a slash.)
Note also that this "redirect" should go before the earlier "rewrite".
*1 The directory-prefix is the absolute filesystem path of the location of the .htaccess file.
*2 The alternative is to set a RewriteBase /subfolder - but that then affects all relative substitutions. You could also use an environment variable to apply a specific prefix only to some rules.

Can't pass parameters to another redirect website

I want to redirect to another website, and pass the parameters also.
Example: I go to my website: source.example/?code=12345
Then, I want it to redirect to target.example/?code=12345.
I am currently using this for my .htaccess file, since I figured out from other posts that if I query a certain parameter, it will get passed also:
RewriteEngine On
RewriteCond %{QUERY_STRING} ^code=[NS]$
RewriteRule "www.google.com" /$1 [R=302,L]
Also, I tried many different approaches looking at these stack questions:
simple .htaccess redirect : how to redirect with parameters?
Redirect and keep the parameter in the url on .htaccess
But I can't get it running :(
since I figured out from other posts that if I query a certain parameter, it will get passed also
This is not true. The query string is passed through by default - there is nothing extra you need to do if you want the same query string on the target URL.
RewriteCond %{QUERY_STRING} ^code=[NS]$
RewriteRule "www.google.com" /$1 [R=302,L]
This code won't match the source URL for many reasons:
"www.google.com" - The first argument to the RewriteRule directive is a regex that matches the source URL-path (less the slash prefix). In your example the URL-path is empty.
^code=[NS]$ matches either code=N or code=S - which is not the intention from your example. (The [NS] looks like a mangled RewriteRule flag?!)
/$1 - this is the substitition string, ie. the URL you want to redirect to. (The $1 backreference is always empty, so this is meaningless.)
To redirect from source.example/?code=<number> to https://target.example/?code=<number> then try the following instead:
RewriteCond %{HTTP_HOST} ^source\.example [NC]
RewriteCond %{QUERY_STRING} ^code=\d+$
RewriteRule ^$ https://target.example/ [R=302,L]
This only matches a query string of the form code=1234. It does not match code= or code=1234&foo=bar, etc.
The query string is passed through by default.
If source.example is the only domain being hosted at the current location then you can remove the first condition that explicitly checks the requested hostname.
The order of directives in the .htaccess file is important. An external redirect like this should go near the top.

removing directory in apache mod_rewrite

I have a PHP site which replaces an ASP site, so the path structure is different.
In the URLs, I need to match http://apache.site/Cartv3/Details.asp & redirect to another location. What is the correct syntax to match that URL fragment?
I've already tried
RewriteCond %{REQUEST_URI} CartV3/results1.asp?Category=60
RewriteRule ^(.*)$ home-study/A-Levels/1/page-1 [R=301,L]
and
RewriteRule ^CartV3/Details\.asp?ProductID=1004 home-study/A-Levels/1/page-1 [R=301,L]
You meed to read more about mod_rewrite. Remember RewriteRule doesn't match query string. You attempt needs to be rewritten as:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
RewriteCond %{QUERY_STRING} ^Category=60$ [NC]
RewriteRule ^CartV3/results1\.asp$ /home-study/A-Levels/1/page-1? [R=302,L,NC]
Once you verify it is working fine, replace R=302 to R=301. Avoid using R=301 (Permanent Redirect) while testing your mod_rewrite rules.
PS: ? after page-1 is a special mod_rewrite syntax to strip original query string. If you want to keep original query string in rewritten URL then take out ? in the end.
The problem here is that you are trying to match the query string, which has to be done by a separate RewriteCond. If you want the match specifically "Category=60", then you can add it as a Condition:
RewriteCond %{QUERY_STRING} Category=60
RewriteCond %{REQUEST_URI} /CartV3/results1.asp
RewriteRule .* home-study/A-Levels/1/page-1?
This will match http://example.com/CartV3/results1.asp?Category=60 and redirect. The ? at the end of the rule stops "?Category=60" being to the resulting URI.
If you don't care about the value in the query string, then you can remove the first condition.

.htaccess - redirect favicon

How do I redirect all requests for favicon.ico in root directory or any subdirectory to /images/favicon.ico
Try this rule:
RewriteEngine on
RewriteRule ^favicon\.ico$ /images/favicon.ico [L]
Edit    And for favicon.ico with arbitrary path segment depth:
RewriteCond $0 !=images/favicon.ico
RewriteRule ^([^/]+/)*favicon\.ico$ /images/favicon.ico [L]
For a favicon at www.mysite.com/images/favicon.ico
the most robust method would be:
RewriteCond %{REQUEST_URI} !^/images/favicon\.ico$ [NC]
RewriteCond %{HTTP_HOST} (.+)
RewriteRule ^(.*)favicon\.(ico|gif|png|jpe?g)$ http://%1/images/favicon.ico [R=301,L,NC]
Explanation:
RewriteCond %{REQUEST_URI} !^/images/favicon\.ico [NC] :
- ensures that the redirect rule does NOT apply if the correct URI is requested (eg a 301 redirect will write the correct favicon URI to browser cache - and this line avoids processing the rule if the browser requests the correct URI)
- [NC] means it's not case sensitive
RewriteCond %{HTTP_HOST} (.+) :
- retrieves the http host name - to avoid hard coding the hostname into the RewriteRule
- this means you can copy your .htaccess file between local/test server and production server without problems (or the need to re-hardcode your new site base url into your RewriteRule)
RewriteRule ^(.*)favicon\.(ico|gif|png|jpe?g)$ http://%1/images/favicon.ico [R=301, L] :
- ^ is the start of the regex
- (.*) is a wildcard group - which means that there can be zero or any number of characters before the word favicon in the URI (ie this is the part that allows root directory or any subdirectories to be included in the URI match)
- \.(ico|gif|png|jpe?g) checks that the URI extension matches any of .ico, .gif, .png, .jpg, .jpeg
- $ is the end of the regex
- http://%1/images/favicon.ico is the redirect url - and it injects the hostname we retrieved in the previous RewriteCond. Note that the %1 is a called a RewriteCond backreference this means it is the last RewriteCond that has been met. (eg %2 would be the 2nd-last RewriteCond that to have been met)
- R=301 means it's a permanent redirect - which stores the redirect in the browser cache. Be careful when testing - you'll need to delete browser cache between code changes or the redirect won't update. Probably leave this out until you know the rule works.
- L means its the last redirect to be followed in this .htaccess file - you won't need this to get the rule working since line 1 won't be met once the browser is directed to the correct url. Without the either line 1 or L the RewriteRule will result in a permanent loop (since the redirect URL will keep satisfying the RewriteRule conditions). However, it's a good idea to add the L anyway if you have other rules following the favicon rules - since on a favicon.ico request, you can (probably) ignore any following rules.
You can test .htaccess rules at http://htaccess.mwl.be/
Final Note:
- be careful that you don't have any other RewriteRule in an .htaccess file located in any of your sub-directories.
- eg if you put this answer in your www.mysite.com/ root folder .htaccess file, a RewriteRule (.*) xxx type rule in your www.mysite.com/images/ folder can mess with the results.
RewriteEngine on
RewriteRule ^(.*)favicon\.ico /images/favicon.ico [L]
I know the question is tagged .htaccessbut, why not use a symlink?
ln -s images/favicon.ico favicon.ico
This quick rewrite should do the trick:
RewriteRule ^(.*)favicon.ico /images/favicon.ico