What is "RewriteRule ^.* /sitename:::144.html? [L,R=301]" actually doing? - apache

What the effect of the following code is?
RewriteRule ^.* /site:::144.html? [L,R=301]
I couldn't find any matching entries in Google explaning the same.

It matches any request ^.* (“starts with any number of any arbitrary character”), and redirects it to /site:::144.html.
The question mark at the end of the target means any existing query string of the original request will be discarded.
The L flag means this will be the last rule interpreted in the current round of rewriting (when configured in .htaccess, the rewriting process “loops”, until no more rules match the current internal request),
and R=301 means it will use status code 301 for a permanent redirect.

Related

htaccess url redirect with get parameters ID and reduce value

I want to do an url redirect to a new domain by retrieving the ID parameter but only taking the first 4 characters. Anyone know how to do this?
For example, an original url:
http://www.original.example/see/news/actualite.php?newsId=be9e836&newsTitle="blablabla"
To :
https://www.new.example/actualites/be9e
I have tested :
RewriteCond %{QUERY_STRING} ^newsId=(.*)$ [NC]
RewriteRule ^$ https://www.new.example/actualites/%1? [NC,L,R]
RewriteCond %{QUERY_STRING} ^newsId=(.*)$ [NC]
RewriteRule ^$ https://www.new.example/actualites/%1? [NC,L,R]
There are a couple of problems with this:
The regex ^$ in the RewriteRule pattern only matches the document root. The URL in your example is /see/news/actualite.php - so this rule will never match (and the conditions are never processed).
The regex ^newsId=(.*)$ is capturing everything after newsId=, including any additional URL parameters. You only need the first 4 characters of this particular URL param.
As an aside, your existing condition is dependent on newsId being the first URL parameter. Maybe this is always the case, maybe not. But it is relatively trivial to check for this URL parameter, regardless of order.
Also, do you need a case-insensitive match? Or is it always newsId as stated in your example. Only use the NC flag if this is necessary, not as a default.
Try the following instead:
RewriteCond %{QUERY_STRING} (?:^|&)newsId=([^&]{4})
RewriteRule ^see/news/actualite\.php$ https://www.new.example/actualites/%1 [QSD,R,L]
The %1 backreference now contains just the first 4 characters of the newsId URL parameter value (ie. non & characters), as denoted by the regex ([^&]{4}).
The QSD flag (Apache 2.4) discards the original query string from teh redirect response. No need to append the substitution string with ? (an empty query string), as would have been required in earlier versions of Apache.
UPDATE:
I have an anchor link (#) which is added at the end of the link, is there a possibility of deleting it to make a clean link? Example, currently I have: https://www.new.example/news/4565/#title Ideally : https://www.new.example/news/4565
The "problem" here is that the browser manages the "fragment identifier" (fragid) (ie. the "anchor link (#)") and preserves this through the redirect. In other words, the browser re-appends the fragid to the redirect response from the server. The fragid is never sent to the server, so we cannot detect this server side prior to issuing the HTTP redirect.
The only thing we can do is to append an empty fragid (ie. a trailing #) in the hope that the browser discards the original fragment. Unfortunately, you will likely end up with a trailing # on your redirected URLs (browser dependent).
For example (simplified):
:
RewriteRule .... https://example.com/# [R=301,NE,L]
Note that you will need the NE flag here to prevent Apache from URL-encoding the # in the redirect response.
Like I say above, browsers might handle this differently.
Further reading:
URL Fragment and 302 redirects
redirect is keeping hash
How to clear fragment identifier on 302 redirect?

What does # symbol mean in mod_rewrite rules

I was looking for a way to prevent my PDF files to be accessed from direct URL on my website and I found theses htaccess rules :
RewriteEngine On
RewriteCond %{HTTP_HOST}##%{HTTP_REFERER} !^([^#]*)##http?://\1/.*
RewriteRule .*\.pdf [NC,F]
Even though it seems to work perfectly, I don't really understand what these # symbols mean on the RewriteCond rule. I've some basics with regex but I haven't found anything related to these on apache and regex docs, and the article where I found the rules doesn't provide any info.
Any ideas?
Short answer: Its basically a segregator between values of HTTP_HOST and %{HTTP_REFERER}. To match their values while performing condition check in RewriteCond directive.
Explained answer: Now why we are putting these ## characters as a segregator between 2 apache variables. Its simple whenever we want to compare if 2 values are EQUAL or SAME then we use it, because this helps us to catch value in capturing group and then later if back reference value used in condition is NOT same then our condition will fail.
Now come on to this current scenario:
let's say our domain name is: www.example.com
and HTTP_REFERER value is: http://www.example.com/en-US/JavaScript
Then what %{HTTP_HOST}##%{HTTP_REFERER} will do is:
it will make value as:
www.example.com##http://www.example.com/en-US/JavaScript
Now come on the right side of Cond line:
!^([^#]*)##http?://\1/.*
You see capturing group will have value as www.example.com and when we are using it as \1 in http?://\1 its actually checking if URL is http://www.example.com/.* or not. if its NOT EQUAL then go ahead with the request of URI.
Basically why we are doing this because there is NO direct way to check if 2 values are equal or not in URI.
Suggestion on improving your Rules:
RewriteEngine On
RewriteCond %{HTTP_HOST}##%{HTTP_REFERER} !^([^#]*)##http?://\1/.*
RewriteRule .*\.pdf/?$ - [NC,F]

Rewrite rule and the_request

How to rewrite search/2 from index.php?search="x"&&searc_by="y"&page_no=2?
If I am not wrong %REQUEST_URI is search/2, right? Also what is %THE_REQUEST in this case.
The page where search/2 link is located is rewritten as just home_page.
%{REQUEST_URI} and %{THE_REQUEST} are variables in mod_rewrite. These variables contain the following:
%{REQUEST_URI} will contain everything behind the hostname and before the query string. In the url http://www.example.com/its/a/scary/polarbear?truth=false, %{REQUEST_URI} would contain /its/a/scary/polarbear. This variable updates after every rewrite.
%{THE_REQUEST} is a variable that contains the entire request as it was made to the server. This is something in the form of GET /its/a/scary/polarbear?truth=false HTTP/1.1. Since the request that was made to the server is static in the lifespan of one such request, this variable does not change when a rewrite is made. It is therefore helpful in certain situations where you only want to rewrite if an external request contained something. It is often used to prevent infinite loops from happening.
A complete list of variables can be found here.
In your case you will have a link to search/2?search=x&search_by=y. You want to internally rewrite this to index.php?search=x&search_by=y&page_no=2. You can do this with the following rule:
RewriteRule ^search/([0-9]+)$ /index.php?page_no=$1 [QSA,L]
The first argument matches the external request that comes in. It is then rewritten to /index.php?page_no=2. The QSA (query string append) flag appends the existing query string to the rewritten query string. You end up with /index.php?search=x&search_by=y&page_no=2. The L flag stops this 'round' of rewriting. It's just an optimalization thing.

How does one add something to the end of an url with mod_rewrite

I need to add ?lang=English to an url /self-service/more here if a visitor comes from a specific domain. (not comes from, thats impossible but i meant was first at site A and then clicked a link and went to my site)
How can i do that? I tried reading the manual but its actually a bit over my head
Try using the mix of rewrite condition and rewrite rule like this, placed on top of .htaccess
RewriteCond %{HTTP_REFERER} ^(http|https)://best-graphic-design\.co\.uk [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1?lang=English [R=301,L,QSA]
A simple explanation
First check for the referer to your site, NC flag stands for nocase/ignore case
If condition matches, redirect to desired location, in your case, same host while appending the new parameter, flags used here are R - Redirect, L - Last, QSA - Query String Append.
Hope it finally helps.

How can I get mod_rewrite to match a rule just once

I have the following URL...
http://localhost/http.mygarble.com/foundationsofwebprogramming/86
...that I want to convert into the following:
http://localhost/http.mygarble.com/php/blog.php?subdomain=foundationsofwebprogramming&page=posts&label=86
I thought I could achieve this with the following rule:
RewriteRule ([^/]+)/([^/]+)$ php/blog.php?subdomain=$1&page=post&label=$2 [NC,L]
However what I find is that this rule is applied repeatedly, resulting in an internal server error. I understand that when the URI is transformed using this rule, the resulting URI will also match the rule, and therefore it is applied again ad-infinitum.
My previous (admittedly rather hazy) understanding was that the [L] flag would stop further processing, although I now understand that this simply means that only the remainder of the rules are skipped, and does not stop the rewrite engine running through the rules again.
I can fix this problem by adding the following condition...
RewriteCond $0 !php/blog.php
RewriteRule ([^/]+)/([^/]+)$ php/blog.php?subdomain=$1&page=post&label=$2 [NC,L]
...or by writing a more specific regular expression. But what I really want to do is find a way of stopping the rewrite engine from attempting ANY further matches once this rule is matched once. Is this possible?
Many thanks.
Usually 2 methods are used.
The first one is a Rewrite Condition testing that the requested file is not a real file. When internal recursion arise your php/blog.php is a real file and rewriterule is not executed the 2nd time. Side-effect is that any request for a file which exists won't be rewritten (which can be good side effect)
RewriteCond %{REQUEST_FILENAME} !-f
The second solution is to check you're not in an internal redirection with:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
Side effect of this 2nd solution is that the rewriteRule cannot be applied if some other rules are applied before (if you want some internal redirection to run after a first pass of rewriting in fact).
Edit
For completion I will add a third method: the [NS] or [nosubreq] tag seems to be doing the same thing. Preventing the rule usage after an internal redirection.
And the third method is to upgrade apache to 2.3.9 or higher and use [END] flag instead of [L].
No side effects