What does # symbol mean in mod_rewrite rules - apache

I was looking for a way to prevent my PDF files to be accessed from direct URL on my website and I found theses htaccess rules :
RewriteEngine On
RewriteCond %{HTTP_HOST}##%{HTTP_REFERER} !^([^#]*)##http?://\1/.*
RewriteRule .*\.pdf [NC,F]
Even though it seems to work perfectly, I don't really understand what these # symbols mean on the RewriteCond rule. I've some basics with regex but I haven't found anything related to these on apache and regex docs, and the article where I found the rules doesn't provide any info.
Any ideas?

Short answer: Its basically a segregator between values of HTTP_HOST and %{HTTP_REFERER}. To match their values while performing condition check in RewriteCond directive.
Explained answer: Now why we are putting these ## characters as a segregator between 2 apache variables. Its simple whenever we want to compare if 2 values are EQUAL or SAME then we use it, because this helps us to catch value in capturing group and then later if back reference value used in condition is NOT same then our condition will fail.
Now come on to this current scenario:
let's say our domain name is: www.example.com
and HTTP_REFERER value is: http://www.example.com/en-US/JavaScript
Then what %{HTTP_HOST}##%{HTTP_REFERER} will do is:
it will make value as:
www.example.com##http://www.example.com/en-US/JavaScript
Now come on the right side of Cond line:
!^([^#]*)##http?://\1/.*
You see capturing group will have value as www.example.com and when we are using it as \1 in http?://\1 its actually checking if URL is http://www.example.com/.* or not. if its NOT EQUAL then go ahead with the request of URI.
Basically why we are doing this because there is NO direct way to check if 2 values are equal or not in URI.
Suggestion on improving your Rules:
RewriteEngine On
RewriteCond %{HTTP_HOST}##%{HTTP_REFERER} !^([^#]*)##http?://\1/.*
RewriteRule .*\.pdf/?$ - [NC,F]

Related

.htaccess : Pretty URL with whatever number+names of parameters

Hello !
I know there already are a lot of topics about URL rewritting and I honestly swear I've spent a lot of time trying to apply them to my problem but I can't see any of them perfectly applying to my situation (if you find otherwise, please give the link).
-----
Here's the problem :
I'm learning MVC model and URL rewriting and I have my URL like this :
http://localhost/blahblahblah/mywebsite/index.php?param1=value1&param2=value2&param3=value3 ... etc ...
What I want (for some MVC template goals) is to have this kind of URL :
http://localhost/blahblahblah/mywebsite/value1/value2/value3 ... etc ...
-----
Whatever are the names of the parameters and whatever are the values.
This is the most essential thing I can't find a solution for.
(Also don't mind the localhost blahblahblah, this has to work even on distant websites but I trust it will work fine on online website has this part of URL may have no importance in what I want to do)
Thanks a lot for your time if you can help me seeing clearer in what I need to do.
If the .htaccess file is located in the document root (ie. effectively at http://localhost/.htaccess) then you would need to do something like the following using mod_rewrite:
RewriteEngine On
RewriteRule ^(blahblahblah/mywebsite)/(\w+)$ $1/index.php?param1=$2 [L]
RewriteRule ^(blahblahblah/mywebsite)/(\w+)/(\w+)$ $1/index.php?param1=$2&param2=$3 [L]
RewriteRule ^(blahblahblah/mywebsite)/(\w+)/(\w+)/(\w+)$ $1/index.php?param1=$2&param2=$3&param3=$4 [L]
# etc.
Where $n is a backreference to the corresponding captured group in the preceding RewriteRule pattern (1st argument).
UDPATE: \w is a shorthand character class that matches a-z, A-Z, 0-9 and _ (underscore).
A new directive is required for every number of parameters. You could combine them into a single (complex) directive but you would have lots of empty parameters when only a few parameters were passed (rather than not passing those parameters at all).
I'm assuming your URLs do not end in a trailing slash.
If, however, the .htaccess file is located in the /blahblahblah/mywebsite directory then then directives could be simplified a bit:
RewriteRule ^(\w+)$ index.php?param1=$1 [L]
RewriteRule ^(\w+)/(\w+)$ index.php?param1=$1&param2=$2 [L]
RewriteRule ^(\w+)/([\w]+)/([\w]+)$ index.php?param1=$1&param2=$2&param3=$3 [L]
# etc.
Don't use URL parameters (alternative method)
An alternative approach is to not convert the path segments into URL parameters in .htaccess and instead just pass everything to index.php and let your PHP script split the URL into parameters. This allows for any number of parameters.
For example, your .htaccess file then becomes rather more simple:
RewriteRule ^\w+(/\w+)*$ index.php [L]
(This assumes the .htaccess file is located in /blahblahblah/mywebsite directory, otherwise you need to add the necessary directory prefix as above.)
The RewriteRule pattern simply validates the request URL is of the form /value1 or /value1/value2 or /value1/value2/value3 etc. And the request is rewritten to index.php (the front-controller) to handle everything.
In index.php you then examine $_SERVER['REQUEST_URI'] and parse the requested URL.

Best practice for a .htaccess internal path rewrite?

We have spend a considerable amount of time looking for a solution else where. We have read and tried the recommended threads. We most likely have a core misunderstanding as to why this, or something along these lines, does not work.
We get a request for a domain:
subdomain.domain.com/embed/34acb453bc4a53abc
We want to leave the URL as it is, but need to direct this to an internal vhost:
embed.example.com/34acb453bc4a53abc
Once the request is directed to this, our system can interpret the 34acb453bc4a53abc and return the appropriate data.
We tried the following (and variations of it) we just get nothing to work.
RewriteCond ^embed\/(.*)$ [NC]
RewriteRule ^ https://embed.example.com%{REQUEST_URI} [L,NE,P]
internal path rewrite
Just to clarify, you can't internally rewrite the request across different hosts. You need to configure a reverse proxy using mod_proxy and related modules. This is what the P flag on the RewriteRule directive is doing... it's passing the request to mod_proxy (providing this is already correctly configured in the server config).
RewriteCond ^embed\/(.*)$ [NC]
RewriteRule ^ https://embed.example.com%{REQUEST_URI} [L,NE,P]
However, this will send the request to https://embed.example.com/embed/34acb453bc4a53abc, not https://embed.example.com/34acb453bc4a53abc as you require.
You need to capture the part of the URL-path after /embded/ and use that instead. You are already capturing this in the RewriteCond directive, but you are not using it. You don't actually need the RewriteCond directive here.
Try the following instead:
RewriteCond %{HTTP_HOST} =subdomain.domain.com
RewriteRule ^embed/([a-z0-9]+)$ https://embed.example.com/$1 [P]
You state that the request is for subdomain.domain.com, so I've included that in the directive.
The L and NE flags are not required here. P implies L and there is nothing that requires the substitution to not be URL encoded. Slashes do not carry any special meaning in the regex, so do not need to be escaped.
I've also made the regex that matches the "code" more restrictive, rather than matching literally anything.
The $1 backreference then matches just the "code" that follows /embed/ in the URL-path.
Note that the order of directives is important. It needs to be before any directives that are likely to result in a conflict.
If the embed and subdomain hosts point to the same place on the filesystem then you can avoid the complexities and overhead of mod_proxy and simply "rewrite" the request on the same host.

Multiple RewriteRules for a single URL

I want to use multiple RewriteRules in .htaccess to modify a single URL, but only the last rule gets applied.
Example INPUT (link loaded by a browser):
http://example.com/aaa/foo/bar
.htaccess:
RewriteRule ^(.*)foo/(.*)$ $1nofoo/$2
RewriteRule ^(.*)bar(.*)$ $1nobar$2
EXPECTED OUTPUT (what Apache should actually look at):
http://example.com/aaa/nofoo/nobar
ACTUAL OUTPUT:
http://example.com/aaa/foo/nobar/
As you can see, only the last rule was applied. Is there any way to make it work the way I want? All suggestions are welcome.
PS. I want to avoid creating a static, ugly rule like
^(.*)foo/bar(.*) $1nofoo/nobar$2
I need all the modifications to work independently of each other.
UPDATE
So here is exactly what I am trying to achieve. I have some links to a backend server:
http://myserver.com/api/user/$userid/car/$carid/getSpeedRecordDetails
http://myserver.com/api/user/$userid/getUserDetails
http://myserver.com/api/car/$carid/getCarDetails
Where $userid and $carid are some unique 12-char-long strings.
And I want to transform them to these:
http://myserver.com/api/getSpeedRecordDetails.php?userid=$userid&carid=$carid
http://myserver.com/api/getUserDetails.php?userid=$userid
http://myserver.com/api/getCarDetails.php?carid=$carid
And I want to achieve it using the least RewriteRules possible (I am looking for a dynamic solution).
UPDATE #2
I love the SO community! Your patience and willingness to help is truly amazing :)
So the very reason why I am interested in modifying the URL using multiple RewriteRules is because I expect that my backend might soon need to implement hundreds (if not thousands) of user-friendly URLs, and mapping all of them individually would be a waste of time and money. Therefore, I want to take advantage of the fact that all the user-friendly URLs consist of repetitive chunks that can be easily translated. The calls below represent the general three types of the user-friendly URLs I need to manage. The only difference within each type is $userid, $carid, and XXXX_of_a_thousand_functions.
http://myserver.com/backend-api/user/$userid/car/$carid/first_of_a_thousand_functions.do
http://myserver.com/backend-api/user/$userid/second_of_a_thousand_functions.do
http://myserver.com/backend-api/car/$carid/sixteenth_of_a_thousand_functions.do
All of these calls (and remember, there will be hundreds, or even thousands of them) need to be translated into these:
http://myserver.com/backend-api/first_of_a_thousand_functions.php?USER_ID=$userid&CAR_ID=$carid
http://myserver.com/backend-api/second_of_a_thousand_functions.php?USER_ID=$userid
http://myserver.com/backend-api/sixteenth_of_a_thousand_functions.php?CAR_ID=$carid
Seeing that there is a simple pattern governing the translation (I hope it is also visible to you), I thought I could create some simple rules for translating different 'chunks' of the user-friendly URL into the internal URL. For example:
RewriteRule ^(.*)user/([A-Za-z0-9]+)/(.*)$ $1$3&USER_ID=$2
Would be responsible for translating the piece user/HSGRE8563LOS into &USER_ID=HSGRE8563LOS
And because some calls have more than one 'chunk' to process, I need to be able to use multiple RewriteRules on a single URL, which I hope somewhat correlates with the title of the question :)
UPDATE #3 - a future reference
So there are a couple of things I believe need to be said regarding this question.
Apparently .htaccess DOES by default apply all the rules that the conditions of are met. However, some CGI/Fast-CGI installations ruin it, resulting in the kind of behaviour depicted in the first example.
Also, one thing I have NEVER seen mentioned anywhere is that Apache applies the RewriteRules not in the order in which they are listed in .htaccess, but it starts 'scanning' the URL from its beginning, and as soons as the conditions for ANY of the rules are met, the URL gets modified accordingly.
After UPDATE#1... to internally rewrite from the stated friendly URLs using mod_rewrite. Try the following directives in the root .htaccess file:
RewriteEngine On
# http://myserver.com/api/user/$userid/car/$carid/getSpeedRecordDetails
RewriteRule ^api/user/(\w{12})/car/(\w{12})/(getSpeedRecordDetails)$ /api/$3.php?userid=$1&carid=$2 [L]
# http://myserver.com/api/user/$userid/getUserDetails
RewriteRule ^api/user/(\w{12})/(getUserDetails)$ /api/$2.php?userid=$1 [L]
# http://myserver.com/api/car/$carid/getCarDetails
RewriteRule ^api/car/(\w{12})/(getCarDetails)$ /api/$2.php?carid=$1 [L]
\w{12} matches a 12 char long string of letters (upper/lower), numbers and underscore. However, this should be made as restrictive as possible. eg. if a valid id is only numeric then \d{12} would be preferable.
UPDATE#2 The process is almost the same as above....
RewriteBase /backend-api
# http://myserver.com/backend-api/user/$userid/car/$carid/<any>.do
RewriteRule ^backend-api/user/(\w{12})/car/(\w{12})/(\w+)\.do$ $3.php?USER_ID=$1&CAR_ID=$2 [L]
# http://myserver.com/backend-api/user/$userid/<any>.do
RewriteRule ^backend-api/user/(\w{12})/(\w+)\.do$ $2.php?USER_ID=$1 [L]
# http://myserver.com/backend-api/car/$carid/<any>.do
RewriteRule ^backend-api/car/(\w{12})/(\w+)\.do$ $2.php?CAR_ID=$1 [L]
Note the use of RewriteBase. This allows the URL-path to be removed from the RewriteRule substitution.
If /backend-api exists as a physical directory then these rules could instead go in /backend-api/.htaccess. Then you could remove the RewriteBase directive and modify the RewriteRule pattern by removing the backend-api/ portion from near the start of the pattern.
UPDATE#3 - a future reference
Apparently .htaccess DOES by default apply all the rules that the conditions of are met. However, some CGI/Fast-CGI installations ruin it, resulting in the kind of behaviour depicted in the first example.
There are certainly some server (mis)configurations that appear to affect certain aspects of .htaccess/mod_rewrite, however, not in the way suggested earlier in your question and I have never encountered this directly myself. And this should have nothing to do with CGI/Fast-CGI. (?)
...Apache applies the RewriteRules not in the order in which they are listed in .htaccess, ...
Not sure exactly what you mean by this, but RewriteRules are processed "in the order in which they are listed in .htaccess" - this is fundamental to how mod_rewrite works. (However, different modules execute at different times, regardless of the order in .htaccess, but within each module the directives execute top-down, in order. eg. mod_rewrite executes before mod_alias (usually), so if you have a mod_alias Redirect before a mod_rewrite RewriteRule, the RewriteRule is still processed first - which is why it is a bad idea to mix redirects from both modules, as you can end up with confusing conflicts.)
...but it starts 'scanning' the URL from its beginning, and as soons as the conditions for ANY of the rules are met, the URL gets modified accordingly.
Exactly, in the order in which they appear in the file.
Note that it's the RewriteRule directives that are scanned (top - down), not the conditions (ie. RewriteCond directives) - in case that is what you were implying. If the RewriteRule pattern matches the URL-path then the preceding RewriteCond directives are processed and if all these pass then the substitution occurs. (Which is why it is always more efficient to match what you can with the RewriteRule and not rely totally on the RewriteCond directives - the RewriteRule is processed first, not the RewriteCond directives.)
Crucially, (and this might be where you are tripping up?), is that if you have multiple RewriteRule directives then the following RewriteRule directives match against the output/substitution of the previously matched RewriteRule (if any), not against the URL-path of the initial request. Only the first matched RewriteRule matches against the URL-path of the request. So, yes, RewriteRule directives do chain together.
The exception to this is the L (LAST) flag on the RewriteRule. This "breaks" the chain. Although not completely... it causes the current round of processing to stop, but then it all starts again from the top! Only when the URL passes through unchanged is processing finished. (However, in Apache 2.4 you do have the END flag - this does indeed halt processing completely.)

Apache mod_rewrite not persisting the name

I usually put my mod_rewrite conditions in an .htaccess file, but this is a case where it must go into the httpd.conf file.
I am confused because what I want to do seems simple:
The root of the site is a nested directory: mydomain.com/foo/bar/
It just has to be that way.
I want to write a rule so a person can enter:
mydomain.com/simple and it will show content from mydomain/foo/bar
Also, if a person clicks around the site, I want the mydomain.com/simple/some-other-page structure to persist.
The closest I've gotten is this:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteRule ^simple$ /foo/bar/$1 [PT]
</IfModule>
However, using this rule, when a person types mydomain.com/simple it rewrites the URI in the browser to mydomain.com/foo/bar
What am I doing wrong?
Thanks in advance.
First, there may be a problem with this rule:
RewriteRule ^simple$ /foo/bar/$1 [PT]
The expression ^simple will probably never match, since all requests will start with a /.
You're using $1 in the right-hand side of the rule, but there are no match groups in the left-hand side that will populate this. This means that a request for /simple would get you /foo/bar/, but a request for /simple/somethingelse wouldn't match the rule. If this isn't the behavior you want, you probably mean this:
RewriteRule ^/simple(.*)$ /foo/bar$1 [PT]
(Note that I've added the missing leading / here as well).
With these changes in place, this rule behaves on my system as I think you're expecting.
Lastly, turning on the RewriteLog and setting RewriteLogLevel (assuming a pre-2.4 version of Apache) will help expose the details of exactly what's happening.

URL rewriting with mod_rewrite to provide RESTful URLs

The web server is Apache. I want to rewrite URL so a user won't know the actual directory. For example:
The original URL:
www.mydomainname.com/en/piecework/piecework.php?piecework_id=11
Expected URL:
piecework.mydomainname.com/en/11
I added the following statements in .htaccess:
RewriteCond %{HTTP_HOST} ^(?!www)([^.]+)\.mydomainname\.com$ [NC]
RewriteRule ^(w+)/(\d+)$ /$1/%1/%1.php?%1_id=$2 [L]
Of course I replaced mydomainname with my domain name.
.htaccess is placed in the site root, but when I access piecework.mydomainname.com/en/11, I got "Object not found".(Of course I replaced mydomainname with my domain name.)
I added the following statements in .htaccess:
RewriteRule ^/(.*)/en/piecework/(.*)piecework_id=([0-9]+)(.*) piecework.mydomainname.com/en/$3
Of course I replaced mydomainname with my domain name.
.htaccess is placed in the site root, but when I access piecework.mydomainname.com/en/11, I got "Object not found".(Of course I replaced mydomainname with my domain name.)
What's wrong?
Try using RewriteLog in your vhost or server-conf in order to check the rewriting process. Right now you just seem to guess what mod_rewrite does.
By using RewriteLogLevel you can modify the extend of the logging. For starters I'd recommend level 5.
PS: Don't forget to reload/restart the server after modifying the config.
Here's a quick overview of what's happening:
RewriteCond %{HTTP_HOST} ^(?!www)([^.]+)\.mydomainname\.com$ [NC]
First, the question mark is supposed to be at the end.
$1 would (should) match anything that is not 'www' 0 or 1 times.
$2 matches anything that is not a character 1 or more times, which theoretically would match a blank space there but likely would never match anything.
Then it requires '.mydomainname.com' after those two groupings.
Your first two conditions are looking for two separate groupings.
I'm not sure exactly how you're trying to set up your structure, but here is how I would write it based on your original and expected URL's:
RewriteCond %{HTTP_HOST} !^www\.mydomainname\.com$ [NC]
RewriteCond %{HTTP_HOST} ^(\w+)\.mydomainname\.com$ [NC]
RewriteRule ^(\w+)/(\d+)$ /$1/%1/%1.php?%1_id=$2 [L]
Basically, your first condition is to make sure it's not the URL beginning with 'www' (it's easier to just make it a separate rule). The second condition makes it check any word that could possibly be in front of your domain name. Then your rewrite rule will redirect it appropriately.
Get rid of the last .htaccess line there in your question...
Someone please correct me if I typed something wrong. I don't remember if you have to have the '\' in front of '\w' and '\d' but I included them anyways.
You are doing it backwards. The idea is that you will give people the friendly address, and the re-write rule will point requests to this friendly, non-existent page to the real page without them seeing it. So right now you have it only handling what to do when they go to the ugly URL, but you are putting in the friendly URL. since no rule exists for when people put the friendly URL directly, apache looks for it and says "Object not Found"
So add a line:
RewriteRule piecework.mydomainname.com/en/(*.) ^/$3/en/piecework/$3?piecework_id=([0-9]+)(.*)
Sorry, that's quite right, but the basic idea is, if they put in the URL you like, Apache is ready to redirect to the real page without the browser seeing it.
Update
I'm way to sleepy to do regex correctly, so I had just tried my best to move your example around, sorry. I would try something more simple first just to get the basic concept down first. Try this:
RewriteRule www.mydomainname.com/en/piecework/piecework\.php\?piecework_id\=11 piecework.mydomainname.com/en/11
At the very least, it will be easier to see what isn't working if you get errors.