Pretty URLs rewrite sometimes with "?" sometimes without - apache

I hope someone can answer "why" this is the case:
There are times I can use:
...
RewriteRule ^(.*)$ index.php/$1 [L]
and then there are times where the above doesn't work and I must use:
...
RewriteRule ^(.*)$ index.php?/$1 [L]
the main difference being the addition of ? ... I typically see this happen on different system setups, fastcgi vs module vs cgi, but haven't done enough setups to see the pattern.
I am guessing that it is related to how the apache/setup parses path/path_info data. Any thoughts are welcomed, ideally I'd like to have a solid explanation of why this is and when it occurs.
On the same thread ... Sometimes Apache does not output PATH_INFO environment var which might be the root cause of this, but I wonder why that is.

The ? is the marker of the beginning of the query string.
So basically your first rule rewrite a URL "x" to a file "x" in the directory index.php, the second rewrite a URL "x" to the index.php file with parameter "x". [(BTW I don't know how to retrieve a variable with no name in the file, usually you use ?var=value&var2=value2 etc...)

Related

.htaccess : Pretty URL with whatever number+names of parameters

Hello !
I know there already are a lot of topics about URL rewritting and I honestly swear I've spent a lot of time trying to apply them to my problem but I can't see any of them perfectly applying to my situation (if you find otherwise, please give the link).
-----
Here's the problem :
I'm learning MVC model and URL rewriting and I have my URL like this :
http://localhost/blahblahblah/mywebsite/index.php?param1=value1&param2=value2&param3=value3 ... etc ...
What I want (for some MVC template goals) is to have this kind of URL :
http://localhost/blahblahblah/mywebsite/value1/value2/value3 ... etc ...
-----
Whatever are the names of the parameters and whatever are the values.
This is the most essential thing I can't find a solution for.
(Also don't mind the localhost blahblahblah, this has to work even on distant websites but I trust it will work fine on online website has this part of URL may have no importance in what I want to do)
Thanks a lot for your time if you can help me seeing clearer in what I need to do.
If the .htaccess file is located in the document root (ie. effectively at http://localhost/.htaccess) then you would need to do something like the following using mod_rewrite:
RewriteEngine On
RewriteRule ^(blahblahblah/mywebsite)/(\w+)$ $1/index.php?param1=$2 [L]
RewriteRule ^(blahblahblah/mywebsite)/(\w+)/(\w+)$ $1/index.php?param1=$2&param2=$3 [L]
RewriteRule ^(blahblahblah/mywebsite)/(\w+)/(\w+)/(\w+)$ $1/index.php?param1=$2&param2=$3&param3=$4 [L]
# etc.
Where $n is a backreference to the corresponding captured group in the preceding RewriteRule pattern (1st argument).
UDPATE: \w is a shorthand character class that matches a-z, A-Z, 0-9 and _ (underscore).
A new directive is required for every number of parameters. You could combine them into a single (complex) directive but you would have lots of empty parameters when only a few parameters were passed (rather than not passing those parameters at all).
I'm assuming your URLs do not end in a trailing slash.
If, however, the .htaccess file is located in the /blahblahblah/mywebsite directory then then directives could be simplified a bit:
RewriteRule ^(\w+)$ index.php?param1=$1 [L]
RewriteRule ^(\w+)/(\w+)$ index.php?param1=$1&param2=$2 [L]
RewriteRule ^(\w+)/([\w]+)/([\w]+)$ index.php?param1=$1&param2=$2&param3=$3 [L]
# etc.
Don't use URL parameters (alternative method)
An alternative approach is to not convert the path segments into URL parameters in .htaccess and instead just pass everything to index.php and let your PHP script split the URL into parameters. This allows for any number of parameters.
For example, your .htaccess file then becomes rather more simple:
RewriteRule ^\w+(/\w+)*$ index.php [L]
(This assumes the .htaccess file is located in /blahblahblah/mywebsite directory, otherwise you need to add the necessary directory prefix as above.)
The RewriteRule pattern simply validates the request URL is of the form /value1 or /value1/value2 or /value1/value2/value3 etc. And the request is rewritten to index.php (the front-controller) to handle everything.
In index.php you then examine $_SERVER['REQUEST_URI'] and parse the requested URL.

Apache Mod_Rewrite Question Mark

I need to redirect an incoming request with the following URL:
http://mywebsite.com/abc/mapserv.exe?map=123
to
http://mywebsite.com/abc/mapserv.exe?map=C:\Mapserver\ms4w\Apache\htdocs\Mapfiles\123.map
I already managed to do simple mod_rewrites but the question mark is killing this one all the time. I am not able to adapt common Query String examples to my case so I need help with this exact case.
As though you did not show your try, you could test this:
RewriteEngine On
RewriteCond %{QUERY_STRING} map=([0-9]+)$
RewriteRule . %{REQUEST_URI}?map=C:\\Mapserver\\ms4w\\Apache\\htdocs\\Mapfiles\\%1.map [NE,L]
Rewrite flags used:
NE: Not Escape,
L: Last instruction to run.
I was still having trouble with the .exe url since it is not accessible if you dont deliver the parameters right when you send the request. And then the redirect wont fire. So I made a dummy mapserver.php file which allows setting a parameter like so:
http://mywebsite.com/abc/mapserver.php?map=123
After hours of trying I ended up with the following RewriteRule:
RewriteCond %{QUERY_STRING} ^map=(.*)$
RewriteRule ^mapserver.php?$ /cgi-bin/mapserv.exe?map=C://Mapserver//ms4w//Apache//htdocs//Mapfiles//%1.map

Multiple RewriteRules for a single URL

I want to use multiple RewriteRules in .htaccess to modify a single URL, but only the last rule gets applied.
Example INPUT (link loaded by a browser):
http://example.com/aaa/foo/bar
.htaccess:
RewriteRule ^(.*)foo/(.*)$ $1nofoo/$2
RewriteRule ^(.*)bar(.*)$ $1nobar$2
EXPECTED OUTPUT (what Apache should actually look at):
http://example.com/aaa/nofoo/nobar
ACTUAL OUTPUT:
http://example.com/aaa/foo/nobar/
As you can see, only the last rule was applied. Is there any way to make it work the way I want? All suggestions are welcome.
PS. I want to avoid creating a static, ugly rule like
^(.*)foo/bar(.*) $1nofoo/nobar$2
I need all the modifications to work independently of each other.
UPDATE
So here is exactly what I am trying to achieve. I have some links to a backend server:
http://myserver.com/api/user/$userid/car/$carid/getSpeedRecordDetails
http://myserver.com/api/user/$userid/getUserDetails
http://myserver.com/api/car/$carid/getCarDetails
Where $userid and $carid are some unique 12-char-long strings.
And I want to transform them to these:
http://myserver.com/api/getSpeedRecordDetails.php?userid=$userid&carid=$carid
http://myserver.com/api/getUserDetails.php?userid=$userid
http://myserver.com/api/getCarDetails.php?carid=$carid
And I want to achieve it using the least RewriteRules possible (I am looking for a dynamic solution).
UPDATE #2
I love the SO community! Your patience and willingness to help is truly amazing :)
So the very reason why I am interested in modifying the URL using multiple RewriteRules is because I expect that my backend might soon need to implement hundreds (if not thousands) of user-friendly URLs, and mapping all of them individually would be a waste of time and money. Therefore, I want to take advantage of the fact that all the user-friendly URLs consist of repetitive chunks that can be easily translated. The calls below represent the general three types of the user-friendly URLs I need to manage. The only difference within each type is $userid, $carid, and XXXX_of_a_thousand_functions.
http://myserver.com/backend-api/user/$userid/car/$carid/first_of_a_thousand_functions.do
http://myserver.com/backend-api/user/$userid/second_of_a_thousand_functions.do
http://myserver.com/backend-api/car/$carid/sixteenth_of_a_thousand_functions.do
All of these calls (and remember, there will be hundreds, or even thousands of them) need to be translated into these:
http://myserver.com/backend-api/first_of_a_thousand_functions.php?USER_ID=$userid&CAR_ID=$carid
http://myserver.com/backend-api/second_of_a_thousand_functions.php?USER_ID=$userid
http://myserver.com/backend-api/sixteenth_of_a_thousand_functions.php?CAR_ID=$carid
Seeing that there is a simple pattern governing the translation (I hope it is also visible to you), I thought I could create some simple rules for translating different 'chunks' of the user-friendly URL into the internal URL. For example:
RewriteRule ^(.*)user/([A-Za-z0-9]+)/(.*)$ $1$3&USER_ID=$2
Would be responsible for translating the piece user/HSGRE8563LOS into &USER_ID=HSGRE8563LOS
And because some calls have more than one 'chunk' to process, I need to be able to use multiple RewriteRules on a single URL, which I hope somewhat correlates with the title of the question :)
UPDATE #3 - a future reference
So there are a couple of things I believe need to be said regarding this question.
Apparently .htaccess DOES by default apply all the rules that the conditions of are met. However, some CGI/Fast-CGI installations ruin it, resulting in the kind of behaviour depicted in the first example.
Also, one thing I have NEVER seen mentioned anywhere is that Apache applies the RewriteRules not in the order in which they are listed in .htaccess, but it starts 'scanning' the URL from its beginning, and as soons as the conditions for ANY of the rules are met, the URL gets modified accordingly.
After UPDATE#1... to internally rewrite from the stated friendly URLs using mod_rewrite. Try the following directives in the root .htaccess file:
RewriteEngine On
# http://myserver.com/api/user/$userid/car/$carid/getSpeedRecordDetails
RewriteRule ^api/user/(\w{12})/car/(\w{12})/(getSpeedRecordDetails)$ /api/$3.php?userid=$1&carid=$2 [L]
# http://myserver.com/api/user/$userid/getUserDetails
RewriteRule ^api/user/(\w{12})/(getUserDetails)$ /api/$2.php?userid=$1 [L]
# http://myserver.com/api/car/$carid/getCarDetails
RewriteRule ^api/car/(\w{12})/(getCarDetails)$ /api/$2.php?carid=$1 [L]
\w{12} matches a 12 char long string of letters (upper/lower), numbers and underscore. However, this should be made as restrictive as possible. eg. if a valid id is only numeric then \d{12} would be preferable.
UPDATE#2 The process is almost the same as above....
RewriteBase /backend-api
# http://myserver.com/backend-api/user/$userid/car/$carid/<any>.do
RewriteRule ^backend-api/user/(\w{12})/car/(\w{12})/(\w+)\.do$ $3.php?USER_ID=$1&CAR_ID=$2 [L]
# http://myserver.com/backend-api/user/$userid/<any>.do
RewriteRule ^backend-api/user/(\w{12})/(\w+)\.do$ $2.php?USER_ID=$1 [L]
# http://myserver.com/backend-api/car/$carid/<any>.do
RewriteRule ^backend-api/car/(\w{12})/(\w+)\.do$ $2.php?CAR_ID=$1 [L]
Note the use of RewriteBase. This allows the URL-path to be removed from the RewriteRule substitution.
If /backend-api exists as a physical directory then these rules could instead go in /backend-api/.htaccess. Then you could remove the RewriteBase directive and modify the RewriteRule pattern by removing the backend-api/ portion from near the start of the pattern.
UPDATE#3 - a future reference
Apparently .htaccess DOES by default apply all the rules that the conditions of are met. However, some CGI/Fast-CGI installations ruin it, resulting in the kind of behaviour depicted in the first example.
There are certainly some server (mis)configurations that appear to affect certain aspects of .htaccess/mod_rewrite, however, not in the way suggested earlier in your question and I have never encountered this directly myself. And this should have nothing to do with CGI/Fast-CGI. (?)
...Apache applies the RewriteRules not in the order in which they are listed in .htaccess, ...
Not sure exactly what you mean by this, but RewriteRules are processed "in the order in which they are listed in .htaccess" - this is fundamental to how mod_rewrite works. (However, different modules execute at different times, regardless of the order in .htaccess, but within each module the directives execute top-down, in order. eg. mod_rewrite executes before mod_alias (usually), so if you have a mod_alias Redirect before a mod_rewrite RewriteRule, the RewriteRule is still processed first - which is why it is a bad idea to mix redirects from both modules, as you can end up with confusing conflicts.)
...but it starts 'scanning' the URL from its beginning, and as soons as the conditions for ANY of the rules are met, the URL gets modified accordingly.
Exactly, in the order in which they appear in the file.
Note that it's the RewriteRule directives that are scanned (top - down), not the conditions (ie. RewriteCond directives) - in case that is what you were implying. If the RewriteRule pattern matches the URL-path then the preceding RewriteCond directives are processed and if all these pass then the substitution occurs. (Which is why it is always more efficient to match what you can with the RewriteRule and not rely totally on the RewriteCond directives - the RewriteRule is processed first, not the RewriteCond directives.)
Crucially, (and this might be where you are tripping up?), is that if you have multiple RewriteRule directives then the following RewriteRule directives match against the output/substitution of the previously matched RewriteRule (if any), not against the URL-path of the initial request. Only the first matched RewriteRule matches against the URL-path of the request. So, yes, RewriteRule directives do chain together.
The exception to this is the L (LAST) flag on the RewriteRule. This "breaks" the chain. Although not completely... it causes the current round of processing to stop, but then it all starts again from the top! Only when the URL passes through unchanged is processing finished. (However, in Apache 2.4 you do have the END flag - this does indeed halt processing completely.)

Apache rewrite rule leading slash

Leading slash first argument: ignored?
What's the syntax difference between
RewriteRule help help.php?q=noslash [L] #1
RewriteRule /help help.php?q=withslash [L] #2
If I hit http://localhost/help, it goes to #1, if I hit http://localhost//help it still goes to #1.
Am I right in saying the leading slash in the first argument to RewriteRule is essentially ignored?
Leading slash second argument: error?
Also, why doesn't this rewrite rule work?
RewriteRule help /help.php [L] #1
Putting a leading slash in front of the second arg actually creates a 500 error for the server. Why?
I should note I'm using a .htaccess file to write these rules in
Strangely enough,
RewriteRule ^/help help.php?q=2 [L]
The above rule fails and never matches.
This rule:
RewriteRule ^help help.php?q=1 [L]
Matches http://localhost/help, http://localhost//help and http://localhost///help
It appears RewriteRule never sees leading slashes of the path, and as TheCoolah said they are collapsed (to 0.. when using a .htaccess file anyway) no matter how many there are.
For the second part of the question,
RewriteRule ^help /help.php
I'm getting the answer from Definitive Guide to Apache Mod_rewrite
... a rewrite target that does not begin with http:// or another protocol
designator is assumed to be a file system path. File paths that do not begin with a slash are interpreted as being relative to the directory in which the rewriting is taking place.
So /help.php looks in the root of the system for a file called help.php, which on my system it cannot find.
To make /help.php appear as a relative URL (relative to the root of the site) you can use the [PT] directive:
RewriteRule ^/help /help.php [PT]
That directs http://localhost/help to http://localhost/help.php.
Regarding double slashes: Most Web servers silently collapse multiple slashes into a single slash early in the request processing pipeline. This is true for at least Apache, Tomcat and Jetty. Most Unix-based file systems work the same way. If you really want to check for this, you need to do something like:
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
help matches "help" anywhere in the path.
/help matches nothing since the rewriterule directive omits the leading slash for matching purposes (i.e., you must use ^, not / or ^/, to reference the current directory).
(This can be very confusing if you've used %{REQUEST_URI} in rewritecond because %{REQUEST_URI} does begin with a trailing slash. When matching against %{REQUEST_URI}, ^ and ^/ are equivalent and a directory name will always be preceded by a slash character regardless of whether or not it is in the top-level directory.)
The server error is caused by an infinite loop. "help" becomes "/help.php" which is then matched by the same directive that did the rewriting. So, after the first match, "/help.php" becomes "/help.php" infinitely resulting in a URL that can't be resolved.
I believe such loops can be fixed with the end flag (i.e., [end]), but that flag requires Apache 2.3.9+ whereas Apache 2.2 seems to be more common in deployment. It'd probably be better to just fix the regular expression anyway; ^help$ would seem to be the better choice here.
The way RewriteRule works is that if the given regular expression matches any part of the path part of the URL (the part after the host and port but before the query string), then the entire path part is completely replaced with the given substitution. This explains the behaviour you're seeing in the first part of your question.
I'm not sure what could be causing the 500 errors on the second part; maybe the collapsing of doubled slashes doesn't happen after the rewrite engine has run and then generates a server error.
The reason for the 500 Error is an infinitive Loop:
help gets rewritten to /help
/help gets stripped to help
help gets rewritten to /help
etc. until the MaxRewrites limit is hit -> 500
Whereas if the rule rewrites help to help, Apache is smart enough to abort rewriting at that point.

How to write this URL rewrite rule?

Using LAMP, is it possible to write rewrite rules to redirect URLs like the following?
http://example.com/topic/142 -> http://example.com/static/14/142.html
--Edit--
The rule is to get ID's first 2 numbers as folder name, then ID.html.
Try this rule:
RewriteEngine on
RewriteRule ^topic/(([0-9]{2})[0-9]*)$ static/$2/$1.html
Is it possible, yes, surely.
RewriteRule /topic/(.+) /static/14/$1.html
However, this will give you the /14/ part every single time. As long as you don't have a hint were this part is encoded in your original URL, there is no way to change this.
RewriteEngine on
RewriteRule ^(([0-9]{1,2})[0-9]*)$ /$2/$1.html
Greedy matching means that the first selector will pick up two characters if they are available.
However, I'm not sure that your rule makes much sense, as pages 14, 140-149 and 1400-1499 will be in the same directory. Might it make more sense to put 0-99, 100-199, etc in the same directory?
RewriteEngine on
RewriteRule ^([0-9]{1,2})$ /0/$1.html
RewriteRule ^(([0-9]+)[0-9]{2})$ /$2/$1.html