Why is this RewriteRule altering QUERY_STRING, but leaving REQUEST_URI untouched? - apache

I have a copy of Concrete5, a PHP-based CMS, running on example.com.
Concrete5 comes with the following basic instructions for pretty URLs (redirecting all URLs to a central index.php)
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/c5.7
RewriteRule ^.*$ c5.7/$0 [L] # Concrete5 is running in the c5.7/ subdirectory
</IfModule>
Pretty straightforward.
Now I have a certain set of URLs that take the form
/product/{productname}
that I need to forward to the Concrete5 (virtual) URL
/products/details?name={productname}
That URL is set up and works as expected when I enter it manually in the browser.
So I added a line to the htaccess file and it now looks like this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
# New rule for products
RewriteCond %{REQUEST_URI} ^/product/
RewriteRule ^product/(.+)$ /products/details?name=$1 [QSA]
RewriteCond %{REQUEST_URI} !^/c5.7
RewriteRule ^.*$ c5.7/$0 [L]
</IfModule>
I can confirm the RewriteRule gets triggered when I choose a random, external URL as the redirection target.
But whenever it is an internal redirect like above, what happens is, I get a 404 inside Concrete5. When I inspect what was passed to it, I see:
REQUEST_URI: /product/my-random-product
QUERY_STRING: name=my-random-product
So it appears that the rule is triggered and does some rewriting, but REQUEST_URI remains unchanged!
Why?
Is it because PHP 7.1 is running via CGI?
I have tried a zillion variations and all the flags in the book, with little success.

The REQUEST_URI in PHP is not the same as the REQUEST_URI within mod_rewrite, so you can't do it like this. In PHP it always contains the original URL. So you can't change it like this if your CMS is working off that.
You should set up your CMS to use the URLs you want, rather than trying to augment your CMS's URL rewriting like this.
If you inspect REDIRECT_URL in PHP you will see the last rewritten URI.

REQUEST_URI in PHP will always be the original request URI.
Because this is already explained by LSerni and SuperDuperApps, I won't elaborate.
Instead, I'm offering a quick solution: modify the REQUEST_URI and add a name parameter in PHP instead of in .htaccess.
Add the following code to the start of your Concrete5 index.php to make sure that REQUEST_URI is modified
before any Concrete5 code runs:
if(preg_match('-^/product/([^?]*)-',$_SERVER['REQUEST_URI'],$matches)){
$_SERVER['REQUEST_URI'] = '/products/details';
$_GET['name'] = $matches[1];
}

Your setup works on a PHP 7.1 machine (without Concrete5). It does call a script I just put in, which is in /c5.7/products/details. So the Apache part is working.
Inside the script, I see that REQUEST_URI is the old value prior to the rewrite.
So its value is normal and it not being rewritten is a red herring - it isn't supposed to be rewritten. The 404 error must be due to something else.
Your Concrete5 routing should support the real URL, not just the virtual one, because C5's routing relies itself on REQUEST_URI. If this is so, you need to create a route for your short URLs
Route::register('/product/{productname}' ...)
and an appropriate controller to get the parameters and invoke the "old" controller.
One possibility using .htaccess could be this, but I'm not too sure it will work since REQUEST_URI is still left unchanged:
# New rule for products
RewriteCond %{REQUEST_URI} ^/product/
RewriteRule ^product/(.+)$ c5.7/products/details?name=$1 [L,QSA]
Otherwise you need to do an external redirect, which will disclose the URL in the browser:
RewriteRule product/(.*)$ http://.../products/details?name=$1 [QSA]
See also this other question.

Related

Rewrite rule to prevent apache decoding url before reaching htaccess?

We have a htaccess rule like this:
RewriteRule ^(.*)/(.*)/(.*) ../app$1/scripts/api/index.php?fn=$2&$3 [L]
This works fine in most cases, however, Apache decodes the url before it arrives at this rule, so a url like beta/list/&cat=red%20%26%20blue, is seen by htaccess as beta/list/&cat=red & blue so we get cat='red' and blue=null coming into index.php instead of cat='red & blue'.
I've read that the workaround for this issue is to use server variables like %{REQUEST_URI} %{THE_REQUEST} in the htaccess rule as these are not decoded before use, but it's difficult to implement. The question mark in the RewriteRule makes everything go crazy and I can't figure out how to escape it.
Can any experts out there help me fix the rule below to behave like the one above?
RewriteCond %{REQUEST_URI} ^(.*)/(.*)/(.*)
RewriteRule . ../app%1/scripts/api/index.php?fn=%2&%3 [L]
Indeed, the solution is to use the special server-variable called THE_REQUEST.
From mod_rewrite documentation:
THE_REQUEST
The full HTTP request line sent by the browser to the server (e.g.,
"GET /index.html HTTP/1.1"). This does not include any additional
headers sent by the browser. This value has not been unescaped
(decoded), unlike most other variables below.
Here is how your rule should look like
# don't touch urls ending by index.php
RewriteRule index\.php$ - [L]
# user request matching /xxx/xxx/xxx (with optional query string)
RewriteCond %{THE_REQUEST} \s/([^/\?]+)/([^/\?]+)/([^\?]+)(?:\s|\?) [NC]
RewriteRule ^ ../app%1/scripts/api/index.php?fn=%2&%3 [L,QSA]
Please note that you shouldn't be using relative path for internal rewrite, which could lead to confusion. Instead, define a RewriteBase, use an absolute path or start from the domain root with a /.
UPDATE
Since you can have encoded forward slashes in your url, you need to set AllowEncodedSlashes to NoDecode (or On but it's unsafe). Note also that, due to a bug, you must put this directive inside a virtual host context, even if the server config context is said to be OK (otherwise, it is simply ignored). By default, AllowEncodedSlashes is set to Off. So, Apache handles encoded slashes automatically by itself and refuses them, without passing the request to mod_rewrite. See the official documentation here.

.htaccess rewrite returning Error 404

RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)public_url=([^&]+)($|&)
RewriteRule ^process\.php$ /api/%2/? [L,R=301]
Where domain.tld/app/process.php?public_url=abcd1234 is the actual location of the script.
But I am trying to get .htaccess to make the URL like this: domain.tld/app/api/acbd1234.
Essentially hides the process.php script and the get query ?public_url.
However the script above is returning error 404 not found.
I think this is what you are actually looking for:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /app/api/%1 [R=301,QSD]
RewriteRule ^/?app/api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
If you receive an internal server error (http status 500) for that then check your http servers error log file. Chances are that you operate a very old version of the apache http server, you may have to replace the [END] flag with the [L] flag which probably will work just fine in this scenario.
And a general hint: you should always prefer to place such rules inside the http servers (virtual) host configuration instead of using dynamic configuration files (.htaccess style files). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only supported as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).
UPDATE:
Based on your many questions in the comments below (we see again how important it is to be precise in the question itself ;-) ) I add this variant implementing a different handling of path components:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /api/%1 [R=301,QSD]
RewriteRule ^/?api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
I am trying to get .htaccess to make the URL like this: example.com/app/api/acbd1234.
You don't do this in .htaccess. You change the URL in your application and then rewrite the new URL to the actual/old URL. (You only need to redirect this, if the old URLs have been indexed by search engines - but you need to watch for redirect loops.)
So, change the URL in your application to /app/api/acbd1234 and then rewrite this in .htaccess (which I assume in in your /app subdirectory). For example:
RewriteEngine On
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
You included a trailing slash in your earlier directive, but you omitted this in your example URL, so I've omitted it here also.
If you then need to also redirect the old URL for the sake of SEO, then you can implement a redirect before the internal rewrite:
RewriteEngine On
# Redirect old URL to new (if request by search engines or external links)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^process\.php$ /app/api/%1? [R=302,L]
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
The check against REDIRECT_STATUS is to avoid a rewrite loop. ?: inside the parenthesised subpattern avoids the group being captured as a backreference.
Change the 302 (temporary) to 301 (permanent) only when you are sure it's working OK, to avoid erroneous redirects being cached by the browser.

How to setup request proxy using URL rewriting

I have an e-commerce site that resides in:
http://dev.gworks.mobi/
When a customer clicks on the signin link, the browser gets redirected to another domain, in order for authentication:
http://frock.gworks.mobi:8080/openam/XUI/#login/&goto=http%3A%2F%2Fdev.gworks.mobi%3A80%2Fcustomer%2Faccount%2Flogin%2Freferer%2FaHR0cDovL2Rldi5nd29ya3MubW9iaS8%2C%2F
I'm trying to rewrite http://dev.gworks.mobi/* to http://frock.gworks.mobi:8080/openam/*, without redirection.
I've tried this in the .htaccess of the dev.gworks.mobi site:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/openam(.*)$ [NC]
RewriteRule ^(.*)$ http://frock.gworks.mobi:8080/$1 [P,L]
</IfModule>
But when I access http://dev.gworks.mobi/openam, it shows a 404 page not found page.
Can anyone help me to achieve my use case?
Try this:
RewriteEngine on
RewriteBase /
# Make sure it's not an actual file being accessed
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# Match the host
RewriteCond %{HTTP_HOST} ^dev\.gworks\.mobi
# Rewrite the request if it starts with "openam"
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [L,QSA]
This will rewrite all the requests to dev.gworks.mobi/openam to frock.gworks.mobi:8080.
If you want to mask the URI in a way that it's not visible to the visitor that she's visiting the authentication app, you need to add a P flag. Please note that it needs Apache's mod_proxy module in place:
RewriteRule ^openam(.*)$ http://frock.gworks.mobi:8080/$1 [P,L,QSA]
Feel free to drop the L flag, if it's not the last rewrite rule. See RewriteRule Flags for more information.
The 404
If it's all in place and you're still getting a 404 error, make sure that the target URL is not throwing 404 errors in the first place.
Second, check if you're still getting the error with the correct referrer URI set. It might be designed in a way to throw a 404, if the referrer is not correctly set. If that's the case, which I suspect, you need to use the R flag and redirect instead of proxying the request.
Last thing that comes to my mind, some webapps are not built in a way to figure out the URI address. The host, as well as the port number, might be hard-coded somewhere in the config files. Make sure that the authentication app is able to be run from another URL without the need to edit the configs.
Test
You can test the rewriterule online:

Redirect loop with simple htaccess rule

I have been pulling my air out over this. It worked before the server migration!
Ok so basically it's as simple as this:
I have a .php file that I want to view the content of using a SEO friendly URL via a ReWrite rule.
Also to canonicalise and to prevent duplicate content I want to 301 the .php version to the SEO friendly version.
This is what I used and has always worked till now on the new server:
RewriteRule ^friendly-url/$ friendly-url.php [L,NC]
RewriteRule ^friendly-url.php$ /friendly-url/$1 [R=301,L]
However disaster has struck and now it causes a redirect loop.
Logically I can only assume that in this version of Apache it is tripping up as it's seeing that the script being run is the .php version and so it tries the redirect again.
How can I re-work this to make it work? Or is there a config I need to switch in WHM?
Thanks!!
This is how your .htaccess should look like:
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
# To externally redirect /friendly-url.php to /friendly-url/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(friendly-url)\.php [NC]
RewriteRule ^ /%1/? [R=302,L]
## To internally redirect /anything/ to /anything.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1\.php -f
RewriteRule ^(.+?)/$ $1.php [L]
Note how I am using R=302, because I don't want the rule to cache on my browser until I confirm its working as expected, then, once I can confirm its working as expected I switch from R=302 to R=301.
Keep in mind you may have also been cached from previous attempts since you're using R=301, so you better of trying to access it from a different browser you have used just to make sure its working.
However disaster has struck and now it causes a redirect loop.
It causes a redirect loop because your redirecting it to itself, the different on my code is that I capture the request, and redirect the php files from there to make it friendly and then use the internal redirect.
The exact same .htaccess file will work differently depending on where it's placed because the [L]ast flag means something different depending on location. In ...conf, [L]ast means all finished processing so get out, but in .htaccess the exact same [L]ast flag means start all over at the top of this file.
To work as expected when moving a block of code from ...conf to .htaccess, most .htaccess files will need one or the other of these tweaks:
Change the [L]ast flags to [END]. (Problem is, the [END] flag is only available in newer [version 2.3.9 and later] Apaches, and won't even "fall back" in earlier versions.)
Add boilerplate code like this at the top of each of your .htaccess files:
*
RewriteCond %{ENV:REDIRECT_STATUS} !^[\s/]*$
RewriteRule ^ - [L]

How can you ignore the end of a URL using mod_rewrite?

I'd like to structure my website like this:
domain.com/person/edit/1
domain.com/person/edit/2
domain.com/person/edit/3
etc.
I have a page to which all these requests should go:
domain.com/person/edit.html
The JavaScript will look at the trailing part of the url when the page is loaded so I want the server to internally ignore it.
I've got this rewrite rule:
RewriteRule ^person/view/(.*)$ person/view.html [L]
I'm sure that I'm missing something obvious but when I visit one of the pages above I get this 404 message:
The requested URL /person/view.html/1 was not found on this server.
As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page. Instead it seems to be applying the rule at the earliest possible moment and then appending the rest of the unmatched url to the re-written one.
How do I get these re-writes to work properly?
"As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page."
Well .. [L] flag tells Apache to stop checking other rules .. and rewrite goes to next iteration .. where it again checks against all rules again (that is how it works).
Try these "recipe" (put it somewhere on top of your .htaccess):
Options +FollowSymLinks -MultiViews
# activate rewrite engine
RewriteEngine On
# Do not do anything for already existing files
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
Another idea to try -- add DPI flag to your [L]: [L,DPI]
If Options will not help, then rewrite rule should. But it all depends on your Apache's configuration. If the above does not work -- please post your whole .htaccess (update your question).