How can you ignore the end of a URL using mod_rewrite? - apache

I'd like to structure my website like this:
domain.com/person/edit/1
domain.com/person/edit/2
domain.com/person/edit/3
etc.
I have a page to which all these requests should go:
domain.com/person/edit.html
The JavaScript will look at the trailing part of the url when the page is loaded so I want the server to internally ignore it.
I've got this rewrite rule:
RewriteRule ^person/view/(.*)$ person/view.html [L]
I'm sure that I'm missing something obvious but when I visit one of the pages above I get this 404 message:
The requested URL /person/view.html/1 was not found on this server.
As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page. Instead it seems to be applying the rule at the earliest possible moment and then appending the rest of the unmatched url to the re-written one.
How do I get these re-writes to work properly?

"As far as I understood it the [L] means that if this rule applies Apache should stop rewriting and serve up the alternate page."
Well .. [L] flag tells Apache to stop checking other rules .. and rewrite goes to next iteration .. where it again checks against all rules again (that is how it works).
Try these "recipe" (put it somewhere on top of your .htaccess):
Options +FollowSymLinks -MultiViews
# activate rewrite engine
RewriteEngine On
# Do not do anything for already existing files
RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule .+ - [L]
Another idea to try -- add DPI flag to your [L]: [L,DPI]
If Options will not help, then rewrite rule should. But it all depends on your Apache's configuration. If the above does not work -- please post your whole .htaccess (update your question).

Related

Why is this RewriteRule altering QUERY_STRING, but leaving REQUEST_URI untouched?

I have a copy of Concrete5, a PHP-based CMS, running on example.com.
Concrete5 comes with the following basic instructions for pretty URLs (redirecting all URLs to a central index.php)
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/c5.7
RewriteRule ^.*$ c5.7/$0 [L] # Concrete5 is running in the c5.7/ subdirectory
</IfModule>
Pretty straightforward.
Now I have a certain set of URLs that take the form
/product/{productname}
that I need to forward to the Concrete5 (virtual) URL
/products/details?name={productname}
That URL is set up and works as expected when I enter it manually in the browser.
So I added a line to the htaccess file and it now looks like this:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
# New rule for products
RewriteCond %{REQUEST_URI} ^/product/
RewriteRule ^product/(.+)$ /products/details?name=$1 [QSA]
RewriteCond %{REQUEST_URI} !^/c5.7
RewriteRule ^.*$ c5.7/$0 [L]
</IfModule>
I can confirm the RewriteRule gets triggered when I choose a random, external URL as the redirection target.
But whenever it is an internal redirect like above, what happens is, I get a 404 inside Concrete5. When I inspect what was passed to it, I see:
REQUEST_URI: /product/my-random-product
QUERY_STRING: name=my-random-product
So it appears that the rule is triggered and does some rewriting, but REQUEST_URI remains unchanged!
Why?
Is it because PHP 7.1 is running via CGI?
I have tried a zillion variations and all the flags in the book, with little success.
The REQUEST_URI in PHP is not the same as the REQUEST_URI within mod_rewrite, so you can't do it like this. In PHP it always contains the original URL. So you can't change it like this if your CMS is working off that.
You should set up your CMS to use the URLs you want, rather than trying to augment your CMS's URL rewriting like this.
If you inspect REDIRECT_URL in PHP you will see the last rewritten URI.
REQUEST_URI in PHP will always be the original request URI.
Because this is already explained by LSerni and SuperDuperApps, I won't elaborate.
Instead, I'm offering a quick solution: modify the REQUEST_URI and add a name parameter in PHP instead of in .htaccess.
Add the following code to the start of your Concrete5 index.php to make sure that REQUEST_URI is modified
before any Concrete5 code runs:
if(preg_match('-^/product/([^?]*)-',$_SERVER['REQUEST_URI'],$matches)){
$_SERVER['REQUEST_URI'] = '/products/details';
$_GET['name'] = $matches[1];
}
Your setup works on a PHP 7.1 machine (without Concrete5). It does call a script I just put in, which is in /c5.7/products/details. So the Apache part is working.
Inside the script, I see that REQUEST_URI is the old value prior to the rewrite.
So its value is normal and it not being rewritten is a red herring - it isn't supposed to be rewritten. The 404 error must be due to something else.
Your Concrete5 routing should support the real URL, not just the virtual one, because C5's routing relies itself on REQUEST_URI. If this is so, you need to create a route for your short URLs
Route::register('/product/{productname}' ...)
and an appropriate controller to get the parameters and invoke the "old" controller.
One possibility using .htaccess could be this, but I'm not too sure it will work since REQUEST_URI is still left unchanged:
# New rule for products
RewriteCond %{REQUEST_URI} ^/product/
RewriteRule ^product/(.+)$ c5.7/products/details?name=$1 [L,QSA]
Otherwise you need to do an external redirect, which will disclose the URL in the browser:
RewriteRule product/(.*)$ http://.../products/details?name=$1 [QSA]
See also this other question.

mod_rewrite: hide real urls but keep available as different files

Possible this question has already been answered but I didn't find any answer after hours of searching.
I need to put the site under "maintenance mode" and redirect/rewrite all requests to site_down.html, but at the same time I need the site to be available if I enter the address like files are in a subfolder.
ex:
if I type http://example.com/login.php I need site_down.html to be displayed.
but if I specify http://example.com/test/login.php I need real login.php do be displayed.
I need this to be done with rewrite, so copying everything to another directory isn't a solution.
I tried a couple dozens of combinations, but I'm still unable to achieve what I need
This is one version of my .htaccess file ():
DirectoryIndex site_down.html
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^test\/(.*)$ $1 [S=1]
RewriteRule ^(.*\.php)$ site_down.html
RewriteRule .* - [L]
</IfModule>
This code should rewrite all requests with "test/*" to "parent folder" and skip next rewrite rule and then terminate rewriting at RewriteRule .* - [L]. If there is no "test/" in url - all request should be rewritten to site_down.html
What am I doing wrong?
Could you suggest any valid solutions, please?
Thank you.
Essentially, you are searching for 2 rules. One rule will translate a virtual subdirectory to the working files. The other rule will translate the url to the working files to a splash page. We just have to make sure that if the first rule matches, the second rule doesn't match. We can do this by making sure " /test/" (including that leading space) was not in THE_REQUEST (or the string that the client sent to the server to request a page; something in the form of GET /test/mypage.php?apes=bananas HTTP/1.1). THE_REQUEST doesn't change on a rewrite, which makes it perfect for that. Skipping a rule like you did usually doesn't have the effect you expect, because mod_rewrite makes multiple passes through .htaccess until the resulting url doesn't change anymore, or it hits a limit and throws an error. The first time it will skip the rule, but the second time it will not do that.
RewriteCond %{THE_REQUEST} !\ /test/
RewriteRule \.php site_down.html [L]
RewriteRule ^test/(.*)$ $1 [L]

Redirect loop with simple htaccess rule

I have been pulling my air out over this. It worked before the server migration!
Ok so basically it's as simple as this:
I have a .php file that I want to view the content of using a SEO friendly URL via a ReWrite rule.
Also to canonicalise and to prevent duplicate content I want to 301 the .php version to the SEO friendly version.
This is what I used and has always worked till now on the new server:
RewriteRule ^friendly-url/$ friendly-url.php [L,NC]
RewriteRule ^friendly-url.php$ /friendly-url/$1 [R=301,L]
However disaster has struck and now it causes a redirect loop.
Logically I can only assume that in this version of Apache it is tripping up as it's seeing that the script being run is the .php version and so it tries the redirect again.
How can I re-work this to make it work? Or is there a config I need to switch in WHM?
Thanks!!
This is how your .htaccess should look like:
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
# To externally redirect /friendly-url.php to /friendly-url/
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(friendly-url)\.php [NC]
RewriteRule ^ /%1/? [R=302,L]
## To internally redirect /anything/ to /anything.php
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{DOCUMENT_ROOT}/$1\.php -f
RewriteRule ^(.+?)/$ $1.php [L]
Note how I am using R=302, because I don't want the rule to cache on my browser until I confirm its working as expected, then, once I can confirm its working as expected I switch from R=302 to R=301.
Keep in mind you may have also been cached from previous attempts since you're using R=301, so you better of trying to access it from a different browser you have used just to make sure its working.
However disaster has struck and now it causes a redirect loop.
It causes a redirect loop because your redirecting it to itself, the different on my code is that I capture the request, and redirect the php files from there to make it friendly and then use the internal redirect.
The exact same .htaccess file will work differently depending on where it's placed because the [L]ast flag means something different depending on location. In ...conf, [L]ast means all finished processing so get out, but in .htaccess the exact same [L]ast flag means start all over at the top of this file.
To work as expected when moving a block of code from ...conf to .htaccess, most .htaccess files will need one or the other of these tweaks:
Change the [L]ast flags to [END]. (Problem is, the [END] flag is only available in newer [version 2.3.9 and later] Apaches, and won't even "fall back" in earlier versions.)
Add boilerplate code like this at the top of each of your .htaccess files:
*
RewriteCond %{ENV:REDIRECT_STATUS} !^[\s/]*$
RewriteRule ^ - [L]

Apache URL Rewriting,

I am trying to get URL rewriting to work on my website. Here is the contents of my .htaccess:
RewriteEngine On
RewriteRule ^blog/?$ index.php?page=blog [L]
RewriteRule ^about/?$ index.php?page=about [L]
RewriteRule ^portfolio/?$ index.php?page=portfolio [L]
#RewriteRule ^.*$ index.php?page=blog [L]
Now the 3 uncommented rewrite rules work perfectly, if I try http://www.mysite.com/blog/, I get redirected to http://www.mysite.com/index.php?page=blog, the same for "about" and "portfolio". However, if I mistype blog, say I try http://www.mysite.com/bloh/, then obviously I get a 404 error. The last rule, the commented one, was to help prevent that. Any URL should get redirected to the blog, but of course this rule is still parsed even if we have successfully used a previous one, so I used the "last" flag ([L]). If I uncomment my last rule, anything, including blog, about, and portfolio, redirect to blog. Shouldn't the "last" flag stop the execution as soon as it finds a matching rule?
Thanks.
Yes, the Last flag means it won't apply any of the rules following this rule in this request.
After rewriting the URL, it makes an internal request using the new rewritten URL which would match your last RewriteRule and thus your redirects go into an infinite loop.
Use the RewriteCond directive to limit rewriting to URLs that don't start with index.php, and you should be fine.
You could add a condition like:
RewriteCond %{REQUEST_URI} !^index\.php
I'll also mention that using RewriteRule ^.*$ is a good way to break all of your media requests (css, js, images) as well. You might want to add some conditions like:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
To make sure you're not trying to rewrite actual files or directories that exist on your server. Otherwise they'll be unreachable unless index.php serves those too!
From apache's mod_rewrite docs
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite
rules. This corresponds to the Perl
last command or the break command in
C. Use this flag to prevent the
currently rewritten URL from being
rewritten further by following rules.
Remember, however, that if the
RewriteRule generates an internal
redirect (which frequently occurs when
rewriting in a per-directory context),
this will reinject the request and
will cause processing to be repeated
starting from the first RewriteRule.
You could use
ErrorDocument 404 /index.php?page=blog
but you should be aware of the fact that it doesn't return 404 error code, but a redirect one and I don't know if that is such a good practice.
After you [L]eave processing for the request, the whole processing runs again for the new (rewritten) URL. You could get out of that loop by using this before your other rules:
RewriteRule ^index.php - [L]
which means "for index.php, don't rewrite and leave processing."

Why would mod_rewrite rewrite twice?

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?
From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline
From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]