Apache URL Rewriting, - apache

I am trying to get URL rewriting to work on my website. Here is the contents of my .htaccess:
RewriteEngine On
RewriteRule ^blog/?$ index.php?page=blog [L]
RewriteRule ^about/?$ index.php?page=about [L]
RewriteRule ^portfolio/?$ index.php?page=portfolio [L]
#RewriteRule ^.*$ index.php?page=blog [L]
Now the 3 uncommented rewrite rules work perfectly, if I try http://www.mysite.com/blog/, I get redirected to http://www.mysite.com/index.php?page=blog, the same for "about" and "portfolio". However, if I mistype blog, say I try http://www.mysite.com/bloh/, then obviously I get a 404 error. The last rule, the commented one, was to help prevent that. Any URL should get redirected to the blog, but of course this rule is still parsed even if we have successfully used a previous one, so I used the "last" flag ([L]). If I uncomment my last rule, anything, including blog, about, and portfolio, redirect to blog. Shouldn't the "last" flag stop the execution as soon as it finds a matching rule?
Thanks.

Yes, the Last flag means it won't apply any of the rules following this rule in this request.
After rewriting the URL, it makes an internal request using the new rewritten URL which would match your last RewriteRule and thus your redirects go into an infinite loop.
Use the RewriteCond directive to limit rewriting to URLs that don't start with index.php, and you should be fine.

You could add a condition like:
RewriteCond %{REQUEST_URI} !^index\.php
I'll also mention that using RewriteRule ^.*$ is a good way to break all of your media requests (css, js, images) as well. You might want to add some conditions like:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
To make sure you're not trying to rewrite actual files or directories that exist on your server. Otherwise they'll be unreachable unless index.php serves those too!

From apache's mod_rewrite docs
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite
rules. This corresponds to the Perl
last command or the break command in
C. Use this flag to prevent the
currently rewritten URL from being
rewritten further by following rules.
Remember, however, that if the
RewriteRule generates an internal
redirect (which frequently occurs when
rewriting in a per-directory context),
this will reinject the request and
will cause processing to be repeated
starting from the first RewriteRule.
You could use
ErrorDocument 404 /index.php?page=blog
but you should be aware of the fact that it doesn't return 404 error code, but a redirect one and I don't know if that is such a good practice.

After you [L]eave processing for the request, the whole processing runs again for the new (rewritten) URL. You could get out of that loop by using this before your other rules:
RewriteRule ^index.php - [L]
which means "for index.php, don't rewrite and leave processing."

Related

The requested URL was not found on this server .htacess

I am trying to rewrite URL like:
example.com/speciality_details.php?id=23&name=ent
TO
example.com/specialities/23/ent
But I am getting this error:
Not Found
The requested URL was not found on this server.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.
This is my .htaccess file
RewriteEngine on
RewriteRule ^doctor/([0-9]+)/([^/.]+)$ doctor_details.php?id=$1&name=$2 [NC,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^/specialities/([0-9]+)/([^/.]+)$ speciality_details.php?id=$1&name=$2 [NC,L]
The first RewriteRule working but the second one is not working
Please help me to know what the problem is. How should I rewrite the code?
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^/specialities/([0-9]+)/([^/.]+)$ speciality_details.php?id=$1&name=$2 [NC,L]
Just as in the first rule (which is "working"), you should not be matching a slash prefix on the URL-path. And the preceding condition (RewriteCond directive) is superfluous, since a URL of the form /specialities/23/ent could not possibly match a physical file (could it?).
In .htaccess, the URL-path matched by the RewriteRule pattern does not start with a slash since the directory-prefix (that always ends with a slash) has already been removed.
So, the rule should look like the following instead (and no RewriteCond directive):
RewriteRule ^specialities/([0-9]+)/([^/.]+)$ speciality_details.php?id=$1&name=$2 [NC,L]
This would match a URL of the form example.com/specialities/23/ent, as per your example. And assumes the file being rewritten to is speciality_details.php in the document root.
The NC (nocase) flag should also be superfluous, unless you are expecting mixed case versions of sPeCiAlItIeS? But if you are then that is better resolved with a redirect since the rewrite would potentially result in a duplicate content (SEO) issue.
Make sure you clear your browser cache before testing.
Although, from your earlier question edits it looks like you had already tried this without the slash prefix, but at the time you had /speciality/23/ent, not /specialities/23/ent as the example request URL - which would obviously not match.

How to add "everything else" rule to mod_rewrite

How can I make mod_rewrite redirect to a certain page or probably just throw 404 if no other rules have been satisfied? Here's what I have in my .htaccess file:
RewriteEngine on
RewriteRule ^\. / [F,QSA,L]
RewriteRule ^3rdparty(/.*)$ / [F,QSA,L]
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^((images|upload)/.+|style.css)$ $1 [L]
RewriteRule ^$ special [QSA]
RewriteRule ^(special|ready|building|feedback)/?$ $1.php [QSA,L]
RewriteRule ^(ready|building)/(\d+)/?$ show_property.php?type=$1&property_id=$2 [QSA,L]
RewriteRule . error.php?code=404 [QSA,L]
This is supposed, among other things, to send user to error.php if he tries to access anything that was not explicitly specified here (by the way, what is the proper way to throw 404?). However, instead it sends user from every page to error.php. If I remove the last rule, everything else works.
What am I doing wrong?
What is happening is that when you are doing a rewrite, you then send the user to the new URL, where these rewrite rules are then evaluated again. Eventually no other redirectoin rules will be triggered and it will get to the final rule and always redirect to the error.php page.
So you need to put some rewrite conditions in place to make this not happen.
The rewrite engine loops, so you need to pasthrough successful rewrites before finally rewriting to error.php. Maybe something like:
RewriteCond %{REQUEST_URI} !^/$
RewriteCond %{REQUEST_URI} !^/(special|ready|building|feedback|show_property)\.php
RewriteCond %{REQUEST_URI} !^/((images|upload)/.+|style.css)$
RewriteRule ^ error.php?code=404 [QSA,L,R=404]
Each condition makes sure the URI isn't one of the ones your other rules have rewritten to.
The R=404 will redirect to the error.php page as a "404 Not Found".
Unfortunatelly, it didn't work - it allows access to all files on the server (presumably because all conditions need to be satisfied). I tried an alternate solution:
Something else must be slipping through, eventhough when I tested your rules plus these at the end in a blank htaccess file, it seems to work. Something else you can try which is a little less nice but since you don't actually redirect the browser anywhere, it would be hidden from clients.
You have a QSA flag at the end of all your rules, you could add a unique param to the query string after you've applied a rule, then just check against that. Example:
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^((images|upload)/.+|style.css)$ $1?_ok [L,QSA]
then at the end:
RewriteCond %{QUERY_STRING} !_ok
RewriteRule ^ error.php?code=404&_ok [QSA,L,R=404]
In theory if none of the rules are matched (and the requested URL does not exist), it's already a 404. So I think the simplest solution is to use an ErrorDocument, then rewrite it:
RewriteEngine On
ErrorDocument 404 /404.php
RewriteRule ^404.php$ error.php?code=404 [L]
# All your other rules here...
You can do the same for any other HTTP error code.
The problem here is that after the mod_rewrite finishes rewriting the URL, it is resubmitted to the mod_rewrite for another pass. So, the [L] flag only makes the rule last for the current pass. As much better explained in this question, mod_rewrite starting from Apache version 2.3.9, now supports another flag - [END], that makes the current mod_rewrite pass the last one. For Apache 2.2 a number of solutions are offered, but since one of them was a bit clumsy and another didn't work, my current solution is to add another two rules that allow a specific set of files to be accessed while sending 404 for everything else:
RewriteRule ^((images|upload)/.+|style.css|(special|ready|building|feedback|property).php)$ - [QSA,L]
RewriteRule .* - [QSA,L,R=404]
I think your last rule should be
RewriteRule ^(.*)$ error.php?code=404&query=$1 [QSA,L]
You could leave out the parenthesis and the $1 parameter, but maybe it's useful to know, what the user tried to achieve.
Hope, this does the trick!

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.

Apache Rewrite Directory With Exceptions

I am trying to setup a rewrite that will get any pages that are in the news folder (with the exception of index.shtml and template.shtml (where template.shtml will have a get variable news in it). All other pages should rewrite to template.shtml?news=(same name as news/name).
What I have so far is:
RewriteCond %{REQUEST_URI} !^/news/((index|template).shtml)?$
RewriteRule ^news/(.*) /news/template.shtml?news=$1
This seems to exclude the main /news/, but not template.shtml and the rewrite seems to loop.
How can I resolve this? Any help would be appreciated.
Thanks.
Well -- this one works just fine for me:
RewriteCond %{REQUEST_URI} !^/news/(index|template)\.shtml$
RewriteRule ^news/(.+)$ /news/template.shtml?news=$1 [L,QSA]
This rule will ignore requests to /news/index.shtml and /news/template.shtml.
It will also do nothing when requesting just /news/ (as I have changed .* to .+ to be on a safer side).
Anything else will be rewritten to /news/template.shtml?news=whatever
I've also added the QSA flag to preserve any existing query string (useful for keeping referral data, e.g. /news/hello-pink-kitten?source=google will be rewritten as /news/template.shtml?news=hello-pink-kitten&source=google)

Why would mod_rewrite rewrite twice?

I only recently found out about URL rewriting, so I've still got a lot to learn.
While following the Easy Mod Rewrite tutorial, the results of one of their examples is really confusing me.
RewriteBase /
RewriteRule (.*) index.php?page=$1 [QSA,L]
Rewrites /home as /index.php?page=index.php&page=home.
I thought the duplicates might have had been caused by something in my host's configs, but a clean install of XAMPP does the same.
So, does anyone know why this seems to parse twice?
And, to me this seems like, if it's going to do this, it would be an infinite loop -- why does it stop at 2 cycles?
From Example 1 on this page, which is part of the tutorial linked in your question:
Assume you are using a CMS system that rewrites requests for everything to a single index.php script.
RewriteRule ^(.*)$ index.php?PAGE=$1 [L,QSA]
Yet every time you run that, regardless of which file you request, the PAGE variable always contains "index.php".
Why? You will end up doing two rewrites. Firstly, you request test.php. This gets rewritten to index.php?PAGE=test.php. A second request is now made for index.php?PAGE=test.php. This still matches your rewrite pattern, and in turn gets rewritten to index.php?PAGE=index.php.
One solution would be to add a RewriteCond that checks if the file is already "index.php". A better solution that also allows you to keep images and CSS files in the same directory is to use a RewriteCond that checks if the file exists, using -f.
1the link is to the Internet Archive, since the tutorial website appears to be offline
From the Apache Module mod_rewrite documentation:
'last|L' (last rule)
[…] if the RewriteRule generates an internal redirect […] this will reinject the request and will cause processing to be repeated starting from the first RewriteRule.
To prevent this you could either use an additional RewriteCond directive:
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule (.*) index.php?page=$1 [QSA,L]
Or you alter the pattern to not match index.php and use the REQUEST_URI variable, either in the redirect or later in PHP ($_SERVER['REQUEST_URI']).
RewriteRule !^index\.php$ index.php?page=%{REQUEST_URI} [QSA,L]