How to prevent mod_rewrite from rewriting URLs more than once? - apache

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)

[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.

I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.

Related

Clean URL and php file extension removal

I want to achieve two things with .htaccess:
Only show the value in the URL parameter "slug" making the URL clean
For all other php pages on my site which doesn't have the URL parameter "slug" simply remove the file extension ".php".
I have the following .htaccess code which takes care of point 1 above:
RewriteEngine On
RewriteRule ^([a-zA-Z0-9_-]+)$ somepage.php?slug=$1
RewriteRule ^([a-zA-Z0-9_-]+)/$ somepage.php?slug=$1
The question is; how do I incorporate point 2 above in the same code without breaking what is already working in point one?
I have tried simply including the following code below the above but this gives me a 404 error when I go to example.com/somepage:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^([^\.]+)$ $1.php [NC,L]
You need to prioritise your extensionless .php URLs, so this rule needs to go before your existing rule that rewrites the "slug".
You also need to check that the corresponding .php file exists before rewriting the request. The rule you are currently proposing blindly rewrites the request when the request does not map to a file, so this would naturally catch requests that should otherwise be rewritten to the "slug".
Try it like this instead (in the root .htaccess file):
RewriteEngine On
# Handle extensionless ".php" files
RewriteCond %{DOCUMENT_ROOT}/$1.php -f
RewriteRule ^([^.]+)$ $1.php [L]
# Rewrite to "slug"
RewriteRule ^([\w-]+)/?$ somepage.php?slug=$1 [L]
I've combined your two rules that rewrite to "slug" into one, since the only difference is the trailing slash - just make it optional in the regex. The \w shorthand character class in the same as [a-zA-Z0-9_]. HOWEVER, consider canonicalising the trailing slash instead (ie. redirect to one or the other). Since by allowing an optional trailing slash (two different URLs) to serve the same content you are potentially creating a duplicate content issue.
I have tried simply including the following code below the above but this gives me a 404 error when I go to example.com/somepage
The request would have been rewritten to somepage.php?slug=somepage, so I assume it must have been somepage.php that generated the 404 response, rather than Apache?

.htaccess rewrite returning Error 404

RewriteEngine on
RewriteCond %{QUERY_STRING} (^|&)public_url=([^&]+)($|&)
RewriteRule ^process\.php$ /api/%2/? [L,R=301]
Where domain.tld/app/process.php?public_url=abcd1234 is the actual location of the script.
But I am trying to get .htaccess to make the URL like this: domain.tld/app/api/acbd1234.
Essentially hides the process.php script and the get query ?public_url.
However the script above is returning error 404 not found.
I think this is what you are actually looking for:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /app/api/%1 [R=301,QSD]
RewriteRule ^/?app/api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
If you receive an internal server error (http status 500) for that then check your http servers error log file. Chances are that you operate a very old version of the apache http server, you may have to replace the [END] flag with the [L] flag which probably will work just fine in this scenario.
And a general hint: you should always prefer to place such rules inside the http servers (virtual) host configuration instead of using dynamic configuration files (.htaccess style files). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only supported as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).
UPDATE:
Based on your many questions in the comments below (we see again how important it is to be precise in the question itself ;-) ) I add this variant implementing a different handling of path components:
RewriteEngine on
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^/?app/process\.php$ /api/%1 [R=301,QSD]
RewriteRule ^/?api/([^/]+)/?$ /app/process.php?public_url=$1 [END]
I am trying to get .htaccess to make the URL like this: example.com/app/api/acbd1234.
You don't do this in .htaccess. You change the URL in your application and then rewrite the new URL to the actual/old URL. (You only need to redirect this, if the old URLs have been indexed by search engines - but you need to watch for redirect loops.)
So, change the URL in your application to /app/api/acbd1234 and then rewrite this in .htaccess (which I assume in in your /app subdirectory). For example:
RewriteEngine On
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
You included a trailing slash in your earlier directive, but you omitted this in your example URL, so I've omitted it here also.
If you then need to also redirect the old URL for the sake of SEO, then you can implement a redirect before the internal rewrite:
RewriteEngine On
# Redirect old URL to new (if request by search engines or external links)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{QUERY_STRING} (?:^|&)public_url=([^&]+)(?:$|&)
RewriteRule ^process\.php$ /app/api/%1? [R=302,L]
# Rewrite new URL back to old
RewriteRule ^api/([^/]+)$ process.php?public_url=$1 [L]
The check against REDIRECT_STATUS is to avoid a rewrite loop. ?: inside the parenthesised subpattern avoids the group being captured as a backreference.
Change the 302 (temporary) to 301 (permanent) only when you are sure it's working OK, to avoid erroneous redirects being cached by the browser.

mod_rewrite: hide real urls but keep available as different files

Possible this question has already been answered but I didn't find any answer after hours of searching.
I need to put the site under "maintenance mode" and redirect/rewrite all requests to site_down.html, but at the same time I need the site to be available if I enter the address like files are in a subfolder.
ex:
if I type http://example.com/login.php I need site_down.html to be displayed.
but if I specify http://example.com/test/login.php I need real login.php do be displayed.
I need this to be done with rewrite, so copying everything to another directory isn't a solution.
I tried a couple dozens of combinations, but I'm still unable to achieve what I need
This is one version of my .htaccess file ():
DirectoryIndex site_down.html
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^test\/(.*)$ $1 [S=1]
RewriteRule ^(.*\.php)$ site_down.html
RewriteRule .* - [L]
</IfModule>
This code should rewrite all requests with "test/*" to "parent folder" and skip next rewrite rule and then terminate rewriting at RewriteRule .* - [L]. If there is no "test/" in url - all request should be rewritten to site_down.html
What am I doing wrong?
Could you suggest any valid solutions, please?
Thank you.
Essentially, you are searching for 2 rules. One rule will translate a virtual subdirectory to the working files. The other rule will translate the url to the working files to a splash page. We just have to make sure that if the first rule matches, the second rule doesn't match. We can do this by making sure " /test/" (including that leading space) was not in THE_REQUEST (or the string that the client sent to the server to request a page; something in the form of GET /test/mypage.php?apes=bananas HTTP/1.1). THE_REQUEST doesn't change on a rewrite, which makes it perfect for that. Skipping a rule like you did usually doesn't have the effect you expect, because mod_rewrite makes multiple passes through .htaccess until the resulting url doesn't change anymore, or it hits a limit and throws an error. The first time it will skip the rule, but the second time it will not do that.
RewriteCond %{THE_REQUEST} !\ /test/
RewriteRule \.php site_down.html [L]
RewriteRule ^test/(.*)$ $1 [L]

.htaccess mod_rewrite linking to wrong page

I have in my .htaccess the following code:
RewriteEngine On
RewriteRule ^/?([^/\.]+)/?$ $1.php [L]
RewriteRule ^/?([^/\.]+).php$ $1/ [R,L]
RewriteRule ^/?([^/\.]+)/?$ $1.php [L] is working fine. What this is doing is taking a url like http://www.example.com/whatever and making it read the page as http://www.example.com/whatever.php.
However, what I'd like to be able to do is take a url like http://www.example.com/whatever.php and automatically send it to http://www.example.com/whatever, hence the second line of the code. However, this isn't working. What its doing now, is as soon as it comes across a link ending in .php, the url becomes http://localhost/C:/Sites/page/whatever/, and pulling a 403: Forbidden page.
All I want to know is what I can to so that http://www.example.com/whatever.php will be read as http://www.example.com/whatever, and that if http://www.example.com/whatever.php is entered into the URL bar, it will automatically redirect to http://www.example.com/whatever.
Does that make any sense?
EDIT
Ok, so it appears I wasn't all too clear.. basically, I want /whatever/ to read as whatever.php while the URL still stays as /whatever/, right? However, if the URL was /whatever.php, I want it to actually redirect the users URL to /whatever/, and then once again read it as whatever.php. Is this possible?
If you're rules are inside an .htaccess file, you can omit the leading slash when you match against a URI:
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
Also note that a leading slash is included in the target (/$1.php), this makes sure /whatever/ gets rewritten to /whatever.php. When you redirect, if you are missing this leading slash, apache prepends the document root to it. Thus /whatever.php gets redirected to the document root C:/Sites/page/whatever/. Even if you include the leading slash, this will never work because you're going to cause a redirect loop:
Enter "http://www.example.com/whatever.php" in your address bar
apache redirects you to "http://www.example.com/whatever/"
apache gets the URI whatever/ and applies the first rule and the URI gets rewritten to /whatever.php
The URI gets put through the rewrite engine again
the URI /whatever.php matches the second rule and redirects the browser to "http://www.example.com/whatever/"
repeat steps 3-5
You need to add a condition that the actual request is for /whatever.php:
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
So altogether, you'll have:
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
You're making a relative path substitution in a per-directory context (.htaccess is a per-directory context). This requires RewriteBase. Per-directory rewrites are done in a later stage of processing, when URLs have been mapped to paths. But the rewrite must produce a URL, which is processed again. I think without the RewriteBase to supply the URL prefix, you end up with a filesystem prefix instead of the URL. That may be why you're getting the C:/Sites thing. Try RewriteBase. But after a correct RewriteBase to specify the correct URL prefix to be tacked in front to the relative rewritten part, I'm afraid you will have the rewrite loop, because you're rewriting whatever.php to whatever; and whatever to whatever.php.
Reference: http://httpd.apache.org/docs/current/rewrite/tech.html

Apache URL Rewriting,

I am trying to get URL rewriting to work on my website. Here is the contents of my .htaccess:
RewriteEngine On
RewriteRule ^blog/?$ index.php?page=blog [L]
RewriteRule ^about/?$ index.php?page=about [L]
RewriteRule ^portfolio/?$ index.php?page=portfolio [L]
#RewriteRule ^.*$ index.php?page=blog [L]
Now the 3 uncommented rewrite rules work perfectly, if I try http://www.mysite.com/blog/, I get redirected to http://www.mysite.com/index.php?page=blog, the same for "about" and "portfolio". However, if I mistype blog, say I try http://www.mysite.com/bloh/, then obviously I get a 404 error. The last rule, the commented one, was to help prevent that. Any URL should get redirected to the blog, but of course this rule is still parsed even if we have successfully used a previous one, so I used the "last" flag ([L]). If I uncomment my last rule, anything, including blog, about, and portfolio, redirect to blog. Shouldn't the "last" flag stop the execution as soon as it finds a matching rule?
Thanks.
Yes, the Last flag means it won't apply any of the rules following this rule in this request.
After rewriting the URL, it makes an internal request using the new rewritten URL which would match your last RewriteRule and thus your redirects go into an infinite loop.
Use the RewriteCond directive to limit rewriting to URLs that don't start with index.php, and you should be fine.
You could add a condition like:
RewriteCond %{REQUEST_URI} !^index\.php
I'll also mention that using RewriteRule ^.*$ is a good way to break all of your media requests (css, js, images) as well. You might want to add some conditions like:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
To make sure you're not trying to rewrite actual files or directories that exist on your server. Otherwise they'll be unreachable unless index.php serves those too!
From apache's mod_rewrite docs
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite
rules. This corresponds to the Perl
last command or the break command in
C. Use this flag to prevent the
currently rewritten URL from being
rewritten further by following rules.
Remember, however, that if the
RewriteRule generates an internal
redirect (which frequently occurs when
rewriting in a per-directory context),
this will reinject the request and
will cause processing to be repeated
starting from the first RewriteRule.
You could use
ErrorDocument 404 /index.php?page=blog
but you should be aware of the fact that it doesn't return 404 error code, but a redirect one and I don't know if that is such a good practice.
After you [L]eave processing for the request, the whole processing runs again for the new (rewritten) URL. You could get out of that loop by using this before your other rules:
RewriteRule ^index.php - [L]
which means "for index.php, don't rewrite and leave processing."