htaccess rewrite drive me nuts - apache

I want to use a rather simple rewrite, something like this:
RewriteRule monitor.html index.php/\?first_category_id=B008 [NC,L]
But it doesn't work as expected, goes to like index.php/monitor.html (which kicks in symfony's routing and returns a 404 error but this is a different story)
However if i include full url like:
RewriteRule monitor.html http://example.com/index.php/\?first_category_id=B008 [NC,L]
it responses the correct content, but this looks like a full redirect, the rewrited url is revealed in the browser. And thats not transparent nor easily deployable.
What am i missing here?
the rest of the htaccess file if it matters:
RewriteCond %{REQUEST_URI} \..+$
RewriteRule .* - [L]
RewriteRule ^(.*)$ index.php [QSA,L]

Your rule is outputting a relative path and you're in a per-directory context. You need RewriteBase. In a per-directory context, rewriting is being done on expanded filesystem paths, not on the original URL's. But the results of the expansion are converted to a URL again! RewriteBase supplies the prefix needed to do that. Without it, the URL is naively made out of the same filesystem prefix that was stripped prior to the substitution and you end up with for instance http://example.com/var/www/docroot/blah... which is nonsense. Either RewriteBase or put out an absolute, beginning with a slash.
Also, you should anchor the match:
RewriteRule ^monitor.html$ ...
Otherwise the rule will potentially match somewhere in the middle of the path and just that matching part will be replaced with the substitution! You don't want to match and translate amonitor.htmly/foobar, right, and convert just the monitor.html part to a the index.php stuff.
You should not escape the question mark in the substitution. It's not a regexp! Just index.php/?etc not index.php/\?etc (Could that backslash be what is screwing up, causing `index.php/monitor.html'?)

Related

.htaccess mod_rewrite linking to wrong page

I have in my .htaccess the following code:
RewriteEngine On
RewriteRule ^/?([^/\.]+)/?$ $1.php [L]
RewriteRule ^/?([^/\.]+).php$ $1/ [R,L]
RewriteRule ^/?([^/\.]+)/?$ $1.php [L] is working fine. What this is doing is taking a url like http://www.example.com/whatever and making it read the page as http://www.example.com/whatever.php.
However, what I'd like to be able to do is take a url like http://www.example.com/whatever.php and automatically send it to http://www.example.com/whatever, hence the second line of the code. However, this isn't working. What its doing now, is as soon as it comes across a link ending in .php, the url becomes http://localhost/C:/Sites/page/whatever/, and pulling a 403: Forbidden page.
All I want to know is what I can to so that http://www.example.com/whatever.php will be read as http://www.example.com/whatever, and that if http://www.example.com/whatever.php is entered into the URL bar, it will automatically redirect to http://www.example.com/whatever.
Does that make any sense?
EDIT
Ok, so it appears I wasn't all too clear.. basically, I want /whatever/ to read as whatever.php while the URL still stays as /whatever/, right? However, if the URL was /whatever.php, I want it to actually redirect the users URL to /whatever/, and then once again read it as whatever.php. Is this possible?
If you're rules are inside an .htaccess file, you can omit the leading slash when you match against a URI:
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
Also note that a leading slash is included in the target (/$1.php), this makes sure /whatever/ gets rewritten to /whatever.php. When you redirect, if you are missing this leading slash, apache prepends the document root to it. Thus /whatever.php gets redirected to the document root C:/Sites/page/whatever/. Even if you include the leading slash, this will never work because you're going to cause a redirect loop:
Enter "http://www.example.com/whatever.php" in your address bar
apache redirects you to "http://www.example.com/whatever/"
apache gets the URI whatever/ and applies the first rule and the URI gets rewritten to /whatever.php
The URI gets put through the rewrite engine again
the URI /whatever.php matches the second rule and redirects the browser to "http://www.example.com/whatever/"
repeat steps 3-5
You need to add a condition that the actual request is for /whatever.php:
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
So altogether, you'll have:
RewriteEngine On
RewriteRule ^([^/\.]+)/?$ /$1.php [L]
RewriteCond %{THE_REQUEST} ^(GET|POST|HEAD)\ /([^/\.]+)\.php
RewriteRule ^ /%2/ [R,L]
You're making a relative path substitution in a per-directory context (.htaccess is a per-directory context). This requires RewriteBase. Per-directory rewrites are done in a later stage of processing, when URLs have been mapped to paths. But the rewrite must produce a URL, which is processed again. I think without the RewriteBase to supply the URL prefix, you end up with a filesystem prefix instead of the URL. That may be why you're getting the C:/Sites thing. Try RewriteBase. But after a correct RewriteBase to specify the correct URL prefix to be tacked in front to the relative rewritten part, I'm afraid you will have the rewrite loop, because you're rewriting whatever.php to whatever; and whatever to whatever.php.
Reference: http://httpd.apache.org/docs/current/rewrite/tech.html

How to do a mod_rewrite redirection to relative URL

I am trying to achieve a basic URL redirection for pretty-URLs, and due to images, CSS etc. also residing in the same path I need to make sure that if the URL is accessed without a trailing slash, it is added automatically.
This works fine if I put the absolute URL like this:
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ http://www.mydomain.com/myParentDir/$1/ [R,nc,L]
But if I change this to a relative URL, so that I don't have to change it each time I move things in folders, this simply doesn't work.
These are what I tried and all do not work, or redirect me to the actual internal directory path of the server like /public_html/... :
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ ./myParentDir/$1/ [R,nc,L]
RewriteRule ^myParentDir/([A-Z0-9_-]+)$ myParentDir/$1/ [R,nc,L]
What is the right way to do a URL redirection so that if the user enters something like:
http://www.mydomain.com/somedir/myVirtualParentDir/myVirtualSubdir
he gets redirected to (via HTTP 301 or 302):
http://www.mydomain.com/somedir/myVirtualParentDir/myVirtualSubdir/
Thanks.
EDIT: Adding some more details because it does not seem to be clear.
Lets say I am implementing a gallery, and I want to have pretty URLs using mod_rewrite.
So, I would like to have URLs as follows:
http://www.mydomain.com/somedir/galleries/cats
which shows thumbnails of cats, while:
http://www.mydomain.com/somedir/galleries/cats/persian
which shows one image from the thumbnails of all cats, named persian.
So in actual fact the physical directory structure and rewriting would be as follows:
http://www.domain.com/somedir/gallery.php?category=cats&image=persian
So what I want to do is put a .htaccess file in /somedir which catches all requests made to /galleries and depending on the virtual subdirectories following it, use them as placeholders in the rewriting, with 2 rewrite rules:
RewriteRule ^galleries/(A-Z0-9_-]+)/$ ./gallery.php?category=$1 [nc]
RewriteRule ^galleries/(A-Z0-9_-]+)/+([A-Z0-9_-]+)$ ./gallery.php?category=$1&image=$2 [nc]
Now the problem is that the gallery script in fact needs some CSS, Javascript and Images, located at http://www.domain.com/somedir/css, http://www.domain.com/somedir/js, and http://www.domain.com/somedir/images respectively.
I don't want to hardcode any absolute URLs, so the CSS, JS and Images will be referred to using relative URLs, (./css, ./js, ./images etc.). So I can do rewriting URLs as follows:
RewriteRule ^galleries/[A-Z0-9_-]+/css/(.*)$ ./css/$1 [nc]
The problem is that since http://www.domain.com/somedir/galleries/cats is a virtual directory, the above only works if the user types:
http://www.domain.com/somedir/gallaries/cats/
If the user omits the trailing slash mod_dir will not add it because in actual fact this directory does not actually exist.
If I put a redirect rewrite with the absolute URL it works:
RewriteRule ^galleries/([A-Z0-9_-]+)$ http://www.mydomain.com/subdir/galleries/$1/ [R,nc,L]
But I don't want to have the URL prefix hardcoded because I want to be able to put this on whatever domain I want in whatever subdir I want, so I tried this:
RewriteRule ^galleries/([A-Z0-9_-]+)$ galleries/$1/ [R,nc,L]
But instead it redirects to:
http://www.mydomain.com/home/myaccount/public_html/subdir/galleries/theRest
which obviously is not what I want.
EDIT: Further clarifications
The solution I am looking for is to avoid hardcoding the domain name or folder paths in .htaccess. I am looking for a solution where if I package the .htaccess with the rest of the scripts and resources, wherever the user unzips it on his web server it works out of the box. All works like that apart from this trailing slash issue.
So any solution which involves hardcoding the parent directory or the webserver's path in .htaccess in any way is not what I am looking for.
Here's a solution straight from the Apache Documentation (under "Trailing Slash Problem"):
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]
Here's a solution that tests the REQUEST_URI for a trailing slash, then adds it:
RewriteCond %{REQUEST_URI} !(/$|\.)
RewriteRule (.+) http://www.example.com/$1/ [R=301,L]
Here's another solution that allows you to exempt certain REQUEST_URI patterns:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ http://domain.com/$1/ [L,R=301]
Hope these help. :)
This rule should add a trailing slash to any URL which is not a real file/directory (which is, I believe, what you need since Apache usually does the redirect automatically for existing directories).
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+[^/])$ $1/ [L,R=301]
Edit:
In order to prevent Apache from appending the path relative to the document root, you have to use RewriteBase. So, for instance, in the folder meant to be your application's root, you add the following, which overrides the physical path:
RewriteBase /
This might work:
RewriteRule ^myParentDir/[A-Z0-9_-]+$ %{REQUEST_URI}/ [NS,L,R=301]
However, I'm not sure why you think you need this at all. Just make your CSS / JS / image file rewrite rule look something like this:
RewriteRule ^galleries/([A-Za-z0-9_-]+/)*(css|js|images)/(.*)$ ./$2/$3
and everything should work just fine regardless of whether the browser requests /somedir/galleries/css/whatever.css or /somedir/galleries/cats/css/whatever.css or even /somedir/galleries/cats/persian/calico/css/whatever.css.
Ps. One problem with this rule is that it prevents you from having any galleries names "css", "js" or "images". You might want to fix that by naming those virtual directories something like ".css", ".js" and ".images", or using some other naming scheme that doesn't conflict with valid gallery names.
I'm not sure I complelty understand your problem.
The trailing slash redirection is done automatically on most Apache installation because of mod_dir module (99% of chance you'have the mod_dir module).
You may need to add:
DirectorySlash On
But it's the default value.
So. If you access foo/bar and bar is not a file in foo directory but a subdirectory then mod_dir performs the redirection to foo/bar/.
The only thing I known that could break this is the Option Multiviews which is maybe trying to fin a bar.php, bar.php, bar.a-mime-extension-knwon-by-apache in the directory. So you could try to add:
Option -Multiviews
And remove all rewriteRules. If you do not get this default Apache behavior you'll maybe have to look at mod-rewrite, but it's like using a nuclear bomb to kill a spider. Nuclear bombs may get quite touchy to use well.
EDIT:
For the trailing slash problem with mod-rewrite you can check this documentation howto, stating this should work:
RewriteEngine on
RewriteBase /myParentDir/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]

How to prevent mod_rewrite from rewriting URLs more than once?

I want to use mod_rewrite to rewrite a few human-friendly URLs to arbitrary files in a folder called php (which is inside the web root, since mod_rewrite apparently won't let you rewrite to files outside the web root).
/ --> /php/home.php
/about --> /php/about_page.php
/contact --> /php/contact.php
Here are my rewrite rules:
Options +FollowSymlinks
RewriteEngine On
RewriteRule ^$ php/home.php [L]
RewriteRule ^about$ php/about_page.php [L]
RewriteRule ^contact$ php/contact.php [L]
However, I also want to prevent users from accessing files in this php directory directly. If a user enters any URL beginning with /php, I want them to get a 404 page.
I tried adding this extra rule at the end:
RewriteRule ^php php/404.php [L]
...(where 404.php is a file that outputs 404 headers and a "Not found" message.)
But when I access / or /about or /contact, I always get redirected to the 404. It seems the final RewriteRule is applied even to the internally rewritten URLs (as they now all start with /php).
I thought the [L] flag (on the first three RewriteRules) was supposed to prevent further rules from being applied? Am I doing something wrong? (Or is there a smarter way to do what I'm trying to do?)
[L] flag should be used only in the last rule,
L - Last Rule - Stops the rewriting process here and don’t apply any more rewriting rules & because of that you are facing issues.
I had similar problem. I have a content management system written in PHP and based on Model-View-Control paradigm. The most base part is the mod_rewrite. I've successfully prevent access to PHP files globally. The trick has name THE_REQUEST.
What's the problem?
Rewriting modul rewrites the URI. If the URI matches a rule, it is rewritten and other rules are applied on the new, rewritted URI. But! If the matched rule ends with [L], the engine doesn't terminate in fact, but starts again. Then the new URI doesn't more match the rule ending with [L], continues and matches the last one. Result? The programmer stars saying bad words at the unexpected 404 error page. However computer does, what you say and doesn't do, what you want. I had this in my .htaccess file:
RewriteEngine On
RewriteBase /
RewriteRule ^plugins/.* pluginLoader.php [L]
RewriteCond %{REQUEST_URI} \.php$
RewriteRule .* index.php [L]
That's wrong. Even the URIs beginning with plugins/ are rewritten to index.php.
Solution
You need to apply the rule if and only if the original - not rewritten - URI matches the rule. Regrettably the mod_rewrite does not provide any variable containing the original URI, but it provides some THE_REQUEST variable, which contains the first line of HTTP request header. This variable is invariant. It doesn't change while rewrite engine is working.
...
RewriteCond %{THE_REQUEST} \s.*\.php\s
RewriteRule \.php$ index.php [L]
The regular expression is different. It is not applied on the URI only, but on entire first line of the header, that means on something like GET /script.php HTTP/1.1. But the critical rule is this time applied only if the user is explicitly requesting some PHP-script directly. The rewritten URI is not used.

mod_rewrite with GET requests

I have mod_rewrite working on most of my site. Right now I have a search that normally would point to
search.php?keyword=KEYWORD
And I'm trying to rewrite that to
search/?keyword=KEYWORD
Just to make it a little bit cleaner. So here's my mod_rewrite. (There are other rules I'm just posting the one that isn't working.)
RewriteRule ^search/?keyword=([^/\.]+)/?$ search.php?search=$1
When I type a search in the address bar way I want it to be, I get a page telling me its a "broken link" (I'm guessing that that's Chrome's equivalent of a 404 error). So what am I doing wrong? I think that the problem is the '=' or the '?' sign in the rule (the first part) because when I take the ?keyword= part out, it works. Does that make sense?
EDIT: This is my full .htaccess code:
RewriteEngine on
RewriteRule ^$ index.php
RewriteRule ^thoughts$ archives.php
RewriteRule ^thoughts/$ archives.php
RewriteRule ^about$ about.php
RewriteRule ^about/$ about.php
RewriteRule ^search/\?keyword=([^/]+)$ search.php?search=$1
RewriteRule ^tags/([^/]+)$ tags.php?tag=$1
RewriteRule ^thoughts/([^/]+)$ post.php?title=$1 [L]
Still getting an error page...
If you just want to transform:
search.php?keyword=KEYWORD
into:
search/?keyword=KEYWORD
all you need to do is:
RewriteRule ^search/$ search.php [QSA]
The QSA flag means "query string append", and passes to search.php whatever you request via GET:
search/?keyword=KEYWORDD
search/?name=value&name2=value2
You may also want to check out Apache MultiViews, which sends every /foo request to any foo.* file it finds in the / directory, although they are considered bad.
RewriteRule ^search/\?keyword=([^/.]+)/?$ search.php?search=$1
The question mark character has special meaning in a regex. You need to escape it.
Additionally, the dot has no special meaning when inside a character class; you need not escape it (you're requiring that keyword contain no forward slashes and dots).

What's wrong with my RewriteRule via .htaccess? (Why does it need RewriteBase?)

rewriteengine on
rewriterule ^/a/b$ ^/c$
not working,but
rewriteengine on
rewritebase /
rewriterule ^a/b$ ^c$
works.
It's probably not the RewriteBase that makes the rule work so much as the leading slash. Also, the second argument to RewriteRule isn't a regular expression. Instead, try:
RewriteRule ^/?a/b$ c
When applying a RewriteRule from .htaccess, the leading slash will be stripped from the URL, which will cause the pattern to fail if you include it. By starting a pattern with "^/?", it will work in the main configuration files and in per-directory config files.
Read over the detailed mod_rewrite documentation for the details of how the rewrite engine works and the significance of RewriteBase.
Edit:
As mentioned in the mod_rewrite technical details and described in the documentation for RewriteRule and RewriteBase, the URL has been translated to a path by the time the per-directory rewrite rules are evaluated. The rewrite engine no longer has a URL to work with. Instead, it removes the local directory prefix (the directory holding the .htaccess file), which ends with a slash. For example, suppose a visitor requests "/var/www/foo/bar/baz.html" and there is a rewrite rule set in "/var/www/foo/.htaccess". Fore each rule, the rewrite engine will strip "/var/www/foo/", leaving "bar/baz.html" to be matched against the rewrite rule. After processing a rule, the local directory prefix is prepended (unless the replacement begins with "http://"). After all the rewriting rules have been processed, the rewrite base, if set, replaces the local directory prefix; if not, the document root is stripped. The rewritten URL is then re-injected as a sub-request.
What version of Apache are you using? RewriteBase should not be necessary when you are rewriting from the root. If you are not, you may need it. For instance, a part of my current configurations (Apache 2.2) for one of my blogs looks as follows, and works:
RewriteEngine On
RewriteRule ^/$ /blog/ [R]
RewriteRule ^/wordpress/(.*) /blog/$1 [R]