Apache .htaccess to redirect index.html to root, why FollowSymlinks and RewriteBase? - apache

In order to redirect all somefolder/index.html (and also somefolder/index.htm) to somefolder/ I use this simple rewrite rule in Apache .htaccess file:
RewriteEngine on
RewriteCond %{THE_REQUEST} ^.*\/index\.html?\ HTTP/
RewriteRule ^(.*)index\.html?$ "/$1" [R=301,L]
This works well!
But at Google groups they suggest to add also:
Options +FollowSymlinks
RewriteBase /
Could anyone be so kind to explain me why would i have to add these last lines, and explain me a bit what they mean and what they do?
Is there a potential secuirty risk in not adding these lines?
Many thanks,

Why they're suggested:
It's suggested that you add Options +FollowSymlinks because it's necessary that symlink following is enabled for mod_rewrite to work, and there's a chance that, while you may be allowed to turn it on, it's not enabled by the main server configuration. I suspect the reason that symlink following is necessary is beause the module makes a number of calls to apr_stat(), which looks like it needs to follow symlinks in order to get file information in all cases.
As for RewriteBase, it's typically not necessary. The documentation goes on about it, but as most people's files do live under the DocumentRoot somewhere, it usually only serves a purpose if you're redirecting externally and you use directory-relative URLs. To illustrate what I mean, consider the following:
RewriteEngine On
RewriteRule ^redirect index.html [R,L]
A request for example.com/redirect will result in an external redirect to example.com/full/path/to/web/root/index.html. The reason for this is that before it handles the redirection, mod_rewrite re-appends the current directory path (which is the default value of RewriteBase). If you modified RewriteBase to be /, then the path information would be replaced with that string, so a request for index.html would now be a request for /index.html.
Note that you could just have done this explicitly on the replace too, regardless of the value of RewriteBase:
RewriteEngine On
RewriteRule ^redirect /index.html [R,L]
...works as intended, for example. However, if you had many rules that needed a common base and were being shifted around between directories, or your content wasn't under the root, it would be useful to appropriately set RewriteBase in that case.
The risk of not using them:
There's absolutely no security risk in not specifying Options +FollowSymlinks, because if you don't and it's not set by the main server configuration, mod_rewrite will always return 403 Forbidden. That's kind of problematic for people trying to view your content, but it definitely doesn't give them any extended opportunity to exploit your code.
Not setting RewriteBase could expose the path to your web content if you had an improperly configured rule set in one of your .htaccess files, but I'm not sure that there's any reason to consider that a security risk.

Related

htaccess doesn't work after moving from webhost to local Synology host

I have a hosted website where I use the following htaccess file for formatting of urls, These all work fine. The host uses Apache, but unfortunately doesn't show a version number. I think it's 2.4.
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTPS} off
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R,L]
RewriteRule ^item/([0-9]+)/(.*) /item.php?item=$1&title=$2 [L]
RewriteRule ^category/(.*) /showitems.php?category=$1 [L]
RewriteRule ^search/(.*) /searching.php?options=$1 [L]
RewriteRule ^searching/(.*) /showitems.php?search=$1 [L]
RewriteRule ^update/(.*) /showitems.php?update=$1 [L]
AddType application/x-httpd-lsphp .html .htm .shtml
I copied the entire site to my local Synology Diskstation with Apache 2.4.
The rewrite urls for category, search and update work fine. However, the urls for 'searching' and 'item' return 404 errors. 'Searching' is a header redirect from within 'searching.php'
Item is an oddity in the sense that it uses 2 get params in the result url. In trial and error mode I changed it to:
RewriteRule ^item/(.*) /item.php?item=$1 [L]
Which doesn't work either, however
RewriteRule ^itemitem/(.*) /item.php?item=$1 [L]
Works fine, which really puzzles me. This last rewrite also doesn't work when I add the second parameter again.
What am I missing? Or is there a better way to approach these rewrites in the first place that I could try?
What CBroe has commented on your post is properly the correct answer.
To be more specific and explain why then if your Apache 2.x server has MultiViews enabled it will try to match things up for you like directory names, file names on your behalf to make the user-experience easier.
However, this can in many cases confuse your rewrite rules that are expecting very specific regular expressions.
In most cases you can get away with just disabling MultiViews.
You disable it by adding to your Options in your .htaccess file.
-MultiViews
Please note that if the AllowOverrive directory does not allow this then your need to change the Apache configuration file for that vhost/directory to include the -MultiViews option.
You can read more about MultiViews here:
http://httpd.apache.org/docs/current/content-negotiation.html
The other thing and perhaps more correct way is to build your rewrite rules better and take advantage of things like the "!-f" or "!-d" parameter to let your rules know that you don't want to include files or directories and so forth.
A side note by the way on MultiViews, even though it's very fancy and neato - If you are running a production site please know that MultiViews create a hole lot of unwanted disk I/O and can slow down the Apache servers performance quite a bit! So it's always good practice to disable MultiViews.

apache mod_rewrite user URLs to index.php

I've always had a fairly standard apache configuration. Right now we're introducing a new concept, user session specific URLs that's going to change things. Basically we have a DocumentRoot and anything such as:
http://example.com/ would hit index.html in the DocumentRoot directive.
But now I'd like to be able to do something like
http://example.com/uid/5/
http://example.com/uid/2
Those should still hit index.html in the DocumentRoot that has been set. The URL is mostly for server-side and client-side scripts to be able to carry out their own tasks.
What's the best way to handle this in Apache? Is mod_rewrite even necessary here?
I also need to be able to support existing paths such as say the following:
http://example.com/foo/bar/something.php will be rewritten to http://example.com/uid/3/foo/bar/something.php but will still hit the same place on the filesystem as before.
You could use mod_rewrite by putting this code in your htaccess
RewriteEngine On
RewriteRule ^uid/([1-9][0-9]*)/(.+)$ /$2?uid=$1 [L]
Example:
http://example.com/foo/bar/something.php -> unchanged
http://example.com/uid/3/foo/bar/something.php -> rewritten to /foo/bar/something.php?uid=3
EDIT: without uid appended
RewriteEngine On
RewriteRule ^uid/[1-9][0-9]*/(.+)$ /$1 [L]

Apache httpd.conf rewrite rules

I have some rewrite rules in my httpd.conf file. Is there a way to get apache to check the rewrite rules only if the url is not valid? My rewrite rules are preceded by checks for the REQUEST_FILENAME being a valid file, and a valid folder. But the documentation mentions that the rewrite conditions are checked only AFTER it finds a match for the rewrite rule.
So, whenever there is a request for a URL, apache checks each rewrite rule for that URL. Almost all the pages have images, .js and .css files and a few more files with them. Apache checks those too, against the rewrite rules in the httpd.conf (I see this in the RewriteLog generated for each URL). This significantly slows down the site.
I am aware of the FallbackResource directive. I don't want to use it as of now, because it returns a http status code of 200 by default. I want to return the correct status code (usually a 301) whenever there is a request for a page that was not found by Apache (usually, the incorrect URL has a correct counterpart, hence the need to send a 301). Sending the correct http status code also benefits our seo efforts. If there is a way to send the correct http status code using the FallbackResource directive, I would be open to using that option.
I have tried googling for these issues, and didn't find an answer. I have tried with different RewriteCond (s) but, like the documentation says, each rewriterule is checked anyways.
Any pointers on this would be of much help.
It does appear that there'd be some readings to do for you but, I always use this as "bible" when it comes to rewrite rule and haven't ceased to failed me. Perhaps this would do the same for you.
http://corz.org/server/tricks/htaccess2.php
Why not use the following:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^.*$ /not-found.php [L]
I would put in a .htaccess file located in the root folder. That way you can easily customize it for each site.

RewriteRule for mapping x.domain.com to y.domain.com

Is it possible to redirect all requests to x.domain.com/.* to y.domain.com/.* WITHOUT letting this redirection be visible in the url?
I have unsuccessfully tried several things in .htaccess. Just specifying the [L] flag still shows this redirection in the url (as it does when I use the [R] flag additionally).
EDIT: as somebody claimed there being no reason for this, let me give some more information :)
I have one nice url: x.domain.com , which is well known.
Then there are a number of other domains: spring.domain.com , summer.domain.com , autumn.domain.com, winter.domain.com .
Depending on the time of the year, a specific y.domain.com becomes the current one. The x.domain.com should always map to the current one.
EDIT2:
I'll write here, as the code isn't nicely rendered in the comments...
I tried what Arjan suggested:
RewriteCond %{HTTP_HOST} ^x.domain.com$
RewriteRule ^(.*)$ /path/to/y.domain.folder/$1
Unfortunatly though this keeps redirecting forever. :(
Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.
Putting the [R] flag behind, I see in the url something like:
http://x.domain.com/path/to/y.domain.folder/path/to/y.domain.folder/path/to/y.domain.folder/ ...
Any suggestions?
Now that I can read the errorlogs, I can give a direct response, as what a possible 500 error refers to.
Assuming you have access to the Apache configuration, create the following virtual host for domain x.domain.com. Then simply update y to whatever you need each season.
<VirtualHost ...:80>
ServerName x.domain.com
UseCanonicalName Off
ProxyRequests Off
<Proxy *>
Order Allow,Deny
Allow from all
</Proxy>
ProxyPreserveHost Off
RewriteEngine On
RewriteRule ^$ http://y.domain.com/ [P,NC]
RewriteRule ^/(.*)$ http://y.domain.com/$1 [P,NC]
ProxyPassReverse / http://y.domain.com/
</VirtualHost>
Also to pick up the Alias suggestions, if you have multiple virtual hosts (one for each season) then you could put a server alias into the current domain. E.g.
<VirtualHost ...:80>
ServerName summer.domain.com
ServerAlias x.domain.com
...
</VirtualHost>
<VirtualHost ...:80>
ServerName spring.domain.com
...
</VirtualHost>
...
This would make Apache deliver the summer.domain.com pages if you go to x.domain.com. If your seasonal subdomains depend on the HOST header line to be set correctly (i.e. to season.domain.com) you would need to use the first suggestion above, though.
If these are not hosted on the same server, then you'd need the Proxy flag. This also requires the proxy module to be running. Not tested:
RewriteCond %{HTTP_HOST} ^x.domain.com$
RewriteRule ^(.*)$ http://y.domain.com/$1 [P]
EDIT: Given the edits to your question they're probably just on the same server. So then indeed, as jetru suggested an Alias might do. Or:
# No RewriteCond required; serve all content from other folder:
RewriteRule ^(.*)$ /path/to/y.domain.folder/$1
EDIT: The above would not change the HTTP_HOST header that was sent by the browser (maybe that can be done as well). This implies that it would only work if the subdomains are represented on the file system as separate directories. So, as the .htaccess would then be placed in the directory holding the website for x.domain.com, the RewriteCond wouldn't even be required. Also, the directory for this x.domain.com subdomain would in fact not need any HTML content then; in the end all content would be served from the directory of another subdomain.
EDIT: As the above does not seem to work either, and yields endless rewrite loops even when adding [NS], maybe simply adding [L] helps here:
RewriteRule ^(.*)$ /path/to/y.domain.folder/$1 [NS,L]
Or maybe one can set an environment variable to stop the loop:
RewriteCond %{ENV:MY_VAR} !=1
RewriteRule ^(.*)$ /path/to/y.domain.folder/$1 [E=MY_VAR:1]
But, for both [L] and [E]: I'm just guessing; I've never made mod_rewrite jump into the directory of another virtual host. I am not sure it can be done to start with.
Unfortunately, it's unclear how one would add a new subdomain. If one would just need to create a new directory with the name of the subdomain (without any use of some administrative tool) then the provider might be be using system wide rewriting as well. In fact, even without subdomains the provider might be doing some Mass Virtual Hosting as described in the URL Rewrite Guide.
I guess the best solution would be to change the value of HTTP_HOST on the fly, to solve issues with any system wide rewriting. Maybe the following is allowed to achieve that:
RewriteCond %{HTTP_HOST} ^x.domain.com$
RewriteRule ^(.*)$ /path/to/y.domain.folder/$1 [E=HTTP_HOST:y.domain.com]
Again, as the above would only be present in the .htaccess in the x.domain.folder, the RewriteCond is probably not needed at all.
Have you tried
Alias /dir/file.html /full/path/to/other/file.html
??
To my knowledge and testing with firebug a redirect via .htaccess is always announced to the client and it's up to him how to proceed. It is therefore not an alternative to some sort of SSI functionality. To prevent a "fake" address modern browser should always make the REAL address visible to the user, however I think I have seen some misbehavior in programs like "feeddemon" where IE is embedded. If you - for whatever reason - really want to show content from one subdomain on another you can try using Javascript or (i)frames on the user side or some include functionality on the server site (eg. file_get_contents with php). However, I don't recommend this.

Mod-Rewrite Problems (Apache) with / slashes

I am betting on an obvious problem here I am not seeing.
Here's the important bits for those of you familiar with Mod-Rewrite
.htaccess file with mod-rewrite rules exists here:
http://www.thedomain.com/.htaccess
User goes to this URL:
http://www.thedomain.com/test/blog
Mod-Rewrite rules should actually tell the server to access this URL:
http://www.thedomain.com/index.php?page=blog
.htaccess:
Options FollowSymLinks
Options -MultiViews
RewriteEngine on
RewriteRule ^test/([^/.]+)$ /index.php?page=$1 [L]
This combination of code/request does not work. If you're wondering about the code snippet ^test not being ^/test instead, it is because apparently this is a problem on GoDaddy, the code fails with the / after the ^ - this seems like it may be related to my problem, which I'll explain further... If I change the .htaccess code line:
RewriteRule ^test/([^/.]+)$ /index.php?page=$1 [L]
to
RewriteRule ^test([^/.]+)$ /index.php?page=$1 [L]
(just removing the / here: ^test/([^/.]+) )
The code works when the requested URL is changed to accomodate (remove the slash; http://www.thedomain.com/testblog) as the user views the proper index.php?page=blog server response. It seems to me I cannot use any slashes within the darn match side of the RewriteRule. What gives?
Update: If at all relevent, this .htaccess file and the relevant files to the question all exist in a subdirectory off of the GoDaddy server that is hosting this although the domain points to the subdirectory as the root. Not sure if this is relevant.
Update: This server (at the server root) is actually running wordpress with pretty URLs enabled and they work perfectly fine. I assume wordpress uses mod-rewrite to make crazy urls like thedomain.com/2008/11/15/the-article-title.html work...?
Thanks so much.
Is RewriteBase what you're looking for?
there is a nice test utility for windows here
http://www.helicontech.com/download-isapi_rewrite.htm
try changing your code to:
^/test/([^/]+)$ /index.php?page=$1 [L]
or without slashes
^test[^a-z]+([a-z]*)$ /index.php?page=$1 [L]
I was unable to find a solid method around this problem on GoDaddy; for whatever reason I could not have slashes within the URL that was attempting to be rewritten aside from the base (http://www.somedomain.com/testingthis would work but http://www.somedomain.com/testing/this died).
I ended up instead using the Wordpress .htaccess to send all non-existant file/directory requests back to my index.php. I then used the $_SERVER['REQUEST_URI'] var with pathinfo() to parse the URL and then direct what content to load from the parsing. This works well, is fast, and is probably the same method Wordpress uses.
Thanks for the attemps!
If you're wondering about the code snippet ^test not being ^/test instead, it is because apparently this is a problem on GoDaddy, the code fails with the / after the ^ […]
That’s not odd but necessary:
Per-directory Rewrites
When using the rewrite engine in .htaccess files the per-directory prefix (which always is the same for a specific directory) is automatically removed for the pattern matching and automatically added after the substitution has been done.
And that per-directory prefix is for a .htaccess file in the document root (/.htaccess) the URL path root (/). Thus patterns with the ^ must be written without that per-directory prefix /.
On the same way the substitution is handled. After a rule is applied, the per-directory prefix is added to the substituion. So try this rule:
RewriteRule ^test/([^/.]+)$ index.php?page=$1 [L]
OK, first off, I think that the GoDaddy apache server simply has some of the options turned off. I think that if they don't have an AllowOverride FileInfo in their configuration, RewriteRule won't work so well, or at all.
Which means its surprising that the URL http://www.thedomain.com/testblog works at all, and gets re-written. So I guess I'm a little confused.
Here's an idea: Try creating a directory named test, and put the .htaccess file in there! It would look like this:
Options FollowSymLinks
RewriteEngine on
RewriteRule ^([^/]+)$ /index.php?page=$1 [L]
OK, another idea: Use RewriteCond. Maybe you can check the request URI directly, like this:
Options FollowSymLinks
RewriteEngine on
RewriteCond %{REQUEST_URI} ^/test/([^/]+)
RewriteRule . /index.php?page=%1 [L]
Last idea: maybe your browser sees the URL http://www.thedomain.com/test/blog and thinks it's a directory, and adds a slash? So the URL is sends is http://www.thedomain.com/test/blog/. In that case, the REGEX won't match unless you allow for a trailing slash:
RewriteRule ^test/([^/.]+)/?$ /index.php?page=$1 [L]
Whoops. Sorry for gushing - there's just some many things that can go wrong in an HTTP request that goes through rewriting, and as many ways to try and overcome the problems :-)