Where is the proper place for mod_rewrite entries? - apache

For the love of God, I can't seem to get this mod_rewrite working properly. Instead of doing brute force trial-and-error, let me ask here.
I want mod_rewrite rules to apply to ALL domains.
I want mod_rewrite entries in httpd.conf
I want to get rid of this WWW virus (for SEO purposes):
http://www.example.com > http://example.com
I want to get rid of index.html (for SEO, google indexes it instead of just domain):
http://www.example.com/index.html > http://example.com
http://www.example.com/some/index.html > http://example.com/some/index.html
Domains are inside <virtualhost> entries. I couldnt figure out where to put what or which one should take priority. As i mentioned, I would like to apply these 2 rules to ALL DOMAINS in the server.
The situation is exacerbated by ssl.conf. Will all these need to be entered into ssl.conf too? What will happen when there are 2 redirects like:
http://www.example.com/index.html > http://example.com/index.html > http://example.com
Thank you so much. This has quickly become all so confusing.
Maria

This solves it for me. As I suspected, there is a whole lotta difference where rewriterule is applied. Many people including mean seems to be unaware of this.
http://wiki.apache.org/httpd/RewriteContext
The Apache HTTPD Server deals with requests in discrete phases. While this is usually transparent to the user and administrator it does have an effect on the behaviour of mod_rewrite when rulesets are placed in different contexts. To oversimplify a little, when rules are placed in VirtualHost blocks (or in the main server context) they get evaluated before the server has yet mapped the requested URI to a filesystem path. Conversely, when rules are placed in .htaccess files, or in Directory blocks in the main server config, they are evaluated after this phase has occured.

Related

.htaccess masked forwarding for certain folder/directory within domain

Let's say I have a domain called www.customer1.com and www.customer2.com. I want to run all the pages of these sites separately... but items found within certain paths, I'd like to reference from one domain to another in a masked forwarded manner for SEO purposes and to avoid having to place files in two different FTP accounts.
The target folders are
/images
/pdfs
FOr example if a call is made to
www.customer2.com/images/[any image] then I want the masked forwarder to kick in to serve a file that is located at www.customer1.com/images/[filename requested]. Same goes for anything found after /pdf/ in the same example.
However all other pages should remain referencing to internal files within.
I have limited understanding of .htaccess and frankly lost as to how to approach anything beyond a very simple 30
Not sure you understand the concept here, as the "forwarding" would equate to an external redirect, there is no "masking" anywhere. The closest thing there is is reverse proxying:
RewriteEngine On
RewriteRule ^(images|pdf)/(.*)$ http://www.customer1.com/$1/$2 [L,P]
You need mod_proxy to do this and those rules need to be in the htaccess file in your customer2.com's document root.
You can also do this in customer2.com's server/vhost config:
ProxyPass /images/ http://www.customer1.com/images/
ProxyPass /pdf/ http://www.customer1.com/pdf/

Access root domain from subdomain on hosted apache server

I had a domain:
mydomain.com
pointing to a hosted apache server 'premium' account that can host multiple domains.
I bought another domain: anotherDomain.com which I set up as an 'add on' domain with my web host. I can access the anotherDomain in several different ways:
mydomain.com/anotherDomain.com
anotherDomain.mydomain.com
and
anotherDomain.com
However, only when using the first method can I access 'generic' files on mydomain.com from anotherDomain.com (using relative addressing).
I was told there is a script I can write so anotherDomain.com can access 'root' files at mydomain.com, using relative addressing, but they cannot tell me how to do it. I've looked around the net, but although there are lots of similar sounding questions, I cannot find how to do it.
Just to restate the problem: I want to be able to access files in mydomain.com, just like I can when anotherDomain.com is accessed like: mydomain.com/anotherDomain.com, when it is accessed like: anotherDomain.mydomain.com or anotherDomain.com
Example:
If I access anotherDomain.com using the URL mydomain.com/anotherDomain.com then, in the index.html for anotherDomain.com I can have:
<img src='../imgs/generic.jpg'/>
Which access the 'generic' image in the imgs folder for mydomain.com. Unfortunately, when I access this page using the URLs: anotherDomain.mydomain.com or anotherDomain.com, this doesn't work.
First of all, I assume you don't have access to your server's config files and thus have to deal with the restricted possibilities of .htaccess. If you have access there would be much better ways to handle this.
I would suggest to circumvent the problem. Have a look at the other answers and the comments. You went for a highly complex setup. You might get that working, but it will be a pain in the ass in the long run.
I will elaborate a bit and explain a few things in my answer.
As I understand you have the following file structure on your server:
public_html/
|
+-- index.html // the index of mydomain.com
|
+-- imgs/
| |
| +-- generic.jpg
|
+-- anotherDomain.com/
|
+-- index.html // index of anotherDomain.com
Suppose you browse http://anotherDomain.com/.
When the browser tries to load generic.jpg it will create this URL: http://anotherDomain.com/../imgs/generic.jpg. This will, however, by almost every browser, be rewritten as http://anotherDomain.com/imgs/generic.jpg.
So you have to tell the server how to server this file.
You can create a rule as #anubhava suggests. If you access http://anotherDomain.com/imgs/* redirect it to the imgs dir on the virtual host http://mydomain.com/. This way the content will appear to belong to mydomain.com.
I would suggest creating a symlink instead, if you have the possibility.
|
+-- anotherDomain.com/
|
+-- index.html // index of anotherDomain.com
|
+-- imgs --> ../imgs/
This way you can access all the images easily. However, they will appear as content of anotherDomain.com. This can be seen as advantages or as already mentioned as disadvantage (search engines.) Creating a symlink can also sometimes be done by using PHP (symlink function) if your provider does not support it via its interface.
After wasting time trying to do this with .htaccess files, I finally worked out a way to do it in plain old html. In the header, before all other links, put:
<base href="http://mydomain.com/anotherDomain.com" />
And thats it. Remove this for development on your local machine. ;-)
Remember to be a bit careful when using the base tag.
Edit:
I've found this causes other problems. For example, relative links go to http://mydomain.com/anotherDomain.com, rather than http://anotherDomain.com.
Adding .htaccess code to the http://mydomain.com root directory like:
RedirectMatch /anotherDomain.com(/)?$ http://anotherDomain.com
Solves this, but introduces other problems. Still looking for a good answer to this question... Anybody?
Try this code in your public_html/anotherDomain.com/.htaccess:
Options +FollowSymLinks -MultiViews
# Turn mod_rewrite on
RewriteEngine On
RewriteBase /
# for images only
RewriteCond %{HTTP_HOST} ^(www\.)?anotherDomain\.com$ [NC]
RewriteRule ^(imgs/.+)$ http://mydomain.com/anotherDomain.com/$1 [L,NC,R]
# or for all files
RewriteCond %{HTTP_HOST} ^(www\.)?anotherDomain\.com$ [NC]
RewriteRule ^(.*)$ http://mydomain.com/anotherDomain.com/$1 [L,NC,R]
Remove (or comment out) your <base... tag.
Accessing the same content via several different domains is maybe not a great idea. This can make your content appear like duplicated contents from search engines. If you keep that idea you should at least try to register all your contents with a canonical url metatag or attribute.
Now I usually like relative urls, as always preficing url with the domain makes them harder to reuse via proxys (but when using ajax stuff, for example, relative url are not enough, you'll always end with some absolute url in some places). But here you use relative url with '../imgs/generic.jpg'. It means this url refers more to the way you've been putting the files on your server than to something which is meaningful. It could be meaningfull on a single domain, but to share stuff between domains it's not. What if some day you need to move the new domain files and directory root on another place (on another server?). From the user (and bots and search engines) your domains are not related. Any proxy that you do not own would request the asset several time if it is asked via subdomain.mydomain.com/foo.jpg and subdomain.com/foo.jpg, so trying to share this stuff from your domains is not meaningful for the rest of the web, it's just for your own managment, on your side. So feel free to make it simple to manage.
You'd better manage your url in a way where the url means something to you, like '/common/imgs/generic.jpg' and '/site/imgs/custom.jpg'. Then on the domain Virtualhosts, server-side, you can work on the mapping url->directory & files. This mapping can be set via Alias, AliasMatch, and of course mod_rewrite directives.
For example a simple
Alias /common /path/to/mydomain/assets
Would allow you to map all the mydomain.com assets on the Virtualhost containing this instruction. The day you will decde to move all the things on your webserver you'll simply have to rework the apache rewrites and alias, and not all you code url management.

Virtual Hosts (Apache) with mod_rewrite issues

I am trying to fix this whole day without success, so I hope someone might be able to help me. I have an app at http://localhost/, and it uses Pylons for the app I am hosting. In addition to that, I need to host a PHP/MySQL site, so I had to use Apache too.
My current setup is that I use haproxy with this config for the Apache backend:
backend apache
mode http
timeout connect 4000
timeout server 30000
timeout queue 60000
balance roundrobin
server app02-8002 localhost:8002 maxconn 1000
This is triggered by this:
acl image url_sub images
use_backend apache if image
So, when I open my IP/images, it will trigger that and open Apache then, with port 8002.
For Apache, I created virtual hosts, and this is the "image" one:
<VirtualHost *:8002>
ServerAdmin my#email.com
ServerName image
ServerAlias image
DocumentRoot /srv/www/image/public_html/
ErrorLog /srv/www/image/logs/error.log
CustomLog /srv/www/image/logs/access.log combined
</VirtualHost>
So, that all works nicely, when I type IP/images it open the /srv/www/image/public_html. But then the issues come. As I am using the image uploading script, it involves a lot of rewriting, so I had to enable that mod. This is the .htaccess which is located in the public_html/images folder (I somehow had to make this subfolder too, to "match" the URL with the actual location in the public_html.
SetEnv PHP_VER 5_3
RewriteEngine On
# You must define your installation directory and uncomment the line :
RewriteBase /images/
RewriteRule ^([a-zA-Z]+)\.(jpg|gif|png|wbmp)$ controller/Resizer.php?m=original&a=$1&e=$2 [L]
RewriteRule ^(icon|small|medium|square)\/([a-zA-Z]+)\.(jpg|gif|png|wbmp)$ controller/Resizer.php?m=$1&a=$2&e=$3 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*) application.php?request=$1 [L,QSA]
So, basically, this is somethow not working. I suppose there is a conflict between this virtual host, subdirectory, rewriting or something, but I can't seem to isolate it.
It is a bit confusing that when I open the IP/images/xxxx.jpg it opens the image, which is located in the public_html/images/upload/original folder, so the rewrite is working. The the other rules seem not to be working. All of the thumbnails and smaller versions are not rendering properly (with the icon, small, medium, square), so that makes the site quite unsusable.
Here is the link of the development server: http://localhost/images/
Thanks in advance for your time and help!
The first thing you should do is determine whether mod_rewrite is in fact part of the problem by accessing one of the failing URLs directly via its rewritten form and verifying that you get the expected result.
Indeed, the problem might simply be that the PHP script for the smaller resolutions "doesn't work" while it does for the original size ones. The first of the following URLs nicely served me an image; the second one is supposed to give me a smaller version of the same image, but served me an HTTP 500:
http://106.186.21.176/images/controller/Resizer.php?m=original&a=q&e=png
http://106.186.21.176/images/controller/Resizer.php?m=small&a=q&e=png
I got the same result (HTTP 500) for any of the smaller-size format names mentioned in your post, which matches your problem description.
Once you've verified that the script works as expected, it's likely that the problem is with mod_rewrite. If so, enable rewrite logging: use the RewriteLog directive to activate it, and RewriteLogLevel to control its verbosity. Especially at the higher log levels, it can give you very detailed information about exactly what it's doing. This should make the problem readily apparent from the logs.
Also, if possible, try to avoid configuring mod_rewrite rules in .htaccess files -- move them into your main server config file instead. The reason is explained on Apache mod_rewrite Technical Details, section "API phases":
Unbelievably mod_rewrite provides URL manipulations in per-directory context, i.e., within .htaccess files, although these are reached a very long time after the URLs have been translated to filenames. It has to be this way because .htaccess files live in the filesystem, so processing has already reached this stage. In other words: According to the API phases at this time it is too late for any URL manipulations. To overcome this chicken and egg problem mod_rewrite uses a trick: When you manipulate a URL/filename in per-directory context mod_rewrite first rewrites the filename back to its corresponding URL (which is usually impossible, but see the RewriteBase directive below for the trick to achieve this) and then initiates a new internal sub-request with the new URL. This restarts processing of the API phases.
Again mod_rewrite tries hard to make this complicated step totally transparent to the user, but you should remember here: While URL manipulations in per-server context are really fast and efficient, per-directory rewrites are slow and inefficient due to this chicken and egg problem. But on the other hand this is the only way mod_rewrite can provide (locally restricted) URL manipulations to the average user.
In general, not using .htaccess at all has the added advantage that you can tell Apache to not even bother and disable the functionality all together, which save Apache from having to scan each directory level it serves from for the .htaccess files.

How do I configure apache for a custom directory?

Trying to configure apache2 to load example.com/forum/ from a different document root, relative to the site root. Forums are installed somewhere else on the server.
Is there a directory alias command? I've found the alias configuration entry for apache, but had no luck.
Basically, I want example.com to have the same directory its always had, but example.com/forum/ to be hosted somewhere else, on the same server.
I tagged this question with mod_rewrite because I thought maybe it would be the key, here.
Cheers!
Alias is the right way, unless you have some subtlety that you didn't reveal in your question.
# http.conf
Alias /forum /usr/lib/bbs/ # or whatever
The job of Alias is to take the abstract URL coming into your system and map it to a concrete filesystem path. Once it has done that, the request is no longer an URL but a path. If there is no Alias or similar directive handling that URL, then it will get mapped to a conrete path via DocumentRoot.
If this isn't working, you have to debug it further. Are you getting errors when you access /forum? Look in the error log.
It all depends of what you want. You can "hardlink" with real path and it works (so you were right to think it could work with mod_rewrite).
Quick sample (that works on my production domains) to make an internal change (I add a subdirectory):
RewriteRule (.*) %{DOCUMENT_ROOT}/mysubfolder%{REQUEST_FILENAME} [QSA,L]
So you can easily do something like:
RewriteRule ^/forum/(.*) %{DOCUMENT_ROOT}/mysubfolder%{REQUEST_FILENAME} [QSA,L]
And my suggestion would be that if you plan to have more rewrite rules, keep everything homogeneous, i.e.: keep on using only rewrite rules, so use my suggestion above. This way you'll not get a bad mix of Alias, RewriteRules and so on. For nice and clean stuff: keep everything homogeneous.

Apache mod_rewrite not doing anything (?)

I'm having some trouble with Apache's mod_rewrite. One of the things I'm trying to get it to do is hide some of my implementation details, so that, for example, the user sees the URL http://www.mysite.com/login but Apache responds with the page at http://www.mysite.com/doc_root/login.php instead (preferably without showing the user that it's a PHP file or the directory structure). Here's what I have in my .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?mysite.com*
RewriteRule ^/(\w+) /doc_root/$1.php [L]
#Redirect http://www.mysite.com to the login page
RewriteRule ^/?$ https://www.mysite.com/doc_root/login.php
But when I go to http://www.mysite.com/login, I get a 404 error even though the page exists. I clearly don't have a great understanding of how the mod_rewrite conditionals and rules work, so can anyone please tell me what I'm doing wrong? Thanks.
Take doc_root out of all the stuff you have it in. That will give you the result you're asking for. However I'm not sure if it's desired or not. How are you going to force someone to login if they manually type http://www.mysite.com/index.php?
Also if you're trying to force all traffic to SSL it's better to use a second VirtualHost and Redirect instead of mod_rewrite. Those are all questions probably better suited for ServerFault
Unless your site has a bunch of different domain names, and you only want mysite.com to do the rewriting, you don't need the RewriteCond. (Potential problem. Apache likes to dick around with the domain name unless you set UseCanonicalName off. If the name isn't what it's expecting, the rewrite won't happen.)
In RewriteCond (and RewriteRule) patterns, . matches any character. Add a backslash before them. (Minor bug. Shouldn't cause rewrites to fail, but they would match stuff like "mysite-com" as well.)
mod_rewrite is actually a URL-to-filename filter. Though it is often used to rewrite URLs to other URLs, sometimes it will misbehave if what you're rewriting to is a URL and it can't tell. (Especially if what it's rewriting to would be an alias, or would otherwise not translate directly to a real filename.) If you add a [PT] flag onto your rule, though, it will consider the rewritten thing a URL and pass it along to the other filters (including the ones that turn URLs into filenames).
Do you really need "/doc_root"? The document root should already be set up in Apache using the DocumentRoot directive, and shouldn't need to be part of the URL unless you have multiple apps on the same domain (in which case it's the app root; the document root doesn't change).
UPDATE:
Another thing i just thought about: Rewrite rules work differently in .htaccess files. Apache likes to strip off the leading slash. So you will probably want to get rid of the first slash in your patterns, or at least make it optional (^/?login instead of ^/login).
^/?(\w+) will match /doc_root/login.php, and cause a rewrite to /doc_root/doc_root.php. You should probably have a $ at the end of your pattern.