use Apache Alias instead of RewriteRule to serve HTML page - apache

A simple Alias in Apache configuration not working -
Alias /url/path/some-deleted-page.html /url/path-modified/new-avatar-of-some-deleted-page.html
It gives "page not found".
However RewriteRule works as expected but it sends redirect status to browser. I want browser/user not to be aware of the redirect. Hence, I want to use Alias instead of RewriteRule. I want to confirm if mod_alias can be used to map individual URL.
I use ProxyPassMatch also which executes all html pages as PHP script. Also adding ProxyPass makes no diffrence.
ProxyPass /url/path/some-deleted-page.html !
Please help so that I can map individual URL (a bunch of them) with Alias instead of RewriteRule.

The purpose of mod_alias is to map requested URLs with a directory on the system running your httpd instance. It does not return anything to the browser (i.e. no redirection code, nothing). It is all done internally. Hence your client does not even know it is there.
Request: http://www.example.com/someurl/index.html
Configuration
[...]
DocumentRoot "/opt/apache/htdocs"
Alias "/someurl/" "/opt/other_path/someurl_files/"
[...]
In this scenario, users asking for any URL besides /someurl/ would receive files from /opt/apache/htdocs.
If a user asks for /someurl/, files from /opt/other_path/someurl_files/ will be used.
Still missing in this example is a <Directory> definition for securing the Alias directory.
You should read: https://httpd.apache.org/docs/2.4/mod/mod_alias.html
Alias will cover the case where you need to point a certain URL to a particular directory on the file system.
If you need to modify the filename (i.e. the client asks for file A, and you send back page B), you should use RewriteRule. And to hide the fact you changed the filename, use the [P] flag.
This directive allows you to use regex, yet still use a proxy mechanism. So your client does know what went on, as the address in his address bar does not change.

Related

Need to configure .htaccess, so multiple folders will act as if they are their own separate root folders - for the code running on them

For example:
mydomain.com/site1
mydomain.com/site2
I need to install an application on /site1 that will think that it is on the root folder. (In this case PHP, js, CodeIgniter, but could be anything)
So for example, links/references for files such as "/file.jpg" (in code that is in the site1 folder, such as at mydomain.com/site1/code.js) will really load from mydomain.com/site1/file.jpg
And also the code would not be able to see any folder below site1, so that is basically the root folder. And similar thing would be at site2, so the 2 are separate root folders.
I thought this would be some kind of simple .htaccess file installed at mydomain.com/site1 with a redirect, or some kind of a reverse proxy, but so far everything I tried did not work.
I can't seem to find even any such example even on stack overflow..
Any ideas?
The easiest way to do this would be to create an additional VirtualHost, for internal use, called internal1, whose RootDirectory is, you guessed it, /var/www/mydomain.com/htdocs/site1 where the main site is in /var/www/mydomain.com/htdocs.
Then in mydomain.com you reverse proxy /site1 to internal1 (you'll have to put it into /etc/hosts and alias for localhost). The second request will have its DOCUMENT_ROOT point to site1, as requested (and its ServerName changed to internal1):
ProxyPass /site1/ http://internal1/
ProxyPassReverse /site1/ http://internal1/
(Not sure about the trailing slashes)
Now, accessing yourdomain.com/site1/joe.html will trigger a second internal connection to internal1/joe.html, which will contain, say, 'src="/joe.jpg"'; and here's where ProxyPassReverse will come into play, rewriting this in 'src="yourdomain.com/site1/joe.jpg"' so that everything will work.
errata corrige
The above is not correct, thanks #MrWhite for pointing this out. ProxyPassReverse is not enough as it only rewrites headers. From the Apache documentation (emphasis mine):
Only the HTTP response headers specifically mentioned above will be
rewritten. Apache httpd will not rewrite other response headers, nor
will it by default rewrite URL references inside HTML pages. This
means that if the proxied content contains absolute URL references,
they will bypass the proxy. To rewrite HTML content to match the
proxy, you must load and enable mod_proxy_html.
(The method is dirty as all Hell: every HTTP call incurs one extra connection and two rewrites, one going in, a larger one going out).
Of course, if the link is built using e.g. Javascript, it might well be that the proxy code will not recognize it as a link, will leave it unchanged, maybe with the "internal1" name inside somewhere, and the app will break.
However, #arkascha has the right of it - you should cure the cause, not the symptom. You can maybe rewrite the environment of the apps so that they run without troubles even if they are in a subdirectory. Or you could try injecting <base href="https://example.com/site1"> in the output HTML.

Using RedirectMatch with HTTP_HOST in the destination

I keep reading that, where possible, I should not be using mod_rewrite. As such, I am trying to do a http to https rewrite with RedirectMatch.
Question: How can I use RedirectMatch and use Apache server variables (such as %{HTTP_HOST}) in the URL parameter?
This code fails to return a response to the client (Chrome):
RedirectMatch ^(.*) https://%{HTTP_HOST}/$1
I recently asked a similar question to this, but it may have been too wordy and lacks direction for an answer: Redirecting http traffic to https in Apache without using mod_rewrite
If you're using 2.4.19 or later, the Redirect directive has a somewhat obscure feature: putting it inside a Location or LocationMatch will enable expression syntax.
So your example can be written as
<LocationMatch ^(?<PATH>.*)>
Redirect "https://%{HTTP_HOST}%{env:MATCH_PATH}"
</LocationMatch>
(Here, the ?<PATH> notation means that the match capture will be saved to an environment variable with the name MATCH_PATH. That's how we can use it later in the Redirect.)
It's even easier if you always redirect using the entire request path, because you can replace the capture group entirely with the REQUEST_URI variable:
<Location "/">
Redirect "https://%{HTTP_HOST}%{REQUEST_URI}"
</Location>
Now, is this easier to maintain/understand than just using mod_rewrite for this one case? Maybe not. But it's an option.
No, You can't use variables of that type with Redirect/RedirectMatch. If you need variables, such as %{HTTP_HOST}, use mod_rewrite.
Note: I commend you for not trying to use mod_rewrite right away, because most people will go for mod_rewrite even for the simplest of redirections, which is clearly overkill and most times it is just looking to complicate things unnecessarily.
Writing for users who might face the same in future.
Not sure how you are adding vhost entries.
I guess this vhost entries are added automatically with help of some programming script.
Do you use VhostDirective with ServerName?
<VirtualHost *:8080>
ServerName example.domain.com
</VirutalHost>
If so, then you can use the same domain value for populating RedirectMatch field.
If you are manually adding vhost entries just write that domain URL value explicitly instead of HTTP_HOST.
Or let me know if its a different scenario.

mod_rewrite behaviour when no rewriteBase

Just want to confirm something. From what I gather of how mod_rewrite works, Apache receives an URL and immediately mod_rewrite applies (non-<directory>) rules in httpd.conf, then per-directory mod-rewriting goes to work, then restarts the process with a new URL if any changes are made.
#JonLin's great answer to this question first says that when your per-directory rule specs an absolute replacement (ie. starting with a slash), it's assumed to be relative to the DocumentRoot which I get. But of relative replacements (no slash) Jon then says:
it's based on the directory that the rule is in. So if
RewriteRule ^foo$ bar.php [L]
is in the "root" and you go to http://example.com/foo, you get served http://example.com/bar.php. But if that rule is in the "subdir1" directory, and you go to http://example.com/subdir1/foo, you get served http://example.com/subdir1/bar.php. etc. This sometimes works and sometimes doesn't, as the documentation says, it's supposed to be required for relative paths, but most of the time it seems to work. Except when you are redirecting (using the R flag, or implicitly because you have http://host in your rule's target). That means this rule:
RewriteRule ^foo$ bar.php [L,R]
if it's in the "subdir2" directory, and you go to http://example.com/subdir2/foo, mod_rewrite will mistake the relative path as a file-path instead of a URL-path and because of the R flag, you'll end up getting redirected to something like: http://example.com/var/www/localhost/htdocs/subdir1.
As Jon explains in the last bit, when a redirect will occur and when there's no rewriteBase, a string intended as filepath gets appended to the site's base address to create a phony URL. But just to confirm, even in the former case Jon mentions, ie. not an actual redirect, the substituted string does get sent back to Apache's URL-reception code, restarting the whole process, correct? The diagram on this page of the spec seems to imply that until no rules make a change, the process keeps restarting. These non-redirect cases would seem to be the time when it WOULD make sense to tack the filepath right from the file system root to the htaccess directory onto the beginning of the substitution. But how does that get turned into a proper URL as expected by the URL-reception code - does http://localhost get prepended? I think that would make everything relative to the documentroot, not the actual file system root.
Thanks!
Been doing some more reading and think I've got this explained, for anyone who's interested.
Regarding my question about how a file system absolute path gets turned into a valid url for the internal redirect, I was thinking that the URI in an HTTP request contained "http://hostname", but this has been cut off ie. the URI is like /this/is/a/path. The host name is in a separate "Host" header field, and is no longer a vital piece of information by the time mod_rewrite is running, as Apache's initial Post Read Request phase has already noticed the GET request on the port and, if Name-Based Virtual Hosting is in use, interpreted things like the DocumentRoot from the Host header field, and finally called the URI Translation Phase where mod_rewrite executes. So any time mod_rewrite is running, there could be only one host name that got us here.
So to summarize, what I had called the "URL-reception" part of Apache always deals with /paths/like/this/without/hostname, not just after internal redirects. The spec does say that rewriteCond/rewriteRule match against such paths, but I figured the host name was there initially and got removed. So then all that's left is to ensure our rules are prepared for cases where they are running in an internal redirect spawned by an earlier runthrough of themselves, and not do something inadvertent when they see a file system absolute path caused by a replacement that didn't start with a slash. What a mouthful.

How do I configure apache for a custom directory?

Trying to configure apache2 to load example.com/forum/ from a different document root, relative to the site root. Forums are installed somewhere else on the server.
Is there a directory alias command? I've found the alias configuration entry for apache, but had no luck.
Basically, I want example.com to have the same directory its always had, but example.com/forum/ to be hosted somewhere else, on the same server.
I tagged this question with mod_rewrite because I thought maybe it would be the key, here.
Cheers!
Alias is the right way, unless you have some subtlety that you didn't reveal in your question.
# http.conf
Alias /forum /usr/lib/bbs/ # or whatever
The job of Alias is to take the abstract URL coming into your system and map it to a concrete filesystem path. Once it has done that, the request is no longer an URL but a path. If there is no Alias or similar directive handling that URL, then it will get mapped to a conrete path via DocumentRoot.
If this isn't working, you have to debug it further. Are you getting errors when you access /forum? Look in the error log.
It all depends of what you want. You can "hardlink" with real path and it works (so you were right to think it could work with mod_rewrite).
Quick sample (that works on my production domains) to make an internal change (I add a subdirectory):
RewriteRule (.*) %{DOCUMENT_ROOT}/mysubfolder%{REQUEST_FILENAME} [QSA,L]
So you can easily do something like:
RewriteRule ^/forum/(.*) %{DOCUMENT_ROOT}/mysubfolder%{REQUEST_FILENAME} [QSA,L]
And my suggestion would be that if you plan to have more rewrite rules, keep everything homogeneous, i.e.: keep on using only rewrite rules, so use my suggestion above. This way you'll not get a bad mix of Alias, RewriteRules and so on. For nice and clean stuff: keep everything homogeneous.

mod_rewrite to absolute path in .htaccess - turning up 404

I want to map a number of directories in a URL:
www.example.com/manual
www.example.com/login
to directories outside the web root.
My web root is
/www/htdocs/customername/site
the manual I want to redirect to is in
/www/customer/some_other_dir/manual
In mod_alias, this would be equal to
Alias /manual /www/customer/some_other_dir/manual
but as I have access only to .htaccess, I can't use Alias, so I have to use mod_rewrite.
What I have got right now after this question is the following:
RewriteRule ^manual(/(.*))?$ /www/htdocs/customername/manual/$2 [L]
this works in the sense that requests are recognized and redirected properly, but I get a 404 that looks like this (note the absolute path):
The requested URL /www/htdocs/customername/manual/resourcename.htm
was not found on this server.
However, I have checked with PHP: echo file_exists(...) and that file definitely exists.
why would this be? According to the mod_rewrite docs, this is possible, even in a .htaccess file. I understand that when doing mod_rewrite in .htaccess, there will be an automated prefix, but not to absolute paths, will it?
It shouldn't be a rights problem either: It's not in the web root, but within the FTP tree to which only one user, the main FTP account, has access.
I can change the web root in the control panel anytime, but I want this to work the way I described.
This is shared hosting, so I have no access to the error logs.
I just checked, this is not a wrongful 301 redirection, just an internal rewrite.
In .htaccess, you cannot rewrite to files outside the wwwroot.
You need to have a symbolic link within the webroot that points to the location of the manual.
Then in your .htaccess you need the line:
Options +SymLinksIfOwnerMatch
or maybe a little more blindly
Options +FollowSymlinks
Then you can
RewriteRule ^manual(/(.*))?$ /www/htdocs/customername/site/manual/$2 [L]
where manual under site is a link to /www/customer/some_other_dir/manual
You create the symlink on the command line with:
ln -s /www/htdocs/customername/site/manual /www/customer/some_other_dir/manual
But I imagine you're on shared hosting without shell access, so look into creating symbolic links within CPanel,Webmin, or whatever your admin interface is. There are php/cgi scripts that do it as well. Of course, you're still limited to the permissions that the host has given you. If they don't allow you to follow symlinks as a policy, you cannot override that within your .htaccess.
AFAIK mod_rewrite works at the 'protocol' level (meaning on the wire HTTP). So I suspect you are getting HTTP 302 with your directory path in the location.
So I'm afraid you might be stuck unless.. your hosting lets you follow symbolic links; so you can link to that location (assuming you have shell access or this is possible using FTP or your control panel) under your current document root.
Edit: It actually mentions URL-file phase hook in the docs so now I suspect the directory directives aren't allowing enough permissions.
This tells you what you need to know.
The requested URL /www/htdocs/customername/manual/resourcename.htm
was not found on this server.
It interprets RewriteRule ^manual(/(.*))?$ /www/htdocs/customername/manual/$2 [L] to mean rewrite example.com/manual/ as if it were example.com/www/htdocs/customername/manual/.
Try
RewriteRule ^manual(/(.*))?$ /customername/manual/$2 [L]
instead.