apache reverse proxy: how to forward proxy server's HTTP_HOST - apache

Our local development setup requires a box in the DMZ, and each developer has a line in its apache config for proxying. Looks something like:
ProxyPreserveHost on
ProxyPass /user1/ {user1's IP}
ProxyPassReverse /user1/ {user1's IP}
ProxyPass /user2/ {user2's IP}
ProxyPassReverse /user2/ {user2's IP}
#etc
Our public URLs become {DMZ server}/user1, {DMZ server}/user2, etc. The problem is that on the dev's boxes, the value of $_SERVER['HTTP_HOST'] is just {DMZ server}, without the user's subdirectory. The desired behavior is to have /user%/ as the real host name.
I've tried overriding the HOST var, and some rewrite rules, but nothing has worked.
Creating subdomains is not an option.
thank you for any help!

http://httpd.apache.org/docs/2.0/mod/mod_proxy.html#proxypreservehost seems to be the answer.

Im going to take a stab and suggest this:
SetEnvIf Host (.*) custom_host=$1
RequestHeader set X-Custom-Host-Header "%{custom_host}e/%{REQUEST_URI}e/%{QUERY_STRING}e"
That should hopefully set a request header called X-Custom-Host-Header that you can then pickup in PHP. If you want, you can try to override the Host Header, but I'm not sure on the implications of that. The Host header is a special HTTP header and generally only contains the host portion of an HTTP request, not the full request url.
Untested unfortunately, but it would help if you could clarify in a bit more detail what you are looking for.

EDIT, THIRD ANSWER:
Looks like Apache has heard this complaint before and the solution is mod_substitute. You need to use it to rewrite all the URLs returned in the document to insert /user1/.
EDIT, SECOND ANSWER:
Based on the additional information in your comments, I'd say your Apache config on your DMZ server is correct. What you are asking for is to have the developer machines generate URLs that include their context path (which is the J2EE term for something analogous to your /user1/ bit). I don't have any experience with PHP so I don't know if it has such a facility, but a quick search suggests it does not.
Otherwise, you'd have to roll your own function that converts a relative URL to an absolute URL, make that configurable so you can have it add something to the host name, and then force everyone to use that function exclusively for generating URLs. See, for some guidance, "Making your application location independent" in this old (outdated?) PHP best practices article for a solution to the related problem of finding local files.
PREVIOUS ANSWER: (doesn't work, causes redirect loop)
I'm still not clear what you are trying to do or what you mean by "Running on the dev apps are apache and PHP mainly, for hosting various applications", but as an educated guess, have you tried:
ProxyPass /user1/ {user1's IP}/user1/
ProxyPassReverse /user1/ {user1's IP}/user1/
If I were setting up the sort of environment you seem to be wanting to have, I'd want $_SERVER['HTTP_HOST'] to be {DMZ server} on every dev machine so that the dev machine's environment looks just like (or at least more like) production to the code running on it.

Related

Need to configure .htaccess, so multiple folders will act as if they are their own separate root folders - for the code running on them

For example:
mydomain.com/site1
mydomain.com/site2
I need to install an application on /site1 that will think that it is on the root folder. (In this case PHP, js, CodeIgniter, but could be anything)
So for example, links/references for files such as "/file.jpg" (in code that is in the site1 folder, such as at mydomain.com/site1/code.js) will really load from mydomain.com/site1/file.jpg
And also the code would not be able to see any folder below site1, so that is basically the root folder. And similar thing would be at site2, so the 2 are separate root folders.
I thought this would be some kind of simple .htaccess file installed at mydomain.com/site1 with a redirect, or some kind of a reverse proxy, but so far everything I tried did not work.
I can't seem to find even any such example even on stack overflow..
Any ideas?
The easiest way to do this would be to create an additional VirtualHost, for internal use, called internal1, whose RootDirectory is, you guessed it, /var/www/mydomain.com/htdocs/site1 where the main site is in /var/www/mydomain.com/htdocs.
Then in mydomain.com you reverse proxy /site1 to internal1 (you'll have to put it into /etc/hosts and alias for localhost). The second request will have its DOCUMENT_ROOT point to site1, as requested (and its ServerName changed to internal1):
ProxyPass /site1/ http://internal1/
ProxyPassReverse /site1/ http://internal1/
(Not sure about the trailing slashes)
Now, accessing yourdomain.com/site1/joe.html will trigger a second internal connection to internal1/joe.html, which will contain, say, 'src="/joe.jpg"'; and here's where ProxyPassReverse will come into play, rewriting this in 'src="yourdomain.com/site1/joe.jpg"' so that everything will work.
errata corrige
The above is not correct, thanks #MrWhite for pointing this out. ProxyPassReverse is not enough as it only rewrites headers. From the Apache documentation (emphasis mine):
Only the HTTP response headers specifically mentioned above will be
rewritten. Apache httpd will not rewrite other response headers, nor
will it by default rewrite URL references inside HTML pages. This
means that if the proxied content contains absolute URL references,
they will bypass the proxy. To rewrite HTML content to match the
proxy, you must load and enable mod_proxy_html.
(The method is dirty as all Hell: every HTTP call incurs one extra connection and two rewrites, one going in, a larger one going out).
Of course, if the link is built using e.g. Javascript, it might well be that the proxy code will not recognize it as a link, will leave it unchanged, maybe with the "internal1" name inside somewhere, and the app will break.
However, #arkascha has the right of it - you should cure the cause, not the symptom. You can maybe rewrite the environment of the apps so that they run without troubles even if they are in a subdirectory. Or you could try injecting <base href="https://example.com/site1"> in the output HTML.

Use Apache to load a page sitting on a different server with the same URL

We have a situation where ideally we would like a user to access a page on our site at a URL such as https://example.com/path/to/page. However, the HTML to render that page is sitting on an entirely different server (S3 to be exact) that we have control over, and we would like to render that page for that URL without redirecting (i.e. changing the URL itself).
I took a brief look at the Apache mod_proxy module, but it doesn't seem to do the job as we just get 500 or 404 errors. Here is an example entry from our .htaccess:
<IfModule mod_proxy.c>
RewriteRule "/path/to/page/(.*)$" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/$1" [P]
</IfModule>
Any help or a pointer in the right direction would be appreciated.
Most likely you stumble over the fact that you are using an absolute path inside a dynamic cohnfiguration files RewriteRule. Have a try with that instead:
RewriteEngine on
RewriteRule "/?path/to/page/(.*)$" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/$1" [P]
That slightly modified will work in dynamic configuration files and in the real http servers host configuration.
But as mentioned in the comment I wonder why you should not be able to use the proxy module directly to simplify things. You'd have to do that in in http servers host configuration though, this is not possible in dynamic configuration files:
ProxyRequests off
ProxyPass "/path/to/page/" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/"
ProxyPassReverse "/path/to/page/" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/"
And a general hint: you should always prefer to place such rules inside the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only provided as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).

Forwarded Tomcat through Apache uses wrong Context path

Okay let me explain my problem really fast. I have a JEE Programm running on my tomcat server. The server has some user defined in the tomcat-users.xml When i test my programm on my local machine everything works fine.
However if i deploy the .war on my server and i want to access a Rest Endpoint i get a 401 unauthorized error. If i remove the users security check i can work fine with the program. So the URLs and server setup is correct.
I think that the problem is somehow related to the forwarding of tomcat through my apache.
So lets assume i have an apache running on http://myIp.de
then i forwarded tomcat with following apache config:
ProxyRequests off
ProxyPass /tomcat http://localhost:8181/ nocanon
ProxyPassReverse /tomcat http://localhost:8181/
so now i can reach tomcat through: http://myIp.de/tomcat
also i can "speak" to my app via: tomcat/myApp
But somehow the Authentizication now fails. And i think the problem is
somehow related to wrong context path. Because tomcat/manager
also fails to login.
Make your life easier by deploying your app under /tomcat on tomcat too. This way there's no path-translation required. Keep in mind that you'll get all the session cookies tied to a specific path and this path is not necessarily translated once forwarded to the client.
Also, sooner or later you might need
ProxyPreserveHost On
(look it up) or utilize mod_jk to preserve this header (and more information) automatically.
Edit: Following your comment, Basic Auth headers seem not to be forwarded to tomcat as well. I haven't attempted this myself, but all the places that I've looked up seem to imply that there'd be some duplication (e.g. second credentials file for Apache) - that doesn't look good. In this case I'd suggest to try out mod_jk rather than mod_proxy. You'll use the JkMount directive, rather than ProxyPass and need a workers.properties, but mod_jk is a lot better in keeping the full context of the request when forwarding to tomcat. I've had good experience with it so far and have only heard little complaints about it - largely in situations that were pretty huge and complex/complicated anyway. At least you should try if it solves your problems.

Retain original request URL on mod_proxy redirect

I am running a WebApplication on a Servlet Container (port 8080) in an environment that can be accessed from the internet (external) and from company inside (intenal), e.g.
http://external.foo.bar/MyApplication
http://internal.foo.bar/MyApplication
The incomming (external/internal) requests are redirected to the servlet container using an apache http server with mod_proxy. The configuration looks like this:
ProxyPass /MyApplication http://localhost:8080/MyApplication retry=1 acquire=3000 timeout=600 Keepalive=On
ProxyPassReverse /MyApplication http://localhost:8080/MyApplication
I am now facing the problem that some MyApplication responses depend on the original request URL. Concrete: a WSDL document will be provided with a element that has a schemaLocation="<RequestUrl>?xsd=MyApplication.xsd" element.
With my current configuration it always looks like
<xs:import namespace="..." schemaLocation="http://localhost:8080/MyApplication?xsd=MyApplication.xsd"/>
but it should be
External Request: <xs:import namespace="..." schemaLocation="http://external.foo.bar/MyApplication?xsd=MyApplication.xsd"/>
Internal Request: <xs:import namespace="..." schemaLocation="http://internal.foo.bar/MyApplication?xsd=MyApplication.xsd"/>
I suppose this is a common requirement. But as I am no expert in configuration of the apache http server and its modules I would be glad if someone could give some (detailed) help.
Thanks in advance!
If you're running Apache >= 2.0.31 then you might try to set the ProxyPreserveHost directive as described here.
This should pass the original Host header trough mod_proxy into your application, and normally the request URL will be rebuild there (in your Servlet container) using the Host header, so the schema location should be build using the host and path infos from "before" the proxy.
(Posted here too for the sake of completeness)
Here is another alternative if you would like to retain both the original host name and the proxied host name.
If you are using mod_proxy disable ProxyPreserveHost in the Apache configuration. For most proxy servers, including mod_proxy, read the X-Forwarded-Host header in your application. This identifies the original Host header provided by the HTTP request.
You can read about the headers mod_proxy (and possible other standard proxy servers) set here:
http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
You should be able to do a mod_rewrite in apache to encode the full URL as a query parameter, or perhaps part of the fragment. How easy this might be depends on whether you might use one or the other as part of your incoming queries.
For example, http://external.foo.bar/MyApplication might get rewritten to http://external.foo.bar/MyApplication#rewritemagic=http://external.foo.bar/MyApplication which then gets passed into the ProxyPass and then stripped out.
A bit of a hack, yes, and perhaps a little tricky to get rewrite and proxy to work in the right order and not interfere with each other, but it seems like it should work.

Changing Cookie Domains

I use apache as a proxy to my application web server and would like to on the fly, change the domain name associated with a sessionid cookie.
The cookie has a .company.com domain associated with it, and I would like using apache mod rewrite (or some similar module), transparently change the domain to app.company.com. Is this possible ? and if so, how would one go about it ?
You can only change the domain of a cookie on the client, or when it's being set on the server. Once a cookie has been set, the path and domain information for it only exists on the client. So existing cookies can't have their domain changed on the server, because that information isn't sent from the client to the server.
For example, if you have a cookie that looks like this on your local machine:
MYCOOKIE:123, domain:www.test.com, path:/
Your server will only receive:
MYCOOKIE:123
on the server. Why isn't the path and domain sent? Because the browser keeps that information on the client, and doesnt bother sending it along, since it only sends this cookie to your server if the page is at www.test.com and at the path /.
Since it's your server, you should be able to change your code that creates new cookies. If you felt you needed to do it outside of your code for some reason, you could do so with something like the following, but you'd have to look exactly at how your cookie is being written in the header to match it exactly. The following is an untested guess at a workable solution for this, using Apache's mod_headers:
<IfModule mod_headers.c>
Header edit Set-Cookie (.*)(domain=.company.com;)(.*) $1 domain=app.company.com; $2
</IfModule>
You can also use mod_headers to change the cookie received from the client, like so, if need be:
<IfModule mod_headers.c>
RequestHeader edit Cookie "OLD_COOKIE=([0-9a-zA-Z\-]*);" "NEW_COOKIE_NAME=$1;"
</IfModule>
This would only rename cookies you receive in the request.
ProxyPassReverseCookieDomain company.com app.company.com
or interchanging domains (as you are not clearly defining which is internal/external).
ref: https://httpd.apache.org/docs/2.4/en/mod/mod_proxy.html#ProxyPassReverseCookieDomain
I don’t know any module that provides such feature. So I guess you will need to write your own output filter using mod_ext_filter that does this for you.
But if you have control over the other server, it might suffice to just omit the cookie’s domain value so that the client will automatically choose the requested domain as the cookie’s domain.
I ended up just creating an intermediate page that via javascript changed the cookie domain to the proxy server (by omitting the domain value) and then re-directed user to the target page. That seemed to resolve the issue. Thanks for your answers.
If your web-app is capturing the Host: header and using that to determine the domain= portion of the cookie, you might also consider the Apache directive
ProxyPreserveHost On
which relays the Host: header from the client.
This only works if your app is designed to assume that its domain name is whatever the client suggests with the Host header. If your app is one of these, this will not only fix your cookies, but also any absolute URLs that the application generates, which can save you the overhead of otherwise needing to enable mod_substitute
ProxyPass "/" "http://company.com/"
ProxyPassReverse "/" "http://company.com/"
ProxyPassReverseCookieDomain "company.com" "app.company.com"
Note: If it comes second in ProxyPass it should be first in ProxyPassReverseCookieDomain... I spent half a day or more figuring this out :-/
Also see: https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypassreversecookiedomain