How to rewrite url in html document in Apache - apache

What I try to achieve is to rewrite
http://www.mydomain.com/subject/
to
http://mylocalhost:8080/ (= tomcat).
Purely for forwarding I use in httpd:
ProxyPass /subject/ http://mylocalhost:8080/
ProxyPassReverse /subject/ http://mylocalhost:8080/
This works, except for the content of the html documents. IOW: all the links in the returned html still contain
http://mylocalhost:8080/...
My attempts with mod_rewrite haven't been very successful, so my question is: how do I rewrite the actual document contents?
The tomcat app doesn't give the possibility to alter the baseurl.

You may need to do an internal = not visible redirect. Go here for a clean explanation.

For situations where you need to reverse-proxy an application that doesn't cooperate and cannot be updated, there is mod_proxy_html.
This will buffer any returned HTML documents and rewrite links inside them.
ProxyPass /subject/ http://mylocalhost:8080/
ProxyPassReverse /subject/ http://mylocalhost:8080/
ProxyHTMLEnable On
It has various other directives to control precisely what and how it changes.

Related

Tomcat mod_proxy AJP static resources directory

I am using httpd with tomcat using the following config:
ProxyPass / ajp://localhost:8009/MyProject
ProxyPassReverse / ajp://localhost:8009/MyProject
This works fine except my image links from tomcat do not work when the HTML renders:
<img src="/MyProject/img/image.jpg"/>
where as I would expect:
<img src="/img/image.jpg"/>
Your image(s) is placed in tomcat at path /img/image.jpg that is context relative path, the absolute path is /MyProject/img/image.jpg for your tomcat, eventhough it is /img/image.jpg outside of the apache. You proxy / -> /MyProject so when you add the context name to the path 'MyProject', it really doesn't work as you mentioned.
SOLUTION 1:
Use context relative paths in tomcat
img/image.jpg
In this case you have to be careful about the requrested URI, e.g. /MyProject/page1/action1/ has its image relative path
../../img/image.jsp
SOLUTION 2:
Use document root paths with leading slash
/img/image.jpg
and define the element base with the document root ('href' attribute). Just be careful about link!
<head>
<base href="http://www.mydomain.com/">
</head>
see http://www.w3schools.com/tags/tag_base.asp
SOLUTION 3:
Map the project to the same URI in apache as in tomcat (Personaly I use this solution as well because it is very easy, and I use a common word as a project/context name, e.g. 'web', 'site', etc.).
ProxyPass /MyProject ajp://localhost:8009/MyProject
SOLUTION 4:
Use a content filter such as mod_proxy_html
http://httpd.apache.org/docs/current/mod/mod_proxy_html.html
NOTE: This solution is ever a bit slow (it doesn't matter witch filter you use)!
Be aware PROXY CONFIGURATION!!!
This is just about redirect etc., but you have a wrong configuration of your ProxyPathReverse!
ProxyPass / ajp://localhost:8009/MyProject
ProxyPassReverse /MyProject http://www.mydomain.com/
see the full explanation
http://www.humboldt.co.uk/the-mystery-of-proxypassreverse/#more-131
read configuration examples
http://www.apachetutor.org/admin/reverseproxies
You need to either:
Use mod_html to rewrite the links. This is slow, and an indication that you've done the wrong thing.
Issue a redirect from / to /MyProject, which you can do with a RewriteRule, or a <meta http-equiv="refresh" content="0; url=http://<host>/MyProject/"> in /index.html, and change the ProxyPass directives to
ProxyPass /MyProject ajp://localhost:8009/MyProject
so that proxying doesn't mess around with the URL paths. This is by far the better technique. You probably don't need the ProxyPassReverse directive at all, but if you do you should apply the same change.
I have not yet tested this for accuracy (and AJP tends to short circuit things like rewrites in Apache making extra testing and tweaking almost mandatory). So with that little AJP-disclaimer you might try something along the lines of:
ProxyPass /MyProject ajp://localhost:8009/MyProject
ProxyPassReverse /MyProject ajp://localhost:8009/MyProject
ProxyPass / ajp://localhost:8009/MyProject
ProxyPassReverse / ajp://localhost:8009/MyProject
Just to try catching those incorrect image paths on the inbound. If that fails try toying with a trailing slash.

Apache reverse proxy and load balancer - does not work as it should

I have 3 machines.
One (loadbalance.lan) is used as a load balancer, the other two (172.16.30.5 and 172.16.30.6) are tomcat's servers. Main page of the tomcat is listening on port 8080
Im typing in the browser loadbalance.lan/tomcat and I am able to see one of the tomcat content (default tomcat page)
The problem is page isn't displayed correctly. There's no images and when I click on any link it displays 404 Not found error.
Lets say I want to access one of the sub pages on the tomcat website. Tomcat website address: 172.16.30.5:8080
Now I can choose, lets say "status" link which redirects me to: 172.16.30.5:8080/manager/status (and works fine)
When I access the same page but via reverse proxy server (loadbalance.net) and click that link on the loadbalance.lan page, links redirect me to loadbalance.lan/manager/status and I get 404 error.
Of course when I type in the browser loadbalance.lan/tomcat/manager/status it displays correct.
Problem with the images is also weird. When I use url: loadbalance.lan/tomcat I can't see images (Tomcat logo)
When I use this one: loadbalance.lan/tomcat/ (slash at the end) it's ok. At least images because links still redirect in wrong place.
Here is my loadbalance.lan apache config:
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
<VirtualHost *:80>
ProxyRequests Off
ProxyVia On
ProxyPreserveHost On
<Proxy balancer://cluster>
Order Deny,Allow
Allow from all
</Proxy>
<Proxy balancer://cluster>
BalancerMember http://172.16.30.5:8080
BalancerMember http://172.16.30.6:8080
<Proxy balancer://cluster>
</Proxy>
<Location /tomcat>
ProxyPass balancer://cluster
ProxyPassReverse balancer://cluster
</Location>
</VirtualHost>
Could someone help me with this?
Obviously there is something wrong with that proxy but I have no idea how to fix that :(
From ProxyPassReverse documentation (strong added):
This directive lets Apache adjust the URL in the Location, Content-Location and URI headers on HTTP redirect responses. This is essential when Apache is used as a reverse proxy (or gateway) to avoid by-passing the reverse proxy because of HTTP redirects on the backend servers which stay behind the reverse proxy.
Only the HTTP response headers specifically mentioned above will be rewritten. Apache will not rewrite other response headers, nor will it rewrite URL references inside HTML pages. This means that if the proxied content contains absolute URL references, they will by-pass the proxy. A third-party module that will look inside the HTML and rewrite URL references is Nick Kew's mod_proxy_html.
So, the proxy job is not to rewrite the html content of the pages, if the proxyied content does not know that the final url should contain /tomcat extension and the proxy does not alter the pages... you're stuck.
This is usually something you do not see because the 172.16.30.5:8080 part is well rewritten in localhost.lan, but this rewrite is not made by the proxy, quite certainly because urls are in fact only relative (<img src="/foo/bar.png">). Check the source code of the page to see if the domain name is really rewritten in urls).
There's several ways of handling that:
- You could avoid altering relative urls paths in, the proxy (so not using a tomcat/ prefix, but instead a dedicated virtualhost with a name, like tomcat.lodabalncer.lan).
- You could also use some dedicated tools, like mod_proxy_html to rewrite the content of the pages, but that's a slow and complex thing.
- The third way is to manage the final full url on the application side (here tomcat) and detect the proxy chain elements in X-Forwareded-for Header to rebuild the right domain.
- Some applications provides tools for that, like the VirtualHostMonster in Zope
For tomcat the preferred tool is mod_proxy_ajp and not mod_proxy. But for a load balancer proxy I do not think you can use mod_proxy_ajp. And, it's been a long time since I made this, but in my memory I think mod_jk was the solution to that.
Read this full documentation on tomcat proxying for details. At least you should get some hints for the solution.

Apache mod_rewrite mod_proxy redirect

I am trying to redirect /foreman to https://someurl:4343
I am using:
SSLProxyEngine on
ProxyPass /foreman https://MyIP:4343/
ProxyPassReverse /foreman https://MyIP:4343/
Results so far are that:
I get the index page with no style and no images
none of the links work i.e. /foreman/hosts?somevariable=somevalue
I would like to get all requests to /foreman/* to go to https://MyIP:4343/* including variables, get requests, images, style sheet, etc
How should I proceed ?
This was a clueless noob moment.
I found out that my desired outcome can be resolved with mod_ruby
I used the below link for a guide to the resolution to my issue.
Foreman as Sub-URI
You need the trailing slash on /foreman
ProxyPass /foreman/ https://MyIP:4343/
ProxyPassReverse /foreman/ https://MyIP:4343/

Reverse Proxy in CakePHP?

I've got a CakePHP application, and the following directives in my httpd.conf
ProxyRequests off
ProxyPass /forum/ http://somesite.com/phpbb3
ProxyPass /gallery/ http://someothersite.com/gallery3
<Location /forum/>
ProxyPassReverse /
</Location>
<Location /gallery/>
ProxyPassReverse /
</Location>
Without CakePHP this works fine - but because CakePHP is using it's own redirection logic from routes.php and other sources, it seems to override any proxy settings, so any call to "/community" on my server follows the default pathway of looking for a Controller called CommunityController.
My issue here is that I want to have one server that serves muliple applications, but keep it seamless to the user - so a complete PHPBB application can for instance run within the "/forum" directory as if it were a controller in CakePHP.
Has anyone done this before, and can it be done? Why does mod_rewrite and/or the routes.php file override my mod_proxy directives??
Perhaps instead of using mod_proxy, you could use mod_rewrite to create a RewriteRule directive with the [P] (proxy) flag in conjunction with the [L] (last rule) flag.
'proxy|P' (force proxy):
This flag
forces the substitution part to be
internally sent as a proxy request and
immediately (rewrite processing stops
here) put through the proxy module.
You must make sure that the
substitution string is a valid URI
(typically starting with
http://hostname) which can be handled
by the Apache proxy module. If not,
you will get an error from the proxy
module. Use this flag to achieve a
more powerful implementation of the
ProxyPass directive, to map remote
content into the namespace of the
local server.
Note: mod_proxy must be enabled in
order to use this flag.
'last|L' (last rule):
Stop the
rewriting process here and don't apply
any more rewrite rules. This
corresponds to the Perl last command
or the break command in C. Use this
flag to prevent the currently
rewritten URL from being rewritten
further by following rules. For
example, use it to rewrite the
root-path URL ('/') to a real one,
e.g., '/e/www/'.

Apache - Reverse Proxy and HTTP 302 status message

My team is trying to setup an Apache reverse proxy from a customer's site into one of our web applications.
http://www.example.com/app1/some-path maps to http://internal1.example.com/some-path
Inside our application we use struts and have redirect = true set on certain actions in order to provide certain functionality. The 302 status messages from these re-directs cause the user to break out of the proxy resulting in an error page for the end user.
HTTP/1.1 302 Found
Location: http://internal.example.com/some-path/redirect
Is there any way to setup the reverse proxy in apache so that the redirects work correctly?
http://www.example.com/app1/some-path/redirect
There is an article titled Running a Reverse Proxy in Apache that seems to address your problem. It even uses the same example.com and /app1 that you have in your example. Go to the "Configuring the Proxy" section for examples on how to use ProxyPassReverse.
The AskApache article is quite helpful, but in practice I found a combination of Rewrite rules and ProxyPassReverse to be more flexible. So in your case I'd do something like this:
<VirtualHost example>
ServerName www.example.com
ProxyPassReverse /app1/some-path/ http://internal1.example.com/some-path/
RewriteEngine On
RewriteRule /app1/(.*) http://internal1.example.com/some-path$1 [P]
...
</VirtualHost>
I like this better because it gives you finer-grained control over the paths you're proxying for the internal server. In our case we wanted to expose only part of third-party application. Note that this doesn't address hard-coded links in HTML, which the AskApache article covers.
Also, note that you can have multiple ProxyPassReverse lines:
ProxyPassReverse / http://internal1.example.com/some-path
ProxyPassReverse / http://internal2.example.com/some-path
I mention this only because another third-party app we were proxying was sending out redirects that didn't include their internal host name, just a different port.
As a final note, keep in mind that Firebug is extremely useful when debugging the redirects.
Basically, ProxyPassReverse should take care of rewriting the Location header for you, as Kevin Hakanson pointed out.
One pitfall I have encountered is missing the trailing slash in the url argument. Make sure to use:
ProxyPassReverse / http://internal1.example.com/some-path/
(note the trailing slash!)
Try using the AJP connector instead of reverse proxy. Certainly not a trivial change, but I've found that a lot of the URL nightmares go away when using AJP instead of reverse proxy.