Apache Reverse Proxy ReWrite - apache

I have a apache instance setup to reverse proxy an internal application. I have this working using mod_proxy, but the end result is a lack of images and other content due to hard coded paths in the application itself. I think I have two options.
Mod_Rewrite
Mod_HTML
The basic problem is this.
External site: http://external.customer.com (Port 80)
Internal site: http://internal.supplier.com:8080/testcustomer
I need to get apache to proxy the connection, but it must use the full URL when talking to the internal server internal.supplier.com:8080/testcustomer and paths must be rewritten so that images etc will render on the end client.
Can anyone give me some guidance here? help would be much appreciated.
Thanks

That may be becuse you have used absolute paths like src=/app/favicon.jpg and src=/app/icons/smiley.jpg......instead of relative paths like using src="favicon.jpg".
This problem can be solved by adding module mod_proxy_html which helps in parsing html.
Then LoadModule proxy_html in your httpd.conf and then add following directives :-
ProxyHTMLEnable On
OR
SetOutputFilter proxy-html
mod_proxy_hmtl has pre-requisite installs libxml2 and libxml2-devel.You can install it through yum.
If you could share your configuration file then may be we can help more.

Related

Use Apache to load a page sitting on a different server with the same URL

We have a situation where ideally we would like a user to access a page on our site at a URL such as https://example.com/path/to/page. However, the HTML to render that page is sitting on an entirely different server (S3 to be exact) that we have control over, and we would like to render that page for that URL without redirecting (i.e. changing the URL itself).
I took a brief look at the Apache mod_proxy module, but it doesn't seem to do the job as we just get 500 or 404 errors. Here is an example entry from our .htaccess:
<IfModule mod_proxy.c>
RewriteRule "/path/to/page/(.*)$" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/$1" [P]
</IfModule>
Any help or a pointer in the right direction would be appreciated.
Most likely you stumble over the fact that you are using an absolute path inside a dynamic cohnfiguration files RewriteRule. Have a try with that instead:
RewriteEngine on
RewriteRule "/?path/to/page/(.*)$" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/$1" [P]
That slightly modified will work in dynamic configuration files and in the real http servers host configuration.
But as mentioned in the comment I wonder why you should not be able to use the proxy module directly to simplify things. You'd have to do that in in http servers host configuration though, this is not possible in dynamic configuration files:
ProxyRequests off
ProxyPass "/path/to/page/" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/"
ProxyPassReverse "/path/to/page/" "https://bucketname.s3-website-eu-west-1.amazonaws.com/path/to/page/"
And a general hint: you should always prefer to place such rules inside the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those files are notoriously error prone, hard to debug and they really slow down the server. They are only provided as a last option for situations where you do not have control over the host configuration (read: really cheap hosting service providers) or if you have an application that relies on writing its own rewrite rules (which is an obvious security nightmare).

Neo4j not functional through apache proxy

I am able to run neo4j fine through port 7474 on my server including cypher queries. Though when I access neo4j through the apache proxy it will load just fine but any requests done through cypher will only return an "Unknown error". I have other proxies such as rstudio running just fine.
I have tried the default values on the neo4j website for proxy configuration with no success. I am at a loss for what to try. Please let me know for more information needed, or how I can get additional information on the cypher error.
I tried the sample Query:
CREATE (n {name:"World"}) RETURN "hello", n.name
And this returns "Unknown error" when done through the proxy, but when done through port 7474 it works fine
This is a Linux Ubuntu LTR 12.04.4 machine.
Neo4j 2.1.1
Apache 2.2.22
Sorry if this is vague but I have not found any help for this issue nor do I know what additional information would be relevant.
Thank you.
Update:
It now works with the case provided by Stefan (Thank you!). But I am unsure how to change it from being on the root of my domain to "/database/" in your example you can change it to "/neo4j" How would I change the other parts of this config file for this to function?
As it looks now (non functional with change of proxy from "/"):
ProxyPass /database/ http://localhost:7474/
ProxyPassReverse /database/ http://localhost:7474/
RedirectMatch permanent ^/database /database/
<Location /db/manage>
AddOutputFilterByType SUBSTITUTE application/json
Substitute "s|http://localhost:7474|http://localhost:8080|n"
</Location>
I tried to change the substitute rule from "localhost:8080" to "localhost:8080/database" and to "/database" to no avail.
In closing what worked is to make it a subdomain and still have it on the root. Not sure why this has to be the case, but it is functional. Thank you again Stefan!
Some time ago I've setup a example config for using mod_proxy and mod_substitute, see https://github.com/sarmbruster/vagrant_neo4j_modproxy. See esp the Apache config file.
Be aware that mod_substitute will not work with huge responses > 1M.

how do I know that I have apache static file configuration correct with mod_wsgi

I have apache 2.2 with mod_wsgi handling /
WSGIScriptAlias / "...wsgihandler.py"
I have followed instructions to setup static file handling with AliasMatches and a matching directory configuration.
The website is working fine.
How can I determine that static content is served by Apache and not via wsgihandler.py
working? The apache access log file doesn't help me, even when I set it to debug.
I've tried to intercept and read traffic between Firefox and the server, but that didn't enlighten me either.
Work out what the URLs of the static files are and then comment out the WSGIScriptAlias. The URLs should still work.
Note that in general you would not use AliasMatch but just Alias. You might want to provide the appropriate parts of the Apache configuration so it can be reviewed to see whether you are doing it in the best way.
This is my attempt:
load the headers_module
and in
Header set MyHeader "Static content served"

How can I rewrite URLs in XML with Apache 2.4?

Apache 2.4 includes mod_proxy_html and that's great, it's catching all kinds of URLs inside the HTML coming back from the server and fixing them. But I've got a Seam app that sends back text/xml files to the client sometimes with fully qualified URLs that also need to be rewritten and mod_proxy_html doesn't fix them.
Apparently there was a mod_proxy_xml that used to exist separately from mod_proxy_html but Apache didn't include that. Is there a way to get mod_proxy_html configured to do the same thing? I need it to fix URLs in both the HTML and XML files coming back from a server.
Follow up:
I continue to fight with this and I've tried a few different solutions with no success including using mod_substitute (which somehow I'm configuring incorrectly because it never seems to substitute anything for anything) and using the force flag mod_proxy_html has to try and force it to do all files under a certain path.
This is an old question, but I just faced the same issue.
I tried with mod_proxy_html, compiled mod_proxy_xml, nothing worked.
#JonLin's suggestion is spot on, it works with mod_sed.
The only if is mod_sed is documented to work inside Directory nodes.
If you declare a Location though and do a SetOutputFilter instead of AddOutputFilter (which requires a mime type) it works beautifully.
The config that works is:
<Location "/">
SetOutputFilter Sed
OutputSed "s,http://internal:80,https://external.com,g"
</Location>

apache mod_proxy_html on Ubuntu ProxyHTMLEnable not working

I'm trying to use mod_proxy_html on Ubuntu which I installed from apt-get. The module is loading properly and all ProxyHTML* directives work except for the one that matters the most. When I do "ProxyHTMLEnable on" in my apache2.conf or vhost conf files, apache complains that it's an invalid directive and I must have misspelled it. Is anyone else having this issue on Ubuntu and what can be done to fix it?
Have you tried leaving out "ProxyHTMLEnable on" entirely? I think that directive is new and not in the version in Ubuntu.
Do put "SetOutputFilter proxy-html" in its place
While this isn't necessarily specific to the question, I figured I'd throw this out there for anyone else getting here from the Google super-highway.
I tried just removing the ProxyHTMLEnable On and adding SetOuputFilter proxy-html, but still wasn't working for me. The "gotcha" in my case was the content mod_proxy_html was trying to process was compressed.
Adding SetOutputFilter INFLATE;proxy-html;DEFLATE instead of SetOuputFilter proxy-html did it for me. (will obviously lead to more processing being done)
This site explains it much better than I can: http://wiki.uniformserver.com/index.php/Reverse_Proxy_Server_2:_mod_proxy_html_2#Cause_and_Solution_3