How to rewrite Location response header in a proxy setup with Apache? - apache

I have a primary proxy which sends requests to a secondary proxy on which OpeenSSO is installed.
If the OpenSSO agent determines that the user is not logged in, it raises a 302 redirect to the authentication server and provides the original (encoded) URL that the user requested as a GET parameter in the redirect location header.
However, the URL in the GET variable is that of the internal (secondary) proxy server, not the original proxy server. Therefore, I would like to edit/rewrite the "Location" response header to give the correct URL.
E.g.
http://a.com/hello/ (Original requested URL)
http://a.com/hello2/ (Secondary proxy with OpenSSO agent)
http://auth.a.com/login/?orig_request=http%3A%2F%2Fa.com%2Fhello2%2F (302 redirect to auth server with requested URL of second proxy server encoded in GET variable)
http://auth.a.com/login/?orig_request=http%3A%2F%2Fa.com%2Fhello%2F (Encoded URL is rewritten to that of the original request)
I have tried pretty much all combinations of headers and rewrites without luck so I'm thinking it may not be possible. The closest I got was this, but the mod_headers edit function does not parse environment variables.
# On the primary proxy.
RewriteEngine On
RewriteRule ^/(.*)$ - [E=orig_request:$1,P]
Header edit Location ^(http://auth\.a\.com/login/\?orig_request=).*$ "$1http%3A%2F%2Fa.com%2F%{orig_request}e"

ProxyPassReverse
ProxyPassReverse should do this for you:
This directive lets Apache adjust the URL in the Location, Content-Location and URI headers on HTTP redirect responses.
I'm not sure why your reverse proxy isn't behaving this way already, assuming you're using a pair of ProxyPass and ProxyPassReverse directives to define it.
Editing the Location Header
If you want to be able to edit the Location header as you describe, you can do it as of Apache 2.4.7:
For edit there is both a value argument which is a regular expression, and an additional replacement string. As of version 2.4.7 the replacement string may also contain format specifiers.
The "format specifiers" mentioned in the docs include being able to use environment variables, e.g. %{VAR}e.
You might also want to consider modifying your application such that the orig_request URL parameter is relativized, thus potentially eliminating the need for Header edits with environment variables.
Relative Path Location Header
You can also try using a relative path in your Location header, which would eliminate the need to explicitly map one domain to the other. This is officially valid as of RFC 7231 (June 2014), but was was widely supported even before that. You can relativize your Location header using Apache Header edit directives (even prior to version 2.4.7, since it wouldn't require environment variable substitution). That would look something like this:
Header edit Location "(^http[s]?://)([a-zA-Z0-9\.\-]+)(:\d+)?/" "/"

Related

Apache 2.4 rewriting directory URLs without trailing slash to https://default_site/dir/ instead of preserving domain

This is a relatively recent behavioral change and appears to be related only to requests which include a "Upgrade-Insecure-Requests: 1" request header.
Apache has started rewriting such requests for sites which are HTTP-only to an HTTPS URL using the default site name instead of just adding the / at the end of the requested URL.
Example: URL submitted in browser: http://www.example.com/blah
Intended redirect: 301 to http://www.example.com/blah/
Instead redirects: 301 to https://default.site.configured/blah/
This happens whether it's a named virtual on the same address as the default server or a virtual using a separate address with separate Listen directives.
I understand all the arguments in favor of the idea that everything should always be encrypted and I don't want to get into a debate about that. This site doesn't consider the tradeoffs desirable at this time.
The default site does have SSL and is configured to redirect HTTP->HTTPS, but the www.foo.com site is not configured that way and does not wish to implement SSL at this time.
Is there any way to get Apache 2.4 to disregard that "Upgrade" header and simply rewrite the URL as desired rather than altering the domain name?
After banging on this some more, I finally found the source of my woes.
This happens when you have IP based virtual hosts and did not configure a name for them using the "ServerName" directive.
tl;dr: If you are having this problem, try adding a "ServerName www.example.com" directive within the VirtualHost definition for the site and that should resolve it.
Details:
It does not happen until you encounter a URL that requires a rewrite other than adding a trailing /. (i.e. if you get a request that doesn't contain the "Upgrade-Insecure-Requests: 1" header, it only gets the trailing / added, but if you get one with that header, it also tries to rewrite the protocol to https which triggers the full URL rewrite).
In my case, the default host name had an SSL configuration, so it didn't fall back to HTTP after the rewrite or reject the rewrite as invalid.
YMMV, I did not continue to do an exhaustive test of all permutations once I found the solution.

use Apache Alias instead of RewriteRule to serve HTML page

A simple Alias in Apache configuration not working -
Alias /url/path/some-deleted-page.html /url/path-modified/new-avatar-of-some-deleted-page.html
It gives "page not found".
However RewriteRule works as expected but it sends redirect status to browser. I want browser/user not to be aware of the redirect. Hence, I want to use Alias instead of RewriteRule. I want to confirm if mod_alias can be used to map individual URL.
I use ProxyPassMatch also which executes all html pages as PHP script. Also adding ProxyPass makes no diffrence.
ProxyPass /url/path/some-deleted-page.html !
Please help so that I can map individual URL (a bunch of them) with Alias instead of RewriteRule.
The purpose of mod_alias is to map requested URLs with a directory on the system running your httpd instance. It does not return anything to the browser (i.e. no redirection code, nothing). It is all done internally. Hence your client does not even know it is there.
Request: http://www.example.com/someurl/index.html
Configuration
[...]
DocumentRoot "/opt/apache/htdocs"
Alias "/someurl/" "/opt/other_path/someurl_files/"
[...]
In this scenario, users asking for any URL besides /someurl/ would receive files from /opt/apache/htdocs.
If a user asks for /someurl/, files from /opt/other_path/someurl_files/ will be used.
Still missing in this example is a <Directory> definition for securing the Alias directory.
You should read: https://httpd.apache.org/docs/2.4/mod/mod_alias.html
Alias will cover the case where you need to point a certain URL to a particular directory on the file system.
If you need to modify the filename (i.e. the client asks for file A, and you send back page B), you should use RewriteRule. And to hide the fact you changed the filename, use the [P] flag.
This directive allows you to use regex, yet still use a proxy mechanism. So your client does know what went on, as the address in his address bar does not change.

Level of obscurity of destination URLs via mod_rewrite

To achieve a single layer of content delivery security, I'm looking into the possibility of obscuring a resource URL via an .htaccess RewriteRule:
RewriteEngine on
RewriteBase /js/
RewriteRule obscure-alias\.js http://example.com/sensitive.js
It would of course be implemented as:
<script type="text/javascript" src="obscure-alias.js"></script>
Because this is not a 301 redirect, but rather a routing scenario similar to that of many of our frameworks we used today, would it be safe to say that this RewriteRule adequately obfuscates the actual URL where this resource is located, or:
Can the destination URL still be found out via some HTTP header sniffing utility
Might a web browser be able to reveal the "Download URL"
I'm going to pre-answer my own questions by saying no to both since the "internal proxy" is taking place on the server-side and not on the client side if I understand it correctly: http://httpd.apache.org/docs/current/mod/mod_rewrite.html. I just wanted to confirm that when Apache goes to serve the destination URL, that it also isn't passing along information to the user agent what the URL was that it rewrote the original request as.
It depends on how you specify the redirect target.
If your http://example.com/ is running on the same server, there will be an internal redirect that is invisible to the client. From the manual:
Absolute URL
If an absolute URL is specified, mod_rewrite checks to see whether the hostname matches the current host. If it does, the scheme and hostname are stripped out and the resulting path is treated as a URL-path. Otherwise, an external redirect is performed for the given URL. To force an external redirect back to the current host, see the [R] flag below.
if the absolute URL points to a remote domain, a header redirect will be performed. A header redirect is visible to the client and will reveal the sensitive location.
To make sure no external redirect takes place, specify a relative URL like
RewriteRule obscure-alias\.js sensitive.js
Note that the sensitive JS file's URL can still be guessed.
To find out whether a request results in a header redirect, log in onto a terminal (eg. on a Linux server) and do
wget --server-response http://www.example.com
If the first HTTP/.... line (there may be more than one) is something that begins with a 3xx, like
HTTP request sent, awaiting response...
HTTP/1.1 302 Moved Temporarily
you are looking at a header redirect.
Possible using proxy throughput.
See http://httpd.apache.org/docs/2.4/rewrite/proxy.html
Also alluded to here as well: mod_rewrite not working as internal proxy

Retain original request URL on mod_proxy redirect

I am running a WebApplication on a Servlet Container (port 8080) in an environment that can be accessed from the internet (external) and from company inside (intenal), e.g.
http://external.foo.bar/MyApplication
http://internal.foo.bar/MyApplication
The incomming (external/internal) requests are redirected to the servlet container using an apache http server with mod_proxy. The configuration looks like this:
ProxyPass /MyApplication http://localhost:8080/MyApplication retry=1 acquire=3000 timeout=600 Keepalive=On
ProxyPassReverse /MyApplication http://localhost:8080/MyApplication
I am now facing the problem that some MyApplication responses depend on the original request URL. Concrete: a WSDL document will be provided with a element that has a schemaLocation="<RequestUrl>?xsd=MyApplication.xsd" element.
With my current configuration it always looks like
<xs:import namespace="..." schemaLocation="http://localhost:8080/MyApplication?xsd=MyApplication.xsd"/>
but it should be
External Request: <xs:import namespace="..." schemaLocation="http://external.foo.bar/MyApplication?xsd=MyApplication.xsd"/>
Internal Request: <xs:import namespace="..." schemaLocation="http://internal.foo.bar/MyApplication?xsd=MyApplication.xsd"/>
I suppose this is a common requirement. But as I am no expert in configuration of the apache http server and its modules I would be glad if someone could give some (detailed) help.
Thanks in advance!
If you're running Apache >= 2.0.31 then you might try to set the ProxyPreserveHost directive as described here.
This should pass the original Host header trough mod_proxy into your application, and normally the request URL will be rebuild there (in your Servlet container) using the Host header, so the schema location should be build using the host and path infos from "before" the proxy.
(Posted here too for the sake of completeness)
Here is another alternative if you would like to retain both the original host name and the proxied host name.
If you are using mod_proxy disable ProxyPreserveHost in the Apache configuration. For most proxy servers, including mod_proxy, read the X-Forwarded-Host header in your application. This identifies the original Host header provided by the HTTP request.
You can read about the headers mod_proxy (and possible other standard proxy servers) set here:
http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
You should be able to do a mod_rewrite in apache to encode the full URL as a query parameter, or perhaps part of the fragment. How easy this might be depends on whether you might use one or the other as part of your incoming queries.
For example, http://external.foo.bar/MyApplication might get rewritten to http://external.foo.bar/MyApplication#rewritemagic=http://external.foo.bar/MyApplication which then gets passed into the ProxyPass and then stripped out.
A bit of a hack, yes, and perhaps a little tricky to get rewrite and proxy to work in the right order and not interfere with each other, but it seems like it should work.

Changing Cookie Domains

I use apache as a proxy to my application web server and would like to on the fly, change the domain name associated with a sessionid cookie.
The cookie has a .company.com domain associated with it, and I would like using apache mod rewrite (or some similar module), transparently change the domain to app.company.com. Is this possible ? and if so, how would one go about it ?
You can only change the domain of a cookie on the client, or when it's being set on the server. Once a cookie has been set, the path and domain information for it only exists on the client. So existing cookies can't have their domain changed on the server, because that information isn't sent from the client to the server.
For example, if you have a cookie that looks like this on your local machine:
MYCOOKIE:123, domain:www.test.com, path:/
Your server will only receive:
MYCOOKIE:123
on the server. Why isn't the path and domain sent? Because the browser keeps that information on the client, and doesnt bother sending it along, since it only sends this cookie to your server if the page is at www.test.com and at the path /.
Since it's your server, you should be able to change your code that creates new cookies. If you felt you needed to do it outside of your code for some reason, you could do so with something like the following, but you'd have to look exactly at how your cookie is being written in the header to match it exactly. The following is an untested guess at a workable solution for this, using Apache's mod_headers:
<IfModule mod_headers.c>
Header edit Set-Cookie (.*)(domain=.company.com;)(.*) $1 domain=app.company.com; $2
</IfModule>
You can also use mod_headers to change the cookie received from the client, like so, if need be:
<IfModule mod_headers.c>
RequestHeader edit Cookie "OLD_COOKIE=([0-9a-zA-Z\-]*);" "NEW_COOKIE_NAME=$1;"
</IfModule>
This would only rename cookies you receive in the request.
ProxyPassReverseCookieDomain company.com app.company.com
or interchanging domains (as you are not clearly defining which is internal/external).
ref: https://httpd.apache.org/docs/2.4/en/mod/mod_proxy.html#ProxyPassReverseCookieDomain
I don’t know any module that provides such feature. So I guess you will need to write your own output filter using mod_ext_filter that does this for you.
But if you have control over the other server, it might suffice to just omit the cookie’s domain value so that the client will automatically choose the requested domain as the cookie’s domain.
I ended up just creating an intermediate page that via javascript changed the cookie domain to the proxy server (by omitting the domain value) and then re-directed user to the target page. That seemed to resolve the issue. Thanks for your answers.
If your web-app is capturing the Host: header and using that to determine the domain= portion of the cookie, you might also consider the Apache directive
ProxyPreserveHost On
which relays the Host: header from the client.
This only works if your app is designed to assume that its domain name is whatever the client suggests with the Host header. If your app is one of these, this will not only fix your cookies, but also any absolute URLs that the application generates, which can save you the overhead of otherwise needing to enable mod_substitute
ProxyPass "/" "http://company.com/"
ProxyPassReverse "/" "http://company.com/"
ProxyPassReverseCookieDomain "company.com" "app.company.com"
Note: If it comes second in ProxyPass it should be first in ProxyPassReverseCookieDomain... I spent half a day or more figuring this out :-/
Also see: https://httpd.apache.org/docs/2.4/mod/mod_proxy.html#proxypassreversecookiedomain