Apache 2.4 rewriting directory URLs without trailing slash to https://default_site/dir/ instead of preserving domain - apache

This is a relatively recent behavioral change and appears to be related only to requests which include a "Upgrade-Insecure-Requests: 1" request header.
Apache has started rewriting such requests for sites which are HTTP-only to an HTTPS URL using the default site name instead of just adding the / at the end of the requested URL.
Example: URL submitted in browser: http://www.example.com/blah
Intended redirect: 301 to http://www.example.com/blah/
Instead redirects: 301 to https://default.site.configured/blah/
This happens whether it's a named virtual on the same address as the default server or a virtual using a separate address with separate Listen directives.
I understand all the arguments in favor of the idea that everything should always be encrypted and I don't want to get into a debate about that. This site doesn't consider the tradeoffs desirable at this time.
The default site does have SSL and is configured to redirect HTTP->HTTPS, but the www.foo.com site is not configured that way and does not wish to implement SSL at this time.
Is there any way to get Apache 2.4 to disregard that "Upgrade" header and simply rewrite the URL as desired rather than altering the domain name?

After banging on this some more, I finally found the source of my woes.
This happens when you have IP based virtual hosts and did not configure a name for them using the "ServerName" directive.
tl;dr: If you are having this problem, try adding a "ServerName www.example.com" directive within the VirtualHost definition for the site and that should resolve it.
Details:
It does not happen until you encounter a URL that requires a rewrite other than adding a trailing /. (i.e. if you get a request that doesn't contain the "Upgrade-Insecure-Requests: 1" header, it only gets the trailing / added, but if you get one with that header, it also tries to rewrite the protocol to https which triggers the full URL rewrite).
In my case, the default host name had an SSL configuration, so it didn't fall back to HTTP after the rewrite or reject the rewrite as invalid.
YMMV, I did not continue to do an exhaustive test of all permutations once I found the solution.

Related

htaccess rewrite. always replace the the first part after domain

My main API URL entry point is: https://www.sample-domain.com/abc/ and then I can have URI_REQUESTS which will follow, for e.g.:
https://www.sample-domain.com/abc/
https://www.sample-domain.com/abc/?do=something
https://www.sample-domain.com/abc/john (with trailing slash or not..)
I want to be alble always to rewirte to any request which has the first part of the URL diffrent back to /abc/. Examples:
https://www.sample-domain.com/def/ (with trailing slash or not..)
https://www.sample-domain.com/def/?do=something or
https://www.sample-domain.com/def/john
I dont care how many parts the URL will have after the first part ot if it has trailing slashes or any query-strings I alwas want to change the first part following the domain back to /abc/
But certain first parts has to be ignored for examle if it comes in as sample-domain.com/help/ then it should not rewrite
Looks pretty straight forward:
RewriteEngine on
RequestCond %{REQUEST_URI} !^/abc/
RequestCond %{REQUEST_URI} !^/(help|help2|help3)/
RewriteRule ^/?[^/]+/(.*)$ /abc/$1 [END,QSA]
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup.
This implementation will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

Apache 2.2 Mod Proxy ProxyPass behavior

I have a server server.example.com which serves Tomcat on port 80 via a ProxyPass/ProxyPassReverse to 8080 and a Drupal site on the same box at server.example.com:8001. If I enter in the port 8001 explicitly, the Drupal site behaves properly, but I need to make it accessible via server.example.com/blog so I created a ProxyPass/ProxyPassReverse for /blog http://server.example.com:8001 which serves the initial page for the Drupal site correctly, but once the form on the home page of Drupal is filled out and submitted, which POSTs to /, the site changes to the Tomcat site, presumably because the / is not relative to the current host on post :8001. How can I get the ProxyPass for /blog to remain persistent so that all subsequent requests remain within the :8001 VirtualHost (Drupal site)?
One thing I tried was with mod_rewrite:
RewriteCond %{HTTP_REFERER} /^blog/.*$
RewriteRule (.*) %{HTTP_HOST}:8001/$1 [L,P,NC]
But that did nothing at all as far as I can tell. I was hoping that if the initial request was for /blog then the referrer would be as well and I could keep requests on the :8001 virtualhost. Perhaps someone can explain why that is flawed.
The problem you are very likely running into is that the documents returned by Drupal include generated links that all reference / instead of /blog. mod_rewrite and proxypass don't do anything to the contents of documents -- they only act upon the request (or, in the case of ProxyPassReverse, on links such as Location: headers in returned content).
To make an application that normally expects to be installed as / operate on a different URL, you need either to :
(a) Configure the application to be aware of the proper base URL. Many applications include such a setting in order to support exactly the situation you have described.
(b) Install some sort of filtering proxy that can modify the content of returned documents. For Apache, mod_proxy_html is made to do exactly this. This is included natively in Apache 2.4 but may need to be installed separately for 2.2.

How to rewrite Location response header in a proxy setup with Apache?

I have a primary proxy which sends requests to a secondary proxy on which OpeenSSO is installed.
If the OpenSSO agent determines that the user is not logged in, it raises a 302 redirect to the authentication server and provides the original (encoded) URL that the user requested as a GET parameter in the redirect location header.
However, the URL in the GET variable is that of the internal (secondary) proxy server, not the original proxy server. Therefore, I would like to edit/rewrite the "Location" response header to give the correct URL.
E.g.
http://a.com/hello/ (Original requested URL)
http://a.com/hello2/ (Secondary proxy with OpenSSO agent)
http://auth.a.com/login/?orig_request=http%3A%2F%2Fa.com%2Fhello2%2F (302 redirect to auth server with requested URL of second proxy server encoded in GET variable)
http://auth.a.com/login/?orig_request=http%3A%2F%2Fa.com%2Fhello%2F (Encoded URL is rewritten to that of the original request)
I have tried pretty much all combinations of headers and rewrites without luck so I'm thinking it may not be possible. The closest I got was this, but the mod_headers edit function does not parse environment variables.
# On the primary proxy.
RewriteEngine On
RewriteRule ^/(.*)$ - [E=orig_request:$1,P]
Header edit Location ^(http://auth\.a\.com/login/\?orig_request=).*$ "$1http%3A%2F%2Fa.com%2F%{orig_request}e"
ProxyPassReverse
ProxyPassReverse should do this for you:
This directive lets Apache adjust the URL in the Location, Content-Location and URI headers on HTTP redirect responses.
I'm not sure why your reverse proxy isn't behaving this way already, assuming you're using a pair of ProxyPass and ProxyPassReverse directives to define it.
Editing the Location Header
If you want to be able to edit the Location header as you describe, you can do it as of Apache 2.4.7:
For edit there is both a value argument which is a regular expression, and an additional replacement string. As of version 2.4.7 the replacement string may also contain format specifiers.
The "format specifiers" mentioned in the docs include being able to use environment variables, e.g. %{VAR}e.
You might also want to consider modifying your application such that the orig_request URL parameter is relativized, thus potentially eliminating the need for Header edits with environment variables.
Relative Path Location Header
You can also try using a relative path in your Location header, which would eliminate the need to explicitly map one domain to the other. This is officially valid as of RFC 7231 (June 2014), but was was widely supported even before that. You can relativize your Location header using Apache Header edit directives (even prior to version 2.4.7, since it wouldn't require environment variable substitution). That would look something like this:
Header edit Location "(^http[s]?://)([a-zA-Z0-9\.\-]+)(:\d+)?/" "/"

Level of obscurity of destination URLs via mod_rewrite

To achieve a single layer of content delivery security, I'm looking into the possibility of obscuring a resource URL via an .htaccess RewriteRule:
RewriteEngine on
RewriteBase /js/
RewriteRule obscure-alias\.js http://example.com/sensitive.js
It would of course be implemented as:
<script type="text/javascript" src="obscure-alias.js"></script>
Because this is not a 301 redirect, but rather a routing scenario similar to that of many of our frameworks we used today, would it be safe to say that this RewriteRule adequately obfuscates the actual URL where this resource is located, or:
Can the destination URL still be found out via some HTTP header sniffing utility
Might a web browser be able to reveal the "Download URL"
I'm going to pre-answer my own questions by saying no to both since the "internal proxy" is taking place on the server-side and not on the client side if I understand it correctly: http://httpd.apache.org/docs/current/mod/mod_rewrite.html. I just wanted to confirm that when Apache goes to serve the destination URL, that it also isn't passing along information to the user agent what the URL was that it rewrote the original request as.
It depends on how you specify the redirect target.
If your http://example.com/ is running on the same server, there will be an internal redirect that is invisible to the client. From the manual:
Absolute URL
If an absolute URL is specified, mod_rewrite checks to see whether the hostname matches the current host. If it does, the scheme and hostname are stripped out and the resulting path is treated as a URL-path. Otherwise, an external redirect is performed for the given URL. To force an external redirect back to the current host, see the [R] flag below.
if the absolute URL points to a remote domain, a header redirect will be performed. A header redirect is visible to the client and will reveal the sensitive location.
To make sure no external redirect takes place, specify a relative URL like
RewriteRule obscure-alias\.js sensitive.js
Note that the sensitive JS file's URL can still be guessed.
To find out whether a request results in a header redirect, log in onto a terminal (eg. on a Linux server) and do
wget --server-response http://www.example.com
If the first HTTP/.... line (there may be more than one) is something that begins with a 3xx, like
HTTP request sent, awaiting response...
HTTP/1.1 302 Moved Temporarily
you are looking at a header redirect.
Possible using proxy throughput.
See http://httpd.apache.org/docs/2.4/rewrite/proxy.html
Also alluded to here as well: mod_rewrite not working as internal proxy

How can I transparently rewrite an old host url to a new host url?

I have two apache virtual hosts within the same domain (and on same physical system):
old.example.com
new.example.com
I'd like to be able to transparently rewrite or map certain old url's to new. Example:
A request for http://old.example.com/foo would actually result in a request for http://new.example.com/foo
I want the http client (browser) to be unaware of the rewrite...in other words, I'm not looking to redirect. And, I only want to rewrite specific url's.
What can I add to either the virtual host or htaccess file(s) to accomplish this?
I guess you could set up your virtual hosts via mod_rewrite and then simply add those rewriting steps to the configuration.
If, however, all you are trying to do is to re-use some things you have in the file system, without any magic in your config files, I would use symbolic links instead. (I have no idea if there are any equivalents for windows servers, though.)
I found the answer here: http://httpd.apache.org/docs/2.0/misc/rewriteguide.html in the section titled Dynamic Mirror. I added this to my htaccess on http://old.example.com :
RewriteEngine on
RewriteBase /
RewriteRule ^foo http://new.example.com/foo [P]
The feature flag P tells the rule to use Proxy Throughput.