Changing request and response with an Apache Proxy Server - apache

I want to use an Apache proxy server (mod_proxy) to intercept all requests and responses to a web server. However I want to change requests and responses before redirecting them. Simply rewriting URLs is easy and documented, but the changes I want to make are more sophisticated, namely they need to inspect the request for user credentials as well as conditionally make redirects.
Is this possible in Apache's mod_rewrite, possibly in combination with other modules?
While the main goal is to implement this in Apache, I would also be happy with an alternative solution which doesn't necessarily use Apache.
Here is a more precise explanation of what I want to achieve, to give a little more context:
Check each incoming request for user credentials. If credentials are present, they are replaced by the user information which the web server can use to identify the user (Ideally in the Authorization header)
For example, let's assume a request contains a cookie which authenticates the request as beeing sent from the user "John", this cookie is removed, and the Authorization header is changed to Authorization Authenticated_by_proxy {"id":12345,"name":"John"}
Check each answer to see if it's an Error 403. If this is the case and the user is not logged in, redirect the user to a login page instead of forwarding the error

Related

Windows Authentication issue with .Net Reverse Proxy using IIS custom HTTP module

We use a custom HTTP module in IIS as a reverse proxy for web applications. Generally this works well and has done for some time, but we've come across an issue with Windows Authentication (WA). We're using IE 11, IIS 10 and Server 2016.
When accessing the target site directly, WA works fine - we get a browser login dialog when the initial HTML page is requested and the subsequent requests (CSS, JS, etc) go through fine.
When accessing via our proxy, the same (correct behaviour) happens for the initial html page, the first CSS/JS request authenticates ok too, but the subsequent ones cause a browser login to popup.
What seems to happen on the 'bad' requests (i,.e. those that cause the login dialog) is:
1) Browser decides it needs to authenticate, so sends an Authorization header (Negotiate, with an NTLM token)
2) Server responds (401) with a WWW-Authenticate: Negotiate response with a full NTLM token
3) Browser re-requests with an Authorization header (Negotiate, with a full NTLM token)
4) Server responds (401) with a WWW-Authenticate: Negotiate (with no token), which causes the browser to show the login dialog
5) With login credentials entered, Browser sends the same request as in (1) - identical NTLM token, server responds as in (2), Browser re-requests as in (3), but this time it works!
We've set up a test web site with one html page, requesting 3 JS and 2 CSS files to replicate this. On our test server we've got two sites, one using our reverse proxy and one using ARR. The ARR site works fine. Also, since step (5) above works, we believe that the proxy pass-through is fundamentally working, i.e. NTLM tokens are not being messed up by dodgy encoding, etc.
One thing that does work, is that if we use Fiddler and put breakpoints on each request, we're able to hold back on the 5 sub-requests (JS & CSS files), letting one go through at a time. If we let each sequence (i.e. NTLM token exchange for each URL/file, through to the 200 response), then it works. This made us think that there is some inter-leaving effect (e.g. shared memory corruption) in our proxy, this is still a possibility.
So, we put code at the start of BeginRequest and end of EndRequest with a Synclock and a shared var to store the Path (AppRelativeCurrentExecutionFilePath). This was for our code to 'Single Thread' each of these request/exchanges. This does what we expected, i.e. only allowing one auth exchange to happen and resulting in a 200 before allowing the next. However, we still have the same problem of the server rejecting the first exchange. So, does this indicate something happening in/before BeginRequest, where if we hold the requests back in Fiddler then they work, but not if we do it in our http module?
Or is there some sort of timing issue where the manual breakpoints in Fiddler also mean we’re doing it at ‘human’ speed and therefore allowing things to work better?
One difference we can see is the ‘Connection: Keep-Alive’. That header is in the request from the browser to our proxy site, but not passed from our proxy to the base site, yet the ARR site does pass that through... It’s all using HTTP 1.1. and so we can't find a way to set Keep-Alive on our outgoing request - could this be it?
Regarding 'things to try', we think we've eliminated things like having the site in the Intranet Zone for IE by having the ARR site work ok, and having the same IE settings for that site. Clearly, something is not right, so we could have missed something here!
In short, we've been working on this for days, and have tried most of what we can find on SO and elsewhere, but can't figure out what the heck is going on.
Any suggestions - let me know if you want any further info. All help will be very gratefully received!

Can I use vb.net to interrogate a website to know if it uses SSL

I have a program that asks the user to type in a URL, and click download. Then the program downloads the webpage.
However, some websites use SSL, and in that case the user has to prefix his URL with https:// for this to work.
The problem is that the user may not know whether the website uses SSL, and may type http://... instead of https://....
Is there some way to send a preliminary message to the website (from vb.net) asking whether the URL should start with https or just http? If there is, I can correct the user URL before attempting to retrieve the web page.
(I should say there it is not enough to use something like this:
request.RequestUri.Scheme - this looks at the URL the user submitted, not the URL coming back from the server, as far as I know)
For websites that uses SSL, usually they will force the request to use HTTPS. That is when you send a request in HTTP, for example, http://www.example.com, the website will send a redirect response with HTTP status code 302 as well as the URL the client side that initiate the request should redirect user to.
So, you can try HTTP first and check the response to see if there is a redirect. So, you will need to handle that in your code.

How does apache match authentication/authorization information with subsequent http requests from same user?

When you protect an area of your document root using either the server configuration or .htaccess, the server prompts for a username and password when someone requests those files from a browser. If the password matches the one from the authentication provider for that user, the documentation at http://httpd.apache.org/docs/2.2/howto/auth.html says that apache will set environment variables for that user. In my case I'm building a php app, and using phpinfo() I gather that the environment variables set are are REDIRECT_AUTHENTICATE_SAMACCOUNTNAME, AUTHENTICATE_SAMACCOUNTNAME (Using active directory as authentication provider), and REMOTE_USER. I believe this is what prevents the user from being prompted again and again on each subsequent request.
What I don't understand is how apache matches requests from a user with the environment variables set for that user, and also when and how it knows how to clear those variables. I doesn't appear to use cookies, because I cleared all the cookies for the domain in question, and still it doesn't ask me to reauthenticate unless I actually close the browser.
Ultimately I'm going to be working with php to get the userid and to maintain state, but since php is getting the information from the apache information, I'd like to know about that context, and I don't seem to be able to find these details. Thanks in advance.
Look at the http headers your browser is sending. After you have supplied a username and password, your browser will continue sending those details to that site until your browser session ends, or longer if you tell your browser to remember the credentials.

JSONP over HTTPS

Is it possible to send a jsonp-Request from domain http://www.a.com (not under my control) to domain www.b.com (under my control) through https? If so, are the parameter values in the GET-Request encrypted or do they be logged in access-logs in plain text?
I'm searching a secure way to do cross domain request. Unfortunately POST-Statements through CORS requests / SSL doesn't work with Internet Explorer. It doesn't support setting cookies by Access-Control-Allow-Credentials. Is there another way to achieve this goal?
For the second part of the question , HTTPS will only encrypt the channel the request uses to transfer the data. Once it arrives at the web server all the request params will be logged in your access log in plain text.
You would need to use a POST request to prevent the data being written to the access log. However you cant use JSONP over a POST request (not possible to send a POST request using a tag).

HttpWebRequest cookie with empty domain

I have an ASP.NET MVC action that sends a GET request to another server via HttpWebRequest. I'd like to include all cookies in the original action's request in the new request. Some of the System.Web.HttpCookies in the original request have empty domain values (i.e. ""), which apparently doesn't cause any issues. When I create a System.Net.Cookie using the name, value, path, and domain of each of these cookies and add it to the request's CookieContainer, I get this error:
"System.ArgumentException: The parameter '{0}' cannot be an empty string. Parameter name: cookie.Domain"
Here's some code that will throw the same error (when the cookie is added):
var request = (HttpWebRequest)WebRequest.Create("http://www.whatever.com");
request.Method = "GET";
request.CookieContainer = new CookieContainer();
request.CookieContainer.Add ( new Cookie ( "MyCookieName", "MyCookieValue", "/", "") );
EDIT
I sort of fixed this by using "localhost" for the domain, instead of the null or empty string value from the original HttpCookie. So, why does an empty domain not work for the CookieContainer? And does HttpCookie use an empty value to signify localhost, or do I need to find another fix for this problem?
Disclaimer:
As stated earlier by #feroze, setting your cookies' domain to localhost is not going to work out so well for you. I'm assuming you're writing a helper that allows you to tunnel HTTP requests out to foreign domains. Note that this is not best practice and in a lot of cases is not needed (i.e. jQuery has a lot of cool cross-domain support built-in, also see the new CORS specification). But sometimes you may be stuck doing this (i.e. the external resource is XML only, and is on a server that doesn't support CORS).
Background Information on Cookie Domains and How They Work:
If you haven't already take a look at HTTP Cookie: Domain and Path on Wikipedia -- pretty much everything you need to know is in there.
When evaluating a cookie, the Domain and Path are taken into account by both the client (the "local" requester) and the web server (the "foreign" responder). When a client requests a resource, the client should only send cookies where those cookies match the Domain (or a more generic parent domain) and Path (or a more generic parent path) of the URI being requested.
Web browsers handle this correctly. If a web browser has a cookie for the domain "localhost" and you're requesting "google.com", for example, those cookies for the "localhost" domain won't be sent in the request to "google.com". -- In fact, most modern browsers won't just not send them, they'll completely ignore them in Set-Cookie response headers that they receive (these are called third-party cookies -- enabling the acceptance of third party cookies in your web browser is a huge privacy/security concern -- don't do it!).
It works in the other direction as well -- even though it's unlikely for a client to include a third party cookie in a request, if it does, the foreign web server is supposed to ignore it (and even some cookies for correct domains/paths, so as to prevent the infamous super-cookie issue. (i.e. The web server hosting "example.com" should ignore cookies belonging to its parent domain: ".com", because ".com" is a "public suffix")).
What You Should Do [if you have to]:
The course of action I recommend for you, is when you read in your client's cookies (I'm not an MVC guy, but in regular ASP.NET this would be in Request.Cookies), loop through them (make sure to filter out your own site's legitimate cookies, especially SessionId, etc -- or use Path properly so they never get sent to this page in the first place), then add them one at a time to the outgoing request's cookie collection, rewriting the domain to be "www.whatever.com" (per your example -- if you're doing this dynamically, load the URL into a new Uri() object and use the .Host property), and then set the Path to "/". -- This will build the "Cookie" header for the outgoing request to the foreign web server.
When that request returns to your server, you then need to check it's incoming response for new cookies -- those cookies can be repackaged and sent back down to your client in much the same type of loop as I illustrated in the previous paragraph, except you'll want to rewrite Host to be Request.Url.Host -- and you'll want to set path back to "/" unless the path to your passthru page is static (I'm guessing it isn't since you're using MVC) then you'd want to set it to Request.Url.AbsolutePath for instance.
Happy Coding!
EDIT:
Also, you'll want to set the X-Forwarded-For tag of the outgoing request, so that the website you're calling doesn't think your web server is one single client that's been spamming the crap out of them.
Not sure it solves your problem. But to add cookies without the "Domain" property you must add to the headers the cookies using HttpRequestHeader.Cookie as follows.
request.Headers.Add(HttpRequestHeader.Cookie, "Your cookies...");
Hope it helps!
Some background
This occurs because CookieContainer is client-side container designed to be reused across multiple HttpWebRequest. Reusing it provides the expected cookie behavior that cookies set by the remote host are sent back with every subsequent HttpWebRequests targeted at the same host.
As a result of the reuse, a CookieContainer might actually contain cookies from multiple request and\or hosts.
So, in order to determine which of the cookies in the container need to be sent with a particular HttpWebRequest to some host (domain), CookieContainer examines the Domain and the Path property.
That's why a Cookie in a CookieContainer needs to have a valid Domain.
Conversely, on the server-side cookies are delivered via a different type, CookieCollection which a simple list of cookies with no extra logic.
Specifically, in your case, while copying cookies from the CookieCollection to the CookieContainer you need to set the Domain property of every cookie to the domain your are going to forward the request to, so that HttpWebRequest will know to include the cookies while sending the request.
You are trying to get cookies sent to localhost, right?
Why don't you do something like this where you give your own machine a real name:
Edit your hosts file and add a line "127.0.0.1 myname.com"
Test using myname.com - which is actually your localhost.
Your browser or app will not know the difference and send cookies to myname.com if that is where the cookie belongs.
Detailed info:
The Hosts file on windows is located at C:\Windows\System32\drivers\etc\hosts