I am running haproxy 1.6.3 and I have the X-Frame-Origin headers set on the frontent. I just come across the situation when the site is loaded in a iframe and the content is blocked because of that header. I have tried to setting an acl rule which looks as the following:
acl is_embeded path_beg /?embeded=1
http-response set-header x-frame-options "SAMEORIGIN" if !is_embeded
when I run haproxy -f /etc/haproxy/haproxy.conf -c I for the following error:
[WARNING] 316/145915 (23701) : parsing [/etc/haproxy/haproxy.cfg:42] : acl 'is_embeded' will never match because it only involves keywords that are incompatible with 'frontend http-response header rule'
Is there a way to get this work?
Because you are using a request acl in response stage.
You need to stroe the url like this:
http-request set-var(txn.urlEmbeded) url
acl is_embeded var(txn.urlEmbeded) -m beg /?embeded=1
http-response set-header x-frame-options "SAMEORIGIN" if !is_embeded
also you are using path, it does not include the query. you might need to use url or query(embeded) with found method. You get the idea.
There are actually two problems with what you are doing.
First, the path fetch is only available during request processing -- not response processing. This is the reason for the warning. The path isn't allocated a buffer of its own -- the fetch just extracts it from the pending request buffer whenever it's evaluated, and that pending request buffer is released as soon as the request had been sent to the server.
Second, everything beginning with ? is not part of the path. That's the query string.
The capture.req.uri is the correct fetch to use, since it includes both the path and the query string, and since a memory buffer is allocated for it, it persists during request processing.
acl is_embeded capture.req.uri -m beg /?embeded=1
capture.req.uri
This extracts the request's URI, which starts at the first slash and ends before the first space in the request (without the host part). Unlike path and url, it can be used in both request and response because it's allocated.
http://cbonte.github.io/haproxy-dconv/1.6/configuration.html#7.3.6-capture.req.uri
Also note the correct spelling for the word embedded.
Related
I came across an IP address / unknown bot that made four HTTP requests, managing to request four different domain names in the following fashion without the first character being a forward slash /:
"GET www.example.com
When I make test the request http://localhost/www.example.com I see the following in Apache:
"GET /www.example.com
All other requests start with a forward slash. How did the bot manage to make such a request and how can I reproduce this to determine how to handle such requests?
Quoted Apache logs reduced to request method and URL to avoid off-topic comments.
Based on the way HTTP requests work, this can be achieved by sending a raw HTTP request to your IP address and specifying both the GET and Host headers as described on the linked page above:
The most common form of Request-URI is that used to identify a
resource on an origin server or gateway. In this case the absolute
path of the URI MUST be transmitted (see section 3.2.1, abs_path) as
the Request-URI, and the network location of the URI (authority) MUST
be transmitted in a Host header field. For example, a client wishing
to retrieve the resource above directly from the origin server would
create a TCP connection to port 80 of the host "www.w3.org" and send
the lines:
GET /pub/WWW/TheProject.html HTTP/1.1
Host: www.w3.org
followed by the remainder of the Request. Note that the absolute path cannot be empty; if none is present in the
original URI, it MUST be given as "/" (the server root).
This can be done on Windows using PuTTY, or on Linux/Mac using nc (see answer here for more details: https://stackoverflow.com/a/3620596/1038813)
I have a simple condition in my HAproxy config (I tried this for frontend and backend):
acl no_index_url path_end .pdf .doc .xls .docx .xlsx
rspadd X-Robots-Tag:\ noindex if no_index_url
It should add the no-robots header to content that should not be indexed. However it gives me this WARNING when parsing the config:
acl 'no_index_url' will never match because it only involves keywords
that are incompatible with 'backend http-response header rule'
and
acl 'no_index_url' will never match because it only involves keywords
that are incompatible with 'frontend http-response header rule'
According to documentation, rspadd can be used in both frontend and backend. The path_end is used in examples within frontend. Why am I getting this error and what does it mean?
Starting in HaProxy 1.6 you won't be able to just ignore the error message. To get this working use the temporary variable feature:
frontend main
http-request set-var(txn.path) path
backend local
http-response set-header X-Robots-Tag noindex if { var(txn.path) -m end .pdf .doc }
Apparently, even with the warning, having the acl within the frontend works perfectly fine. All the resources with .pdf, .doc, etc are getting the correct X-Robots-Tag added to them.
In other words, this WARNING is misleading and in reality the acl does match.
if using haproxy below v1.6, create a new backend block (could be a duplicate of the default backend) and add the special headers in there. then in frontend use that backend conditionally. i.e.
use_backend alt_backend if { some_condition }
admittedly not an ideal solution but it does the job.
I'm running on my ubuntu 12.04 system apache2 and playing around with response headers. I want to change the behavior of http response headers, especially the Content-Length header. I've tried adding following lines in my apache2.conf in the IfModule mod_headers.c section:
Header set Static-Header "Static Content with nonsense"
Header set Content-Length "1338"
If I run curl -I localhost I get the expected header field Content-Length: 1338 (curl -I performs a HEAD request).
If I run curl -i the Content-Length is correctly calculated.
In RFC2616, section 9.4 is described that the HEAD request SHOULD be identical to the information sent in response to a GET request.
Can someone explain me this behavior?!
Apache2 always calculates the content-length from scratch when it actually does deliver content. You'll experience that same behavior if you change that header using PHP. This is necessary to make sure the Content-Length matches the length of the content that is sent after the server applied, for example, compression (if mod_deflate is active).
Because of this, in any request that sends content, your change to that header is nullified. But as Apache doesn't even look at the content in an head-request (only it's metadata), it does not calculate content-length. This is valid, as HEAD-requests don't have any body, so content-length is always zero.
Therefore, you should:
a) not modify the content-length header in the first place
b) not send one for HEAD requests
I have a primary proxy which sends requests to a secondary proxy on which OpeenSSO is installed.
If the OpenSSO agent determines that the user is not logged in, it raises a 302 redirect to the authentication server and provides the original (encoded) URL that the user requested as a GET parameter in the redirect location header.
However, the URL in the GET variable is that of the internal (secondary) proxy server, not the original proxy server. Therefore, I would like to edit/rewrite the "Location" response header to give the correct URL.
E.g.
http://a.com/hello/ (Original requested URL)
http://a.com/hello2/ (Secondary proxy with OpenSSO agent)
http://auth.a.com/login/?orig_request=http%3A%2F%2Fa.com%2Fhello2%2F (302 redirect to auth server with requested URL of second proxy server encoded in GET variable)
http://auth.a.com/login/?orig_request=http%3A%2F%2Fa.com%2Fhello%2F (Encoded URL is rewritten to that of the original request)
I have tried pretty much all combinations of headers and rewrites without luck so I'm thinking it may not be possible. The closest I got was this, but the mod_headers edit function does not parse environment variables.
# On the primary proxy.
RewriteEngine On
RewriteRule ^/(.*)$ - [E=orig_request:$1,P]
Header edit Location ^(http://auth\.a\.com/login/\?orig_request=).*$ "$1http%3A%2F%2Fa.com%2F%{orig_request}e"
ProxyPassReverse
ProxyPassReverse should do this for you:
This directive lets Apache adjust the URL in the Location, Content-Location and URI headers on HTTP redirect responses.
I'm not sure why your reverse proxy isn't behaving this way already, assuming you're using a pair of ProxyPass and ProxyPassReverse directives to define it.
Editing the Location Header
If you want to be able to edit the Location header as you describe, you can do it as of Apache 2.4.7:
For edit there is both a value argument which is a regular expression, and an additional replacement string. As of version 2.4.7 the replacement string may also contain format specifiers.
The "format specifiers" mentioned in the docs include being able to use environment variables, e.g. %{VAR}e.
You might also want to consider modifying your application such that the orig_request URL parameter is relativized, thus potentially eliminating the need for Header edits with environment variables.
Relative Path Location Header
You can also try using a relative path in your Location header, which would eliminate the need to explicitly map one domain to the other. This is officially valid as of RFC 7231 (June 2014), but was was widely supported even before that. You can relativize your Location header using Apache Header edit directives (even prior to version 2.4.7, since it wouldn't require environment variable substitution). That would look something like this:
Header edit Location "(^http[s]?://)([a-zA-Z0-9\.\-]+)(:\d+)?/" "/"
When using mod_deflate in Apache2, Apache will chunk gzipped content, setting the Transfer-encoding: chunked header. While this results in a faster download time, I cannot display a progress bar.
If I handle the compression myself in PHP, I can gzip it completely first and set the Content-length header, so that I can display a progress bar to the user.
Is there any setting that would change Apache's default behavior, and have Apache set a Content-length header instead of chunking the response, so that I don't have to handle the compression myself?
You could maybe play with the sendBufferSize to get a value big enough to contain your response in one chunk.
Then chunked content is part of the HTTP/1.1 protocol, you could force an HTTP/1.0 response (so not chunked: “A server MUST NOT send transfer-codings to an HTTP/1.0 client.”) by setting the force-response-1.0 in your apache configuration. But PHP breaks this settings, it's a long-known-bug of PHP, there's a workaround.
We could try to modify the request on the client side with an header preventing the chunked content, but w3c says: "All HTTP/1.1 applications MUST be able to receive and decode the "chunked" transfer-coding", so I don't think there's any header like 'Accept' and such which can prevent the server from chunking content. You could however try to set your request in HTTP/1.0, it's not really an header of the request, it's the first line, should be possible with jQuery, certainly.
Last thing, HTTP/1.0 lacks one big thing, the 'host' headers is not mandatory, verify your requests in HTTP/1.0 are still using the 'host' header if you work with name based virtualhosts.
edit: by using the technique cited in the workaround you can see that you could tweak Apache env in the PHP code. This can be used to force the 1.0 mode only for your special gzipped content, and you should use it to prevent having you complete application in HTTP/1.0 (or use the request mode to set the HTTP/1.0 for you gzip requests).