How can I log the full request url within a traefik reverse proxy - traefik

When I setup Traefik as a reverse proxy I have some options for access logs as described here: https://doc.traefik.io/traefik/observability/access-logs/#limiting-the-fieldsincluding-headers.
All available fields.names should be visible by default. But I get only the URL path in access log output like
"GET /path/to/my/site HTTP/1.1"
Is there a way to display the requested domain, e.g.
"GET mydomain.com/path/to/my/site HTTP/1.1"?
I need to figure out which domain was used for the request.

In traefik 2.x you can set this in your traefik.yaml file:
accessLog:
filePath: "/logs/traefik.log"
bufferingSize: 100
format: json
The resulting json will have "RequestAddr":"mydomain.com" along with allot of other fields.

Related

Apache Access Log request that does not start with a forward slash /

I came across an IP address / unknown bot that made four HTTP requests, managing to request four different domain names in the following fashion without the first character being a forward slash /:
"GET www.example.com
When I make test the request http://localhost/www.example.com I see the following in Apache:
"GET /www.example.com
All other requests start with a forward slash. How did the bot manage to make such a request and how can I reproduce this to determine how to handle such requests?
Quoted Apache logs reduced to request method and URL to avoid off-topic comments.
Based on the way HTTP requests work, this can be achieved by sending a raw HTTP request to your IP address and specifying both the GET and Host headers as described on the linked page above:
The most common form of Request-URI is that used to identify a
resource on an origin server or gateway. In this case the absolute
path of the URI MUST be transmitted (see section 3.2.1, abs_path) as
the Request-URI, and the network location of the URI (authority) MUST
be transmitted in a Host header field. For example, a client wishing
to retrieve the resource above directly from the origin server would
create a TCP connection to port 80 of the host "www.w3.org" and send
the lines:
GET /pub/WWW/TheProject.html HTTP/1.1
Host: www.w3.org
followed by the remainder of the Request. Note that the absolute path cannot be empty; if none is present in the
original URI, it MUST be given as "/" (the server root).
This can be done on Windows using PuTTY, or on Linux/Mac using nc (see answer here for more details: https://stackoverflow.com/a/3620596/1038813)

Caddy as reverse proxy to rewrite a http redirect url from an upstream response

I am having a backend that is not able when running behind a reverse proxy since I cannot configure a custom base URL.
For the login process the backend makes heavy use of HTTP redirects but due to the fact that is behind a reverse proxy it sends redirection URL that are not reachable by the client.
So I was wondering if there is a way to rewrite the upstream HTTP HEADER Location
If the backend responses
HTTP/1.1 301
Location: http://backend-hostname/auth/login
Caddy should rewrite the Location header to
HTTP/1.1 301
Location: http://www.my-super-site.com/service/a/auth/login
Is something like this possible?
I've that we can remove headers by declaring
header / {
- Location
}
but it possible to replace the header and rewrite the URL?
I was also looking for answer for this question and unfortunately I've found this responses:
https://caddy.community/t/v2-reverse-proxy-but-upstream-server-redirects-to-nonexistent-path/8566
https://caddy.community/t/proxy-url-not-loading-site/5393/7
TLDR:
You need to use sub-domains rather than sub-paths for services that are not design for being after proxy (or at least configure base URL). :(

Can the Host Header be different from the URL

We run a website which is hosted using WCF.
The website is hosted on: https://foo.com and the ssl certicate is registered using the following command:
netsh http add sslcert hostnameport=foo.com:443
When we browse the website on the server, all is fine, and the certificate is valid.
There is a loadbalance in front of the server which listens to bar.com and then redirects the request to our server.
The loadbalancer doesn't rewrite the get URL, but only the Host Header.
The rewritten header looks like this:
GET https://foo.com/ HTTP/1.1
Host: bar.com
Connection: keep-alive
Now we have some issues which indicates that the ssl certificate is invalid in this case.
The Loadbalancer itself has a certificate registered listening to https://bar.com
Questions:
Is it ok/allowed that the get URL and the Host in the http header are different?
If it is ok to have different values in the header, under which url should we run the site? get URL or Host url?
Well, referencing the RFC2616:
If Request-URI is an absolute URI, the host is part of the
Request-URI. Any Host header field value in the request MUST be
ignored.
So, back to your questions:
It is allowed but a bad idea as it will create confusion, better to use relative path. i.e.
GET /path HTTP/1.1
instead of
GET https://foo.com/path HTTP/1.1.
Modify the loadbalance configuration to do so. Or make the both values the same.
If Host header has a value different than the request URI, then the URI is taking priority over the Hosts header.

Apache ProxyPass all requests

I have Proxy Pass somewhat working. I am using it like so
ProxyPass /chorus/ http://localhost:7070/
ProxyPassReverse /chorus/ http://localhost:7070/
This chorus folder does not exist and I am accessing through apache port 80 in the browser. Then it redirects to my application running on port 7070 which provides its webpage. The functionality within the webpage does not work though because the javascript starts requesting images and other info as /images/image1.jpg for example or /jsonrpc on apache and isn't going through the proxy. But on port 80 there is no /images because it's part of the :7070 application. If I do like below it will work too, but there are too many folders, I need a way to set everything returned from 7070 to be processed by apache as http:// localhost:7070/image/...
ProxyPass /jsonrpc http://localhost:7070/jsonrpc
ProxyPass /image http://localhost:7070/image
Basically the page for the app loads but the content does not, the app is requesting /jsonrpc which looks something like this (proxied version)
Remote Address:192.168.1.150:80
Request URL:http://192.168.1.150/jsonrpc?tm=1419196786193
Request Method:POST
Status Code:404 Not Found
When in the app directly without proxy it looks like this
Remote Address:192.168.1.150:7070
Request URL:http://192.168.1.150:7070/jsonrpc?tm=1419196894248
Request Method:POST
Status Code:200 OK
it's not really something you can fix within the Proxy module, other than by spelling out all possible paths, which you want to avoid; your alternatives are:
a. change the application and make it proxy aware so that
a1. it produces paths by prefixing it with a configured path
a2. interprets something like a X-Forwarded-Path header
a3. uses the HTML base tag: http://www.w3schools.com/tags/tag_base.asp
b. change the proxy so that your app lives on it's own vhost e.g. chorus.example.org

Wkhtmltoimage: how to prevent to create pdf/image from localhost/127.0.0.1?

I have a list of urls and I wat to create scrennshots with wkhtmltoimage. Some of the urls are redirected to localhost/127.0.0.1 and then I have a screenshot of my localhost (list of directories). How to prevent it?
You can do any of the following:
Configure your webserver (running on localhost) to Show a pretty page with a message you like - so that you get that screenshot instead of list of directories
Configure your webserver (running on localhost) to Return a http error code 404
Cleanup your list to not include any url that resolves to 127.0.0.1, before feeding it to wkhtmltoimage