Apache Access Log request that does not start with a forward slash / - apache

I came across an IP address / unknown bot that made four HTTP requests, managing to request four different domain names in the following fashion without the first character being a forward slash /:
"GET www.example.com
When I make test the request http://localhost/www.example.com I see the following in Apache:
"GET /www.example.com
All other requests start with a forward slash. How did the bot manage to make such a request and how can I reproduce this to determine how to handle such requests?
Quoted Apache logs reduced to request method and URL to avoid off-topic comments.

Based on the way HTTP requests work, this can be achieved by sending a raw HTTP request to your IP address and specifying both the GET and Host headers as described on the linked page above:
The most common form of Request-URI is that used to identify a
resource on an origin server or gateway. In this case the absolute
path of the URI MUST be transmitted (see section 3.2.1, abs_path) as
the Request-URI, and the network location of the URI (authority) MUST
be transmitted in a Host header field. For example, a client wishing
to retrieve the resource above directly from the origin server would
create a TCP connection to port 80 of the host "www.w3.org" and send
the lines:
GET /pub/WWW/TheProject.html HTTP/1.1
Host: www.w3.org
followed by the remainder of the Request. Note that the absolute path cannot be empty; if none is present in the
original URI, it MUST be given as "/" (the server root).
This can be done on Windows using PuTTY, or on Linux/Mac using nc (see answer here for more details: https://stackoverflow.com/a/3620596/1038813)

Related

X-Forwarded-For HTTP Header implementation - explanation needed

I was assign a task by my direct manager to make sure that all the websites in the company will have "X-Forwarded-For" HTTP Header set, in order to receive the original IP of the users for our Web Application Firewall logs.
I am not a developers, but I need to make sure our developers do that, and they seem to not understand what needs to be in the value of the header.
Because looking at some examples, it seemed that some people put specific IP like this:
X-Forwarded-For: <client>, <proxy1>, <proxy2>
which doesn't make any sense to me, because how can u type the IP in the value when it is completely random for each one?
Basically, I need that our logs will contain the real IP from each computer which surf behind a proxy or a load balancer.
Would like for some help : )
Thanks!
If you have one reverse proxy (or load balancer) between the client and the application server, then the proxy should add the header:
X-Forwarded-For: <client>
before forwarding the request on to the application server. The application server receives the request from the proxy IP but can deduce the client's IP from the value of the header.
If you have two reverse proxies (or load balancers) between the client and the application server, the first proxy (the one nearest the client) acts the same as above.
The second proxy receives the request from proxy1's IP and also receives the X-Forwarded-For header from proxy1. It then appends the IP address from where the request was receives (proxy1) and passes the updated header to the application server as:
X-Forwarded-For: <client>, <proxy1>
Each proxy or load balancer is responsible for creating the header if it does not already exist, and appending the IP address of the from where the request was received (i.e. the previous step in the chain).
Only the first IP address is necessary to identify the client, the remaining IP addresses are necessary to ensure that the header has not been faked.

Apache configuration: effect of explicit :80 in http header field (host)

We have a server running Apache providing services via a simple API. We now stumbled upon the problem that we cannot access the API using a third-party library, altough the resulting HTTP request are ALMOST the same. The only difference - as far as we can tell from Wireshark - is the presence or absence of the explicit information about port 80. For example:
curl -d "..." http://www.example.com/foo/bar/
curl -d "..." http://www.example.com:80/foo/bar/
Both work, and Wireshark shows Host: www.example.com, i.e., without the port 80. As far as I understand cURL as well as browser or most other clients remove port 80. So far, all fine.
Now, a third-party library to make requests requires to set a port, and we need to set it to 80. If the library makes a request, Wiresharks now shows Host: www.example.com:80 - note the additional port information. This request fails, and as far as we can see in Wiresharks, this failing request only differs with respect to the host field.
Can this be a configuration issue of Apache? We currently have no direct access to the server to check the conf files. Or are we missing something completely different here.
From rfc 2616:
Host = "Host" ":" host [ ":" port ] ; Section 3.2.2
So "Host: www.example.com:80" is perfectly legitimate. But I have never seen port 80 (or 443 in the case of HTTPS) in the host field of a HTTP request. It is obviously required where the request is routed via a proxy to a non-standard port.
This would give me some concerns as to the quality of the "third-party library". My first of port of call in resolving this would be to speak to the providers of the component - they have presumably come across the problem before.
You did not mention what access you have to the library - did you check that this is not a configurable option? Do you have access to the source code, and the permission to modify it? (if not, that would imply it is commercial, paid-for software - which should give you the right to some support).
I don't know what the solution is, but some obvious things to try would be:
configure the URL at the default vhost for webserver rather than explicitly for www.example.com
or use mod_headers to rewrite the host field
or put a forward proxy in front of the webserver e.g. squid and add a url rewriter (if squid does not automatically strip the port from the host field)
Apache performs string matching with the Host field. So when the :80 is attached, the string matching will fail and Apache will consider it a URL it does not handle and reject it. That is why curl stripped it.
You can read more about the ServerName field here, which is the setting in which Apache matches against Host.
Update
So the :80 has no effect and the string matching still works.
On my production server, I did not change Apache's configuration. I wrote some quick PHP to send out the GET request on a socket, and Apache still responded correctly with the :80 attached to the Host: field.
I also checked on the server itself and see the request come in with the errant :80 attached to it and Apache answers with the status of 200 and presents the HTML.
There is something else wrong with the third party software's request.

Can the Host Header be different from the URL

We run a website which is hosted using WCF.
The website is hosted on: https://foo.com and the ssl certicate is registered using the following command:
netsh http add sslcert hostnameport=foo.com:443
When we browse the website on the server, all is fine, and the certificate is valid.
There is a loadbalance in front of the server which listens to bar.com and then redirects the request to our server.
The loadbalancer doesn't rewrite the get URL, but only the Host Header.
The rewritten header looks like this:
GET https://foo.com/ HTTP/1.1
Host: bar.com
Connection: keep-alive
Now we have some issues which indicates that the ssl certificate is invalid in this case.
The Loadbalancer itself has a certificate registered listening to https://bar.com
Questions:
Is it ok/allowed that the get URL and the Host in the http header are different?
If it is ok to have different values in the header, under which url should we run the site? get URL or Host url?
Well, referencing the RFC2616:
If Request-URI is an absolute URI, the host is part of the
Request-URI. Any Host header field value in the request MUST be
ignored.
So, back to your questions:
It is allowed but a bad idea as it will create confusion, better to use relative path. i.e.
GET /path HTTP/1.1
instead of
GET https://foo.com/path HTTP/1.1.
Modify the loadbalance configuration to do so. Or make the both values the same.
If Host header has a value different than the request URI, then the URI is taking priority over the Hosts header.

Why is CORS needed for localhost?

There's probably an answer already on stackoverflow that I'm missing, sorry in advance for that, I just can't find it.
I have a small TCP server running on my localhost that, for security reasons, will not support CORS.
My question is, if CORS is for cross-domain protection, why is it being requested when I have a page on http://localhost/ request a connection to http://localhost:xxxx
I know I can turn off the security in my browser, but Im trying to understand why localhost to localhost connections are being treated as cross-origin.
XMLHttpRequest cannot load http://localhost:8000/. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'http://localhost:63342' is therefore not allowed access. The response had HTTP status code 500.
Because localhost (port 80) is a different host than localhost:8000.
See RFC 6454, Section 5:
If the two origins are scheme/host/port triples, the two origins
are the same if, and only if, they have identical schemes, hosts,
and ports.
Same-origin Policy
The same-origin policy permits scripts running in a browser to only make requests to pages on the same domain. This means that requests must have the same URI scheme, hostname, and port number. This post on the Mozilla Developer Network clearly defines the definition of an origin and when requests result in failure. If you send a request from http://www.example.com/, the following types of requests result in failure.
https://www.example.com/ – Different protocol (or URI scheme).
http://www.example.com:8080/myUrl – Different port (since HTTP requests run on port 80 by default).
http://www.myotherexample.com/ – Different domain.
http://example.com/ – Treated as a different domain as it requires the exact match (Notice there is no www.).
For more information refer to this link

What is yourinfo.allrequestsallowed.net?

In my apache instillation, I keep seeing the following line in my access logs:
"POST http://yourinfo.allrequestsallowed.net/ HTTP/1.1" 200
It's really freaking me out because this site is not being hosted on my server (I checked the IP just to be 100% sure). I added a "Deny all" line since the site is still in development, and now the HTTP 200 response changed to 403, like the domain is being hosted on my server.
I'm incredibly confused and scared. Does anybody know what's going on? Can I Deny all to this domain that's apparently pointing to my server?
You may want to check to make sure you don't have ProxyRequests On set anywhere where it's not supposed to. Typically a request like that is for a forward proxy and the troubling bit is that you returned a 200 response which could indicate that the request was successfully proxied.
Take a look at this wiki page about Proxy abuse.
My server is properly configured not to proxy, so why is Apache returning a 200 (Success) status code?
That status code indicates that Apache successfully sent a response to the client, but not necessarily that the response was retrieved from the foreign website.
RFC2616 section 5.1.2 mandates that Apache must accept requests with absolute URLs in the request-URI, even for non-proxy requests. This means that even when proxying is turned off, Apache will accept requests that look like proxy requests. But instead of retrieving the content from the foreign site, Apache will serve the content at the corresponding location on your website. Since the hostname probably doesn't match a name for your site, Apache will look for the content on your default host.
But it's probably worthwhile to check that you aren't proxying. Otherwise, it's not really that big of a deal.
After Jon Lin pointed me in the right direction, I figured it out.
After disabling mod_proxy and enabling mod_security, I added the following to my virtual host configuration:
SecRuleEngine On
SecRule REQUEST_LINE "://" drop,phase:1
And then restarted apache. It quits the connection and returns any amount of data, which uses less resources and bandwidth during Brute Force and DDOS attacks.
Also, it shows as an HTTP 404 Response in the access logs.
EDIT: I updated the rule to drop all types or proxies (https,https,ftp). I don't know how many protocols can be used this way, but I'd rather be safe than sorry.