I have a legacy product that I'm trying to support on an Apache server and the server only after a recent update began rejecting request headers which only used LF for newlines and it's a tall order to rebuild it because of how old the code base is. Is there a setting somewhere that can be used or a mod_rewrite command that can be leveraged to allow request headers which use LF instead of CRLF or that will re-write LF's as CRLF's in request headers?
Example header from app:
Host: www.ourhostname.com:80\n
Accept-language: en\n
user_agent: Our Old Application\n
\n
If I hex edit the file to change the \n to \r\n, it works, but hex editing a file for release as an update isn't desired and I'm trying to find something server-side to get Apache to stop choking on LF's by themselves. Thanks in advance for any help on this problem!
we had the same problem and found Apache's fixed vulnerability:
important: Apache HTTP Request Parsing Whitespace Defects CVE-2016-8743
https://httpd.apache.org/security/vulnerabilities_24.html
These defects are addressed with the release of Apache HTTP Server 2.4.25 and coordinated by a new directive;
HttpProtocolOptions Strict
which is the default behavior of 2.4.25 and later. By toggling from 'Strict' behavior to 'Unsafe' behavior, some of the restrictions may be relaxed to allow some invalid HTTP/1.1 clients to communicate with the server, but this will reintroduce the possibility of the problems described in this assessment. Note that relaxing the behavior to 'Unsafe' will still not permit raw CTLs other than HTAB (where permitted), but will allow other RFC requirements to not be enforced, such as exactly two SP characters in the request line.
So, HttpProtocolOptions Unsafe directive may be your solution. We decided not to use it.
You could put a reverse proxy of some kind in front of Apache and have that handle converting the request to something Apache-friendly for you. Perhaps Varnish Cache would work, which can also function as just a HTTP processor, or NGINX. Another option may be a little Node.js app to accept the squiffy input and convert it to something better for you while piping it to the back-end.
Related
I have a OPNSense firewall setup with HAproxy sitting on my WAN interface to reverse-proxy my web server.
The problem with my application (which is outsourced) is that it has a lot of unicode characters in the URL parameters. Before installing OPNsense, I was running ISA server 2006 with no problems.
As I have read in its documentation, HAProxy only supports ASCII characters. However, I have a lot of non ascii characters which are written by design in the URL as URL parameters.
These characters include arabic characters and special french characters. HAProxy considers these characters illegal, making the HTTP request invalid and returning error code 400 (Invalid request). After days of debugging and checking logs, I figured that this is the normal behavior of HAProxy.
One of the things I tried is to make HAProxy accept these characters, but It was not successful.
One last resort before trying another reverse proxy engine is to try to encode these characters in Javascript. But once I encode them, how do I decode them on the HAProxy configuration ?
As is the HTTP response I am getting is 404 not found because the encoded URL parameters are not being decoded properly.
Any suggestions ?
Yesterday I received a mail from GCP telling about Load Balancer and upper and lowercases headers. A part of the message is:
After September 30, HTTP(S) Load Balancers will convert HTTP/1.1
header names to lowercase in the request and response directions;
header values will not be affected.
As header names are case-insensitive, this change will not affect
clients and servers that follow the HTTP/1.1 specification (including
all popular web browsers and open source servers). Similarly, as
HTTP/2 and QUIC protocols already require lowercase header names,
traffic arriving at load balancers over these protocols will not be
affected. However, we recommend testing projects that use custom
clients or servers prior to the rollout to ensure minimal impact.
Google talk specificly about request and response header names (not values) but, for example, is Google Load Balancer asking to me to replace a classic PHP redirection header "Location" into a lowercase "location"?
header("location: http://www.example.com/error/403");
Of course, the plan is to do what the standars says, but in many cases will be work that cant will be done before GCP deadline (September 30, 2019).
As is a standard, all modern browsers are prepared to use case insentive header names?
Should I be worry about files naming? (camelcases)
If is the case, there exist some mod in Apache (for example) to use meanwhile I change my code?
https://cloud.google.com/load-balancing/docs/https/
HTTP/1.1 specification specifies that HTTP headers are case insensitive. This only applies to the header name ("content-type") and not the value of the header ("application/json").
In the event that this new policy will cause problems for you, you can contact Google Support and opt-out temporarily.
For code that is correctly written and performs case-insensitive comparisons, you will not have problems. In most cases, you can use curl with various HTTP headers to test your backend code. Of course, completing a code walkthru is a good idea.
Example curl command:
curl --http1.1 -H “x-goog-downcase-all-headers: test” http://example.com/
Curl documentation for the --http1.1 command line option:
https://curl.haxx.se/docs/manpage.html
As is a standard, all modern browsers are prepared to use case
insentive header names?
Yes. This has been the norm for a long time.
Should I be worry about files naming? (camelcases)
No. The new changes do not affect values of HTTP headers, only the header names.
If is the case, there exist some mod in Apache (for example) to use
meanwhile I change my code?
No that I am aware of.
Let's say we have Varnish configured with Apache as a backend.
For some odd reasons, some clients send custom HTTP headers that are badly formed because they have a space before the header's colon (eg. "X-CUSTOM : value"), causing a 400 bad request on Apache.
Is it possible to deal with it on the Varnish side to sanitize headers, removing the extra space before the colon?
If you know another tool than Varnish that can easily do this job it's ok for me too.
Varnish will work.
It will simply discard the "invalid" header and the requests will proceed as normal further.
So simply putting Varnish in front of Apache will allow you to fix the requests which would otherwise result in 400.
I've confirmed this with Varnish 4.1. I wouldn't be 100% confident that other versions have the same behaviour.
So let's start with some background. I have a 3-tier system, with an API implemented in django running with mod_wsgi on an Apache2 server.
Today I decided to upgrade the server, running at DigitalOcean, from Ubuntu 12.04 to Ubuntu 14.04. Nothing special, only that Apache2 also got updated to version 2.4.7. After wasting a good part of the day figuring out that they actually changed the default folder from /var/www to /var/www/html, breaking functionality, I decided to test my API. Without touching a single line of code, some of my functions were not working.
I'll use one of the smaller functions as an example:
# Returns the location information for the specified animal, within the specified period.
#csrf_exempt # Prevents Cross Site Request Forgery errors.
def get_animal_location_reports_in_time_frame(request):
start_date = request.META.get('HTTP_START_DATE')
end_date = request.META.get('HTTP_END_DATE')
reports = ur_animal_location_reports.objects.select_related('species').filter(date__range=(start_date, end_date), species__localizable=True).order_by('-date')
# Filter by animal if parameter sent.
if request.META.get('HTTP_SPECIES') is not None:
reports = reports.filter(species=request.META.get('HTTP_SPECIES'))
# Add each information to the result object.
response = []
for rep in reports:
response.append(dict(
ID=rep.id,
Species=rep.species.ai_species_species,
Species_slug=rep.species.ai_species_species_slug,
Date=str(rep.date),
Lat=rep.latitude,
Lon=rep.longitude,
Verified=(rep.tracker is not None),
))
# Return the object as a JSON string.
return HttpResponse(json.dumps(response, indent = 4))
After some debugging, I observed that request.META.get('HTTP_START_DATE') and request.META.get('HTTP_END_DATE') were returning None. I tried many clients, ranging from REST Clients (such as the one in PyCharm and RestConsole for Chrome) to the Android app that would normally communicate with the API, but the result was the same, those 2 parameters were not being sent.
I then decided to test whether other parameters are being sent and to my horror, they were. In the above function, request.META.get('HTTP_SPECIES') would have the correct value.
After a bit of fiddling around with the names, I observed that ALL the parameters that had a _ character in the title, would not make it to the API.
So I thought, cool, I'll just use - instead of _ , that ought to work, right? Wrong. The - arrives at the API as a _!
At this point I was completely puzzled so I decided to find the culprit. I ran the API using the django development server, by running:
sudo python manage.py runserver 0.0.0.0:8000
When sending the same parameters, using the same clients, they are picked up fine by the API! Hence, django is not causing this, Ubuntu 14.04 is not causing this, the only thing that could be causing it is Apache 2.4.7!
Now moving the default folder from /var/www to /var/www/html, thus breaking functionality, all for a (in my opinion) very stupid reason is bad enough, but this is just too much.
Does anyone have an idea of what is actually happening here and why?
This is a change in Apache 2.4.
This is from Apache HTTP Server Documentation Version 2.4:
MOD CGI, MOD INCLUDE, MOD ISAPI, ... Translation of headers to environment variables is more strict than before
to mitigate some possible cross-site-scripting attacks via header injection. Headers containing invalid characters
(including underscores) are now silently dropped. Environment Variables in Apache (p. 81) has some pointers
on how to work around broken legacy clients which require such headers. (This affects all modules which use
these environment variables.)
– Page 11
For portability reasons, the names of environment variables may contain only letters, numbers, and the underscore character. In addition, the first character may not be a number. Characters which do not match this restriction will be replaced by an underscore when passed to CGI scripts and SSI pages.
– Page 86
A pretty significant change in other words. So you need to rewrite your application so send dashes instead of underscores, which Apache in turn will substitute for underscores.
EDIT
There seems to be a way around this. If you look at this document over at apache.org, you can see that you can fix it in .htaccess by putting the value of your foo_bar into a new variable called foo-bar which in turn will be turned back to foo_bar by Apache. See example below:
SetEnvIfNoCase ^foo.bar$ ^(.*)$ fix_accept_encoding=$1
RequestHeader set foo-bar %{fix_accept_encoding}e env=fix_accept_encoding
The only downside to this is that you have to make a rule per header, but you won't have to make any changes to the code either client or server side.
Are you sure Django didn't get upgraded as well?
https://docs.djangoproject.com/en/dev/ref/request-response/
With the exception of CONTENT_LENGTH and CONTENT_TYPE, as given above, any HTTP headers in the request are converted to META keys by converting all characters to uppercase, replacing any hyphens with underscores and adding an HTTP_ prefix to the name. So, for example, a header called X-Bender would be mapped to the META key HTTP_X_BENDER.
The key bits are: Django is converting '-' to underscore and also prepending 'HTTP_' to it. If you are already adding a HTTP_ prefix when you call the api, it might be getting doubled up. Eg 'HTTP_HTTP_SPECIES'
Note: Made some updates based on new information. Old ideas have been added as comments below.
Note: Made some updates (again) based on new information. Old ideas have been added as comments below (again).
We are running two instances of CouchDB on separate computers behind Apache reverse proxies. When attempting to replicate between the two instances:
curl -X POST http://user:pass#localhost/couchdb/_replicate -d '{ "source": "db1", "target": "http://user:pass#10.1.100.59/couchdb/db1" }' --header "Content-Type: application/json"
(we started using curl to debug the problem)
we receive an error similar to:
{"error":"case_clause","reason":"{error,\n {{bad_return_value,\n {invalid_json,\n <<\"<!DOCTYPE HTML PUBLIC \\\"-//IETF//DTD HTML 2.0//EN\\\">\\n<html><head>\\n<title>404 Not Found</title>\\n</head><body>\\n<h1>Not Found</h1>\\n<p>The requested URL /couchdb/db1/_local/01e935dcd2193b87af34c9b449ae2e20 was not found on this server.</p>\\n<hr>\\n<address>Apache/2.2.3 (Red Hat) Server at 10.1.100.59 Port 80</address>\\n</body></html>\\n\">>}},\n {child,undefined,\"01e935dcd2193b87af34c9b449ae2e20\",\n {gen_server,start_link,\n [couch_rep,\n [\"01e935dcd2193b87af34c9b449ae2e20\",\n {[{<<\"source\">>,<<\"db1\">>},\n {<<\"target\">>,\n <<\"http://user:pass#10.1.100.59/couchdb/db1\">>}]},\n {user_ctx,<<\"user\">>,\n [<<\"_admin\">>],\n <<\"{couch_httpd_auth, default_authentication_handler}\">>}],\n []]},\n temporary,1,worker,\n [couch_rep]}}}"}
So after further research it appears that apache returns this error without attempting to access CouchDB (according to the log files). To be clear when fed the following URL
/couchdb/db1/_local/01e935dcd2193b87af34c9b449ae2e20
Apache passes the request to CouchDB and returns CouchDB's 404 error. On the other hand when replication occurs the URL actually being passed is
/couchdb/db1/_local%2F01e935dcd2193b87af34c9b449ae2e20
which apache determines is a missing document and returns its own 404 error for without ever passing the request to CouchDB. This at least gives me some new leads but I could still use help if anyone has an answer offhand.
The source CouchDB (localhost) is telling you that the remote URL was invalid. Instead of a CouchDB response, the source is receiving the Apache httpd proxy's file-not-found response.
Unfortunately, you may have some reverse-proxy troubleshooting to do. My first guess is the Host header the source is sending to the target. Perhaps it's different from when you connect directly from a third location?
Finally, I think you probably know this, but the path
/couchdb/db1/_local%2F01e935dcd2193b87af34c9b449ae2e20
Is not a standard CouchDB path. By the time CouchDB sees a request, it should have the /couchdb stripped, so the query is for a document called _local%2f... in the database called db1.
Incidentally, it is very important not to let the proxy modify the paths before they hit couch. In particular, if you send %2f then CouchDB had better receive %2f and if you send / then CouchDB had better receive /.
From official documentation...
Note that HTTPS proxies are in theory supported but do not work in 1.0.1. This is because 1.0.1 ships with ibrowse version 1.5.5. The CouchDB version in trunk (from where 1.1 will be based) ships with ibrowse version 1.6.2. This later ibrowse contains fixes for HTTPS proxies.
Can you see which version of ibrowse is involved? Maybe update that ver?
Another thought I have is with regard to the SSL certs. If you don't have any, and I know you don't :), then technically you're doing SSL wrong. In java we know there are ways around this, but maybe try putting in proper certs since all SSL stuff basically involves certs.
And for my last contribution (today) I would say have you looked through this document which seems highly relevant?
http://wiki.apache.org/couchdb/Apache_As_a_Reverse_Proxy