uwsgi breaks headers - http-headers

I'm using Nginx + uwsgi + python3
Sending any header via start_response goes well, but when I want to send more than one header, it becomes mad.
For example, if I write:
start_response('200 OK', [('Last-Modified', 'Wed, 11 Jan 2012 00:00:00 GMT'), ('Content-Type', 'text/html; charset=windows-1251')])
The headers sent are:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Server: nginx/1.0.11
Connection: close
Date: Wed, 11 Jan 2012 04:17:22 GMT
Content-Type: text/html; charset=windows-1251
Content-Type: text/html; charset=windows-12
uwsgi sends the same header twice and even more the second one is broken.

which uWSGI and nginx version ? In both 0.9.8.x and 1.0.x i cannot reproduce your error.
You can check the real headers sent by uWSGI putting it in http mode with --http/--http-socket

Related

Use wget to download pdf with no direct link

Some websites provide pdf files for viewing but I can't download such pdf files with wget.
Calling the website in question from my browser views the pdf:
https://www.lokalmatador.de/epaper/ausgabe/gemeinderundschau-muehlhausen-14-2021/
But using the following code I only get a pdf file with 0 lenght.
wget --content-disposition -nd https://www.lokalmatador.de/epaper/ausgabe/gemeinderundschau-muehlhausen-14-2021/
I tried some combinations with saving and loading cookies and referer but nothing works.
At this point I'm just curious what is happening and why wget is not fetching anything except maybe an empty index.html.
When I was looking at server response, it was saying the content length was 0.
--2021-04-17 14:59:35-- https://www.lokalmatador.de/epaper/ausgabe/gemeinderundschau-muehlhausen-14-2021/
Resolving www.lokalmatador.de (www.lokalmatador.de)... 37.202.6.70
Connecting to www.lokalmatador.de (www.lokalmatador.de)|37.202.6.70|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Sat, 17 Apr 2021 13:59:36 GMT
Server: Apache
Set-Cookie: fe_typo_user=477e8a1d2b3dd74bc5b6b408a6d74edd; expires=Mon, 17-May-2021 13:59:36 GMT; Max-Age=2592000; path=/; domain=.lokalmatador.de; httponly; samesite=lax
Upgrade: h2,h2c
Connection: Upgrade, Keep-Alive
Content-Length: Array
Cache-Control: max-age=2592000
Expires: Mon, 17 May 2021 13:59:36 GMT
X-UA-Compatible: IE=edge
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Content-Type: application/pdf
Length: 0 [application/pdf]
Remote file exists but does not contain any link -- not retrieving.
So looked at the manual:
https://www.gnu.org/software/wget/manual/html_node/HTTP-Options.html
And there is a command just exactly for this:
‘--ignore-length’
Unfortunately, some HTTP servers (CGI programs, to be more precise) send out bogus Content-Length headers, which makes Wget go wild, as it thinks not all the document was retrieved. You can spot this syndrome if Wget retries getting the same document again and again, each time claiming that the (otherwise normal) connection has closed on the very same byte.
With this option, Wget will ignore the Content-Length header—as if it never existed.
Then the wget command started working as expected:
wget --ignore-length -O epaper.pdf https://www.lokalmatador.de/epaper/ausgabe/gemeinderundschau-muehlhausen-14-2021
Here is output which I'm seeing with the ignore length:
--2021-04-17 14:56:19-- https://www.lokalmatador.de/epaper/ausgabe/gemeinderundschau-muehlhausen-14-2021
Resolving www.lokalmatador.de (www.lokalmatador.de)... 37.202.6.70
Connecting to www.lokalmatador.de (www.lokalmatador.de)|37.202.6.70|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: ignored [application/pdf]
Saving to: ‘epaper.pdf’
epaper.pdf [ <=> ] 4.39M 1.23MB/s in 3.6s
2021-04-17 14:56:23 (1.21 MB/s) - ‘epaper.pdf’ saved [4601842]

GET Bucket op response + AWS S3 + Content-Length header

Just wanted to know if the GET Bucket op response ever skips the Content-Length header. I tested this and i saw that there was no Content-Length header in the response for GET Bucket op.
How does an application reading the response understand where the body of the response ends if the response doesn't contain Content-Length header?
Request-Response Snippet:
GET /?max-keys=1000&prefix&delimiter=%2F HTTP/1.1
Date: Sat, 09 Apr 2016 18:27:23 GMT
x-amz-request-payer: requester
Authorization: AWS AKIAIP3KAUILC4GG7A2A:UG3bGvIjayrxrkxEX1mfrvETy/M=
Connection: Keep-Alive
User-Agent: Cyberduck/4.9.19632 (Mac OS X/10.10.5) (x86_64)
HTTP/1.1 200 OK
x-amz-id-2: yg76HSq5j0mi0oR6dXF8ZfGq722kHBWiMQmNvXPqiLxr1S4nGj5GVn1RVrPQrOUfNynxxaMSYEY=
x-amz-request-id: B4468E68E10B6AEF
Date: Sat, 09 Apr 2016 18:27:25 GMT
x-amz-bucket-region: us-east-1
Content-Type: application/xml
Server: AmazonS3
Connection: close
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">......</ListBucketResult>
Thanks!
The Content-Length header is optional in response. And it may not reflect the real content-length even if it presents. Think about gzipped response. So to answer the question: When no Content-Length is received, the client keeps reading until the server closes the connection.
In Java, keep calling InputStream.read() until it returns -1.
Is the Content-Length header required for a HTTP/1.0 response?

HTTP pipelining request text example

Below is an example HTTP 1.1 call with a single page requested :
GET /jq.js HTTP/1.1
Host: 127.0.0.1
Accept: */*
I understand with HTTP Pipelining, multiple requests can be sent without breaking the connection.
Can someone post, some text example of how this request will be sent to the server, I want to be able to do it over the command line or with PHP sockets.
Does support for pipelining need to enabled on the web-server as well?
Is pipelining supported by major Web-servers(apache, nginx) by default or does it need to be enabled
From w3c protocol details:
8.1.2.2 Pipelining
A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.
Clients which assume persistent connections and pipeline immediately after connection establishment SHOULD be prepared to retry their connection if the first pipelined attempt fails. If a client does such a retry, it MUST NOT pipeline before it knows the connection is persistent. Clients MUST also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses.
Clients SHOULD NOT pipeline requests using non-idempotent methods or non-idempotent sequences of methods (see section 9.1.2). Otherwise, a premature termination of the transport connection could lead to indeterminate results. A client wishing to send a non-idempotent request SHOULD wait to send that request until it has received the response status for the previous request.
So, first fact is that you should be in a KeepAlive status. So you should add Connection: keep-alive keyword in your request headers, but some webservers may still accept pipelining without this keep alive status. On the other hand, this could be rejected by the server, the server may or may not accept your connection in keepalive mode. So, at any time, being in keepalived or not, you may send 3 requests pipelined in one connection, and get only one response.
From this gist we can find a nice way to test it with telnet.
Asking for keepalive with Connection: keep-alive header:
(echo -en "GET /index.html HTTP/1.1\nHost: foo.com\nConnection: keep-alive\n\nGET /index.html HTTP/1.1\nHost: foo.com\n\n"; sleep 10) | telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.lan.
Escape character is '^]'.
HTTP/1.1 200 OK
Date: Sun, 27 Oct 2013 17:51:58 GMT
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 04 Mar 2012 15:00:29 GMT
ETag: "56176e-3e-4ba6c121c4761"
Accept-Ranges: bytes
Content-Length: 62
Vary: Accept-Encoding
Keep-Alive: timeout=5, max=100 <======= Keepalive!
Connection: Keep-Alive
Content-Type: text/html; charset=utf-8
<html>
<body>
<h1>test</h1>
</body>
</html>
HTTP/1.1 200 OK
Date: Sun, 27 Oct 2013 17:51:58 GMT
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 04 Mar 2012 15:00:29 GMT
ETag: "56176e-3e-4ba6c121c4761"
Accept-Ranges: bytes
Content-Length: 62
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
<html>
<body>
<h1>test</h1>
</body>
</html>
It works.
Without asking for Keepalive:
(echo -en "GET /index.html HTTP/1.1\nHost: foo.com\nConnection: keep-alive\n\nGET /index.html HTTP/1.1\nHost: foo.com\n\n"; sleep 10) | telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.lan.
Escape character is '^]'.
HTTP/1.1 200 OK
Date: Sun, 27 Oct 2013 17:49:37 GMT
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 04 Mar 2012 15:00:29 GMT
ETag: "56176e-3e-4ba6c121c4761"
Accept-Ranges: bytes
Content-Length: 62
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
<html>
<body>
<h1>test</h1>
</body>
</html>
HTTP/1.1 200 OK
Date: Sun, 27 Oct 2013 17:49:37 GMT
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 04 Mar 2012 15:00:29 GMT
ETag: "56176e-3e-4ba6c121c4761"
Accept-Ranges: bytes
Content-Length: 62
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8
<html>
<body>
<h1>test</h1>
</body>
</html>
Connection closed by foreign host.
Same result, I did not ask for it but it looks like a Keepalive answer (closing after 5s which is the value set in Apache). And a pipelined answer, I get my two pages.
Now if I prevent usage of any Keepalive connection in Apache by setting:
Keepalive Off
And restarting it:
(echo -en "GET /index.html HTTP/1.1\nHost: foo.com\nConnection: keep-alive\n\nGET /index.html HTTP/1.1\nHost: foo.com\n\n"; sleep 10) | telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.lan.
Escape character is '^]'.
HTTP/1.1 200 OK
Date: Sun, 27 Oct 2013 18:02:41 GMT
Server: Apache/2.2.22 (Debian)
Last-Modified: Sun, 04 Mar 2012 15:00:29 GMT
ETag: "56176e-3e-4ba6c121c4761"
Accept-Ranges: bytes
Content-Length: 62
Vary: Accept-Encoding
Connection: close
Content-Type: text/html; charset=utf-8
<html>
<body>
<h1>test</h1>
</body>
</html>
Connection closed by foreign host.
Only one answer... So the server can reject my request for pipelining.
Now, for support on servers and browsers, I think your wikipedia source tells enough :-)

G-WAN 3.12.26 64-bit add duplicate http header

I use gwan for image generation, so I need to set correct content type, but G-WAN 3.12.26 after some load adds its own header with content type text/html and returns page with 2 http headers.
How to reproduce this:
use setheaders.c servlet from gwan package, start gwan and open this page, lets say http://localhost/?setheaders.c and you will get this (correct response):
HTTP/1.1 200 OK
Date: Sat, 29 Dec 2012 20:37:52 GMT
Last-Modified: Sat, 29 Dec 2012 20:37:52 GMT
Content-type: text/html
Content-Length: 371
Connection: close
<!DOCTYPE HTML><html lang="en"><head><title>Setting response headers</title><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><link href="imgs/style.css" rel="stylesheet" type="text/css"></head><body style="margin:16px;"><h1>Setting response headers</h1><br>This reply was made with custom HTTP headers, look at the servlet source code.<br></body></html>`
now run apache bench: ab -n 1000 'http://localhost/?setheaders.c' (1000 requests were enough for my system).
DO NOT RESTART GWAN, open http://localhost/?setheaders.c again and this is what you should get (incorrect response, 2 http headers):
HTTP/1.1 200 OK
Server: G-WAN
Date: Sat, 29 Dec 2012 20:43:34 GMT
Last-Modified: Fri, 16 Jan 1970 16:53:33 GMT
ETag: "be86ada7-14b40d-16f"
Vary: Accept-Encoding
Accept-Ranges: bytes
Content-Type: text/html; charset=UTF-8
Content-Length: 367
Content-Encoding: gzip
Connection: close
HTTP/1.1 200 OK
Date: Sat, 29 Dec 2012 20:43:34 GMT
Last-Modified: Sat, 29 Dec 2012 20:43:34 GMT
Content-type: text/html
Content-Length: 371
Connection: close
<!DOCTYPE HTML><html lang="en"><head><title>Setting response headers</title><meta http-equiv="Content-Type" content="text/html; charset=utf-8"><link href="imgs/style.css" rel="stylesheet" type="text/css"></head><body style="margin:16px;"><h1>Setting response headers</h1><br>This reply was made with custom HTTP headers, look at the servlet source code.<br></body></html>
GWAN returns correct response if gzip and x-gzip are not set as acceptable encoding in request header (Accept-Encoding: gzip, x-gzip).
Is it possible to solve this modifying just servlet? If yes, then how?
Are you setting the MIME type like shown in fractal.c:
// -------------------------------------------------------------------------
// specify a MIME type so we don't have to build custom HTTP headers
// -------------------------------------------------------------------------
char *mime = (char*)get_env(argv, REPLY_MIME_TYPE);
// note that we setup the FILE EXTENTION, not the MIME type:
mime[0] = '.'; mime[1] = 'g'; mime[2] = 'i'; mime[3] = 'f'; mime[4] = 0;
If you do so then there's no way to confuse the automatic headers feature.
Other than that, v3.12 has had many instability problems (file time failures, pthread failures, signals failures, etc.) due to our direct syscalls and GLIBC wrappers - an effort initially intended to make the program run on all versions of Linux.
We have found (thanks to the many reports like yours) that rather than trying to fix those issues one by one (pointlessly fighting GLIBC, a moving target with many different releases each having its own bugs and specificities) a much safer path is to ditch GLIBC.
That's what G-WAN v4 will do, just a few days from now.

Setting outbound 'Expires:' in Squid server's HTTP header

I'm having a problem where items served by my Squid server are being cached by Limelight for too long, sometimes days. It happens when a piece of content has been static for a long time (weeks) and then undergoes numerous changes in a matter of hours.
Limelight gets its content from our Squid server and I'm told that if I can add 'Expires: 15m' in the HTTP header the Squid server sends, Limelight will not cache the image for more than 15 min.
Unfortunately, I can fond no setting in Squid that will allow me to add this to the header.
Here's the HTTP header as presently being sent:
HTTP/1.0 200 OK
Date: Tue, 15 Dec 2009 23:57:33 GMT
Server: nginx/0.5.26
Content-Type: image/jpeg
Content-Length: 83843
Last-Modified: Tue, 15 Dec 2009 23:52:00 GMT
Accept-Ranges: bytes
Age: 450
X-Cache: HIT from squid01.prod.mydomain
X-Cache-Lookup: HIT from squid01.prod.mydomain:3128
Via: 1.0 squid01.prod.mydomain:3128 (squid/2.6.STABLE14)
Connection: close
You need to set the header on the origin server, not on your Squid box.
See:
http://www.mnot.net/cache_docs/#IMP-SERVER