I have hundreds of hardware devices at customers which need to send HTTP data through a telnet interface.
The destination is an Apache 2 Webserver with a PHP script waiting for the data.
This is already working however we found that the hardware involved is not able to handle hw-flow-control, this means that once data is filled (around 250 bytes) the buffer can overflow resulting in data corruption.
Fixing the HW-flow is not an option, the "modem" firmware is closed and can not be modified by the vendor anymore as it's quite old hardware.
Normally we'd use this:
POST / HTTP/1.1
Host: api.server
User-Agent: P8
Content-Type: application/x-www-form-urlencoded
Accept: */*
Content-Length: 767
VARIABLE=URLENCODED_DATA(total length 767 bytes)
This would work perfectly fine with flow-control, but in my case the 767 bytes are too much.
After around 200 bytes buffers would be overwritten and some bytes are lost.
The only current way to get it working now was using a delay when sending to the "modem" so it can empty it's buffers in time. However in the field this will not work due to instable internet connections with unpredictable timings.
I am not an expert in HTTP, I just hope it is possible to fragment a package.
I thought about using "Connection: keep-alive" or something similar.
My main question:
Is there a way to send POST data ($VARIABLE) to a Apache 2 server in smaller chunks in a way that makes the HTTP server combine them to one stream internally ?
Pseudo code:
POST / HTTP/1.1
Host: api.server
User-Agent: P8
Content-Type: application/x-www-form-urlencoded
Accept: */*
Content-Length: 400
Connection: keep-alive
VARIABLE=URLENCODED_DATA(200 bytes)
END\n\n
Server responds in TCP stream once received with "OK".
Next chunk is sent:
VARIABLE=URLENCODED_DATA(200 bytes)
Connection is closed.
As 400 bytes have been reached the process is ready, Apache forwards VARIABLE to PHP scripts POST input.
So like a HTTP flow-control within an open TCP connection.
Maybe there is a HTTP feature which is built for that purpose, or something that can be "ab"used to act in that way. keep-alive was just a guess.
If current HTTP protocols do not have such a feature the only way I can think about solving my issue is to implement flow-control on PHP side.
I hope for a better way than that though.
Update:
Meanwhile I found two interesting parameters:
Expect: 100-continue
Transfer-Encoding: chunked
What I would need is a mix of both.
A chunked transfer encoding which is expecting a 100-continue after each chunk !
This is a very interesting question, and it really has nothing to do with HTTP but with TCP.
The way to solve this is to use an intermediary proxy that takes care of spoon-feeding your devices. Ideally, this device will be able to set the window size on the TCP packet ACKs to whatever the size of the buffer the device is. That window size will close to zero when the device cannot handle any more. If you do this, you will be utilizing TCP's built-in flow control and solve the problem in a simple way.
Another thing you can do is keep this entirely in the application layer and have this intermediary proxy buffer all of the data from the response. For most normal HTTP responses this will be okay.
Related
The newest version of Safari (mobile & desktop) buffers videos 4x slower than other browsers because it sends many small sized range-bytes requests opposed to a few large ones. An example request and response is below (this request continues with a small size of 64kb until enough data is loaded for the video to play, in Chrome, Firefox and other browsers the range-bytes request is much larger and so the data is delivered much faster in one stream).
Is it possible to get around this issue by forcing my web server (apache) to ignore Safari's small range-byte request of 64kb, and instead send a larger amount of data (about 5MB)? The request is made directly to the video file.
Summary
URL: http://example.org/video.mp4?rand=942824
Status: 206 Partial Content
Source: Network
Request
GET /video.mp4 HTTP/1.1
Accept: */*
Connection: keep-alive
Range: bytes=0-65535
Accept-Encoding: identity
Response
HTTP/1.1 206 Partial Content
Content-Type: video/mp4
Content-Range: bytes 0-65535/467342440
Accept-Ranges: 0-467342440
Content-Length: 65536
Connection: keep-alive
Server: nginx/1.2.1
UPDATE: I managed to change the request range header using the below code, however even though the 5mb is downloaded quickly, safari continues sending these small 64kb range requests and ignores the 5mb that was downloaded so this is not a solution.
SetEnvIf Range bytes=0-65535 HAVE_MyRequestHeader
RequestHeader unset Range env=HAVE_MyRequestHeader
RequestHeader set Range bytes=0-5000000 env=HAVE_MyRequestHeader
No. You can not change it server side. The client makes a request the server fulfills the request. Sending data the client didn’t ask for will likely cause errors.
My iOS app makes Rest calls to my WCF web service.
The responding speed is very slow, over 3 min.
However, when I set up Fiddler as a Proxy to monitor the iOS traffic. The call was finished in 1 sec.
What does make Fiddler magically accelerate the Rest call from iOS?
p.s. Fiddler is setup on a windows PC where uses the same network with iOS App.
The rest call example (from Fiddler)
Request
GET https://xxxx.xxxx.com/Deals HTTP/1.1
Host: xxx.xxxx.com
Proxy-Connection: keep-alive
Accept-Encoding: gzip
Content-Type: application/json
Cookie: ASPXAUTH=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Connection: keep-alive
User-Agent: Natural xxxx x.x.x (iPad; iPhone OS 7.0.2; en_US)
Response
HTTP/1.1 200 OK
Cache-Control: private
Content-Length: 891437
Content-Type: application/json; charset=utf-8
Server: Microsoft-IIS/7.5
LastFetchDateTimeUTC: 2014-02-14T16:52:43.5465273Z
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Fri, 14 Feb 2014 16:52:45 GMT
Response body is a large json (2MB)
p.s.
Except for Fiddler, we also tried to install wireshark and use it to capture traffic on the mac while running the app from on the simulator.
We see a lot DUP ACK, I guess that's causing tcp re-transmission
p.s.
We pinged from iOS too, there is no delay to the WCF web service.
Help!
UPDATE:
We found out a problem, looks like the respond time decreases with the length of the body. Does it mean anything?
The WireShark logs should provide you plenty of information about what happens in each case. When Fiddler "magically" makes things faster, it's typically due to:
Better connection reuse (e.g. Fiddler may reuse connections better than client)
Better buffer sizes (e.g. not using tiny buffers for read/write)
Non-broken proxy determination behavior
I wrote a bit about these in this blog post.
We solved this problem by proving that server is a shitty one. We deployed the same service on another VM and it works. Must be the a broken network card
Please note: This is not a complain about a shoddy CMS.
Just toying with Apache Bench and got terrible results with our custom CMS, more exactly i got:
Requests per second: 0.37 [#/sec] (mean)
When i run another test with a plain php file i got:
Requests per second: 4786.07 [#/sec] (mean)
Another test with a previous version of the CMS:
Requests per second: 6068.66 [#/sec] (mean)
The website(s) are working fine, no problems detected, Google's Webmaster Tools reports our sites as faster than 80% of the pages which is fine, i think.
The test was:
ab -t 30 -c 10 http://example.com/
Maybe some kind of Apache problem? Bad .htaccess config, or similar?
Update:
Just ran a simple test with sockets and the results are similar. Page loads very, very slowly. If i ran my script with another website everything is fine.
Also, there's a small hint about a chunk length problem. (Bad Apache Headers, or line endings?)
The site is gzipped, and when verbose logging turned on, i see these lines in the response:
LOG: Response code = 200
LOG: header received:
HTTP/1.1 200 OK
Date: Tue, 04 Oct 2011 13:10:49 GMT
Server: Apache
Set-Cookie: PHPSESSID=ibnfoqir9fee2koirfl5mhm633; path=/
Expires: Sat, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-store, no-cache, must-revalidate
Pragma: no-cache
Cache-Control: post-check=0, pre-check=0
Vary: Accept-Encoding
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8
2ef6
Always at the same place, in the middle of the HTML-source, then <!DOCTYPE HTML> again.
Please, help.
Update #2:
Just checked my HTTP headers with Rex Swain's HTTP Viewer and got these results:
HTTP/1.1·200·OK(CR)(LF)
Date:·Wed,·05·Oct·2011·08:33:51·GMT(CR)(LF)
Server:·Apache(CR)(LF)
Set-Cookie:·PHPSESSID=n88g3qcvv9p6irm1fo0qfse8m2;·path=/(CR)(LF)
Expires:·Sat,·26·Jul·1997·05:00:00·GMT(CR)(LF)
Cache-Control:·no-store,·no-cache,·must-revalidate(CR)(LF)
Pragma:·no-cache(CR)(LF)
Cache-Control:·post-check=0,·pre-check=0(CR)(LF)
Vary:·Accept-Encoding(CR)(LF)
Connection:·close(CR)(LF)
Transfer-Encoding:·chunked(CR)(LF)
Content-Type:·text/html;·charset=UTF-8(CR)(LF)
(CR)(LF)
Do you notice anything unusual?
If it works well with ordinary web browsers (as you mentioned in the comments) the CMS handle the requests from Apache Benchmark differently.
A quick checklist:
AFAIK Apache Benchmark just send simple requests without any cookie handling, so try to set -C with a valid cookie (copy the values from a web browser).
Try to send exactly the same headers to the CMS as the web browser sends. Save a dump of a valid request with netcat, HttpFox or a packet sniffer and set the missing headers with -H.
Profile the CMS on the server while you're sending to it a request with Apache Benchmark. Maybe you found the bottleneck. Two poor man's error_log calls with a timestamp in the first and the last line of the index.php (or the tested script's entry point) could show how fast is the PHP script and help to calculate the overhead of the Apache HTTP Server and network.
If you run socket tests and browser tests from different machines it's could be a DNS issue (turn off HostnameLookups in Apache). Try to run them from the same machine.
Try ab -k ... or ab -H "Connection: close" ....
I guess the CMS does some costly initialization when it initializes the session and it's happens when it processes the first request. Since Apache Benchmark does not send the cookies back the CMS it creates a new session for every request and it's the cause of the slow answers.
A second guess is that the CMS handle the incoming http headers differently and the headers which was sent (or the lack of them) by Apache Benchmark trigger some costly/slow processing. It looks more appropriate since the report of the Google's Webmaster Tools.
Apache Benchmark sends HTTP 1.0 request, for example:
GET / HTTP/1.0
Host: localhost:9100
User-Agent: ApacheBench/2.3
Accept: */*
It looks to me that your server does not send any http header about Keep-Alive settings but it assumes that the client uses keep-alive when the client uses HTTP 1.0. It's not an RFC compliant behaviour:
From RFC 2616, 19.6.2 Compatibility with HTTP/1.0 Persistent Connections:
Some clients and servers might wish to be compatible with some
previous implementations of persistent connections in HTTP/1.0
clients and servers. Persistent connections in HTTP/1.0 are
explicitly negotiated as they are not the default behavior.
By default Apache Benchmark doesn't use keep-alive so it waits when the response arrives for the closing of the socket. The server closes it after 15 seconds idle. Downloading the main page with wget also takes 15 seconds. Wget also uses HTTP 1.0 in the request.
I think it's a bug in the PHP code of the CMS since ab works well on the same server with a plain php file. Anyway, you can workaround it with using keep-alive connections (-k):
ab -k -t 30 -c 10 http://example.com/
or with explicitly disabling persistent connections:
ab -H "Connection: close" -t 30 -c 10 http://example.com/
but it's still a server side issue and your original ab commands is right.
Please note that this bug probably affects only HTTP 1.0 clients (like Apache Benchmark, wget) and clients with regular browsers will not notice it.
Response with info? is very quick:
i: info? http://cdimage.ubuntu.com/daily/current/natty-alternate-i386.iso
i/size
With http head request it takes maybe 10 times more time why ?
port: open tcp://cdimage.ubuntu.com:80
insert port "HEAD /daily/current/natty-alternate-i386.iso HTTP/1.1 ^/"
insert port "Host: cdimage.ubuntu.com ^/^/"
out: copy ""
while [data: copy port][append out data]
block: parse out rejoin [": " newline]
select block "Content-Length"
the port modes are responsible in this case. you where using buffered I/O with the wait mode (which is on by default).
in http, the client is responsible closing of the port when you've read all the server bytes.
since you are basically using tcp directly, using insert port, you are responsible for also detecting the end of the request and closing the port when sufficient bytes have arrived. this can only be done in /lines or /no-wait when doing low-level tcp fun.
Something that read and info? do for you.
while [data: copy port][append out data]
doesn't terminate until a timeout occurs (which is 30 seconds by default in REBOL).
also, your request seems to be in error...
try this:
port: open/lines tcp://cdimage.ubuntu.com:80
insert port {HEAD /daily/current/natty-alternate-i386.iso HTTP/1.0
Accept: */*
Connection: close
User-Agent: REBOL View 2.7.7.3.1
Host: cdimage.ubuntu.com
}
out: form copy port
block: parse out none ;rejoin [": ^/"]
probe select block "Content-Length:"
here it seems that adding /lines will prevent the wait. its probably related to how the http scheme handles the line mode on open.
look around for REBOL port modes within the documentation and on the net its well explained all over the place.
if you had used trace/net on, you'd realized that all the packets where received and that the interpreter was just still waiting. btw your code actually returned an error 400 in my tests.
I would like to determine what the long url of a short url is. I have tried using http HEAD requests, but very few of the returned header fields actually contain any data pertaining to the destination/long url.
Is there:
1. Any way to determine the long url?
2. If so, can it be done without downloading the body of the destination?
Thank you
Issue an HTTP GET request, don't follow the redirect, analyse the Location header. That's where the target of redirection is.
Specifically in Cocoa, use an asynchronous request with a delegate, handle the didReceiveResponse in the delegate. The first response will be the redirection one. Once you extract the URL in the handler, call [cancel] on the connection.
EDIT: depending on the provider, HEAD instead of GET might or might not work. And if you don't follow the redirect, the response data won't be loaded anyway, so there's no transmission overhead to having a GET.
Do a HEAD and look for the Location header.
% telnet bit.ly 80
Trying 168.143.173.13...
Connected to bit.ly.
Escape character is '^]'.
HEAD /cwz5Jd HTTP/1.1
Host: bit.ly
HTTP/1.1 301 Moved
Server: nginx/0.7.42
Date: Fri, 12 Mar 2010 18:37:46 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Set-Cookie: _bit=4b9a89fa-002bd-030af-baa08fa8;domain=.bit.ly;expires=Wed Sep 8 14:37:46 2010;path=/; HttpOnly
Location: http://www.engadget.com/2010/03/12/motorola-milestone-with-android-2-1-hitting-bulgaria-by-march-20/?utm_source=twitterfeed&utm_medium=twitter
MIME-Version: 1.0
Content-Length: 404
LongUrlPlease offers an API which expands short urls.