Blackberry reading mod_deflate compressed page - apache

I am using Apache mod_deflate to return compressed html from a webpage. It has reduced the generated page size from 3k down to 700 bytes.
How do I use HttpConnection in Blackberry to get the compressed page (i.e. only 700bytes instead of 3k)?
P.S. Trying to use the GZIPInputStream(inputStream) keeps returning an incorrect header check error.

As I understood you already tried to download and got non-compressed html page.
If so I think you should add "Accept-Encoding" header to your request (question on forum). Try:
connection.setRequestProperty("Accept-Encoding", "gzip, deflate");
Don't forget that you will get zipped data, so you need to unzip before using.
Also, as mentioned here, gzip/deflate is not so efficient when your traffic is going over BIS-B, BES. Because BB servers will encode/decode data to analyze it and make it more efficient fro transmission.

Related

AWS API Gateway to S3 - PUT Content-Encoding .Z Files

I have run into an issue with using API Gateway as a proxy to S3 (for custom authentication), in that it does not handle binary data well (which is a known issue).
I'm usually uploading either .gz or .Z (Unix compress utility) files. As far as I understand it, the data is not maintained due to encoding issues. I can't seem to figure out a way to decode the data back to binary.
Original leading bytes: \x1f\x8b\x08\x08\xb99\xbeW\x00\x03
After passing through API GW: ��9�W�
... Followed by filename and the rest of the data.
One way of 'getting around this' is to specify Content-Encoding in the header of the PUT request to API GW as 'gzip'. This seems to force API GW to decompress the file before forwarding it to S3.
The same does not work for .Z files compressed with the Unix compress utility. Where you should specify the Content-Encoding as 'compress'.
Does anyone have any insight about what is happening to the data, to help shed some light on my issue? Also, does anyone know any possible work-around's to maintain the encoding of my data while passing through API GW (or to decode it once it's in S3)?
Obviously I could just access the S3 API directly (or have API GW return a pre-signed URL for accessing the S3 API), but there are a few reasons why I don't want to do that.
I should mention that I don't understand very much at all about encoding - sorry if there are some obvious answers to some of my questions.
It's not exactly an "encoding issue" -- it's the fact that API Gateway just doesn't support binary data ("yet")... so it's going to potentially corrupt binary data, depending on the specifics of the data in question.
Uploading as Content-Encoding: gzip probably triggers decoding in a front-end component that is capable of dealing with binary data (gzip, after all, is a standard encoding and is binary) before passing the request body to the core infrastructure... but you will almost certainly find that this is a workaround that does not consistently deliver correct results, depending on the specific payload. The fact that it works at all seems more like a bug than a feature.
For now, the only consistently viable option is base64-encoding your payload, which increases its size on-the-wire by 33% (base64 encoding produces 4 bytes of output for every 3 bytes of input) so it's not much of a solution. Base64 + gzip with the appropriate Content-Encoding: gzip should also work, which seems quite a silly suggestion (converting a compressed file into base64 then gzipping the result to try to reduce its size on the wire) but should be consistent with what API Gateway can currently deliver.

xbuf_frurl does not work properly without server header of content length?

I try to get some info from other sites with xbuf_frurl.
I got some site OK but some Not OK.
By Now, I still can not make sure what is going wrong.
But some sites are missing the content length header.
Who can tell whether xbuf_frurl() relies on the (potentially missing) content length header, esp. when growing the buffer?
xbuf_frurl() is indeed reading a body IF an HTTP content-length header is present. It will not try to decode chunked responses.
To deal with those servers using chunked replies, use the G-WAN curl.c example provided with the distribution. With libcurl you have even the opportunity to use SSL/TLS.
If that's not resolving your problem, the only way to troubleshoot this kind of issues is to give a non-working example, with both the full request that you have sent and the full reply received from the server.
That's why the xbuf_xcat("%v") format has been added: to give hexdumps, even with binary replies.
Edit your question and add this information to let people help you with a well-defined problem.

411 Length required error with App Proxies

I set up an app proxy for my app deployed on a normal shared hosted server (not Heroku or anything like that). It works like a charm (as do my other apps) until I set the content type to application/liquid.
As soon as I do that I get a 411 Length Required error by nginx which is generated by my server (my guess). I tried to resolve it by setting content length to 0. It worked for a while but then it stopped. I tried other values and it works depending on its mood. Funnily, sometimes the output is truncated at content length, and at times I get the whole output (a simple page refresh can give different outputs). Also, sometimes it doesn't work AT ALL and shopify throws a "we're having tech. difficulties" error.
To summarize, content length is not reliable at all.
Now I am not sure exactly what causes a 411 error and what can I do about it. And why is it thrown only when content type is liquid. Moreover, content-length doesn't result in a consistent output (no output/predictable output/truncated output/shopify error).
Anyone knows what's up?
Perhaps your responses are using chunked transfer encoding. I don't think nginx supports this by default, so would return a 411 error in this case because chunked encoding doesn't use a Content-Length header.
If you do want to use chunked responses, there is the http://wiki.nginx.org/HttpChunkinModule module that should add support for this. Otherwise, disable the chunked encoding in your app, and make sure the Content-Length header is consistent with the length of the body of the response.

Suggestions for best way of implementing HTTP upload resume

I'm working on a project which will allow large files (GB+) to be uploaded via HTTP PUT and I need to implement a method for resuming the upload. Once a file is uploaded and finalized it is complete and cannot be modified. So far, I have 2 options in mind but neither of which fit perfectly:
Option 1
Client sends an initial HEAD request on the file which will return either 404 if it does not exist or the file details including current size along with an HTTP X-Header along the lines of X-Can-Resume or something like that to specify whether the file can be resumed and a Range header specifying which bytes it has. This seems OK but I'm not keen on the X-Header as it removes from the HTTP standard.
Option 2
Client sends a PUT request with a Content-Length header of 0 bytes and no body, the server can then send back a 308 Resume Incomplete (as proposed here http://code.google.com/p/gears/wiki/ResumableHttpRequestsProposal) or a 202 Accepted header to indicate whether to resume or start from the beginning. This also seems acceptable apart from the use of a non-standard header.
Any other suggestions on the best way to implement this?
Thanks,
J
In either solutions there is no existing client and server implementations, so I'm guessing you'll code both. I think you should just find a right balance between the simplest and what is described in the gears proposal (by the way, you probably know Gears is dead), and be prepared to change when a standard emerges.
If I were to implement this feature, I'd make it possible for the client to upload in chunks and I would add a message digest on the whole content and the chunks.

How is file upload handled in HTTP?

I am curious to know how webservers handle file uploads.
Is the entire file sent as a single chunk? Or is it streamed into the webserver - which puts it together and saves it in a temp folder for PHP etc. to use?
It's just a matter of following the encoding rules so that one can easily decode (parse) it. Read on the specification about multipart-form/data encoding (the one which is required in HTML based file uploads using input type="file").
Generally the parsing is done by the server side application itself. The webserver only takes care about streaming the bytes from the one to the other side.
It's streamed to answer that question, but see this RFC 1867 for more information.
RFC 1867 describes the mechanism.