Getting file contents with Range header returns Partial Content and subsequent request returns no data - onedrive

We have a non-understandable issue with getting file contents using OneDrive API.
When we request file contents with Range header:
GET /blahblah/foobar.docx HTTP/1.1
Host: qw122q-ch3301.files.1drv.com
Accept: */*
Accept-Encoding: deflate, gzip
Range: bytes=0-77270
OneDrive returns:
HTTP/1.1 206 Partial Content
Cache-Control: no-cache
Content-Length: 18325
We checked that the file size is correct on OneDrive server using web interface. Usually OneDrive returns full requested content but from last week they returns partial contents. But it's OK if we can get remaining parts with another API calls.
But when we send another request with Range header:
Range: bytes=18325-77270
OneDrive returns no data:
HTTP/1.1 206 Partial Content
Control: no-cache
Content-Length: 0
Has anyone experienced this issue? I can't find any clues on this issue from OneDrive developer documents. Please shed some light on this..

Actually I have a theory so I'm going to take a shot at an answer. There are actually two different issues that are resulting in this confusing behavior, so I'll tackle each one separately.
Reported file size doesn't match content size
This is an unfortunate quirk of the system that is being tracked with this GitHub issue. Ryan explains in more detail here.
Range downloads of word docs do not correctly handle unsatisfiable ranges
When a range outside of the actual file size is requested we should be failing with a 416 Requested Range Not Satisfiable like we do for "normal" files. But that's obviously not working. You can see in the Content-Range of the result there's something screwy going on:
FileSize: 15 bytes
Range Requested: bytes=15-
Content-Range Response: bytes=15-14/15
The value of the Content-Range obviously makes no sense.
Together these two issues should result in the weird behavior you're seeing. We're close to resolving the first, while the second was unknown (at least to me) so I've opened a new GitHub issue to track it.

Related

Request to API getting truncated

I have an Asp.net API application running on Windows data center 2019 and some of the requests for an API get truncated. This is happening to only one of the APIs as far as I know because it’s the only one that upload big chunks of data. All others are tiny requests.
This API takes json in the body which contains: and issue ID and a serialized image. The largest requests size is 20mb so its not too big and most of the time they go thru fine, but sometimes the request gets chopped off at the end – sometimes a little, sometimes a lot. Also, there is a pattern that when a request it truncated, the API returns a 500 code and the client device tries to call the API again, and often it will continue to be truncated and always in different places.
I have good visibility into this because I use a logger module that writes every request to a text file allowing me to see exactly what hits the server.
I know the device is sending a well formatted request because it would get an exception if it created bad formatted json.
Lastly, this is using SSL.
This is what a typical request looks like:
HEADERS
Content-Length : 7154364
Content-Type : text/plain; charset=utf-8
Accept-Encoding : gzip
Host : cmtafr-dev.nwis.net
User-Agent : Dart/2.17 (dart:io)
deviceid : xxxxxxx
appversion : 2.21.10
cap2.0_tokenkey : xxxxx
osversion : 15.6.1
devicebrand : IOS
devicemodel : iPhone13,4
PATH
/api/IssueController/ExecuteIssue_UploadImageV2/7acd5643-c112-4f74-9dfa-1d558ee3ae69
BODY
{"Isu_Id":"9EC539F2-ABDD-49C5-934A-FAD6371B3E9C","ImageData":"/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAA... bla bla bla"}
Additional info:
The screenshot below shows 2 requests attempting to upload the same image. both failed as they were truncated. Notice the file sizes are different - meaning they left the device in good shape.
However, the header on both requests show the Content-Length:
16449758
HEADERS
Connection : keep-alive
Content-Length : **16449758**
Content-Type : application/json
Accept : */*
How can I possibly troubleshoot this? I have spent hours searching and not found anything similar to this. Most posts regarding …truncated requests/responses are application specific where a vendor replies.
Thank you for any help you can offer.

HTTP4s increase max upload size

I am toying with http4s multipart file upload, which I got working. However, the multipart parsing throws an exception for file uploads bigger than ~500kb.
The error on client side, which is thrown while parsing multipart body is HTTP 422: The request body was invalid.
The error on the server side is "Part not terminated properly"
Since this is obviously related to the size of uploaded file, I suspect there must be a config in http4s to allow larger uploads?
Thanks in advance!
can you try changing content length in headers?
eg: content-length: 3495 OR Content-Length: 3495 depending on size of your content.
Ref:
https://github.com/http4s/http4s/blob/4b928e0dc0ba6edbdbe7461204663e13a7013f8c/blaze-server/src/main/scala/org/http4s/blaze/server/Http2NodeStage.scala#L129
As I see this method getBody is called with len param
https://github.com/http4s/http4s/blob/4b928e0dc0ba6edbdbe7461204663e13a7013f8c/blaze-server/src/main/scala/org/http4s/blaze/server/Http2NodeStage.scala#L108
What I uncovered is that you can pass this header content-length to allow that size
https://github.com/http4s/blaze/blob/3d1b15eace96740507daac9c9e75f978bbd2e524/http/src/main/scala/org/http4s/blaze/http/HeaderNames.scala#L26
ref for content length header:
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Length
Hope it works.
There is a header to change the maximum size able to be uploaded called Content-Length.
This is the simple syntax:
Content-Length: <length>
The length parameter is just a number stating the number of bytes that is allowed for a maximum size.
Some examples include:
Content-Length: 6553
Content-Length: 54138
So, you can set the maximum size here.
To inspect this header in browsers, follow the steps below.
Click Inspect Element in your browser.
Click on the Network tab.
Check the request header.
Find the header Content-Length in there.
If you want browser compatibility statistics, here they are:
Google Chrome (and all Chromnium-based browsers)
Firefox
Opera
Safari
Microsoft Internet Explorer
This is how you can change the maximum file upload size with HTTP4.

Do any browsers support trailers sent in chunked encoding responses?

HTTP/1.1 specifies that a response sent as Transfer-Encoding: chunked can include optional trailers (ie. what would normally be sent as headers, but for whatever reason can't be calculated before the content, so they can be appended to the end), for example:
Request:
GET /trailers.html HTTP/1.1
TE: chunked, trailers
Response:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Trailer: My-Test-Trailer
D\r\n
All your base\r\n
B\r\n;
are belong\r\n
6\r\n
to us\r\n
0\r\n
My-Test-Trailer: something\r\n
\r\n
This request specifies in the TE header that it's expecting a chunked response, and will be looking for trailers after the final chunk.
The response specifies in the Trailer header the list of trailers it will be sending (in this case, just one: My-Test-Trailer)
Each of the chunks are sent as:
size of chunk in hex (D = 13), followed by a CRLF
chunk data (All your base), followed by a CRLF
A zero size chunk (0\r\n) indicates the end of the body.
Then the trailer(s) are specified (My-Test-Trailer: something\r\n), followed by a final CRLF.
Now, from everything I've read so far, trailers are rarely (if ever) used. Most discussions here and elsewhere concerning trailers typically start with "but why do you want to use trailers anyway?".
Putting aside the question of why, out of curiosity I've been trying to simulate a HTTP request/response exchange that uses trailers; but so far I have not yet been able to get it to work, and I'm not sure if it's something wrong with response I'm generating, or whether (as some have suggested) there are simply no clients that look for trailing headers.
Clients I've tried include: curl, wfetch, Chrome + jQuery.
In all cases, the client receives and correctly reconstructs the chunked response (All your base are belong to us); and I can see in the response headers that Trailer: My-Test-Trailer is being sent; but I'm not seeing My-Test-Trailier: something returned either in the response headers, or anywhere.
It's unclear whether a trailing header like this should appear in the client as a normal response header, after the entire response has been received and the connection closed?
Interestingly, the curl change logs appear to suggest that curl does support optional trailers, and that curl will process any trailers it finds into the normal header stream.
So does anybody know:
of a valid URL that I could ping, which sends trailers in a chunked response? (so that I can confirm whether it's just my test response that's not working); and
which clients are known to support (and access/display) trailers sent by the server?
No common browsers support HTTP/1.1 trailers. Look at the column "Headers in trailer" in the "Network" tab of browserscope.
Chrome: No, and won't fix (bug). Supports H/2 trailers (bug).
Firefox: No, and I don't see a ticket in bugzilla for it. Appears to support H/2.
IE: No
Edge: No
Safari: No
Opera: Old versions only (v10 - 12, removed in 14)
As you've discovered, a number of non-browser clients support it.
Over 5 years since asking this question, I can now definitively answer it myself.
Mozilla just announced that they will be supporting the new Server-Timing field as a HTTP trailing header (their first ever support for trailers).
https://bugzilla.mozilla.org/show_bug.cgi?id=1413999
However, more importantly, they confirm that it will be whitelisted so that Server-Timing is the only support value (emphasis mine):
Server-Timing is an HTTP trailer, not a header. :mcmanus tells me we currently parse trailers, but then silently throw them away. We don't want to change that behavior in general (we don't want to encourage trailers), so we'll want to whitelist the Server-Timing trailer, store it somewhere (probably even just a mServerTiming header will work for now, since it's the only trailer we support) and then make it available via some new channel.getTrailers() call.
So I guess that confirms it once and for all: trailing headers are not supported (and never likely to be in a general sense) by Moz, and presumably the same stance is taken by all other browser vendors.
Since this commit, Jodd HTTP Java client support trailer headers.
On the first question, I haven't yet found any live response that uses them ;)
As of May 2022, all browsers support the Trailer response header: https://caniuse.com/mdn-http_headers_trailer.
Library support:
Node.js
Jodd
EDIT ME, to add more. (This is a Community wiki answer.)
Just recently, according to caniuse.com, it is claimed that most major browser providers now support the Trailer response header in their latest versions (e.g. Firefox-88, Safari-14.1, Chrome-88, etc).
However, it seems that this only due to their support of Server-Timing (which can use trailers as mentioned in another answer), and there does not currently seem to be a general way to access a Trailer header from the browser Javascript - it's currently an open issue for the Fetch API, and the bug request for trailers in Chrome is marked wontfix.

411 Length required error with App Proxies

I set up an app proxy for my app deployed on a normal shared hosted server (not Heroku or anything like that). It works like a charm (as do my other apps) until I set the content type to application/liquid.
As soon as I do that I get a 411 Length Required error by nginx which is generated by my server (my guess). I tried to resolve it by setting content length to 0. It worked for a while but then it stopped. I tried other values and it works depending on its mood. Funnily, sometimes the output is truncated at content length, and at times I get the whole output (a simple page refresh can give different outputs). Also, sometimes it doesn't work AT ALL and shopify throws a "we're having tech. difficulties" error.
To summarize, content length is not reliable at all.
Now I am not sure exactly what causes a 411 error and what can I do about it. And why is it thrown only when content type is liquid. Moreover, content-length doesn't result in a consistent output (no output/predictable output/truncated output/shopify error).
Anyone knows what's up?
Perhaps your responses are using chunked transfer encoding. I don't think nginx supports this by default, so would return a 411 error in this case because chunked encoding doesn't use a Content-Length header.
If you do want to use chunked responses, there is the http://wiki.nginx.org/HttpChunkinModule module that should add support for this. Otherwise, disable the chunked encoding in your app, and make sure the Content-Length header is consistent with the length of the body of the response.

Understanding caching strategies of a dynamically generated search page

While studying the caching strategies adopted by various search engine websites and Stackoverflow itself, I can't help but notice the subtle differences in the response headers:
Google Search
Cache-Control: private, max-age=0
Expires: -1
Yahoo Search
Cache-Control: private
Connection: Keep-Alive
Keep-Alive: timeout=60, max=100
Stackoverflow Search
Cache-Control: private
There must be some logical explanation behind the settings adopted. Can someone care to explain the differences so that everyone of us can learn and benefit?
From RFC2616 HTTP/1.1 Header Field Definitions, 14.9.1 What is Cacheable:
private
Indicates that all or part of the response message is intended for a single
user and MUST NOT be cached by a shared cache. This allows an origin server
to state that the specified parts of the response are intended for only one
user and are not a valid response for requests by other users. A private
(non-shared) cache MAY cache the response.
max-age=0 means that it may be cached up to 0 seconds. The value zero would mean that no caching should be performed.
Expires=-1 should be ignored when there's a max-age present, and -1 is an invalid date and should be parsed as a value in the past (meaning already expired).
From RFC2616 HTTP/1.1 Header Field Definitions, 14.21 Expires:
Note: if a response includes a Cache-Control field with the max-age directive
(see section 14.9.3), that directive overrides the Expires field
HTTP/1.1 clients and caches MUST treat other invalid date formats, especially
including the value "0", as in the past (i.e., "already expired").
The Connection: Keep-Alive and Keep-Alive: timeout=60, max=100 configures settings for persistent connections. All connections using HTTP/1.1 are persistent unless otherwise specified, but these headers change the actual timeout values instead of using the browsers default (which varies greatly).