How is an HTTP multipart "Content-length" header value calculated? - http-headers

I've read conflicting and somewhat ambiguous replies to the question "How is a multipart HTTP request content length calculated?". Specifically I wonder:
What is the precise content range for which the "Content-length" header is calculated?
Are CRLF ("\r\n") octet sequences counted as one or two octets?
Can someone provide a clear example to answer these questions?

How you calculate Content-Length doesn't depend on the status code or media type of the payload; it's the number of bytes on the wire. So, compose your multipart response, count the bytes (and CRLF counts as two), and use that for Content-Length.
See: http://httpwg.org/specs/rfc7230.html#message.body.length

The following live example should hopefully answer the questions.
Perform multipart request with Google's OAuth 2.0 Playground
Google's OAuth 2.0 Playground web page is an excellent way to perform a multipart HTTP request against the Google Drive cloud. You don't have to understand anything about Google Drive to do this -- I'll do all the work for you. We're only interested in the HTTP request and response. Using the Playground, however, will allow you to experiment with multipart and answer other questions, should the need arise.
Create a test file for uploading
I created a local text file called "test-multipart.txt", saved somewhere on my file system. The file is 34 bytes large and looks like this:
We're testing multipart uploading!
Open Google's OAuth 2.0 Playground
We first open Google's OAuth 2.0 Playground in a browser, using the URL https://developers.google.com/oauthplayground/:
Fill in Step 1
Select the Drive API v2 and the "https://www.googleapis.com/auth/drive", and press "Authorize APIs":
Fill in Step 2
Click the "Exchange authorization code for tokens":
Fill in Step 3
Here we give all relevant multipart request information:
Set the HTTP Method to "POST"
There's no need to add any headers, Google's Playground will add everything needed (e.g., headers, boundary sequence, content length)
Request URI: "https://www.googleapis.com/upload/drive/v2/files?uploadType=multipart"
Enter the request body: this is some meta-data JSON required by Google Drive to perform the multipart upload. I used the following:
{"title": "test-multipart.txt", "parents": [{"id":"0B09i2ZH5SsTHTjNtSS9QYUZqdTA"}], "properties": [{"kind": "drive#property", "key": "cloudwrapper", "value": "true"}]}
At the bottom of the "Request Body" screen, choose the test-multipart.txt file for uploading.
Press the "Send the request" button
The request and response
Google's OAuth 2.0 Playground miraculously inserts all required headers, computes the content length, generates a boundary sequence, inserts the boundary string wherever required, and shows us the server's response:
Analysis
The multipart HTTP request succeeded with a 200 status code, so the request and response are good ones we can depend upon. Google's Playground inserted everything we needed to perform the multipart HTTP upload. You can see the "Content-length" is set to 352. Let's look at each line after the blank line following the headers:
--===============0688100289==\r\n
Content-type: application/json\r\n
\r\n
{"title": "test-multipart.txt", "parents": [{"id":"0B09i2ZH5SsTHTjNtSS9QYUZqdTA"}], "properties": [{"kind": "drive#property", "key": "cloudwrapper", "value": "true"}]}\r\n
--===============0688100289==\r\n
Content-type: text/plain\r\n
\r\n
We're testing multipart uploading!\r\n
--===============0688100289==--
There are nine (9) lines, and I have manually added "\r\n" at the end of each of the first eight (8) lines (for readability reasons). Here are the number of octets (characters) in each line:
29 + '\r\n'
30 + '\r\n'
'\r\n'
167 + '\r\n'
29 + '\r\n'
24 + '\r\n'
'\r\n'
34 + '\r\n' (although '\r\n' is not part of the text file, Google inserts it)
31
The sum of the octets is 344, and considering each '\r\n' as a single one-octet sequence gives us the coveted content length of 344 + 8 = 352.
Summary
To summarize the findings:
The multipart request's "Content-length" is computed from the first byte of the boundary sequence following the header section's blank line, and continues until, and includes, the last hyphen of the final boundary sequence.
The '\r\n' sequences should be counted as one (1) octet, not two, regardless of the operating system you're running on.

If an http message has Content-Length header, then this header indicates exact number of bytes that follow after the HTTP headers. If anything decided to freely count \r\n as one byte then everything would fall apart: keep-alive http connections would stop working, as HTTP stack wouldn't be able to see where the next HTTP message starts and would try to parse random data as if it was an HTTP message.

\n\r are two bytes.
Moshe Rubin's answer is wrong. That implementation is bugged there.
I sent a curl request to upload a file, and used WireShark to specifically harvest the exact actual data sent by my network. A methodology that everybody should agree is more valid than on online application somewhere gave me a number.
--------------------------de798c65c334bc76\r\n
Content-Disposition: form-data; name="file"; filename="requireoptions.txt"\r\n
Content-Type: text/plain\r\n
\r\n
Pillow
pyusb
wxPython
ezdxf
opencv-python-headless
\r\n--------------------------de798c65c334bc76--\r\n
Curl, which everybody will agree likely implemented this correctly:
Content-Length: 250
> len("2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d646537393863363563333334626337360d0a436f6e74656e742d446973706f736974696f6e3a20666f726d2d646174613b206e616d653d2266696c65223b2066696c656e616d653d22726571756972656f7074696f6e732e747874220d0a436f6e74656e742d547970653a20746578742f706c61696e0d0a0d0a50696c6c6f770d0a70797573620d0a7778507974686f6e0d0a657a6478660d0a6f70656e63762d707974686f6e2d686561646c6573730d0a2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d2d646537393863363563333334626337362d2d0d0a")
500
(2x250 = 500, copied the hex stream out of WireShark.)
I took the actual binary there. The '2d' is --- which starts the boundary.
Please note, giving the wrong count to the server treating 0d0a as 1 rather than 2 octets (which is insane they are octets and cannot be compound), actively rejected the request as bad.
Also, this answers the second part of the question. The actual Content Length is everything here. From the first boundary to the last with the epilogue --\r\n, it's all the octets left in the wire.

Related

What does 'very robust through email gateways' means in Content-Type multipart header?

I'm studying about Content-Type header and in multipart section, I could find this sentence from developer.mozilla(link).
boundary
For multipart entities the boundary directive is required, which consists of 1 to 70 characters from a set of characters known to be very robust through email gateways, and not ending with white space. It is used to encapsulate...
As I know, "email gateway" is a type of email server that protects server. And I cannot understand what does robust through email gateway means. What does it mean?

sending € character to web server via VBA

I am using MSXML2.XMLHTTP60 to send text messages via VBA using a web server. I cannot understand why the € symbol is not displayed when receiving a text message. Other special characters, such as ò,à,è etc are displayed after a conversion function I wrote (for example à is encoded as "%E0"). I suppose that web server is expecting charset iso 8859-1 which doe not support € symbol. Therefore how can I solve this problem?
If your request is a POST request then you can specify header for Content-Type with encoding e.g. like this:
objHTTP.Open "POST", ...
objHTTP.setRequestHeader "Content-Type", "text/html; charset=utf-8"
But for GET request the URL with possible query string parameters will be encoded as ASCII. Read e.g. this post.
Using UTF-8 as your character encoding should solve such problems. It may also remove the need for your conversion function. I'm not sure how to set the encoding in your web server, but that's usually well documented.

Send data in an HTTP header and get it back

I am coding some test software to simulate something like a router. It will send URL requests on behalf of multiple users.
Is there any HTTP GET header field which I can send which the receiving server will always send back to me unchanged in the response so that I can associate the response with a user?
This is test software for use on a local LAN only, so I don't mind misusing a field, just as long as I get it returned unchanged.
according to http 1.1 rfc, response is:
Response = Status-Line ; Section 6.1
*(( general-header ; Section 4.5
| response-header ; Section 6.2
| entity-header ) CRLF) ; Section 7.1
CRLF
[ message-body ] ; Section 7.2
and here is notation:
*rule
The character "*" preceding an element indicates repetition. The
full form is "<n>*<m>element" indicating at least <n> and at most
<m> occurrences of element. Default values are 0 and infinity so
that "*(element)" allows any number, including zero; "1*element"
requires at least one; and "1*2element" allows one or two.
[rule]
Square brackets enclose optional elements; "[foo bar]" is
equivalent to "*1(foo bar)".
so, the only requirement for server is to respond with status code, other components are optional, always, which effectively means there is no requirement to send any header back
also, this contains list of all possible headers, none of them meet your requirements
I'm not sure about http 2.0, maybe somebody could add information about it

Onenote API (REST) - PATCH append - "must include a 'commands'" error when Commands is already supplied (?!)

Note: I'm pretty sure nothing's wrong with the PATCH query, I had it working before with 'Content-type':'application/json' and a constructed json file:
[
{
'target':'|TARGET_ID|',
'action':'append',
'content':'|HTML|'
}
]
For the purposes of this, the header supplied (authentication bearer is correct and will be omitted)
'Content-type':'multipart/form-data; Boundary=sectionboundary'
(note: Boundary=sectionboundary is in the same line)
Attempting to pass the following body as a PATCH to
https://www.onenote.com/api/v1.0/pages/|GUID|/content
returns a
"code":"20124","message":"A multi-part PATCH request must include a 'commands' part containing the PATCH action JSON structure." :
--sectionboundary
Content-Disposition: form-data; name="Commands"
Content-Type: application/json
[
{
'target':'|TARGET_ID|',
'action':'append',
'content':'|HTML|'
}
]
--sectionboundary
Content-Disposition: form-data; name="image-part-name"
Content-Type: image/png
|BINARY_IMAGE_DATA|
--sectionboundary--
As you can see, there's a Commands section already. Using smallcaps 'commands' doesn't help, and the correct syntax should be "Commands" as per the OneNote Dev Center documentation.
PS: |TARGET_ID| |HTML| |GUID| and |BINARY_DATA| are replaced with the correct content at runtime. Due to privacy constraints, the fact that you may use a different schema than I do, and how long |BINARY_IMAGE_DATA| actually is, I will not show the actual input unless required to solve the problem.
Would like to know if I missed anything - thanks in advance.
PPS: Yes, I realize i've omitted the img tag inside |HTML| somewhere. It shouldn't have anything to do with code 20124, and if I got it wrong should return another thing entirely.
Based on investigating the request information you shared, I can confirm that the PATCH request referenced as part of the correlation you provided does not match your posted header information.
The correlated PATCH request shows up as a multi-part request with only a single part that has Media Type "TEXT/HTML" and not "Application/JSON". Can you please check and confirm your request content ?
Let us continue to discuss this on email. If you still face issues calling the API, please write to me at machandw#microsoft.com
Regards,
Manoj

Fiddler add binary file data to POST

I'm try to add binary file data directly to the request body of a POST call so I can simulate a file upload. However, I tried setting a 'before request' breakpoint and using 'insert file' but I couldn't seem to get that to work. I also tried to modify CustomRules.js to inject the file but couldn't figure out how to load binary data via JScript. Is there an easy solution here?
I'm sure this is a new feature in the year since this question was answered, but thought I'd add it anyhow:
There's a blue "[Upload file]" link in Composer now on the right side under the URL textbox. This will create a full multipart/form-data request. If you use this, you'll notice in the body you now have something that looks like this:
<#INCLUDE C:\Some\Path\my-image.jpg#>
In my case, I just wanted to POST the binary file directly with no multipart junk, so I just put the <#INCLUDE ... #> magic in the request body, and that sends the binary file as the body.
In order to send multipart/form-data, this receipe will be helped.
In upper panel (Http header), set Content-Type as below. Other values are automatically resolved.
Content-Type: multipart/form-data; boundary=-------------------------acebdf13572468
And, input Response Body at the below panel as follows.
---------------------------acebdf13572468
Content-Disposition: form-data; name="description"
the_text_is_here
---------------------------acebdf13572468
Content-Disposition: form-data; name="file"; filename="123.jpg"
Content-Type: image/jpg
<#INCLUDE *C:\Users\Me\Pictures\95111c18-e969-440c-81bf-2579f29b3564.jpg*#>
---------------------------acebdf13572468--
The import rules are,
Content-Type should have two more - signs than boundary words in body.
The last of the body should be ended with two - signs.
In Fiddler script: (in Fiddler: Rules... Customize Rules), find the OnBeforeRequest function, and add a line similar to:
if (oSession.uriContains("yourdomain"))
{
oSession.LoadRequestBodyFromFile("c:\\temp\\binarycontent.dat");
}
since version 2.0, the Request Body has an "Upload File..." link that allows you to post/upload binary data.