content-type request header - http-headers

I'm making an ajax call to a rest API and specified the following header an http post request.
Content-Type application/json; charset=UTF-8
My post body contains some japanese/chinese characters.
Now what my question is, do I need the encode the body of the post request with UTF-8 encoding or the browser takes care of encoding?

When your Content-Type header declares UTF-8 charset, then you must send the content in the UTF-8 encoding.
Although browsers sometimes "guess" or "fix" the encoding, you should never rely on this, as this is a very fragile logic that often fails to work properly.
If your Chinese/Japanese content was in a different encoding (like Shift-JIS), then you will have to convert the text with library like iconv.
Alternatively you could declare that other encoding in the header, but note that you can use only single encoding for all of the response body. Converting everything to UTF-8 is usually the best solution.

Related

Is it correct to specify HTTP response Content-Type as image/*?

I'm curious whether sending Content-Type: image/* in a HTTP response is correct. I know it's advisable to specify the exact MIME type but I'd like to hear if I can use such a header as a fallback when I know it's an image but I don't know its type.
I don't see any evidence that you can or should use wildcards in Content-Type.
RFC 7231 does allow wildcards like that in the Accept header, where you're indicating a range of acceptable content types. The definition of Content-Type does not give any special meaning to the * character, and image/* is not listed as a registered type.
If you cannot identify the media type accurately, you should just leave off the header entirely.

is there a size limit to individual fields in HTTP POST?

I have an API for a file upload that expects a multipart form submission. But I have a customer writing a client and his system can't properly generate a multipart/form-data request. He's asking that I modify my API to accept the file in a application/x-www-form-urlencoded request, with the filename in one key/value pair and the contents of the file, base64 encoded, in another key/value pair.
In principle I can easily do this (tho I need a shower afterwards), but I'm worried about size limits. The files we expect in Production will be fairly large: 5-10MB, sometimes up to 20MB. I can't find anything that tells me about length limitations on individual key/value pair data inside a form POST, either in specs (I've looked at, among others, the HTTP spec and the Forms spec) or in a specific implementation (my API runs on a Java application server, Jetty, with an Apache HTTP server in front of it).
What is the technical and practical limit for an individual value in a key/value pair in a form POST?
There are artificial limits, configurations, present on the HttpConfiguration class. Both for maximum number of keys, and maximum size of the request body content.
In practical terms, this is a really bad idea.
You'll have a String, which uses 2-bytes per character for the Base64 data.
And you have the typical 33% overhead just being Base64.
They'll also have to utf8 urlencode the Base64 string for various special characters (such as "+" which has meaning in Base64, but is space " " in urlencoded form. So they'll need to encode that "+" to "%2B").
So for a 20MB file you'll have ...
20,971,520 bytes of raw data, represented as 27,892,122 characters in raw Base64, using (on average) 29,286,728 characters when urlencoded, which will use 58,573,455 bytes of memory in its String form.
The decoding process on Jetty will take the incoming raw urlencoded bytes and allocate 2x that size in a String before decoding the urlencoded form. So that's a 58,573,456 length java.lang.String (that uses 117,146,912 bytes of heap memory for the String, and don't forget the 29MB of bytebuffer data being held too!) just to decode that Base64 binary file as a value in a x-www-form-urlencoded String form.
I would push back and force them to use multipart/form-data properly. There are tons of good libraries to generate that form-data properly.
If they are using Java, tell them to use the httpmime library from the Apache HttpComponents project (they don't have to have/use/install Apache Http Client to use the httpmime, its a standalone library).
Alternative Approach
There's nothing saying you have to use application/x-www-form-urlecnoded or multipart/form-data.
Offer a raw upload option via application/octet-stream
They use POST, and MUST include the following valid request headers ...
Connection: close
Content-Type: application/octet-stream
Content-Length: <whatever_size_the_content_is>
Connection: close to indicate when the http protocol is complete.
Content-Type: application/octet-stream means Jetty will not process that content as request parameters and will not apply charset translations to it.
Content-Length is required to ensure that the entire file is sent/received.
Then just stream the raw binary bytes to you.
This is just for the file contents, if you have other information that needs to be passed in (such as filename) consider using either the query parameters for that, or a custom request header (eg: X-Filename: secretsauce.doc)
On your servlet, you just use HttpServletRequest.getInputStream() to obtain those bytes, and you use the Content-Length variable to verify that you received the entire file.
Optionally, you can make them provide a SHA1 hash in the request headers, like X-Sha1Sum: bed0213d7b167aa9c1734a236f798659395e4e19 which you then use on your side to verify that the entire file was sent/received properly.

Safari - Special Char in header

Due to language adaptation I need to place some "special" chars in a custom header (chars like é, á, í, ç, and others)...
On the server side i'm using ASP.NET MVC.
It all works fine on chrome.
But in Safari... I can't figure out witch encoding safari uses...
I tried:
UTF-8,
UTF-16,
ASCII,
Url Encode,
a few ISO's
but alert(headerValue) always returns crazy chars...
can anyone tell me which encode to use?
There was a specification in the past regarding HTTP header encoding: RFC 2047. But it seems not to be implemented anymore and even removed.
Here are some related links:
What character encoding should I use for a HTTP header?
HTTP headers encoding/decoding in Java
https://bugzilla.mozilla.org/show_bug.cgi?id=601933
In your case, perhaps you could use URL-encoded string for the value of this custom header.
Hope it helps you,
Thierry

Fiddler: How to use oSession.utilFindInResponse in gzip encoded response

I'm trying to use Fiddler to break when the response contains a specific word and edit the response live.
However, it seems that the oSession.utilFindInResponse function does not match successfully because the response is using GZIP encoding.
Is it possible to get around this ?
I'm new to fiddlers rules but there might be a way to change http compression on the fly ?
Thanks
Calling oSession.utilDecodeResponse() on the session object will remove HTTP chunking and compression from the response.

RFCs for content-type header?

I've looked # rfc 2231 and 2183.
Dealing with a multipart/related mime payload.
I'm trying to decypher if the following is syntactically correct, specifically the "start" attribute for the first Content-Type, but I haven't been able to find the correct RFC.
Content-Type: multipart/related; boundary="=_34e1b39f5c290f66360ff510d4c38da4"; type="application/smil"; start="<cid:eaec2c30d892902b14044d57dbb6ff85>"
--=_34e1b39f5c290f66360ff510d4c38da4
Content-ID: <eaec2c30d892902b14044d57dbb6ff85>
Content-Type: application/vnd.oma.drm.message; boundary=ihvdxymhvdhobklkqbcn;
name="IrishJi2.dm";
Content-Disposition: attachment;
filename="IrishJi2.dm";
--ihvdxymhvdhobklkqbcn
Content-Type: audio/mpeg
Content-Transfer-Encoding: binary
Some background information for the curious. application/vnd.oma.drm.* file types is just a wrapper around a payload item (mp3,jpg, etc) that tells cellular devices the wrapped file is to be considered protected payload and not to allow it to be forwarded or transfered off the phone in anyway. If not for contractual obligations I'd just rip the wrapper off, send the payload on, and be happy, but that is too easy and probably illegal.
From RFC 2387 (The MIME Multipart/Related Content-type):
3.2. The Start Parameter
The start parameter, if given, is the content-ID of the compound object's "root". If not present the "root" is the first body part in the Multipart/Related entity. The "root" is the element the applications processes first.