I have an NSData object that consists of several HTTP responses or requests concatenated together. What is the most effective way to tokenise this stream of requests/responses into individual CFHTTPMessageRef objects?
My current approach is to read the data one line at a time until CFHTTPMessageIsHeaderComplete returns YES, at which point I then grab the value of the Content-Length header to determine the length of the body associated with this particular request.
This approach works reasonably well, but fails in the case of chunked transfer encoding. I could now add additional logic to deal with chunked transfers, but my parsing logic will grow more than I would like. Similarly, I am only currently dealing with well-formed messages -- it will trip up should a message not be formatted perfectly.
Is there (ideally) a set of Objective-C classes that can parse a stream of data into discrete HTTP messages? Is this something that libcurl could perform?
No, libcurl cannot split up this for you. It only separates actual HTTP responses that it receives over the network.
Related
What is the difference between a frame and an envelope in the context of rabbitmq-c. The examples seem to use them interchangeably, and its not clear what the difference is.
A frame (or amqp_frame_t) is an AMQP protocol level construct. It represents a unit of transmission over the network can contain a number of different payloads.
An envelope (amqp_envelope_t) is a complete received message. An envelope is usually constructed from 3 or more frames. (One method, one header, and one or more body frames).
I am using tensorflow-serving for deep learning model server, it is a grpc serivce. And in order to track the server's requests and responses, there is a proxy in the middle of the server and client. The proxy will record the whole http level requests and responses.
The (request, response) tuple need some way to be human readable. So I need to translate the grpc request and response to json format. As I have the *.proto files it looks not so hard. But after some tests, I found that the grpc request and response body shows 5 (different) extra bytes data in front of the whole body.
// bytes in the grpc response:
\x00\x00\x00\x00c\nA\n\x07Softmax\x126\x08\x01\x12\x08\x12\x02\x08\x01\x12\x02\x08\n*(\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x12\x1e\n\x07default\x12\x02\x08\x01\x1a\x0fserving_default
// bytes in the raw .pb format:
\nA\n\x07Softmax\x126\x08\x01\x12\x08\x12\x02\x08\x01\x12\x02\x08\n*(\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x12\x1e\n\x07default\x12\x02\x08\x01\x1a\x0fserving_default
You can see there are extra five bytes \x00\x00\x00\x00c there. So...what is this mean? Does all the grpc requests and responses have this extra? Or is there some better way to parse grpc contents and tranlate into some human readable structure?
gRPC has a 5 byte header. Search for Length-Prefixed-Message in https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md.
What is the recommended way to deal with message versioning? The main schools of thought appear to be:
Always create a new message class as the message structure changes
Never use (pure) serialized objects as a message. Always use some kind of version header field and a byte stream body field. In this way, the receiver can always accept the message and check the version number before attempting to read the message body.
Never use binary serialized objects as a message. Instead, use a textual form such as JSON. In this way, the receiver can always accept the message, check the version number, and then (when possible) understand the message body.
As I want to keep my messages compact I am considering using Google Protocol Buffers which would allow me to satisfy both 2 & 3.
However I am interested in real world experiences and advice on how to handle versioning of messages as their structure changes?
In this case "version" will be basically some metadata about the message, And these metadata are some instruction/hints to the processing algorithm. So I willsuggest to add such metadata in the header (outside of the payload), so that consumer can read the metadata first before trying to read/understand and process the message payload. For example, if you keep the version info in the payload and due to some reason your (message payload is corrupted) then algorithm will fail parse the message, then it can not event reach the metadata you have put there.
You may consider to have both version and type info of the payload in one header.
Is there a option in RabbitMQ to send multipart messages?
This fact allows for using multi-part messages for adding
coarse-grained structure to your message. The example with two
matrices illustrates the point. You send the two matrices as two
message parts and thus avoid the copy. However, at the same time the
matrices are cleanly separated, each residing in its own message part
and you are guaranteed that the separation will be preserved even on
the receiving side. Consequently you don't have to put matrix size
into the message or invent any kind of "matrix delimiters".
There is no such feature like multipart content publish in AMQP protocol (see section 2.1.2 Message Flow) at all.
Multipart message sending and receiving can be implemented on application level, but there are no known use cases for it.
I have limited knowledge in WCF as well as sending binary data via WCF, so this question may be somewhat rudimental.
I would like to know the difference between sending data with BinaryMessageEncodingBindingElement and MtomMessageEncodingBindingElement. It is still not clear to me when to use which approach after reading this page from MSDN on Large Data and Streaming.
Also, a small question: are a message with attachments and an MTOM message the same thing?
MTOM is a standard that uses multi-part mime-encoded messages to send portions of the message that are large and would be too expensive to base64 encode as pure binary. The SOAP message itself is sent as the initial part of the message and contains references to the binary parts which a web service software stack like WCF can then pull back together to create a single representation of the message.
Binary encoding is entirely proprietary to WCF and really doesn't just have to do with large messages. It presents a binary representation of the XML Infoset which is far more compact across the wire and faster to parse than text based formats. If you happen to be sending large binary chunks of data then it just fits right in with the other bytes that are being sent.
Streaming can be done used with any message format. That's more about when the data is written across the network vs. being buffered entirely in memoery before being presented to the network transport. Smaller messages make more sense to buffer up before sending and larger messages, especially those containing large binary chunks or streams, necessitate being streamed or will exhaust memory resources.