MSN (message sequence number) in response for a retransmitted RDMA Read - nic

While running ib_read_bw test for 64K message sizes from Mellanox CX-4 (request initiator) to another RNIC, re-transmissions are happening from Mellanox for the 5th RDMA-READ on-wards for 50KB of data (first 12KBs has been ACKed successfully), after which it continuously re-transmitting the same request for remaining 50KB data, though the target RNIC is responding.
One observation the target RNIC is responding with a MSN of 11 instead of 5 int the first RDMA READ response, for the re-transmitted (for 50KB) read request.
The infiniband spec says, for duplicate requests RNIC should not increment the MSN, does this mean that, the RNIC should respond with whatever MSN it has (it may have responded for all the incoming requests received and having a MSN of 16 and then re-transmission being seen) or should it respond with proper MSN for the re-transmitted RDMA READ.

The InfiniBand spec says that:
For RDMA READ requests, the responder
may increment its MSN after it has completed validating the request and
before it has begun transmitting any of the requested data, and may return
the incremented MSN in the AETH of the first response packet.
and
The MSN shall not be incremented for duplicate requests.
(C9-148)
I believe this means the MSN should remain unchanged when retransmission occurs.

Yes, as per my understanding the MSN should be pointing to the original read request.
In case of responding to a duplicate SEND or WRITE, the PSN and MSN, both are of the last ACK sent. This works as a coalesced ACK.
But while responding to the Read request, the PSN is of the original read request and hence the MSN should also be of the original read request.
From Spec -
"to be considered a duplicate RDMA READ Request, the PSN of the duplicate
request must be within the responder's current duplicate PSN region. Furthermore,
to be considered a valid duplicate RDMA READ Request, the
PSN of the duplicate request must fall within the range of PSNs allocated
to the original RDMA READ Response, and the amount of data requested
in the duplicate request must be entirely contained within the extent of
data requested in the original RDMA READ Request. In other words, the
data requested in the duplicate RDMA READ Request must be a proper
subset of the data requested in the original RDMA READ Request.
If the starting PSN and length of a duplicate RDMA READ Request does
not fall within the range of PSNs allocated to the original RDMA READ Response,
the request is invalid and the responder may silently drop the duplicate
RDMA READ Request "

Related

How to handle EAGAIN case for TLS over SCTP streams using memory BIO interface

I'm using BIO memory interface to have TLS implemented over SCTP.
So at the client side, while sending out application data,
SSL_write() api encrypts the data and writes data to the associated write BIO interface.
Then the data from BIO interface is read to a output buffer using BIO_read() call and then
send out to the socket using sctp_sendmsg() api.
Similarly at the server side, while reading data from socket
sctp_recvmsg() api reads ecrypted message chunks from socket,
BIO_write() api writes it to the read BIO buffer, and
SSL_read() api decrypts the data read from the BIO.
The case i'm interested at is where at client side, steps 1 and 2 are done, and while doing 3, i get an EAGAIN from the socket. So whatever data i've read from the BIO buffer, i clean it up, and ask application to resend the data again after some time.
Now when i do this, and later when steps 1, 2 and 3 at client side goes through fine, at the server side, openssl finds it that the record that it received has got a a bad_record_mac and closes the connection.
From googling i came to know that one possibility for it to happen is if TLS packets comes out of sequence, as MAC encoding has dependency on the previous packet encoded, and, TLS needs to have the packets delivered in the same order. So when i was cleaning up the data on EAGAIN i am dropping an SSL packet and then sending a next packet which is out of order (missing clarity here) ?
Just to make sure of my hypothesis, whenever the socket returned EAGAIN, i made the code change to do an infinite wait till the socket was writeable and then everything goes fine and i dont see any bad_record_mac at server side.
Can someone help me here with this EAGAIN handling ? I can't do an infinite wait to get around the issue, is there any other way out ?
... i get an EAGAIN from the socket. So whatever data i've read from the BIO buffer, i clean it up, and ask application to resend the data again after some time.
If you get an EAGAIN on the socket you should try to send the same encrypted data later.
What you do instead is to throw the encrypted data away and ask the application to send the same plain data again. This means that these data get encrypted again. But encrypting plain data in SSL also includes a sequence number of the SSL frame and this sequence number is not the same as for the last SSL frame you throw away.
Thus, if you have thrown away the full SSL frame you are trying to send a new SSL frame with the next sequence number which does not fit the expected sequence number. If you've succeeded to send part of the previous SSL frame and thew away the rest then the new data you send will be considered part of the previous frame which means that the HMAC of the frame will not match.
Thus, don't throw away the encrypted data but try to resent these instead of letting the upper layer resent the plain data.
Select for writability.
Repeat the send.
If the send was incomplete, remove the part of the buffer that got sent and go to (1).
So whatever data i've read from the BIO buffer, i clean it up
I don't know what this means. You're sending, not receiving.
Just to make sure of my hypothesis, whenever the socket returned EAGAIN, i made the code change to do an infinite wait till the socket was writeable and then everything goes fine and i dont see any bad_record_mac at server side.
That's exactly what you should do. I can't imagine what else you could possibly have been doing instead, and your description of it doesn't make any sense.

Is there a way to add header to apache response on how long it took to retrieve a resource?

Is there a module or a built-in function in apache which I can use/activate to send information how long it took to retrieve/process a resource?
For example the resource http://dom.net/resource is accessed. The response header will include the total time it took to wait for the resource to be ready before it gets sent back to the client.
Apache doesn't really 'wait' until the resource is ready before sending the response back to you - it streams data back to the client as and when it receives it.
Depending on what you're interested in measuring, you could record the time taken for the client to receive the first byte/last byte back from Apache, or measure the time taken for Apache to receive the first byte from the (remote?) resource like so. The time taken for Apache to receive the entire response back from the remote resource is not something you can send in the headers, as the headers will have been sent to the client before the remote response is fully received. This information could trivially be written to the Apache logs, however.

How to answer an invalid ISO8583 message

Our system receives ISO8583 messages which we decode and handle appropriately. Now we are getting invalid ISO messages in between which our system can't handle. In fact, it sends nothing in return. This causes a timeout on the other side. As a consequence, the (invalid) transaction is reverted which then causes quite a messup as there is nothing to be reverted.
Can anyone give me a clue on how to deal with/answer an invalid/undecodable ISO8583 message? Is there a standard answer (e.g. 'NAK' like)?
According to the ISO-8583 spec, 6XX (or 16XX, if you're using the '93 version)-class messages are appropriate for administrative notifications. Generally, a 644 or 1644 MTI is prescribed for notifying the sender of a problem processing a message, where
X6XX - Indicates an administrative message, often containing the details of a failure
XX4X - Indicates that the message is a notification; the sender is not to repeat the message
XXX4 - Indicates the source of the message (acquirer/issuer/other); here, it's Other
Putting it all together, your message should have at least the following fields
MTI: 1644
DE-24 (Function code): 650 (Unable to parse message)
Of course, you're to include the standard message identification fields: DE-7,11,12,39. These fields will be necessary for the message sender to match your response with the request.
I don't think there is a standard way of handling invalid ISO 8583 request messages. You didn't say why you are receiving invalid request messages, and without knowing that it is difficult to suggest how you should handle them.
Depending on the situation it may be best not to answer invalid ISO 8583 requests. In fact I know of systems that not only don't answer invalid request messages but will also blacklist the device that sent the invalid message and refuse to answer all other messages from it.
If you do decide not to respond to invalid request messages then as you have found out the client is likely to time out and then attempt to reverse the transaction. This is not usually a problem because servers will usually respond with an approval message to reversal request for transactions that don't exist. Remember that when a client times out after sending a request, it doesn't know if the request was processed or even if the request was received. So a server has to be prepared to handle both 1. a request that was received and processed (by undoing the transaction and then responding with an approval), and 2. a request that was never received/processed (by sending an approval). NOTE: In case 2 there is no need to undo the transaction because the transaction never took place.
From my experience with integrating ISO links, invalid ISO messages are usually, by industry standard, handled by a dropping down of the acquring host's connection followed by an angry mail from the acquirer's service provider accusing you of segfaulting down their mainframe.
Other than that different implementations, when implemented well, will handle invalid messages differently, from what #kolossus said in case the parser fails completely, to a normal **10 response with a specific response code such as RC 12 "Invalid transaction" when just some subfields don't make sense (such as problems with packaging of the complex subfields with tokens, track2 parsing etc)
The practical reason why #kolossus solution doesn't really make sense and why Stuard has a point is, if the client has issues of forming the ISO messages, then it almost certainly has a problem with parsing them too, so another ISO message doesn't really tell the client anything except provoking a parser exception on his side too.
End result will be the same - a technical reversal by the client, just not after a timeout. Basically, with iso8583, the best way to handle invalid messages is to not have them, there's no clean way.

Why does the TLS heartbeat extension allow user supplied data?

The heartbeat protocol requires the other end to reply with the same data that was sent to it, to know that the other end is alive. Wouldn't sending a certain fixed message be simpler? Is it to prevent some kind of attack?
At least the size of the packet seems to be relevant, because according to RFC6520, 5.1 the heartbeat message will be used with DTLS (e.g. TLS over UDP) for PMTU discovery - in which cases it needs messages of different sizes. Apart from that it might be simply modelled after ICMP ping, where you can also specify the payload content for no reason.
Just like with ICMP Ping, the idea is to ensure you can match up a "pong" heartbeat response you received with whichever "ping" heartbeat request you made. Some packets may get lost or arrive out of order and if you send the requests fast enough and all the response contents are the same, there's no way to tell which of your requests were answered.
One might think, "WHO CARES? I just got a response; therefore, the other side is alive and well, ready to do my bidding :D!" But what if the response was actually for a heartbeat request 10 minutes ago (an extreme case, maybe due to the server being overloaded)? If you just sent another heartbeat request a few seconds ago and the expected responses are the same for all (a "fixed message"), then you would have no way to tell the difference.
A timely response is important in determining the health of the connection. From RFC6520 page 3:
... after a number of retransmissions without
receiving a corresponding HeartbeatResponse message having the
expected payload, the DTLS connection SHOULD be terminated.
By allowing the requester to specify the return payload (and assuming the requester always generates a unique payload), the requester can match up a heartbeat response to a particular heartbeat request made, and therefore be able to calculate the round-trip time, expiring the connection if appropriate.
This of course only makes much sense if you are using TLS over a non-reliable protocol like UDP instead of TCP.
So why allow the requester to specify the length of the payload? Couldn't it be inferred?
See this excellent answer: https://security.stackexchange.com/a/55608/44094
... seems to be part of an attempt at genericity and coherence. In the SSL/TLS standard, all messages follow regular encoding rules, using a specific presentation language. No part of the protocol "infers" length from the record length.
One gain of not inferring length from the outer structure is that it makes it much easier to include optional extensions afterwards. This was done with ClientHello messages, for instance.
In short, YES, it could've been, but for consistency with existing format and for future proofing, the size is spec'd out so that other data can follow the same message.

How to wait for entire buffer to arrive in a SSL connection

I am implementing a client server program, in which the client sends HTTP messages to the server. It can be both HTTP or HTTPS
In case of large messages, like file transfer using HTTP, the client sends the whole message at one go, whereas it reaches the server in multiple fragments( the network does it). I wait for the entire message to come, and keep merging it so that I get the whole message. Content length is found using a parameter I send in the HTTP message.
But in the case of HTTPS there is no way to know if the enitre message has arrived.
If i decrypt the fragment, it returns junk values. I think that is because, the whole encrypted message must be joined before decrypting it.
How is it possible to identify if the entire message has arrived in HTTPs
I am using SSL library and using windows sockets.
SSL encrypts plain data into blocks and then those blocks are transmitted individually to the other party. The receiver needs to read the raw socket data and pump it into the SSL decryption engine as it arrives. When the engine has enough bytes for a given block, it decrypts that block and outputs the plain data for just that block. So you simply keep reading socket data and pumping it into the decryption engine, buffering whatever plain data is outputted, until you encounter a decrypted <CRLF><CRLF> sequence denoting the end of the HTTP message headers, then you process those headers to determine whether the HTTP message body is present and how it is encoded. If a message body is present, keep reading socket data, pumping it into the decryption engine, and buffering the output plain data, until you encounter the end of the message body. RFC 2616 Section 4.4 - "Message Length" describes how to determine the encoding of the HTTP message body (after decryption is applied) and what condition terminates the message body.
In other words, you are not supposed to look for the end of an encrypted socket message. You are supposed to decrypt everything you receive until you detect the end of the decrypted HTTP message.