Difference between TTL and Keep alive - ttl

Can any one tell me the difference between TTL and Keep alive in sockets (C# Networking) and also Linger.. Thanks in advance.

TTL tells the packet how many routers he can go through before giving up, while Keep Alive tells the connexion how long it must be kept open without activity.
From what i read about Linger, i don't see the difference with keep-alive, i may be missing something here.
EDIT: The linger option allows you to close the socket while telling it to wait some time to see if data is still on the wire; from this page, we read that
There may still be data available in the outgoing network buffer after
you close the Socket. If you want to specify the amount of time that
the Socket will attempt to transmit unsent data after closing, create
a LingerOption with the enabled parameter set to true, and the seconds
parameter set to the desired amount of time. The seconds parameter is
used to indicate how long you would like the Socket to remain
connected before timing out. If you do not want the Socket to stay
connected for any length of time after closing, create a LingerOption
with the enabled parameter set to false. In this case, the Socket will
close immediately and any unsent data will be lost. Once created, pass
the LingerOption to the Socket.SetSocketOption method. If you are
sending and receiving data with a TcpClient, then pass the
LingerOption to the TcpClient.LingerState method.

Time to live is the number of devices (hops) a network packet may cross (like routers, switches etc) Keep alive time is the time the socket stays open when no data is being send or received

Related

BITMEX Websocket connection drops / disconnects despite keep alive burning through connection pool

Every once in a while Bitmex disconnects our websocket connection which forces us to reconnect. However, they provide a connection pool of 40 connections per hour. In times of low volatility it seems not to be a problem AT ALL, however as soon as trading activity goes up, we are running through these 40 connections in no time leaving our connection dead eventually.
We do have a keep-alive but it does not solve the problem at all.
We haven’t seen any specifics on the API documentation regarding how to deal with this problem, or the specific reasons we get so many close opcodes whenever the volatility raises
Does anyone know if we are doing something wrong?
EDIT: heartbeat is also in place
I suggest implementing heartbeats as per https://www.bitmex.com/app/wsAPI#Heartbeats
In general, WebSocket connections can drop if the connection remains idle for too long without transmitting any data.

LoadTest on dummy server succeeds after setting Time to Live or TTL in HttpConnPool, but what does it do?

What does the Time to Live (TTL) variable in the HttpConnPool.class from the package org.apache.http.impl.conn; do?
I was running some load tests on a dummy server. When I am passing close to 9 requests per second. I got random NoHttpResponseException, target failed to respond or dummy server failed to respond.
Then I added a property called "TTL" or "TimetoLive" and gave it a value. The HttpResponseException stopped arising. I would like to know what this variable does to prevent the NoHttpResponseException to arise in the first place.
Actually I have figured out the answer myself.
In my load testing, initially we got "NoHttpResponseException, target server #Somelink:PortNumber failed to respond." during loadtest because httpClient maintains persistent connections meaning one and same connection to send multiple requests. It is more efficient this way. There is an evictor thread which we have set for certain milliseconds or seconds. The evictor thread will remove idle connection after certain milliseconds. During production there is a possibility of having a idle connection as we do not have traffic all the time. Now during Load test, the connection will not be idle as we keep sending requests all the time to the client server. Hence the connection will not be evicted and the TTL property was set to Default value of "-1" which means infinite (This is for my application, for every application it depends on the value set by the developer).
TTL is the property that defines how long a connection must be active regardless if its idle or not. If the property is set to "-1", then the connection will remain active forever or at least until the client server closes it. The client server usually closes the connection after certain time. No server maintains a connection forever. A new connection will always be established.
During this time when the client close our connection, our server will assume that the connection is established but the client did not send a response. Hence it returns NoHttpResponseException i.e., the target server failed to respond. Adding TTL property will ensure to remove any persistent connection regardless if it is idle or not. Hence we will always have a new connection preventing an NoHttpResponseException.
I hope this helps.

Is it possible to lose events when using long-polling to retrieve real time notifications?

When subscribing to real-time notifications, I go through the normal handshake, subscribe, connect flow.
Once the connection returns with events, I reconnect and wait for the next response to return. My question is:
If events are generated the first response and the next reconnect, could they be lost?
As an example: A synchronous application which processes returned response data after it returns and only reconnects once the data processing has finished could cause a significant delay between the response and the next reconnect. Are the cumulocity events generated during that delay buffered in the real-time queue for that particular client id or are they just lost?
Another possible example is when the client ID is no longer valid (this seems to happen every day at midnight), I have to resubscribe, causing a period of time during which no one is subscribed.
The client ID that you receive when handshaking is connected to a queue on the server side. That queue keeps all notifications that you are not able to receive until the next connect. It delivers them when you reconnect. (Try it out with Postman: After a connect returns, send a couple of events, then connect again. You will notice that you will get all events at once.)
However, as you noticed, the queue is not kept forever. If you are not able to reconnect within two hours (I believe), the queue is thrown away in order to not block server resources. This is what you noticed. In that case, you need to query the database to determine any missed events (e.g., poll any operations in pending state from devices).

When do the events emitted by Port get emitted? And what do they mean?

As far as I can tell there are 7 events dispatched by a NoFlo port:
attach,
connect,
begingroup,
data,
endgroup,
disconnect,
detach
To me some of these events sound very similar such as attach + connect, and disconnect + detach. What is the difference?
What does begingroup and endgroup mean?
When do these events get emitted and when are they generally used?
I've seen the documentation at: http://noflojs.org/documentation/components/#portevents
Would my assumption be correct to assume that attach and detach are for handling NoFlo UI cases eg changing the state of the components look?
Another assumption would be that connect gets fired every time before data is sent? Then data gets fired. Then disconnect? Seems a bit odd to me...
I'm completely in the dark when it comes to groups.
attach and detach happen when the NoFlo Network attaches (or removes) a socket to the port. So usually they happen at network start-up time, before IIPs get sent.
The exception to this is when you're live-editing the graph with a tool like Flowhub. In that situation attach/detach can happen whenever you connect or remove wires.
Most components don't need to care about the attachment events.
connect happens before the upstream connection sends data, and disconnect when the upstream connection says that it has sent everything it is intending to send. So in effect they're beginning of transmission and end of transmission events. An upstream component may choose to connect again after a disconnect if it has a new batch of data to send.
data is the event for actual payload-containing packets.
begingroup and endgroup are the "bracket IPs" containing metadata about the data being sent. They can be used for creating tree structures with packet data.
For example, filesystem/ReadFile will send the file contents as a data packet, but the filename is sent via a bracket IP using a begingroup/endgroup packets around the actual file contents.
The noflo-groups library provides lots of components for utilizing group information for synchronization, routing, etc.

OpenSSL SSL_ERROR_WANT_WRITE never recovers during SSL_write()

I have two applications talking to each other over SSL. The client is running on a windows machine, the server is a linux based application. The client is sending a large amount of data to the server on startup. The data is sent in ~4000byte chunks over to the server that contains 30 entries. I have to send about 50000 entries over.
During that transmission the server sends a message to the client, the message size is ~4000bytes. After that happens, the SSL_write() on the client side begins to return error of SSL_ERROR_WANT_WRITE. The client sleeps for 10ms, and retries the SSL_write with the exact same parameters, however, the SSL_write fails infinitely. Subsequently it aborts. If it tries to send a new message, I get an error indicating I am not sending the same aborted message from earlier.
error:1409F07F:SSL routines:SSL3_WRITE_PENDING: bad write retry”
The server eventually kills the connection since it has not heard from the client for 60s and re-establishes a new one. This is just an FYI, the real issue is how can I get SSL_write to resume.
If the server does not send a request during the receive the problem goes away. If I shrink the size of the request from 16K to 100 bytes the problem does not happen.
The SSL CTX MODE is set to SSL_MODE_AUTO_RETRY and SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER.
Does anyone have an idea what might cause a simultaneous transmission from both sides with large information can cause this failure. What can I do to prevent it if this is a limitation other than capping the size that goes out from the server to the client. My concern is that if the client is not sending anything the throttling I applied to avoid this issue is a waste.
On the client side I tried to perform an SSL_read to see if I need to read during a write even though I never receive an SSL_ERROR_PENDING_READ, but the buffer is not that big anyway. ~1000bytes in size.
Any insight on this would be appreciated.
SSL_ERROR_WANT_WRITE - This error is returned by OpenSSL (I am assuming you are using OpenSSL) only when socket send gives it an EWOULDBLOCK or EAGAIN error. The socket send will give a EWOUDLBLOCK error when the send side buffer is full, which in turn means that your Server is not reading the messages sent from Client.
So, essentially, the problem lies with your Server which is not reading the messages sent to it. You need to check your server and fix it, which will automatically fix your client problem.
Also, why have you set the option "SSL_MODE_ACCEPT_MOVING_WRITE_BUFFER"? SSL always expects that the record which it is trying to send should be sent completely before the next record can be sent.
As it turns out that with both the client and server side app, the read and writes are processed in one thread. In a perfect storm as I described above, the client is busy writing (non blocking). The server then decides to do a write a large set of messages of its own in between processing its rx buffers. The server tx is a blocking call. The server gets stuck writing, starves the read, the buffers fill up and we have a deadlock scenario.
The default windows buffer is 8k bytes so it doesn't take much to fill it up.
The architecture should be such that there is a separate thread for the rx and tx processing on both sides. As a short cut/term fix, once can increase the rx buffers and rate limit the tx side to prevent the deadlock.