Here's a interesting problem I started facing since migrating from Heroku to Google Container Engine:
Since moving to GCE, after a few hours after server start/restart/deploy, out of nowhere, my Elixir application can't deliver push notifications to APNS any longer. I'm using the apns4ex library. Here is roughly what I found out so far:
Internally on init, the library opens a :ssl (erlang) socket to APNS and keeps recycling that inside a GenServer process
def connect_socket(host, port, opts, timeout_seconds) do
address = "#{host}:#{port}"
case :ssl.connect(host, port, opts, timeout_seconds * 1000) do
{:ok, socket} ->
APNS.Logger.debug("successfully connected to #{address}")
{:ok, socket}
{:error, reason} ->
APNS.Logger.error("failed to connect to push socket #{address}, reason given: #{inspect(reason)}")
{:error, {:connection_failed, address}}
end
end
Now, from hour x, after attempting to send a message, the library starts receiving the :ssl_closed message/callback to indicate that the SSL connection got closed
def handle_info({:ssl_closed, socket}, %{socket_apple: socket} = state) do
APNS.Logger.debug("ssl socket closed, returning :connect")
{:connect, {:error, "ssl_closed"}, %{state | socket_apple: nil}}
end
How it handles this is that it just let's the connection close and returns :connect, which will then re-connect to APNS (here)
Once push notifications stop working, the debug log always reports the following pattern on every message.
Attempt to send the message
Report "success sending" (nothing is being delivered to the phones. This message is caused by :ssl.send reporting :ok)
Then receive a ssl socket close message
Reconnect to gateway.push.apple.com (:ssl.connect returns :ok)
Repeat
send_package code:
def send_package(socket, packet) do
result = :ssl.send(socket, [packet])
case result do
:ok ->
APNS.Logger.debug("success sending ssl package")
{:error, reason} ->
APNS.Logger.warn("error #{reason} sending ssl package")
end
result
end
In contrast, on successful sending it stops at point 2.
Here is some raw log output from my app when sending a push (notice the last 9 lines showing the pattern I described)
01:41:14.820 request_id=fecds3h3s1so2825c44qfestvvvpv707 [debug] [APNS] #PID<0.20135.97> 23303051:1ad798 sending in poolboy transaction :myapp
01:41:14.821 request_id=fecds3h3s1so2825c44qfestvvvpv707 [debug] [APNS] #PID<0.20135.97> 23303051:1ad798 sending message
01:41:14.821 request_id=fecds3h3s1so2825c44qfestvvvpv707 [debug] [APNS] #PID<0.20135.97> 62064556:b12e98 sending in poolboy transaction :myapp
01:41:14.821 [debug] [APNS] #PID<0.349.0> 23303051:1ad798 handling cast :send
01:41:14.821 [debug] [APNS] #PID<0.349.0> 23303051:1ad798 message's payload looks good
01:41:14.821 request_id=fecds3h3s1so2825c44qfestvvvpv707 [debug] [APNS] #PID<0.20135.97> 62064556:b12e98 sending message
01:41:14.821 request_id=fecds3h3s1so2825c44qfestvvvpv707 [debug] [APNS] #PID<0.20135.97> 19048099:b3ed8e sending in poolboy transaction :myapp
01:41:14.822 [debug] [APNS] #PID<0.349.0> success sending ssl package
01:41:14.822 [debug] [APNS] #PID<0.349.0> 23303051:1ad798 success sending
01:41:14.822 [debug] [APNS] #PID<0.349.0> 23303051:1ad798 handle call :send received :ok
01:41:14.822 [debug] [APNS] #PID<0.348.0> 62064556:b12e98 handling cast :send
01:41:14.822 [debug] [APNS] #PID<0.348.0> 62064556:b12e98 message's payload looks good
01:41:14.823 request_id=fecds3h3s1so2825c44qfestvvvpv707 [debug] [APNS] #PID<0.20135.97> 19048099:b3ed8e sending message
01:41:14.823 request_id=fecds3h3s1so2825c44qfestvvvpv707 [info] Sent 200 in 22ms
01:41:14.823 [debug] [APNS] #PID<0.348.0> success sending ssl package
01:41:14.823 [debug] [APNS] #PID<0.348.0> 62064556:b12e98 success sending
01:41:14.823 [debug] [APNS] #PID<0.348.0> 62064556:b12e98 handle call :send received :ok
01:41:14.823 [debug] [APNS] #PID<0.347.0> 19048099:b3ed8e handling cast :send
01:41:14.824 [debug] [APNS] #PID<0.347.0> 19048099:b3ed8e message's payload looks good
01:41:14.824 [debug] [APNS] #PID<0.347.0> success sending ssl package
01:41:14.824 [debug] [APNS] #PID<0.347.0> 19048099:b3ed8e success sending
01:41:14.824 [debug] [APNS] #PID<0.347.0> 19048099:b3ed8e handle call :send received :ok
01:41:15.027 [debug] [APNS] #PID<0.348.0> ssl socket closed, returning :connect
01:41:15.029 [debug] [APNS] #PID<0.347.0> ssl socket closed, returning :connect
01:41:15.043 [debug] [APNS] #PID<0.349.0> ssl socket closed, returning :connect
01:41:15.207 [debug] [APNS] #PID<0.348.0> successfully connected to gateway.push.apple.com:2195
01:41:15.207 [debug] [APNS] #PID<0.348.0> successfully connected to socket
01:41:15.209 [debug] [APNS] #PID<0.347.0> successfully connected to gateway.push.apple.com:2195
01:41:15.209 [debug] [APNS] #PID<0.347.0> successfully connected to socket
01:41:15.214 [debug] [APNS] #PID<0.349.0> successfully connected to gateway.push.apple.com:2195
01:41:15.214 [debug] [APNS] #PID<0.349.0> successfully connected to socket
One theory is that GCE is closing the connection for being idle but this doesn't explain why another message after reconnect immediately results in the same pattern. Also why does the socket only close after sending with :ssl.send?
I have same issue with apns4erl, when socket close after trying send message, but problem was on my side, do not remember, it was either in the wrong certificate file or malformed messages
Related
I have an ASP.NET Core application using a SignalR hub. When running via a console application (development mode), no keep-alive requests are sent by the server to the client. Consequently, the connection is re-established every 30 seconds or so.
However, when running the same application via Service Fabric, keep-alive requests are sent and everything works as expected.
Here are the server logs when running under the console app:
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionManager[1]
New connection T2NKQg0jyrm7QAPz4p0ZWA created.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionDispatcher[4]
Establishing new connection.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionHandler[5]
OnConnectedAsync started.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[1]
Socket opened using Sub-Protocol: '(null)'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 32, EndOfMessage: True.
dbug: Microsoft.AspNetCore.SignalR.Internal.DefaultHubProtocolResolver[2]
Found protocol implementation for requested protocol: json.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionContext[1]
Completed connection handshake. Using HubProtocol 'json'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[11]
Sending payload: 3 bytes.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 11, EndOfMessage: True.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 11, EndOfMessage: True.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[4]
Waiting for the application to finish sending data.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[2]
Socket closed.
trce: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionContext[1]
Disposing connection T2NKQg0jyrm7QAPz4p0ZWA.
trce: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionContext[2]
Waiting for application to complete.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionHandler[6]
OnConnectedAsync ending.
trce: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionContext[3]
Application complete.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionManager[2]
Removing connection T2NKQg0jyrm7QAPz4p0ZWA from the list of connections.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionManager[1]
New connection JW_1AnoGvhvNJ6MdWGb5RA created.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.HttpConnectionDispatcher[4]
Establishing new connection.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionHandler[5]
OnConnectedAsync started.
dbug: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[1]
Socket opened using Sub-Protocol: '(null)'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 32, EndOfMessage: True.
dbug: Microsoft.AspNetCore.SignalR.Internal.DefaultHubProtocolResolver[2]
Found protocol implementation for requested protocol: json.
dbug: Microsoft.AspNetCore.SignalR.HubConnectionContext[1]
Completed connection handshake. Using HubProtocol 'json'.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[11]
Sending payload: 3 bytes.
trce: Microsoft.AspNetCore.Http.Connections.Internal.Transports.WebSocketsTransport[9]
Message received. Type: Text, size: 11, EndOfMessage: True.
And the client logs:
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Debug: Transport 'WebSockets' started.
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Information: HttpConnection Started.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Information: Using HubProtocol 'json v1'.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending Hub Handshake.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Received message from application. Payload size: 32.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Message received. Type: Text, size: 3, EndOfMessage: True.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Handshake with server complete.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Receive loop starting.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Received message from application. Payload size: 11.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message completed.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Information: HubConnection started.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: The HubConnection is attempting to transition from the Connecting state to the Connected state.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Releasing Connection Lock in StartAsyncInner (/_/src/SignalR/clients/csharp/Client.Core/src/HubConnection.cs:280).
The thread 0x8184 has exited with code 0 (0x0).
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Acquired the Connection Lock in order to ping the server.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Sending PingMessage message completed.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Releasing Connection Lock in RunTimerActions (/_/src/SignalR/clients/csharp/Client.Core/src/HubConnection.cs:1881).
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Received message from application. Payload size: 11.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: Waiting on Connection Lock in HandleConnectionClose (/_/src/SignalR/clients/csharp/Client.Core/src/HubConnection.cs:1279).
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Debug: Disposing HttpConnection.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Information: Transport is stopping.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Send loop stopped.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Transport stopped.
Microsoft.AspNetCore.Http.Connections.Client.HttpConnection: Information: HttpConnection Disposed.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Debug: Canceling all outstanding invocations.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Receive loop canceled.
Microsoft.AspNetCore.Http.Connections.Client.Internal.WebSocketsTransport: Debug: Receive loop stopped.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Trace: The HubConnection is attempting to transition from the Connected state to the Reconnecting state.
Microsoft.AspNetCore.SignalR.Client.HubConnection: Error: HubConnection reconnecting due to an error.
I won't include them here, but the logs when running under Service Fabric show that the server is correctly sending keep-alives to the client ("Sent a ping message to the client").
It might seem obvious that there is some difference in configuration between my console and Service Fabric hosts, but I've gone through it carefully and cannot see anything that would explain this. In fact, the SignalR integration differed only in that the development host configured detailed errors to be enabled, but even if I remove that the behavior remains the same.
Short of running my own build of ASP.NET Core (something I'm perhaps lazily attempting to avoid only because it was looking far from trivial to build), is there anything I might be missing that would explain this situation?
I tried TLS connection from <10.220.17.192> to the external server(10.220.224.126) via nginx through reverse proxying, but at the external server, the connection is going to TIME_WAIT instead of getting ESTABLISHED.
From the nginx debug logs, I could see, "upstream disconnected". Does it means the external server closed the connected ?
2020/12/10 15:09:51 [debug] 10166#0: *11 event timer del: 4: 82382883
2020/12/10 15:09:51 [info] 10166#0: *11 proxy 10.220.17.192:50125 connected to 10.220.224.126:6515
2020/12/10 15:09:51 [debug] 10166#0: *11 malloc: 08DA8A10:16384
2020/12/10 15:09:51 [debug] 10166#0: *11 post event 08D8EFD0
2020/12/10 15:09:51 [debug] 10166#0: *11 epoll add event: fd:3 op:1 ev:80002001
2020/12/10 15:09:51 [debug] 10166#0: *11 event timer add: 3: 14400000:96778145
2020/12/10 15:09:51 [debug] 10166#0: *11 event timer: 3, old: 96778145, new: 96778145
2020/12/10 15:09:51 [debug] 10166#0: *11 delete posted event 08D8EFD0
2020/12/10 15:09:51 [debug] 10166#0: *11 SSL_read: 0
2020/12/10 15:09:51 [debug] 10166#0: *11 SSL_get_error: 5
2020/12/10 15:09:51 [debug] 10166#0: *11 peer shutdown SSL cleanly
2020/12/10 15:09:51 [debug] 10166#0: *11 posix_memalign: 08E22BE0:256 #16
2020/12/10 15:09:51 [debug] 10166#0: *11 write new buf t:0 f:0 00000000, pos 08DA8A10, size: 0 file: 0, size: 0
2020/12/10 15:09:51 [debug] 10166#0: *11 stream write filter: l:1 f:1 s:0
2020/12/10 15:09:51 [info] 10166#0: *11 upstream disconnected, bytes from/to client:0/0, bytes from/to upstream:0/0
I got the answer to the question after analysing logs from nginx source code.
nginx sends a post event if it receives connection closure from the external server.
2020/12/10 15:09:51 [debug] 10166#0: *11 post event 08D8EFD0
The manual page for SSL_read says, if SSL_read: 0
The read operation was not successful. The reason may either be a clean shutdown due to a “close notify” alert sent by the peer (in which case the SSL_RECEIVED_SHUTDOWN flag in the ssl shutdown state is set (see SSL_shutdown(3) and SSL_set_shutdown(3)). It is also possible that the peer simply shut down the underlying transport and the shutdown is incomplete. Call SSL_get_error() with the return value to find out whether an error occurred or the connection was shut down cleanly (SSL_ERROR_ZERO_RETURN)
2020/12/10 15:09:51 [debug] 10166#0: *11 SSL_read: 0
2020/12/10 15:09:51 [debug] 10166#0: *11 SSL_get_error: 5
2020/12/10 15:09:51 [debug] 10166#0: *11 peer shutdown SSL cleanly
For the below nginx logs , this is what the manual says,
2020/12/11 09:13:06 [debug] 11489#0: *1 SSL_shutdown: 1
If the peer already sent the “close notify” alert and it was already processed implicitly inside another function (SSL_read(3)), the SSL_RECEIVED_SHUTDOWN flag is set. SSL_shutdown() will send the “close notify” alert, set the SSL_SENT_SHUTDOWN flag and will immediately return with 1. Whether SSL_RECEIVED_SHUTDOWN is already set can be checked using the SSL_get_shutdown() (see also the SSL_set_shutdown(3) call).
I have been using Zabbix for a while now. I tried to configure the telegram media type so as to receive notifications. Due to some error, I'm not receiving any notification. While testing the media type this is the error that appears on the log. Please help me resolve this.
Media type test log
00:00:00.000 [Debug] [Telegram Webhook] URL: https://api.telegram.org/bot/sendMessage
00:00:00.000 [Debug] [Telegram Webhook] params: {"chat_id":"-xyxyxyxyxy","text":"{ALERT.SUBJECT}\n{ALERT.MESSAGE}","disable_web_page_preview":true,"disable_notification":false}
00:00:05.183 [Debug] [Telegram Webhook] HTTP code: 200
00:00:05.184 [Debug] [Telegram Webhook] notification failed: TypeError: cannot read property 'ok' of null
Eclipse Console Error:
Exception in thread "main" java.lang.NullPointerException
at org.openqa.selenium.remote.RemoteWebElement.execute(RemoteWebElement.java:279)
at org.openqa.selenium.remote.RemoteWebElement.click(RemoteWebElement.java:83)
at amazon.StartApplication.main(StartApplication.java:58)
Appium log:
{"strategy":"id","selector":"in.amazon.mShop.android.shopping:id/sign_in_button","context":"","multiple":false}}
[debug] [AndroidBootstrap] [BOOTSTRAP LOG] [debug] Got data from client: {"cmd":"action","action":"find","params":{"strategy":"id","selector":"in.amazon.mShop.android.shopping:id/sign_in_button","context":"","multiple":false}}
[debug] [AndroidBootstrap] [BOOTSTRAP LOG] [debug] Got command of type ACTION
[debug] [AndroidBootstrap] [BOOTSTRAP LOG] [debug] Got command action: find
[debug] [AndroidBootstrap] [BOOTSTRAP LOG] [debug] Finding 'in.amazon.mShop.android.shopping:id/sign_in_button' using 'ID' with the contextId: '' multiple: false
[debug] [AndroidBootstrap] [BOOTSTRAP LOG] [debug] Using: UiSelector[INSTANCE=0, RESOURCE_ID=in.amazon.mShop.android.shopping:id/sign_in_button]
[debug] [AndroidBootstrap] [BOOTSTRAP LOG] [debug] Returning result: {"status":0,"value":{"ELEMENT":"2"}}
[debug] [AndroidBootstrap] Received command result from bootstrap
[debug] [MJSONWP] Responding to client with driver.findElement() result: {"ELEMENT":"2"}
[info] [HTTP] <-- POST /wd/hub/session/fb4c547d-3a81-4b48-b6ff-cb14eb629138/element 200 28 ms - 87
Waited 60 seconds:
waited 60 seconds for a command
[debug] [AndroidDriver] Shutting down Android driver
Appium server is shutting down after the 1 minute
The default appium new command timeout value is 60 seconds, that's why appium server is shutting down the driver after 60 seconds. You can change the default timeout by setting in the capabilities as below:
capabilities.setCapability(MobileCapabilityType.NEW_COMMAND_TIMEOUT, 6000);
or
capabilities.setCapability("newCommandTimeout", 6000);
Firstly, NullPointerException in Java only occurs when the object it's referring to is null.
So it's possible that the variable you are using in this case is null and you are performing some calls on it, e.g.:
element.click ();
Here, element may be null.
So you need to handle this situation gracefully. If element is null, don't go to click on it, instead, log the error and exit.
I am using Akka.Remote to communicate between a server-side service application and multiple desktop client applications. The clients send a request message to the server (using Akka.net) and waits for the server to reply with a response message. The client applications are transient, meaning that they often connect to the server, stay connected for some time, disconnect and then reconnect again.
The problem I encountered is that sometimes when a client disconnects from the server actor (by shutting down its ActorSystem) and then reconnects back to the server, it does not receive any replies from the server for some time. After a few minutes the communication works without any problems. I found out that this issue occurs when the server sends a reply to a client that has disconnected during the request and is no longer reachable. The server cannot deliver the response message and it somehow marks the client endpoint as invalid.
In the log (on the server side) I am getting the following messages when the client is disconnected.
[DEBUG] 2016-01-21 13:04:58.6151 received AutoReceiveMessage <Terminated>: [akka.tcp://qb#client:8090/user/qb] - ExistenceConfirmed=True ServerActor
[DEBUG] 2016-01-21 13:04:58.6550 Stopped Akka.Remote.Transport.ProtocolStateActor
[ INFO] 2016-01-21 13:04:58.6550 Quarantined address [akka.tcp://qb#client:8090] is still unreachable or has not been restarted. Keeping it quarantined. Akka.Event.DummyClassForStringSources
[DEBUG] 2016-01-21 13:04:58.6725 Stopped Akka.Remote.ReliableDeliverySupervisor
[DEBUG] 2016-01-21 13:04:58.6725 no longer watched by [akka://myservice/system/endpointManager/reliableEndpointWriter-akka.tcp%3a%2f%2fqb%40client%3a8090-2] Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:04:58.6725 Disassociated [akka.tcp://myservice#server:8081] <- akka.tcp://qb#client:8090 Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:04:58.6725 Stopped Akka.Remote.EndpointWriter
And then when the client attempts to reconnect, I get:
[DEBUG] 2016-01-21 13:05:15.5883 ConnectResponse [akka.tcp://qb#client:8090/user/qb] ServerActor
[DEBUG] 2016-01-21 13:05:16.0467 Started (Akka.Remote.Transport.ProtocolStateActor) Akka.Remote.Transport.ProtocolStateActor
[DEBUG] 2016-01-21 13:05:16.0467 Stopped Akka.Remote.Transport.ProtocolStateActor
[ WARN] 2016-01-21 13:05:16.0467 AssociationError [akka.tcp://myservice#server:8081] -> akka.tcp://qb#client:8090: Error [Invalid address: akka.tcp://qb#client:8090] [] Akka.Remote.EndpointWriter
[ INFO] 2016-01-21 13:05:16.0467 Quarantined address [akka.tcp://qb#client:8090] is still unreachable or has not been restarted. Keeping it quarantined. Akka.Event.DummyClassForStringSources
[DEBUG] 2016-01-21 13:05:16.0643 Stopped Akka.Remote.ReliableDeliverySupervisor
[DEBUG] 2016-01-21 13:05:16.0711 no longer watched by [akka://myservice/system/endpointManager/reliableEndpointWriter-akka.tcp%3a%2f%2fqb%40client%3a8090-4] Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:05:16.0711 Disassociated [akka.tcp://myservice#server:8081] -> akka.tcp://qb#client:8090 Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:05:16.0711 Stopped Akka.Remote.EndpointWriter
[DEBUG] 2016-01-21 13:05:16.0867 received AutoReceiveMessage <Terminated>: [akka://myservice/system/endpointManager/reliableEndpointWriter-akka.tcp%3a%2f%2fqb%40client%3a8090-4] - ExistenceConfirmed=True Akka.Remote.EndpointManager
[DEBUG] 2016-01-21 13:05:16.0867 Terminated [akka.tcp://qb#client:8090/user/qb] ServerActor
I suspect that this behavior is a feature of Akka.net, however, I need to implement my system so that clients can disconnect and then reconnect back to the server without the need to wait. Is there any way to disable the quarantine mechanism or to gracefully close the client endpoint on the server so that the client endpoint doesn't get quarantined?
[ INFO] 2016-01-21 13:04:58.6550 Quarantined address [akka.tcp://qb#client:8090] is still unreachable or has not been restarted. Keeping it quarantined. - that says it all. The node was quarantined which requires a restart of the actor system.
However, IMHO - just upgrade to Akka.NET 1.0.6, which we released on Monday. We made the remoting policy manager much less brittle than it has been historically.