Websocket ping timeout freezes the Mattermost "bot" - api

I'm creating a Mattermost bot. It stops responding after the websocket connection receives a ping timeout (PingTimeoutChannel) after random periods of time (1 minute, 8 minutes, 2 hours etc.). Mattermost server is v.5.13, API v.4.
The bot connects to the Mattermost API by creating new Client4. Next it logs in as the user and after it creates a Websocket client with the authorization token received. It starts listening on all channels and when it receives an event which is a message directed to him (#botname) it responds automatically (creates model.post).
I chose to use simple username/password authentication for logging in, just as it is in the Mattermost sample bot. However, I tried to rewrite it to the personal access token authentication (as in here) because I'd thought it'd solve the timeout problem. However, this solution doesn't work anymore, it gives the "Invalid or expired session error, please login again" while trying to login that way.
So I dropped this idea and started searching where the timeout happens. The server pings are ok, the websocket's are not. I tried many ways, to the point where I just reconnect (by creating new Mattermost API and Websocket clients again). The bot still does not respond. I've run out of ideas.
Websocket connection (skipped error handling):
if config.BotCfg.Port == "443" {
protocol = "https"
secure = true
}
config.ConnectionCfg.Client = model.NewAPIv4Client(fmt.Sprintf("%s://%s:%s", protocol, config.BotCfg.Server, config.BotCfg.Port))
user,resp := config.ConnectionCfg.Client.Login(config.BotCfg.BotName, config.BotCfg.Password)
setBotTeam()
if limit.Users == nil {
limit.SetUsersList()
}
ws := "ws"
if secure {
ws = "wss"
}
if Websocket != nil {
Websocket.Close()
}
websocket, err := model.NewWebSocketClient4(fmt.Sprintf("%s://%s:%s", ws, config.BotCfg.Server, config.BotCfg.Port), config.ConnectionCfg.Client.AuthToken)
Listening function:
for {
select {
case <-connection.Websocket.PingTimeoutChannel:
logs.WriteToFile("Websocket ping timeout. Connecting again.")
log.Println("Websocket ping timeout. Connecting again.")
mux.Lock()
connection.Connect()
mux.Unlock()
case event := <-connection.Websocket.EventChannel:
mux.Lock()
if event != nil {
if event.IsValid() && isMessage(event.Event){
handleEvent(event)
}
}
mux.Unlock()
}
}
}()
// block to the go function
select {}
I expect the bot to run continuously.
If you have any suggestions how to fix this issue, I'd really appreciate that!
Edit: As Cerise suggested, I added the SIGQUIT to the exit function and ran a race detector. Fixed the data race issue by deleting one if from the case event := [...]. Race detector doesn't report any issues anymore, however the bot still stops responding after some time.
I found out that the first time PingTimeout occurs, the peer stops responding until I restart the app. The reconnection of Websocket doesn't help. However, I don't actually know how to solve this problem or does the solution even exist.

Related

Websphere application server LDAP connection pool

We are using websphere application server 8.5.0.0. we have a requirement where we have to query a LDAP server to get the customer details. I tried to configure the connection pool as described here and here.
I passed the below JVM arguments
-Dcom.sun.jndi.ldap.connect.pool.maxsize=5
-Dcom.sun.jndi.ldap.connect.pool.timeout=60000
-Dcom.sun.jndi.ldap.connect.pool.debug=all
Below is a sample code snippet
Hashtable<String,String> env = new Hashtable<String,String>();
...
...
env.put("com.sun.jndi.ldap.connect.pool", "true");
env.put("com.sun.jndi.ldap.connect.timeout", "5000");
InitialDirContext c = new InitialDirContext(env);
...
...
c.close();
I have two issues here
When I am calling the service for the 6th time, I am getting javax.naming.ConnectionException: Timeout exceeded while waiting for a connection: 5000ms. I checked the connection pool debug logs and I noticed the connections are not returning back to the pool immediately despite closing the context safely in a finally block. The connections are released after some time and expired after sometime after the release. There after if I call the service again, it connects to the LDAP server but new connections are being created.
I tried to execute the code and I am able to see the connection pool debug logs. But the logs are being logged in System.Err log. Is this an issue? Can I ignore it?
But when I run the code as a standalone application(multithreaded with loop of 50 times), the connections are returned/released immediately.
Can anyone please let me know what am I doing wrong?

TcpListener actively refused connection

I am maintaining an application that uses a TcpListener to listen for incoming communication. The following code opens the connection:
Dim listener As TcpListener
_listenFailed = False
Try
listener = New TcpListener(System.Net.IPAddress.Parse(Me.Host), Me.Port)
listener.Start()
Catch ex As Exception
' an error here means the settings are likely bogus
_listenFailed = True
Return
End Try
While Not _stoplistening
{
' Accept connection
}
The problem that I am having is that when i send a file from a different computer, I get the error "No connection could be made because the target machine actively refused it."
I have checked for firewalls and antivirus, and there were no blocks. I used
netstat -a -n
to determine that the port is active and listening. Both applications are running from Visual Studios in Administrator mode, not that this should make a difference. I have a break point set at the first line of the accept connection code, but it never gets run.
I stopped the application and examined the TcpListener, and found that there were a couple errors if I dug deep down. At TcpListener.LocalEndpoint.IPEndPoint.IPAddress.Address.ErrorCode there was the error 10045, "OperationNotSupported". Also, at TcpListener.Socket.RemoteEndPoint.ErrorCode there was an error 1057, "A request to send or receive data was disallowed because the socket is not connected, and (when sending on a datagram socket using a sendto call) no address was supplied. I don't know if either of these errors are relevant.
If anyone has any insights as to what might be causing this problem, or steps that can be taken to trace the root of the issue, I would be grateful.
Try changing this:
listener = New TcpListener(System.Net.IPAddress.Parse(Me.Host), Me.Port)
To this:
listener = New TcpListener(System.Net.IPAddress.Any, Me.Port)
This will allow you to listen across all network interfaces.
Additionally, you can use IPv6Any instead of Any if you want to target IPv6 instead. This choice has an impact on the address used by the client side obviously.

WP7 error: "The remote server returned an error: NotFound."

I have this code that it used to work, but at some specific time it stopped to work and returned the error "The remote server returned an error: NotFound."
WebClient deliciousWebClient = new WebClient();
deliciousWebClient.Credentials = Credentials;
deliciousWebClient.DownloadStringAsync(new Uri("https://api.del.icio.us/v1/tags/get"));
deliciousWebClient.DownloadStringCompleted += (s, ee) =>
{
if (ee.Error == null)
{
…
Any suggestion on this error?
In this code, the error is pointing to delicious endpoit, but the same error is happening with some other services...
The NotFound error is a classic 404 error, so it's possible that the API endpoint is down or that it changed on you.
I'd start by using Fiddler2 to make the requests manually. That'll help you figure out whether the issue is in your code somewhere or on the API side.
It's hard to get Fiddler working with the WP7 emulator, as you noted below. One trick I've used in the past when I got really desperate was to write a quick console app that used the same code that my Windows Phone app was executing. Then I was able to successfully intercept the traffic. It turned out that my request being properly formatted.

Task Persistence C#

I'm having a hard time trying to get my task to stay persistent and run indefinitely from a WCF service. I may be doing this the wrong way and am willing to take suggestions.
I have a task that starts to process any incoming requests that are dropped into a BlockingCollection. From what I understand, the GetConsumingEnumerable() method is supposed to allow me to persistently pull data as it arrives. It works with no problem by itself. I was able to process dozens of requests without a single error or flaw using a windows form to fill out the request and submit them. Once I was confident in this process I wired it up to my site via an asmx web service and used jQuery ajax calls to submit request.
The site submits request based on a url that is submitted, the Web Service downloads the html content from the url and looks for other urls within the content. It then proceeds to create a request for each url it finds and submits it to the BlockingCollection. Within the WCF service, if the application is Online (i.e. Task has started) - it pulls the request using the GetConsumingEnumerable via a Parallel.ForEach and Processes the request.
This works for the first few submissions, but then the task just stops unexpectedly. Of course, this is doing 10x more request than I could simulate in testing - but I expected it to just throttle. I believe the issue is in my method that starts the task:
public void Start()
{
Online = true;
Task.Factory.StartNew(() =>
{
tokenSource = new CancellationTokenSource();
CancellationToken token = tokenSource.Token;
ParallelOptions options = new ParallelOptions();
options.MaxDegreeOfParallelism = 20;
options.CancellationToken = token;
try
{
Parallel.ForEach(FixedWidthQueue.GetConsumingEnumerable(token), options, (request) =>
{
Process(request);
options.CancellationToken.ThrowIfCancellationRequested();
});
}
catch (OperationCanceledException e)
{
Console.WriteLine(e.Message);
return;
}
}, TaskCreationOptions.LongRunning);
}
I've thought about moving this into a WF4 Service and just wire it up in a Workflow and use Workflow Persistence, but am not willing to learn WF4 unless necessary. Please let me know if more information is needed.
The code you have shown is correct by itself.
However there are a few things that can go wrong:
If an exception occurs, your task stops (of course). Try adding a try-catch and log the exception.
If you start worker threads in a hosted environment (ASP.NET, WCF, SQL Server) the host can decide arbitrarily (without reason) to shut down any worker process. For example, if your ASP.NET site is inactive for some time the app is shut down. The hosts that I just mentioned are not made to have custom threads running. Probably, you will have more success using a dedicated application (.exe) or even a Windows Service.
It turns out the cause of this issue was with the WCF Binding Configuration. The task suddenly stopped becasue the WCF killed the connection due to a open timeout. The open timeout setting is the time that a request will wait for the service to open a connection before timing out. In certain situations, it reached the limit of 10 max connection and caused the incomming connections to get backed up waiting for a connection. I made sure that I closed all connections to the host after the transactions were complete - so I gave in to upping the max connections and the open timeout period. After this - it ran flawlessly.

Recovering from a CommunicationObjectFaultedException in WCF

I have a client app that tries every 10 seconds to send a message over a WCF web service. This client app will be on a computer on board a ship, which we know will have spotty internet connectivity. I would like for the app to try to send data via the service, and if it can't, to queue up the messages until it can send them through the service.
In order to test this setup, I start the client app and the web service (both on my local machine), and everything works fine. I try to simulate the bad internet connection by killing the web service and restarting it. As soon as I kill the service, I start getting CommunicationObjectFaultedExceptions--which is expected. But after I restart the service, I continue to get those exceptions.
I'm pretty sure that there's something I'm not understanding about the web service paradigm, but I don't know what that is. Can anyone offer advice on whether or not this setup is feasible, and if so, how to resolve this issue (i.e. re-establish the communications channel with the web service)?
Thanks!
Klay
Client service proxies cannot be reused once they have faulted. You must dispose of the old one and recreate a new one.
You must also make sure you close the client service proxy properly. It is possible for a WCF service proxy to throw an exception on close, and if this happens the connection is not closed, so you must abort. Use the "try{Close}/catch{Abort}" pattern. Also bear in mind that the dispose method calls close (and hence can throw an exception from the dispose), so you can't just use a using like with normal disposable classes.
For example:
try
{
if (yourServiceProxy != null)
{
if (yourServiceProxy.State != CommunicationState.Faulted)
{
yourServiceProxy.Close();
}
else
{
yourServiceProxy.Abort();
}
}
}
catch (CommunicationException)
{
// Communication exceptions are normal when
// closing the connection.
yourServiceProxy.Abort();
}
catch (TimeoutException)
{
// Timeout exceptions are normal when closing
// the connection.
yourServiceProxy.Abort();
}
catch (Exception)
{
// Any other exception and you should
// abort the connection and rethrow to
// allow the exception to bubble upwards.
yourServiceProxy.Abort();
throw;
}
finally
{
// This is just to stop you from trying to
// close it again (with the null check at the start).
// This may not be necessary depending on
// your architecture.
yourServiceProxy = null;
}
There was a blog article about this, but it now appears to be offline. A archived version is available on the Wayback Machine.