StackExchange.Redis.StrongName.dll!StackExchange.Redis.SocketManager.WriteAllQueues() - redis

In our webapi code we use SignalR and redis backplane. I'm see the problem where our code hangs after sometime.
StackExchange.Redis.StrongName.dll!StackExchange.Redis.SocketManager.WriteAllQueues() Line 288
Actually the code for caching the data works (I can see data getting populated in redis server) but after couple web request/response the code hangs. I've installed the latest package 'StackExchange.Redis 1.1.608'. Unfortunately in VS I don't see my code in stack when I hit break-all.
Any ideas what might be wrong or where to look for the problem. I wish I could put more details here but this is all I got. Thanks!
here is the snapshot of threads when I hit break-all in VS
[Threads]:

Those are threads created by the StackExchange.Redis library to read and write to the socket connected to Redis.
It is normal for the writer thread to be be blocked inside WriteAllQueues() because it is calling Monitor.Wait(). The thread will awaken when there is work to do.
The read thread probably blocks on reading from the socket, but I can't find in the source code where this happens.

Related

Is there any internal timeout in Microsoft UIAutomation?

I am using the UI Automation COM-to-.NET Adapter to read the contents of the target Google Chrome browser that plays a FLASH content on Windows 7. It works.
I succeeded to get the content and elements. Everything works fine for some time but after few hours the elements become inaccessible.
The (AutomationElement).FindAll() returns 0 children.
Is there any internal undocumented Timeout used by UIAutomation ?
According to this IUIAutomation2 interface
There are 2 timeouts but they are not accessible from IUIAutomation interface.
IUIAutomation2 is supported only on Windows 8 (desktop apps only).
So I believe there is some timeout.
I made a workaround that restarts the searching and monitoring of elements from the beginning of the desktop tree but the elements are still not available.
After some time (not sure how much) the elements are available again.
My requirements are to read the values all the time as fast as possible but this behavior makes a damage to the whole architecture.
I read somewhere that there is some timeout of 3 minutes but not sure.
if there is a timeout, is it possible to change it ?
Is it possible to restart something or release/dispose something ?
I can't find anything on MSDN.
Does anybody have any idea what is happening and how to resolve ?
Thanks for this nicely put question. I have a similar issue with a much different setup. I'm on Win7, using UIAutomationCore.dll directly from C# to test our application-under-development. After running my sequence of actions & event subscriptions and all the other things, I intermittently observe that the UIA interface stops working (about 8-10min in my case, but I'm heavily using the UIA interface).
Many different things including dispatching the COM interface, sleeping at different places failed. The funny revelation was I managed to use the AccEvent.exe (part of SDK like inspect.exe) during the test and saw that events also stopped flowing to AccEvent, too. So it wasn't my client's interface that stopped, but it was rather the COM-server (or whatever the UIAutomationCore does) that stopped responding.
As a solution (that seems to work most of the time - or improve the situation a lot), I decided I should give the application-under-test some breathing point, since using the UIA puts additional load on it. This could be a smartly-put sleep points in your client, but instead of sleeping a set time, I'm monitoring the processor load of the application and waiting until it settles down.
One of the intermittent errors I receive when the problem manifests itself is "... was unable to call any of the subscribers..", and my search resulted in an msdn page saying they have improved things on CUIAutomation8 interface, but as this is Windows8 specific, I didn't have the chance to try that yet.
I should also add that I also reduced the number of calls to UIA by incorporating more ui caching (FindAllBuildCache), as the less the frequency of back-and-forth the better it is for the uia. Thanks to the answer of Guy in another question: UI Automation events stop being received after a while monitoring an application and then restart after some time

silverlight wcf callback error try catch fails

I am working on an application in Silverlight 5. We use WCF for all of our network communication, and it mostly works well. However, we have a couple of Virtual Machines that we use for testing where the app fails, tries to restart itself, fails, etc. in an endless loop. I have added a lot of tracing code, and a lot of try catches, and I have it isolated all the way down to the line of code that is failing, but I still can't get an actual error message from the failure, just the crash. Originally, it was failing on this line of WCF code:
return await Task<List<Instance>.Factory.FromAsync(Channel.BeginGetInstance, Channel.EndGetInstance, null);
In case it had something to do with the use of async/await, I went back to our old code with callbacks. I still get the same failure, but now I can see the call to the WCF function completes successfully, but the log statement on the first line of the callback never happens, so it seems like its dying before or outside of the callback.
One other note, it appears the code we have in Application_UnhandledException is not firing, but the code in Application_Exit does run, I see that as the last line in the log file.
I tried to setup remote debugging, but I am unable to connect to the app before it crashes and recycles, so that didn't help either.
I also used TCPView to watch the network traffic, and it looks like communication is happening in both directions.
If anyone has any suggestions of anything else to try, I would greatly appreciate it.
I spent 10 days chasing my tail on this before finally realizing my problem. There was a bug in the error logging code. It was generating an error, I was just not seeing it. Once I realized that and got the error message, the actual underlying bug was fixed in about 5 minutes. Good lesson though, never assume the underlying code is working, no matter how simple it is.

How to detect alarm-based blocking RabbitMQ producer?

I have a producer sending durable messages to a RabbitMQ exchange. If the RabbitMQ memory or disk exceeds the watermark threshold, RabbitMQ will block my producer. The documentation says that it stops reading from the socket, and also pauses heartbeats.
What I would like is a way to know in my producer code that I have been blocked. Currently, even with a heartbeat enabled, everything just pauses forever. I'd like to receive some sort of exception so that I know I've been blocked and I can warn the user and/or take some other action, but I can't find any way to do this. I am using both the Java and C# clients and would need this functionality in both. Any advice? Thanks.
Sorry to tell you but with RabbitMQ (at least with 2.8.6) this isn't possible :-(
had a similar problem, which centred around trying to establish a channel when the connection was blocked. The result was the same as what you're experiencing.
I did some investigation into the actual core of the RabbitMQ C# .Net Library and discovered the root cause of the problem is that it goes into an infinite blocking state.
You can see more details on the RabbitMQ mailing list here:
http://rabbitmq.1065348.n5.nabble.com/Net-Client-locks-trying-to-create-a-channel-on-a-blocked-connection-td21588.html
One suggestion (which we didn't implement) was to do the work inside of a thread and have some other component manage the timeout and kill the thread if it is exceeded. We just accepted the risk :-(
The Rabbitmq uses a blocking rpc call that listens for a reply indefinitely.
If you look the Java client api, what it does is:
AMQChannel.BlockingRpcContinuation k = new AMQChannel.SimpleBlockingRpcContinuation();
k.getReply(-1);
Now -1 passed in the argument blocks until a reply is received.
The good thing is you could pass in your timeout in order to make it return.
The bad thing is you will have to update the client jars.
If you are OK with doing that, you could pass in a timeout wherever a blocking call like above is made.
The code would look something like:
try {
return k.getReply(200);
} catch (TimeoutException e) {
throw new MyCustomRuntimeorTimeoutException("RabbitTimeout ex",e);
}
And in your code you could handle this exception and perform your logic in this event.
Some related classes that might require this fix would be:
com.rabbitmq.client.impl.AMQChannel
com.rabbitmq.client.impl.ChannelN
com.rabbitmq.client.impl.AMQConnection
FYI: I have tried this and it works.

Monitor and handle MSGW messages on a job on an IBM i-series (AS/400) from Java

Does anyone know how one can automatically reply to messages with status MSGW that block a job on an IBM i-series (AS/400)?
I'm using the jt400/jtopen library to access a program on an AS/400 from Java. I'm using the com.ibm.as400.access.ProgramCall class, which works fine, unless the program fails for some reason. As with almost any program, failures will happen sometimes, but unfortunately, in this case, it does not result in a status message or an exception. Instead, the calling thread just hangs. What's worse, any call to the AS/400 to get information on the Job (another class in jt400 that mostly does what you would expect) backing the queue will hang as well.
I could of course monitor the thread in which the call runs and simply kill it after waiting for a while, but that's a last resort. Getting an error message back from the system would be nice.
You could try execute this command before invoke your pcml with com.ibm.as400.access.CommandCall.run() method:
CHGJOB INQMSGRPY(*DFT)
It sets 'C' as default answer for all messages.
but you should ensure you have log of the messages in order to know the problem which generates this message
Regards,
I don't believe Java can directly trap errors that occur on the other side of that API. What I've done is to 'harden' the RPG (IBM i side) program so that it monitors for errors rather than let the default error handler get them. When an error occurs, the RPG program gracefully terminates and passes back an error code or even the entire message back to the Java application.
I've found that you can use the timeout mechanism of ExecutorService to interrupt a ProgramCall in MSGW.
You must discard the AS400 object afterwards, and the server job is still in MSGW, but at least you can continue on the Javaside.
(You need to use a separate AS400 object if you want to investigate on the hanging job.)

WCF Service hangs on the 14th call

I'm having a problem where the WCF service hangs after 13-14 asynchronous process calls from the client. This occurs all the time. The client is a mobile JavaFX app. There is no specific error outputted in the server as well as in client. Someone suggested that it might be a throttling issue.
I've set the service side .config parameters maxConcurrent calls from 10 to 500
<serviceThrottling maxConcurrentCalls="500" maxConcurrentSessions="500” />
So this means, it should be able to accept more than 10 calls, right? However, it didn't resolve this issue. Still hangs on the 13-14th process call.
Only one client is connecting to this web service.
What do you think is wrong?
Do you close the client after doing your call?
When I encountered this problem, I did not close it, and the open requests blocked the service after a short time.
Edit: Ok, I know nothing about JavaFX =) The code below is C#, sorry. But you can surely do something similar.
Use either
WcfClient client = new WcfClient()
// ...
client.Close()
or
using(WcfClient client = new WcfClient()){
// ...
}
Similar problem here - I have an app calling from one process to another, locally, named pipes.
Calls are really light in code- basically takex an array of serializable objects, queues them on other side. Occasionally it hangs. Restarts afte rtimeout. no data lost, but... as the data is financial data, and the receiving app an autoamted trading system, that may result in very bad financial issues. Not been able to reproduce it yet.
This could very easily be caused by any deadlock condition in your code. If your service locks up and starts eating up 100% or CPU you have a dead lock. Create a dump file and see where your code was at.
I ran into the same issue my first WCF app it was a dictionary that i wasn't making sure was synchronized in logging code.
The SvcTraceViewer is super helpful in figuring out tough wcf