I wonder if process could stay in blocked state even if there is not any other process that want to run.
Theroticaly i guess the answer is yes ,but in practically it is not .. i'm right ?
Simple example: your process X is waiting for some IO to show on some socket indiscriminately.
In other words: you make some sort of read() call without any kind of timeout. That process will just sit there, and do nothing. No matter what other processes in that system do.
Related
I am facing an issue in our C based application where one of VxWorks TASK(say Task1) got crashed due to some unknown reasons. The crashed task had locked a mutual exclusion semaphore(say semA).
Now the next TASK2 is waiting on semA to get Unlocked. Since semA is locked by a crashed TASK, TASK2 will be waiting infinitely to grab semA. This has broken application functionality.
We can not provide a timeout to lock semA in TASK2 becuase semA is protecting a send routing that is sending data over sockets. Providing a timeout will result in failure in message communication.
After googling I have found ROBUST mutex for LINUX for such problem, but our platform is VxWorks(version 5.5.1).
So can somebody tell me the way by which we can handle this problem in VxWorks?
I have tried a below mentioned solution nut not sure how safe it is to do so.
1) TASK2 will wait on semA for a particular timout
2) if failed check the state of previous task that had locked the semA
3 if TASK1 state is SUSPENDED, TASK 2 will call semDelete on semA and than recreate it.
4) if TASK1 is not in SUSPENDED state, keep on waiting to grab semA.
I have test this code as prototype and is working fine. I am not sure about how good is to implement such solution where we recreate semaphore and what will be the possible risks imposed.
Please let me know your inputs.
Thanks
I think your prototyped solution is not anymore risky than having code (Task1) that crashes for unknown reasons.
If I were to work on your problem, I would first try really hard to find out why Task1 is crashing. If I were unable to figure out the root cause, I would then go to implement your proposed solution. That is, I would query the state of Task2 after a certain amount of time, and then recreate the semaphore.
I must say, that even if you implement your work around of recreating the semaphore, then you still have a crashed task which consumes resources. If this problem persists, then eventually the whole system will stop working.
In the end the correct and only way to fix this problem is to fix the crash in task1. You should be able to get a stack trace to where it crashed and fix it.
I second the previous answers: finding the cause why Task1 crashes is better than implementing a workaround.
Can you post the messages written by VxWorks of the crashed Task1?
One of the first things I try if a task crashes for no good reason is to increase its stack size (let's say double it). If the task runs fine your stack size is too small. Also try to increase the stack size of the task(s) you've modified lately!
If it is a stack problem it isn't neccessarily Task1 which is to blame...
in my Cocoa project, I communicate with a device connected to a serial port. Now, I am waiting for the serial device to send a particular message of some bytes. For the read operation (and the reaction for once the desired message has been received), I created a new thread. On user request, I want to be able to cancel the thread.
As Apple suggests in the docs, I added a flag to the thread dictionary, periodically check if the flag has been set and if so, call [NSThread exit]. This works fine.
Now, the thread may be stuck waiting for the serial device to finally send the 12 byte message. The read call looks like this:
numBytes = read(fileDescriptor, buffer, 12);
Once the thread starts reading from the device, but no data comes in, I can set the flag to tell the thread to finish, but the thread is not going to read the flag unless it finally received at least 12 bytes of data and continues processing.
Is there a way to kill a thread that currently performs a read operation on a serial device?
Edit for clarification:
I do not insist in creating a separate thread for the I/O operations with the serial device. If there is a way to encapsulate the operations such that I am able to "kill" them if the user presses a cancel button, I am perfectly happy.
I am developing a Cocoa application for desktop Mac OS X, so no restrictions regarding mobile devices and their capabilities apply.
A workaround would be to make the read function return immediately if there are no bytes to read. How can I do this?
Use select or poll with a timeout to detect when the descriptor is ready for reading.
Set the timeout to (say) half a second and call it in a loop while checking to see if your thread should exit.
Asynchronous thread cancellation is almost always a bad idea. Try to stick with event-driven interfaces (and, if necessary, timeouts).
This is exactly what the pthread_cancel interface was designed for. You'll want to wrap the block with read in pthread_cleanup_push and pthread_cleanup_pop in order that you can safely clean up if the thread is cancelled, and also disable cancellation (with pthread_setcancelstate) in other code that runs in this thread that you don't want to be cancellable. This can be a pain if proper cleanup would involve multiple call frames; it essentially forces you to use pthread_cleanup_push at every call level and structure your thread code like C++ or Java with try/catch style exception handling.
An alternative approach would be to install a signal handler for an otherwise-unused signal (like SIGUSR1 or one of the realtime signals) without the SA_RESTART flag, so that it interrupts syscalls with EINTR. The signal handler itself can be a complete no-op; the only purpose of it is to interrupt things. Then you can use pthread_kill to interrupt the read (or any other syscall) in a particular thread. This has the advantage that you don't have to switch your code to using C++/Java-type idioms. You can handle the EINTR error by checking a flag (indicating whether the thread was requested to abort) and resume the read if the flag is not set, or return an error code that causes the caller to clean up and eventually pthread_exit.
If you do use interrupting signal handlers, make sure all your syscalls that can return EINTR are wrapped in loops that retry (or check the abort flag and optionally retry) on EINTR. Otherwise things can break badly.
I want to write a program that can not close it
In other words, they can not kill process .
The solution of the process of closure for less than the other way is to do a good job
I have another solution?
Assuming Linux, this is not quite possible from userspace because there are signals like
SIGKILL that cannot be ignored or handled.
If you have access to kernel, than you could write a kernel moudule which puts the current process in state TASK_UNINTERRUPTIBLE and thus no one could kill it.
please specify your question in detail.
i think you have facing error on previous activity is onResume();
you have to on resume method for removing this error.
in onResume() method you can call finish(); method for stop activity.
I'm building a program that has a very basic premise.
For X amount of Objects
Open Conection
Perform Actions
Close Connection
Open Next
Each of these connections is made on a socks5 proxy and after about the 200th connection I get "The operation has timeout" errors. I have tested all the proxies and they work just fine and the really wierd thing is if I shut down the program and restart it again the problems go away. So I'm left to believe that when I'm closing my connection that its really not closing the connection and the computer is being overloaded. How cna i force all socks connections to close that are associated with a class?
socket.Shutdown(SocketShutdown.Both);
//socket.Close();
socket.Disconnect(true);
socket = null;
In reponse to a tip to use netstat I checked it out. I noticed connections where lingering but finally would go away. However, the problem still remains, after about the 100th connection, 5 second pause between connections. I get timeout errors. If I close the proram and restart it they go away. So for some reason I feel that the connections are leaving behind something. Netstat dosent show it. I've even tried adding the instances of the client to a list and each time one is finish remove it from the list and then set it to null. Is there a way to kill a port? Maybe that would work, if I killed the port the connection was being made on? Is it possible this is a Windows OS issue? Something thats used to prevent viruses? I'm making roughly a connection a minute and mainint that connection for about 1 minute before moving on to the next with atleast 20 concurent if not more connections at the same time. What dosent make sense to me is that shuting down the program seem sto clean up whatever resources I'm not cleaning up in my code. I'm using an class I found on the internet that allows socks5 proxies to be used with the socket class. So i'm really at a loss, any advice or direction to head woudl be great? It dosent have to be pretty. I'm have tempted to wite to a text file where I was in my connection list and shutdown the program and then have anohter program restart it to pick up where it left off at this point.
Sounds like your connections aren't really closed. Without seeing the code, it's hard to troubleshoot this; can you boil it down to a program that loops through an open-close sequence?
If the connection doesn't close as expected, you can probably see what state it is in with netstat. Do you have 200 established connections, or are they in some sort of closing state?
Sockets implement IDisposable. Only calling Dispose or Close will cause the socket to give give up the unmanaged resources in a deterministic manner. This is causing you to run out of the resources that the socket uses (probably a network handle of some sort), even though you may not any managed object useing them.
So you should probably just do
socket.Shutdown(SocketShutdown.Both);
socket.Close();
To be clear setting the socket to Null does not do this because setting the socket to null only causes the sockets to be placed on the freachable queue, to have its finalizer called when it gets around to processing the freachable queue.
You may want to review this article which gives a good model on how Unmanaged resources are dealt with in .NET
Update
I checked and Sockets do indeed contain a handle to a WSASocket. So unless you call close or dispose you'll have to wait until the Finalizers run (or exiting the appplication) for you to get them back.
Scenario:
I have a Distributed-objects-based IPC between a mac application and a launchd daemon (written with Foundation classes). Since I had issues before regarding asynchronous messaging (e.g. I have a registerClient: on the server's root object and whenever there's an event the server's root object notifies / calls a method in the client's proxy object), I did long-polling which meant that the client "harvests" lists of events / notifications from the daemon. This "harvest" is done through a server object method call, which then returns an NSArray instance.
It works pretty well, until for a few seconds, the server object's process (launched thru launchd) starts being labeled red with the "(Not responding)" tag beside it (inside Activity Monitor). Like I said, functionally, it works well, but we just want to get rid of this "Not responding" label.
How can I prevent this "Not responding" tag?
FYI, I already did launchd-based processes before and this is the first time I did long-polling. Also, I tried NSSocketPortNameServer-based connections and also NSSocketPort-based ones. They didn't have this problem. Locking wasn't also an issue 'coz the locks used were only NSCondition's and we logged and debugged the program and it seems like the only locking "issue" is on the harvesting part, which actually, functionally works. Also, client-process is written in PyObjC while server process was written using ObjC.
Thanks in advance.
Sample the process to find out what it's doing or waiting on.
Peter's correct in the approach, though you may be able to figure it out through simple inspection. "Not responding" means that you're not processing events on your event queue for at least 5 seconds (used to be 2 seconds, but they upped it in 10.4). For a UI process, this would create a spinning wait cursor, but for a non-UI process, you're not seeing the effects as easily.
If this is a runloop-based program, it means you're probably doing something with a blocking (synchronous) operation that should be done with the run loop and a callback (async). Alternately, you need a second thread to process your blocking operations so your mainthread can continue to respond to events.
My problem was actually the call for getting a process's PID using the signature FNDR... that part caused the "Not responding" error and it never was the locks or the long-polling part. Sorry about this guys. But thank God I already found the answer.