Wait for a thread inside a C++ static object - dll

I have a static object that needs to initialize an imaging API. The allocated resources of this imaging API need to be released by the same thread.
So I'm starting a thread in my static object that initializes everything and then waits for a counter to reach zero. When this happens the thread cleans all up and finishes.
This is an unmanaged class inside a managed library, so I can't use System::Threading::Thread (needs a managed static member function) or std::thread (compiler error, not supported with /clr).
So I have to start my thread like:
CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)&Initialize, this, 0, 0);
All works fine, the init is done and the API functions work. But when I close the application I see that the usage counter of my static object reaches zero but the clean up function is never called by the thread, as if the thread was killed. Is there a way to make sure the thread will continue to exist and execute until its end?

After turning this around in all possible ways and adding events etc I guess this is not possible so I'll have to change the structure of my code and encapsulate the non managed class inside a managed class, and add the thread to the managed class.

I think you could proceed in one of two ways:
Wrap the resources in RAII-style classes, and refactor to have the objects' lifetimes be on the stack of your created thread, ensuring their destructors get called when the thread loop exits without having to call any additional cleanup. If there is no issue with the thread returning correctly when your counter reaches 0, this should be the simplest and cleanest way of addressing this.
I'm thinking you could intercept the WM_CLOSE message using window procedures, process necessary cleanup and then pass the message on, effectively "stalling" it until you are ready to close. Note that even though you are in a DLL you can still set up a window procedure and message pump system, you don't need a GUI to do that. I am however not 100% sure on whether you'll receive the WM_CLOSE message that concerns the application that "owns" your DLL, it's not something I've tried out yet.
You will have to implement some form of messaging through events within your thread's loop however, as the WindowProc will be called on a different thread, so you know when to call the cleanup procedure.
I also am not very familiar with CLR, so there might be a simpler way of interacting with those APIs than with raw C++ calls and handles.

Related

VB.Net: appropriate synchronisation object for waiting, not protecting

I have BBController instances (my custom objects), where some may need to wait for a few others to complete first (dependencies). I have decided to have each controller lock some synchronisation object at initialisation, lets call it a Padlock, and then unlock it when its done processing. When its unlocked, any controllers that depend (or were waiting for) on the aforementioned controller can then continue. So this is not about protecting a section of code by allowing one thread, but instead telling anything that that depends on an output to wait until that output is available.
I have experience with Semaphores in objective c, so I thought I could use those here by having each controller initialise its semaphore with a value of 0, and then when finished signal it with a value of infinite or max. While that would work, I'm sure there is a better locking object to make use of, since the value property of Semaphore is of no use here since as many BBControllers can continue when the semaphore is signalled. I am new to VB.Net

Thread safe hooking of DirectX Device

I successfully hooked BeginScene/EndScene methods of DirectX9's DeviceEx, in order to override regions on the screen of a graphics application. I did it by overriding the first 'line' of the function pointed by the appropriate vtable entry (42 for EndScene) with an x86 jump command.
The problem is that when I would like to call the original EndScene method, I have to write the original code overriden by the jump. This operation is not thread safe, and the application has two devices used by two threads.
I tried overriding the vtable entry or copying it and override the COM interface pointer to the vtable, neither ways worked. I guess the original function pointer is cached somewhere or was optimized in the compilation.
I thought about copying the whole original method body to another memory block, but two problems I'm afraid of: (1) (the easy one I think) I don't know how to discover the length of the method and (2) I don't know if the function body stores offsets which are relative to the location where the function is in memory.
I'm trying to hook WPF's device, if it can help somehow.
Do anyone know a thread safe way for such hooking?
Answering my own question: It seems that for my purpose (performing another method before or instead of the original one within my own process), 'trampoline' is the answer. Generally it means I need to make another code segment that makes exactly what the overriden assembly commands did.
Because it is not an easy task, using an external library is recommended.
A discussion about this topic:
How to create a trampoline function for hook

Detect when block is added to Grand Central Dispatch?

I have an iOS application using NSThreads for concurrency tasks. I will try to migrate it to be using the Grand Central Dispatch (GCD) for handling concurrency.
The problem is that the app needs information regarding how many threads has been created since a given time. And how many threads that was spawned since that given time is currently running.
At the moment this is done by creating a category that does a method swizzling on the -main method in NSThread. In the new swizzled method it simply increments the total number of threads running and then decrement the same variable before the new swizzled -main method returns.
The problem is that when I use GCD dispatch_async it does not create a NSThread, hence my category approach does not work. How can I achieve the same while using GCD to handle concurrency?
What I would like to detect is when a new block is added to GCD, and when that block has been executed.
Any suggestions on how to achieve the same is very welcome.
EDIT
Many thanks to #ipmcc and #RyanR for helping me out on this. :) I believe I need to tell some more about the background and what I am trying to accomplish.
What I am actually trying is to extend the iOS testing framework Frank. Frank embeds a small web-server within a given app which enables sending HTTP request to the iOS application and thereby simulating events, a swipe or a tap gesture as an example.
I would like to extend it in a way that enables it to wait until all work triggered by a specific simulated event has ended before returning upon a request.
However I found it hard to detect exactly what work was triggered by the received event. And thats how I came to the solution to just reset a thread counter and then increment this counter for all created threads after the event was simulated, and decrement it when the threads are finishing. And then block until threads count became zero again. I know this approach is not perfect either, and it wont work with GCP.
Is there any other way to achieve it? Another possible solution which I have thought of is to specify that everything must run synchronized except the thread handling the HTTP request. However I don't know if this possible.
Any suggestions on how to achieve blocking after each simulated event until work triggered by that event has completed?
The problem is that the app needs information regarding how many
threads has been created since a given time. And how many threads that
was spawned since that given time is currently running.
You will not be able to get this information from GCD. One of the points of GCD is that you do not manage the thread pool. It is opaque. You'll note that even pthreads, the underlying threading library on which NSThread and GCD are built, does not have a (public) means to enumerate all existing threads or get the number of running threads. This is not going to be doable without hard core low level hackery. If you need to control or know the number of threads, then you need to be the one to spawn and manage them, and GCD is the wrong abstraction for you.
At the moment this is done by creating a category that does a method
swizzling on the -main method in NSThread. In the new swizzled method
it simply increments the total number of threads running and then
decrement the same variable before the new swizzled -main method
returns.
Note that this only tells you the number of threads started using NSThread. As mentioned, NSThread is a fairly high level abstraction on top of pthreads. There is nothing to prevent library code from spawning its own threads using the pthreads API that will be invisible to your count.
The problem is that when I use GCD dispatch_async it does not create a
NSThread, hence my category approach does not work. How can I achieve
the same while using GCD to handle concurrency?
In short, you can't. If you want to go forth and patch functions all over the various frameworks, then you should look up a library called mach_override. (But please don't.)
What I would like to detect is when a new block is added to GCD, and
when that block has been executed.
Since GCD uses thread pools, the act of adding a block does not imply a new thread. (And that's sorta the whole point.)
If you have some limited resource whose consumption you need to manage, the traditional way to do that would be with a limiting semaphore, but that is just one option.
This whole question just reeks of a poor design. Like the number of pthreads, GCD's queue widths are opaque/non-public. Your previous solution was not particularly viable (as discussed), and further efforts are likely to yield similarly poor solutions. You should really rethink your architecture such that knowing how many threads are running isn't important.
EDIT: Thanks for the clarification. There's not really a generic way, from the outside, to tell when all the "work" is done. What if an action sets up a timer that won't call back for ten minutes? At the extreme, consider this: the main runloop continues to spin for the entire life of the app, and as long as the main runloop is spinning, "work" could be being done on it.
In order to detect "doneness" your app has to signal doneness. In order to signal doneness, the app has to have some way (internal to itself) to know it's done. Put differently, the app can't tell something else (i.e. Frank) something it doesn't know. One way to go about this would be to encapsulate all the work you do in your app in NSOperations. NSOperation/NSOperationQueue provide good ways of reporting "doneness." At the simplest level, you could wrap the code where you kickoff work in an NSBlockOperation, then add a completion block to that operation that signals something else when it's done, and enqueue it to an NSOperationQueue for execution. (You could also do this with dispatch_group and dispatch_group_notify if you prefer working in the GCD style.)
If you have specific questions about how to package up your app's work into NSOperations, I would suggest starting a new question.
You can hook into the dispatch introspection functions (introspection.h, methods all start with dispatch_introspection), but you have to link with that library which is supposed to be only for debugging. I don't think you can include that in a release build. Your best bet would be to encapsulate GCD into your own object, so all your code submits blocks to execute through that object and it submits them to GCD after tracking whatever you're interested in. You won't be able to track thread consumption though, because GCD intentionally abstracts that and reuses threads.

Should my block based methods return on the main thread or not when creating an iOS cloud integration framework?

I am in the middle of creating a cloud integration framework for iOS. We allow you to save, query, count and remove with synchronous and asynchronous with selector/callback and block implementations. What is the correct practice? Running the completion blocks on the main thread or a background thread?
For simple cases, I just parameterize it and do all the work i can on secondary threads:
By default, callbacks will be made on any thread (where it is most efficient and direct - typically once the operation has completed). This is the default because messaging via main can be quite costly.
The client may optionally specify that the message must be made on the main thread. This way, it requires one line or argument. If safety is more important than efficiency, then you may want to invert the default value.
You could also attempt to batch and coalesce some messages, or simply use a timer on the main run loop to vend.
Consider both joined and detached models for some of your work.
If you can reduce the task to a result (remove the capability for incremental updates, if not needed), then you can simply run the task, do the work, and provide the result (or error) when complete.
Apple's NSURLConnection class calls back to its delegate methods on the thread from which it was initiated, while doing its work on a background thread. That seems like a sensible procedure. It's likely that a user of your framework will not enjoy having to worry about thread safety when writing a simple callback block, as they would if you created a new thread to run it on.
The two sides of the coin: If the callback touches the GUI, it has to be run on the main thread. On the other hand, if it doesn't, and is going to do a lot of work, running it on the main thread will block the GUI, causing frustration for the end user.
It's probably best to put the callback on a known, documented thread, and let the app programmer make the determination of the effect on the GUI.

Low-level details of the implementation of performSelectorOnMainThread:

Was wondering if anyone knows, or has pointers to good documentation that discusses, the low-level implementation details of Cocoa's 'performSelectorOnMainThread:' method.
My best guess, and one I think is probably pretty close, is that it uses mach ports or an abstraction on top of them to provide intra-thread communication, passing selector information along as part of the mach message.
Right? Wrong? Thanks!
Update 09:39AMPST
Thank you Evan DiBiase and Mecki for the answers, but to clarify: I understand what happens in the run loop, but what I'm looking for an answer to is; "where is the method getting queued? how is the selector information getting passed into the queue?" Looking for more than Apple's doc info: I've read 'em
Update 14:21PST
Chris Hanson brings up a good point in a comment: my objective here is not to learn the underlying mechanisms in order to take advantage of them in my own code. Rather, I'm just interested in a better conceptual understanding of the process of signaling another thread to execute code. As I said, my own research leads me to believe that it's takes advantage of mach messaging for IPC to pass selector information between threads, but I'm specifically looking for concrete information on what is happening, so I can be sure I'm understanding things correctly. Thanks!
Update 03/06/09
I've opened a bounty on this question because I'd really like to see it answered, but if you are trying to collect please make sure you read everything, including all currently posed answers, comments to both these answers and to my original question, and the update text I posted above. I'm look for the lowest-level detail of the mechanism used by performSelectorOnMainThread: and the like, and as I mentioned earlier, I suspect it has something to do with Mach ports but I'd really like to know for sure. The bounty will not be awarded unless I can confirm the answer given is correct. Thanks everyone!
Yes, it does use Mach ports. What happens is this:
A block of data encapsulating the perform info (the target object, the selector, the optional object argument to the selector, etc.) is enqueued in the thread's run loop info. This is done using #synchronized, which ultimately uses pthread_mutex_lock.
CFRunLoopSourceSignal is called to signal that the source is ready to fire.
CFRunLoopWakeUp is called to let the main thread's run loop know it's time to wake up. This is done using mach_msg.
From the Apple docs:
Version 1 sources are managed by the run loop and kernel. These sources use Mach ports to signal when the sources are ready to fire. A source is automatically signaled by the kernel when a message arrives on the source’s Mach port. The contents of the message are given to the source to process when the source is fired. The run loop sources for CFMachPort and CFMessagePort are currently implemented as version 1 sources.
I'm looking at a stack trace right now, and this is what it shows:
0 mach_msg
1 CFRunLoopWakeUp
2 -[NSThread _nq:]
3 -[NSObject(NSThreadPerformAdditions) performSelector:onThread:withObject:waitUntilDone:modes:]
4 -[NSObject(NSThreadPerformAdditions) performSelectorOnMainThread:withObject:waitUntilDone:]
Set a breakpoint on mach_msg and you'll be able to confirm it.
One More Edit:
To answer the question of the comment:
what IPC mechanism is being used to
pass info between threads? Shared
memory? Sockets? Mach messaging?
NSThread stores internally a reference to the main thread and via that reference you can get a reference to the NSRunloop of that thread. A NSRunloop internally is a linked list and by adding a NSTimer object to the runloop, a new linked list element is created and added to the list. So you could say it's shared memory, the linked list, that actually belongs to the main thread, is simply modified from within a different thread. There are mutexes/locks (possibly even NSLock objects) that will make sure editing the linked list is thread-safe.
Pseudo code:
// Main Thread
for (;;) {
lock(runloop->runloopLock);
task = NULL;
do {
task = getNextTask(runloop);
if (!task) {
// function below unlocks the lock and
// atomically sends thread to sleep.
// If thread is woken up again, it will
// get the lock again before continuing
// running. See "man pthread_cond_wait"
// as an example function that works
// this way
wait_for_notification(runloop->newTasks, runloop->runloopLock);
}
} while (!task);
unlock(runloop->runloopLock);
processTask(task);
}
// Other thread, perform selector on main thread
// selector is char *, containing the selector
// object is void *, reference to object
timer = createTimerInPast(selector, object);
runloop = getRunloopOfMainThread();
lock(runloop->runloopLock);
addTask(runloop, timer);
wake_all_sleeping(runloop->newTasks);
unlock(runloop->runloopLock);
Of course this is oversimplified, most details are hidden between functions here. E.g. getNextTask will only return a timer, if the timer should have fired already. If the fire date for every timer is still in the future and there is no other event to process (like a keyboard, mouse event from UI or a sent notification), it would return NULL.
I'm still not sure what the question is. A selector is nothing more than a C string containing the name of a method being called. Every method is a normal C function and there exists a string table, containing the method names as strings and function pointers. That are the very basics how Objective-C actually works.
As I wrote below, a NSTimer object is created that gets a pointer to the target object and a pointer to a C string containing the method name and when the timer fires, it finds the right C method to call by using the string table (hence it needs the string name of the method) of the target object (hence it needs a reference to it).
Not exactly the implementation, but pretty close to it:
Every thread in Cocoa has a NSRunLoop (it's always there, you never need to create on for a thread). PerformSelectorOnMainThread creates a NSTimer object like this, one that fires only once and where the time to fire is already located in the past (so it needs firing immediately), then gets the NSRunLoop of the main thread and adds the timer object there. As soon as the main thread goes idle, it searches for the next event in its Runloop to process (or goes to sleep if there is nothing to process and being woken up again as soon as an event is added) and performs it. Either the main thread is busy when you schedule the call, in which case it will process the timer event as soon as it has finished its current task or it is sleeping at the moment, in which case it will be woken up by adding the event and processes it immediately.
A good source to look up how Apple is most likely doing it (nobody can say for sure, as after all its closed source) is GNUStep. Since the GCC can handle Objective-C (it's not just an extension only Apple ships, even the standard GCC can handle it), however, having Obj-C without all the basic classes Apple ships is rather useless, the GNU community tried to re-implement the most common Obj-C classes you use on Mac and their implementation is OpenSource.
Here you can download a recent source package.
Unpack that and have a look at the implementation of NSThread, NSObject and NSTimer for details. I guess Apple is not doing it much different, I could probably prove it using gdb, but why would they do it much different than that approach? It's a clever approach that works very well :)
The documentation for NSObject's performSelectorOnMainThread:withObject:waitUntilDone: method says:
This method queues the message on the run loop of the main thread using the default run loop modes—that is, the modes associated with the NSRunLoopCommonModes constant. As part of its normal run loop processing, the main thread dequeues the message (assuming it is running in one of the default run loop modes) and invokes the desired method.
As Mecki said, a more general mechanism that could be used to implement -performSelectorOn… is NSTimer.
NSTimer is toll-free bridged to CFRunLoopTimer. An implementation of CFRunLoopTimer – although not necessarily the one actually used for normal processes in OS X – can be found in CFLite (open-source subset of CoreFoundation; package CF-476.14 in the Darwin 9.4 source code. (CF-476.15, corresponding to OS X 10.5.5, is not yet available.)