I have some model classes that I need to synchronize. There is a main object of type Library that contains several Album objects (imagine a music library, for example). Both the library and albums support serialization by implementing the NSCoding protocol. I need to synchronize modifications to the library with album modifications and the serialization of both classes, so that I know the updates don’t step on each other’s toes and that I don’t serialize objects in the middle of an update.
I thought I would just pass all the objects a shared dispatch queue, dispatch_async all the setter code and dispatch_sync the getters. This is simple and easy, but it does not work, as the program flow is recursive:
// In the Library class
- (void) encodeWithCoder: (NSCoder*) encoder
{
dispatch_sync(queue, ^{
[encoder encodeObject:albums forKey:…];
});
}
// In the Album class, same queue as above
- (void) encodeWithCoder: (NSCoder*) encoder
{
dispatch_sync(queue, ^{
[encoder encodeObject:items forKey:…];
});
}
Now serializing the library triggers the album serialization and since both method use dispatch_sync on the same queue, the code deadlocks. I have seen this pattern somewhere:
- (void) runOnSynchronizationQueue: (dispatch_block_t) block
{
if (dispatch_get_current_queue() == queue) {
block();
} else {
dispatch_sync(queue, block);
}
}
Does it make sense, will it work, is it safe? Is there an easier way to do the synchronization?
For an exposition on recursive locks in GCD, see the Recursive Locks section of the dispatch_async man page. To briefly summarize it, it's generally a good idea to rethink your object hierarchies when something like this occurs.
You can also use dispatch_set_target_queue() to control the hierarchy of execution (targeting subordinate queues at higher level queues) once you've refactored your code such that it's the operations rather than the objects that need to be controlled, but that's also assuming that you weren't able to simply use completion callbacks to accomplish the same synchronization effect (which, frankly, is more recommended since abstract queue hierarchies can be difficult to conceptualize and debug).
I know that's not really the answer you were looking for, but you're kind of in a "you can't get there from here" situation with GCD and a more fundamental rethink of how to do things "the GCD way" is almost certainly necessary in this case.
Related
I have 2 method calls exposed in my library as follows:
-(void) createFile {
dispatch_async(serialQueue, ^{
[fileObj createFile:fileInfo completion:^(void){
//completion block C1
}];
});
}
-(void) readFile:(NSUInteger)timeStamp {
dispatch_async(serialQueue, ^{
[fileObj readFile:timeStamp completion:^(void){
//completion block C2
}];
});
}
Now the createFile:fileInfo:completion and readFile:timeStamp:completion are in turn XPC calls that call into a process P1. Their implementation inside the process P1 looks like this:
#interface FileObject : NSObject
+ (instancetype) sharedInstance;
#property (nonatomic, strong) NSData *fileContents;
#end
#implementation FileObject
+ (instancetype)sharedInstance
{
static FileObject *sharedObj = nil;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
sharedObj = [[self alloc] init];
});
return sharedObj;
}
- (void)createFile:(FileInfo *)fileInfo
completion:(void (^))completion {
FileObject *f = [FileObject sharedInstance];
//lock?
f.fileContents = //call some method;
//unlock
}
- (void)readFile:(NSUInteger)timeStamp
completion:(void (^))completion {
FileObject *f = [FileObject sharedInstance];
//do stuff with f.fileContents
}
#end
The point to be noted is that,a call to method createFile is capable of making modifications to the singleton obj and readFile:fileInfo is always called after createFile(serial queue used at invocation).
My question is given the serial queue usage for the two methods exposed in my library,
Do I still need to lock and unlock when I modify f.fileContents in readFileInfo:FileInfo:completion?
How about if multiple different processes call into my library? Will XPC to P1 handle that automatically or should I have to do something about it ?
You asked:
Do I need to use locks if the methods are already executed in a serial queue?
If you can interact with non thread-safe objects at the same time from multiple threads, yes, you can use locks (or a variety of other techniques) to synchronize your access. If you're doing all of your interaction from a serial queue, though, no locks are needed.
But you need to be careful because although you're using a serial queue, you've got completion handlers that could be running on different threads. But if you coordinate the interaction on those completion handlers so that none can ever run at the same time (more on that, below), then no locks (or other synchronization techniques) are needed.
A bunch of other thoughts based upon your code snippets (and conversations we've had elsewhere):
You're still using the GCD dispatch calls with asynchronous methods. There's a cognitive dissonance there. Either
lose the dispatch queues altogether and use completion handler patterns (this is the better solution IMHO, but you say you can't change your endpoints which apparently precludes this more logical approach); or
make createFile and readFile behave synchronously (with semaphores, for example) and then you can take advantage of the GCD queue behaviors (we generally eschew patterns like locks, semaphores, or anything that will "wait", but given that you're doing this on background queue, it's less problematic).
If you want these two fundamentally asynchronous tasks to behave synchronously, you can achieve this with locks, too, but I think dispatch semaphores are more the logical pattern if you're going to use dispatch queues.
The better pattern would be to make createFile and readFile create custom, asynchronous, NSOperation subclasses. This is a much stronger pattern than locks or semaphores in GCD because it minimizes the deadlocking risks.
Completely unrelated to the question here, I would not suggest making FileObject a singleton. Each FileObject is associated with a particular file, and it has no business being a singleton. This is especially important because we discovered offline that you're contemplating this being an XPC service with multiple requests coming in, and the singleton pattern is antithetical to that.
Plus, as an aside, FileObject doesn't pass the basic criteria for when you'd use singletons. For discussion on the considerations of singletons (which we don't need to repeat here, especially since it's unrelated to your main question), see What is so bad about singletons? Singleton's have their place, but this is a scenario where you likely will want separate instances, so the singleton pattern seems especially ill-suited.
I've been reading about KVC and Cocoa Scripting, and how properties can be used for this. I have a model class in mind, but the element/property data has to be obtained from the Internet. But the design of properties and KVC looks like it assumes fast & in-memory retrieval, while network calls can be slow and/or error-prone. How can these be reconciled?
For speed, do we just say "screw it" and post a waiting icon? (Of course, we should keep things multi-threaded so the UI doesn't stop while we wait.)
If your property is supposed to be always available, we could set it to nil if the resource call gets an error. But we would have no way to get the specifics. Worse would be a property that supports "missing values," then nil would represent that and we would have no spare state to use for errors.
Although Apple-events support error handling, I couldn't use it because between my potentially error-generating model calls and the Apple event, the KVC layer would drop the error to the floor (of oblivion). The Scripting Bridge API saw this problem, since its designers added a secret protocol to handle errors.
Am I wrong? Is there a way to handle errors with KVC-based designs?
Addendum
I forgot to mention exceptions. Objective-C now supports them, but the little I read about them implies that they're meant for catastrophic "instead of straight crashing" use, not for regular error handling like in C++. Except for that, they could've been useful here....
I think I understand what you're asking now. I would say using KVC (or property getters) is not a good way to accomplish what you're trying to do. If the code gets called on the main thread, you will then block that thread. which you don't want to do. As you have discovered you'll also have a hard time returning other state information such as errors.
Instead, you should use block syntax to create an asynchronous method that operates on a background queue. Here is a basic template for what this might look like:
// called from main thread
- (void) fetchDataInBackgroundWithCompletionHandler:(void (^)(id responseData, NSError *error))handler
{
// perform in background
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^()
{
// perform your background operation here
// call completion block on main thread
dispatch_async(dispatch_get_main_queue(), ^{
if(// whatever your error case is)
{
handler(nil, error);
}
else // success
{
handler(responseData, nil);
}
});
});
}
This also gives you the benefit of being able to pass in as many other parameters are you want as well as return as many values as you want in the completion block.
Some very good examples of this pattern can be seen in AFNetworking, which is one of the more popular networking libraries written for iOS. All of the calls in the library can be made from the main queue and will return on the main queue asycnhronously while performing all networking in the background.
When subclassing NSOperation to get a little chunk of work done, I've found out it's pretty easy to deadlock. Below I have a toy example that's pretty easy to understand why it never completes.
I can only seem to think through solutions that prevent the deadlock from the caller perspective, never the callee. For example, the caller could continue to run the run loop, not wait for finish, etc. If the main thread needs to be message synchronously during the operation, I'm wondering if there is a canonical solution that an operation subclasser can implement to prevent this type of deadlocking. I'm only just starting to dip my toe in async programming...
#interface ToyOperation : NSOperation
#end
#implementation ToyOperation
- (void)main
{
// Lots of work
NSString *string = #"Important Message";
[self performSelector:#selector(sendMainThreadSensitiveMessage:) onThread:[NSThread mainThread] withObject:string waitUntilDone:YES];
// Lots more work
}
- (void)sendMainThreadSensitiveMessage:(NSString *)string
{
// Update the UI or something that requires the main thread...
}
#end
- (int)main
{
ToyOperation *op = [[ToyOperation alloc] init];
NSOperationQueue *opQ = [[NSOperationQueue alloc] init];
[opQ addOperations: #[ op ] waitUntilFinished:YES]; // Deadlock
return;
}
If the main thread needs to be message synchronously during the
operation, I'm wondering if there is a canonical solution that an
operation subclasser can implement to prevent this type of
deadlocking.
There is. Never make a synchronous call to the main queue. And a follow-on: Never make a synchronous call from the main queue. And, really, it can be summed up as Never make a synchronous call from any queue to any other queue.
By doing that, you guarantee that the main queue is not blocked. Sure, there may be an exceptional case that tempts you to violate this rule and, even, cases where it really, truly, is unavoidable. But that very much should be the exception because even a single dispatch_sync() (or NSOpQueue waitUntilDone) has the potential to deadlock.
Of course, data updates from queue to queue can be tricky. There are several options; a concurrency safe data layer (very hard), only passing immutable objects or copies of the data (typically to the main queue for display purposes -- fairly easy, but potentially expensive), or you can go down the UUID based faulting like model that Core Data uses. Regardless of how you solve this, the problem isn't anything new compared to any other concurrency model.
The one exception is when replacing locks with queues (For example, instead of using #synchronized() internally to a class, use a serial GCD queue and use dispatch_sync() to that queue anywhere that a synchronized operation must take place. Faster and straightforward.). But this isn't so much an exception as solving a completely different problem.
I'm debating whether or not to move to a GCD-based pattern for multi-threaded accessors. I've been using custom lock-based synchronization in accessors for years, but I've found some info (Intro to GCD) and there seems to be pros of a GCD-based approach. I'm hoping to start a dialog here to help myself and others weigh the decision.
The pattern looks like:
- (id)something
{
__block id localSomething;
dispatch_sync(queue, ^{
localSomething = [something retain];
});
return [localSomething autorelease];
}
- (void)setSomething:(id)newSomething
{
dispatch_async(queue, ^{
if(newSomething != something)
{
[something release];
something = [newSomething retain];
[self updateSomethingCaches];
}
});
}
On the pro side: you get the benefit of, possibly, non-blocking write access; lower overhead than locks (maybe?); safety from forgetting to unlock before returning from critical code sections; others?
Cons: Exception handling is non-existent so you have to code this into every block in which you might need it.
Is this pattern potentially the recommended method of writing multithreaded accessors?
Are there standard approaches for creating dispatch queues for this purpose? In other words, best practices for trading off granularity? With locks, for example, locking on each attribute is more fine grain than locking on the entire object. With dispatch queues, I could imagine that creation of a single queue for all objects would create performance bottlenecks, so are per-object queues appropriate? Obviously , the answer is highly dependent on the specific application, but are there known performance tradeoffs to help gauge the feasibility of the approach.
Any information / insight would be appreciated.
Is this pattern potentially the recommended method of writing
multithreaded accessors?
I guess you wrote that with a serial queue in mind, but there is no reason for it. Consider this:
dispatch_queue_t queue = dispatch_queue_create("com.example", DISPATCH_QUEUE_CONCURRENT);
// same thing as your example
- (NSString*)something {
__block NSString *localSomething;
dispatch_sync(queue, ^{
localSomething = _something;
});
return localSomething;
}
- (void)setSomething:(NSString*)something {
dispatch_barrier_async(queue, ^{
_something = something;
});
}
It reads concurrently but uses a dispatch barrier to disable concurrency while the write is happening. A big advantage of GCD is that allows concurrent reads instead locking the whole object like #property (atomic) does.
Both asyncs (dispatch_async, dispatch_barrier_async) are faster from the client point of view, but slower to execute than a sync because they have to copy the block, and having the block such a small task, the time it takes to copy becomes meaningful. I rather have the client returning fast, so I'm OK with it.
I have a question regarding thread safety in Objective-C. I've read a couple of other answers, some of the Apple documentation, and still have some doubts regarding this, so thought I'd ask my own question.
My question is three fold:
Suppose I have an array, NSMutableArray *myAwesomeArray;
Fold 1:
Now correct me if I'm mistaken, but from what I understand, using #synchronized(myAwesomeArray){...} will prevent two threads from accessing the same block of code. So, basically, if I have something like:
-(void)doSomething {
#synchronized(myAwesomeArray) {
//some read/write operation on myAwesomeArray
}
}
then, if two threads access the same method at the same time, that block of code will be thread safe. I'm guessing I've understood this part properly.
Fold 2:
What do I do if myAwesomeArray is being accessed by multiple threads from different methods?
If I have something like:
- (void)readFromArrayAccessedByThreadOne {
//thread 1 reads from myAwesomeArray
}
- (void)writeToArrayAccessedByThreadTwo {
//thread 2 writes to myAwesomeArray
}
Now, both the methods are accessed by two different threads at the same time. How do I ensure that myAwesomeArray won't have problems? Do I use something like NSLock or NSRecursiveLock?
Fold 3:
Now, in the above two cases, myAwesomeArray was an iVar in memory. What if I have a database file, that I don't always keep in memory. I create a databaseManagerInstance whenever I want to perform database operations, and release it once I'm done. Thus, basically, different classes can access the database. Each class creates its own instance of DatabaseManger, but basically, they are all using the same, single database file. How do I ensure that data is not corrupted due to race conditions in such a situation?
This will help me clear out some of my fundamentals.
Fold 1
Generally your understanding of what #synchronized does is correct. However, technically, it doesn't make any code "thread-safe". It prevents different threads from aquiring the same lock at the same time, however you need to ensure that you always use the same synchronization token when performing critical sections. If you don't do it, you can still find yourself in the situation where two threads perform critical sections at the same time. Check the docs.
Fold 2
Most people would probably advise you to use NSRecursiveLock. If I were you, I'd use GCD. Here is a great document showing how to migrate from thread programming to GCD programming, I think this approach to the problem is a lot better than the one based on NSLock. In a nutshell, you create a serial queue and dispatch your tasks into that queue. This way you ensure that your critical sections are handled serially, so there is only one critical section performed at any given time.
Fold 3
This is the same as Fold 2, only more specific. Data base is a resource, by many means it's the same as the array or any other thing. If you want to see the GCD based approach in database programming context, take a look at fmdb implementation. It does exactly what I described in Fold2.
As a side note to Fold 3, I don't think that instantiating DatabaseManager each time you want to use the database and then releasing it is the correct approach. I think you should create one single database connection and retain it through your application session. This way it's easier to manage it. Again, fmdb is a great example on how this can be achieved.
Edit
If don't want to use GCD then yes, you will need to use some kind of locking mechanism, and yes, NSRecursiveLock will prevent deadlocks if you use recursion in your methods, so it's a good choice (it is used by #synchronized). However, there may be one catch. If it's possible that many threads will wait for the same resource and the order in which they get access is relevant, then NSRecursiveLock is not enough. You may still manage this situation with NSCondition, but trust me, you will save a lot of time using GCD in this case. If the order of the threads is not relevant, you are safe with locks.
As in Swift 3 in WWDC 2016 Session Session 720 Concurrent Programming With GCD in Swift 3, you should use queue
class MyObject {
private let internalState: Int
private let internalQueue: DispatchQueue
var state: Int {
get {
return internalQueue.sync { internalState }
}
set (newValue) {
internalQueue.sync { internalState = newValue }
}
}
}
Subclass NSMutableArray to provide locking for the accessor (read and write) methods. Something like:
#interface MySafeMutableArray : NSMutableArray { NSRecursiveLock *lock; } #end
#implementation MySafeMutableArray
- (void)addObject:(id)obj {
[self.lock lock];
[super addObject: obj];
[self.lock unlock];
}
// ...
#end
This approach encapsulates the locking as part of the array. Users don't need to change their calls (but may need to be aware that they could block/wait for access if the access is time critical). A significant advantage to this approach is that if you decide that you prefer not to use locks you can re-implement MySafeMutableArray to use dispatch queues - or whatever is best for your specific problem. For example, you could implement addObject as:
- (void)addObject:(id)obj {
dispatch_sync (self.queue, ^{ [super addObject: obj] });
}
Note: if using locks, you'll surely need NSRecursiveLock, not NSLock, because you don't know of the Objective-C implementations of addObject, etc are themselves recursive.