Core Data and Concurrency using NSOperationQueues

Core Data and Concurrency using NSOperationQueues - objective-c

After using Instruments, I have found the a spot in my code that is very long running and blocking my UI: lots of Core Data fetches (it's part of a process of ingesting a large JSON packet and building up managed objects while making sure that objects have not be duplicated).
While my intentions are to break up this request into smaller pieces and process them serially, that only means I'll be spreading out those fetches - I anticipate the effect will be small bursts of jerkiness in the app instead of one long hiccup.
Everything I've read both in Apple's docs and online at various blog posts indicates that Core Data and concurrency is akin to poking a beehive. So, timidly I've sat down to give it the ol' college try. Below is what I've come up with, and I would appreciate someone wiser pointing out any errors I'm sure I've written.
The code posted below works. What I have read has me intimidated that I've surely done something wrong; I feel like if pulled the pin out of a grenade and am just waiting for it to go off unexpectedly!
NSBlockOperation *downloadAllObjectContainers = [NSBlockOperation blockOperationWithBlock:^{
NSArray *containers = [webServiceAPI findAllObjectContainers];
}];
[downloadAllObjectContainers setCompletionBlock:^{
NSManagedObjectContext *backgroundContext = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
[backgroundContext setPersistentStoreCoordinator:[_managedObjectContext persistentStoreCoordinator]];
[[NSNotificationCenter defaultCenter] addObserverForName:NSManagedObjectContextDidSaveNotification
object:backgroundContext
queue:[NSOperationQueue mainQueue]
usingBlock:^(NSNotification *note) {
[_managedObjectContext mergeChangesFromContextDidSaveNotification:note];
}];
Builder *builder = [[Builder alloc] init];
[builder setManagedObjectContext:backgroundContext];
for (ObjectContainer *objCont in containers) { // This is the long running piece, it's roughly O(N^2) yuck!
[builder buildCoreDataObjectsFromContainer:objCont];
}
NSError *backgroundContextSaveError = nil;
if ([backgroundContext hasChanges]) {
[backgroundContext save:&backgroundContextSaveError];
}
}];
NSOperationQueue *background = [[NSOperationQueue alloc] init];
[background addOperation:downloadAllObjectContainers];

Since you are using NSPrivateQueueConcurrencyType you must be doing it for iOS5, you do not have to go through all the trouble of creating context in a background thread and merging it in the main thread.
All you need is to create a managed object context with concurrency type NSPrivateQueueConcurrencyType in the main thread and do all operations with managed objects inside a block passed in to managedObjectContext:performBlock method.
I recommend you take a look at WWDC2011 session 303 - What's New in Core Data on iOS.
Also, take a look at Core Data Release Notes for iOS5.
Here's a quote from the release notes:
NSManagedObjectContext now provides structured support for concurrent operations. When you create a managed object context using initWithConcurrencyType:, you have three options for its thread (queue) association
Confinement (NSConfinementConcurrencyType).
This is the default. You promise that context will not be used by any thread other than the one on which you created it. (This is exactly the same threading requirement that you've used in previous releases.)
Private queue (NSPrivateQueueConcurrencyType).
The context creates and manages a private queue. Instead of you creating and managing a thread or queue with which a context is associated, here the context owns the queue and manages all the details for you (provided that you use the block-based methods as described below).
Main queue (NSMainQueueConcurrencyType).
The context is associated with the main queue, and as such is tied into the application’s event loop, but it is otherwise similar to a private queue-based context. You use this queue type for contexts linked to controllers and UI objects that are required to be used only on the main thread.

Concurrency
Concurrency is the ability to work with the data on more than one queue at the same time. If you choose to use concurrency with Core Data, you also need to consider the application environment. For the most part, AppKit and UIKit are not thread-safe. In OS X in particular, Cocoa bindings and controllers are not threadsafe Core Data, Multithreading, and the Main Thread

Related

Asynchronous Cocoa - Preventing "simple" (obvious) deadlocks in NSOperation?

When subclassing NSOperation to get a little chunk of work done, I've found out it's pretty easy to deadlock. Below I have a toy example that's pretty easy to understand why it never completes.
I can only seem to think through solutions that prevent the deadlock from the caller perspective, never the callee. For example, the caller could continue to run the run loop, not wait for finish, etc. If the main thread needs to be message synchronously during the operation, I'm wondering if there is a canonical solution that an operation subclasser can implement to prevent this type of deadlocking. I'm only just starting to dip my toe in async programming...
#interface ToyOperation : NSOperation
#end
#implementation ToyOperation
- (void)main
{
// Lots of work
NSString *string = #"Important Message";
[self performSelector:#selector(sendMainThreadSensitiveMessage:) onThread:[NSThread mainThread] withObject:string waitUntilDone:YES];
// Lots more work
}
- (void)sendMainThreadSensitiveMessage:(NSString *)string
{
// Update the UI or something that requires the main thread...
}
#end
- (int)main
{
ToyOperation *op = [[ToyOperation alloc] init];
NSOperationQueue *opQ = [[NSOperationQueue alloc] init];
[opQ addOperations: #[ op ] waitUntilFinished:YES]; // Deadlock
return;
}

If the main thread needs to be message synchronously during the
operation, I'm wondering if there is a canonical solution that an
operation subclasser can implement to prevent this type of
deadlocking.
There is. Never make a synchronous call to the main queue. And a follow-on: Never make a synchronous call from the main queue. And, really, it can be summed up as Never make a synchronous call from any queue to any other queue.
By doing that, you guarantee that the main queue is not blocked. Sure, there may be an exceptional case that tempts you to violate this rule and, even, cases where it really, truly, is unavoidable. But that very much should be the exception because even a single dispatch_sync() (or NSOpQueue waitUntilDone) has the potential to deadlock.
Of course, data updates from queue to queue can be tricky. There are several options; a concurrency safe data layer (very hard), only passing immutable objects or copies of the data (typically to the main queue for display purposes -- fairly easy, but potentially expensive), or you can go down the UUID based faulting like model that Core Data uses. Regardless of how you solve this, the problem isn't anything new compared to any other concurrency model.
The one exception is when replacing locks with queues (For example, instead of using #synchronized() internally to a class, use a serial GCD queue and use dispatch_sync() to that queue anywhere that a synchronized operation must take place. Faster and straightforward.). But this isn't so much an exception as solving a completely different problem.

AFNetworking - Why does it spawn a network request thread?

I'm trying to understand Operations and Threads better, and looked to AFNetworking's AFURLConnectionOperation subclass for example, real-world, source code.
My current understanding is when instances of NSOperation are added to an operation queue, the queue, among other things, manages the thread responsible for executing the operation. In Apple's documentation of NSOperation it points out that even if subclasses return YES for -isConcurrent the operation will always be started on a separate thread (as of 10.6).
Based on Apple's strong language throughout the Thread Programming Guide and Concurrency Programming Guide, it seems like managing a thread is best left up to the internal implementation of NSOperationQueue.
However, AFNetworking's AFURLConnectionOperation subclass spawns a new NSThread, and the execution of the operation's -main method is pushed off onto this network request thread. Why? Why is this network request thread necessary? Is this a defensive programming technique because the library is designed to be used by a wide audience? Is it just less hassle for consumers of the library to debug? Is there a (subtle) performance benefit to having all networking activity on a dedicated thread?
(Added Jan 26th)
In a blog post by Dave Dribin, he illustrates how to move an operation back onto the main thread using the specific example of NSURLConnection.
My curiosity comes from the following section in Apple's Thread Programming Guide:
Keep Your Threads Reasonably Busy.
If you decide to create and
manage threads manually, remember that threads consume precious system
resources. You should do your best to make sure that any tasks you
assign to threads are reasonably long-lived and productive. At the
same time, you should not be afraid to terminate threads that are
spending most of their time idle. Threads use a nontrivial amount of
memory, some of it wired, so releasing an idle thread not only helps
reduce your application’s memory footprint, it also frees up more
physical memory for other system processes to use.
It seems to me that AFNetworking's network request thread isn't being "kept reasonably busy;" it's running an infinite while-loop for handling networking I/O. But, see, that's the point of this questions - I don't know and I am only guessing.
Any insight or deconstruction of AFURLConnectionOperation with specific regards to operations, threads (run loops?), and / or GCD would be highly beneficial to filling in the gaps of my understanding.

Its an interesting question and the answer is all about the semantics of how NSOperation and NSURLConnection interact and work together.
An NSURLConnection is itself an asynchronous task. It all happens in the background and calls its delegate periodically with the results. When you start an NSURLConnection it schedules the delegate callbacks using the runloop it is scheduled on, so a runloop must always be running on the thread you are executing an NSURLConnection on.
Therefore the method -start on our AFURLConnectionOperation will always have to return before the operation finishes so it can receive the callbacks. This requires that AFURLConnectionOperation be an asynchronous operation.
from: https://developer.apple.com/library/mac/documentation/Cocoa/Reference/NSOperation_class/index.html
The value of property is YES for operations that run asynchronously with respect to the current thread or NO for operations that run synchronously on the current thread. The default value of this property is NO.
But AFURLConnectionOperation overrides this method and returns YES as we would expect. Then we see from the class description:
When you call the start method of an asynchronous operation, that method may return before the corresponding task is completed. An asynchronous operation object is responsible for scheduling its task on a separate thread. The operation could do that by starting a new thread directly, by calling an asynchronous method, or by submitting a block to a dispatch queue for execution. It does not actually matter if the operation is ongoing when control returns to the caller, only that it could be ongoing.
AFNetworking creates a single network thread using a class method that it schedules all the NSURLConnection objects (and their resulting delegate callbacks) on. Here is that code from AFURLConnectionOperation
+ (void)networkRequestThreadEntryPoint:(id)__unused object {
#autoreleasepool {
[[NSThread currentThread] setName:#"AFNetworking"];
NSRunLoop *runLoop = [NSRunLoop currentRunLoop];
[runLoop addPort:[NSMachPort port] forMode:NSDefaultRunLoopMode];
[runLoop run];
}
}
+ (NSThread *)networkRequestThread {
static NSThread *_networkRequestThread = nil;
static dispatch_once_t oncePredicate;
dispatch_once(&oncePredicate, ^{
_networkRequestThread = [[NSThread alloc] initWithTarget:self selector:#selector(networkRequestThreadEntryPoint:) object:nil];
[_networkRequestThread start];
});
return _networkRequestThread;
}
Here is code from AFURLConnectionOperation showing them scheduling the NSURLConnection on the runloop of the AFNetwokring thread in all runloop modes
- (void)start {
[self.lock lock];
if ([self isCancelled]) {
[self performSelector:#selector(cancelConnection) onThread:[[self class] networkRequestThread] withObject:nil waitUntilDone:NO modes:[self.runLoopModes allObjects]];
} else if ([self isReady]) {
self.state = AFOperationExecutingState;
[self performSelector:#selector(operationDidStart) onThread:[[self class] networkRequestThread] withObject:nil waitUntilDone:NO modes:[self.runLoopModes allObjects]];
}
[self.lock unlock];
}
- (void)operationDidStart {
[self.lock lock];
if (![self isCancelled]) {
self.connection = [[NSURLConnection alloc] initWithRequest:self.request delegate:self startImmediately:NO];
NSRunLoop *runLoop = [NSRunLoop currentRunLoop];
for (NSString *runLoopMode in self.runLoopModes) {
[self.connection scheduleInRunLoop:runLoop forMode:runLoopMode];
[self.outputStream scheduleInRunLoop:runLoop forMode:runLoopMode];
}
//...
}
[self.lock unlock];
}
here [NSRunloop currentRunloop] retrieves the runloop on the AFNetworking thread instead of the mainRunloop as the -operationDidStart method is called from that thread. As a bonus we get to run the outputStream on the background thread's runloop as well.
Now AFURLConnectionOperation waits for the NSURLConnection callbacks and updates its own NSOperation state variables (cancelled, finished, executing) itself as the network request progresses. The AFNetworking thread spins its runloop repeatedly so that as NSURLConnections from potentially many AFURLConnectionOperations schedule their callbacks they are called and the AFURLConnectionOperation objects can react to them.
If you always plan to use queues to execute your operations, it is simpler to define them as synchronous. If you execute operations manually, though, you might want to define your operation objects as asynchronous. Defining an asynchronous operation requires more work, because you have to monitor the ongoing state of your task and report changes in that state using KVO notifications. But defining asynchronous operations is useful in cases where you want to ensure that a manually executed operation does not block the calling thread.
Also note that you can also use an NSOperation without a NSOperationQueue by calling -start and observing it until -isFinished returns YES. If AFURLConnectionOperation was implemented as a synchronous operation and blocked the current thread waiting for NSURLConnection to finish it would never actually finish as NSURLConnection would schedule its callbacks on the current runloop, which wouldn't be running as we would be blocking it. Therefore to support this valid scenario of using NSOperation we have to make the AFURLConnectionOperation asynchronous.
Answers to Questions
Yes, AFNetworking creates one thread which it uses to schedule all connections. Thread creation is expensive. (this is partly why GCD was created. GCD keeps a pool of threads running for you and dispatches blocks on the different threads as needed without having to create, destroy and manage threads yourself)
The processing is not done on the background AFNetworking thread. AFNetworking uses the completionBlock property of NSOperation to do its processing which is executed when finished is set to YES.
The exact execution context for your completion block is not guaranteed but is typically a secondary thread. Therefore, you should not use this block to do any work that requires a very specific execution context. Instead, you should shunt that work to your application’s main thread or to the specific thread that is capable of doing it. For example, if you have a custom thread for coordinating the completion of the operation, you could use the completion block to ping that thread.
the post processing of HTTP connections is handled in AFHTTPRequestOperation. This class creates a dispatch queue specifically for transforming response objects in the background and shunts the work off onto that queue. see here
- (void)setCompletionBlockWithSuccess:(void (^)(AFHTTPRequestOperation *operation, id responseObject))success
failure:(void (^)(AFHTTPRequestOperation *operation, NSError *error))failure
{
self.completionBlock = ^{
//...
dispatch_async(http_request_operation_processing_queue(), ^{
//...
I guess this begs the question could they have written AFURLConnectionOperation to not make a thread. I think the answer is yes as there is this API
- (void)setDelegateQueue:(NSOperationQueue*) queue NS_AVAILABLE(10_7, 5_0);
Which is meant to schedule your delegate callbacks on a specific operation queue as opposed to using the runloop. But as we're looking at the legacy part of AFNetworking and that API was only available in iOS 5 and OS X 10.7. Looking at the blame view on Github for AFURLRequestOperation we can see that mattt actually wrote the +networkRequestThread method coincidentally on the actual day the iPhone 4s and iOS 5 were announced back in 2011! As such we can reason that the thread exists because at the time it was written we can see that making a thread and scheduling your connections on it was the only way to receive callbacks from NSURLConnection in the background while running in an asynchronous NSOperation subclass.
the thread is created using the dispatch_once function. (see the extra code snipped i added as you suggested) This function ensures that the code enclosed in the block it runs will be run only once in the lifetime of the application. The AFNetworking thread is created when it is needed and then persists for the lifetime of the application
When i wrote NSURLConnectionOperation i meant AFURLConnectionOperation. I corrected that thanks for mentioning it :)

Leak reported by profiler when using NSThread. The project is compiled with ARC for Objective C and iOS

This is the way I am trying to manage my thread
-(void)ExecuteThread {
#autoreleasepool {
bInsideDifferentThread = YES;
//some code...
bInsideDifferentThread = NO;
}
[NSThread exit];
}
-(void)ThreadCallerEvent {
NSThread *myThread = [[NSThread alloc] initWithTarget:self selector:#selector(ExecuteThread) object:nil];
if (!bInsideThread)
[myThread start];
else
{
[myThread cancel];
}
}
I do it this way becuase I don't want the thread to be started until it has finished working. The problem is that this is generating leaks from a non released memory allocated at [NSThread init]
Any ideas of how to fix this problem?

I ran a fragment similar to yours and wasn't able to detect leaks; but the context is almost certainly different. I'm really not sure what ARC is doing with myThread in your example when it goes out of scope. A typical pattern for using NSThread would be:
[NSThread detachNewThreadSelector:#selector(executeThread)
toTarget:self
withObject:nil];
in which case you're not responsible for dealing directly with the thread you are detaching. (Note: I changed your method name to use camel case which is the preferred method and variable naming convention in Cocoa.)
All of the above said, managing threads are no longer the preferred way of achieving concurrent design. It's perfectly acceptable; but Apple is encouraging developers to migrate to GCD. Far better to think in terms of units of work that need to be performed concurrently.
Without understanding your needs more deeply in this case, it's hard to know what advantages, if any, working directly with threads might offer you; but I would consider looking at concurrent queues/GCD more closely. Perhaps you could simply use something like:
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
// do your background work
});
and achieve both a clearer design for concurrency and avoid whatever memory management issues you are now seeing.

Is ThreadedCoreData example (from Apple) creating NSManagedObjectContext on main thread?

From http://goo.gl/MkV8V
You must create the managed context on the thread on which is will be used. If you use NSOperation, note that its init method is invoked on the same thread as the caller. You must not, therefore, create a managed object context for the queue in the queue’s init method, otherwise it is associated with the caller’s thread. Instead, you should create the context in main (for a serial queue) or start (for a concurrent queue).
From http://goo.gl/6CMO4
In ConnectionDidLoading method:
ParseOperation *parseOperation = [[ParseOperation alloc] initWithData:self.earthquakeData];
[self.parseQueue addOperation:parseOperation];
[parseOperation release]; // once added to the NSOperationQueue it's retained, we don't need it anymore
ConnectionDidiLoading is being called on the main thread. Now Inside the ParseOperation::initWIthData method we have something like this : (See ParseOperation.m file)
// setup our Core Data scratch pad and persistent store
managedObjectContext = [[NSManagedObjectContext alloc] init];
[self.managedObjectContext setUndoManager:nil];
SeismicXMLAppDelegate *appDelegate = (SeismicXMLAppDelegate *)[[UIApplication sharedApplication] delegate];
[self.managedObjectContext setPersistentStoreCoordinator:appDelegate.persistentStoreCoordinator];
#
My understanding is that this managedObjectContext is still created on the main thread.
Would appreciate if someone clarify or correct my understanding as it is quite unlikely that sample code from Apple is not correct.

Good point here. Seems the code is not correct (but maybe I'm wrong). Based on my experience you should create the context within the start method (for example) or something strange could happen.
The other rule is that you cannot pass NSManagedObject among treads. Pass NSManagedObjectID instead. Before passing them, you need to save to disk.
These rules habe been enforced in iOS 5. Starting from iOS 5 you could take advantage of Core Data new API for creating context in a private queue (there is also a main queue) and perform long running calculation there. In addition if you pass NSManagedObject between threads exceptions occur.

In my opinion you are correct. Apple seems to break their own rules in this sample code. However it seems in practice, and this is not much advertised, that as long as the managed object context is used only on one single thread, you are safe, regardless on which thread the context was created.

use NSOperationQueue as a LIFO stack?

i need to do a series of url calls (fetching WMS tiles). i want to use a LIFO stack so the newest url call is the most important. i want to display the tile on the screen now, not a tile that was on the screen 5 seconds ago after a pan.
i can create my own stack from a NSMutableArray, but i'm wondering if a NSOperationQueue can be used as a LIFO stack?

You can set the priority of operations in an operation queue using -[NSOperation setQueuePriority:]. You'll have to rejigger the priorities of existing operations each time you add an operation, but you can achieve something like what you're looking for. You'd essentially demote all of the old ones and give the newest one highest priority.

Sadly I think NSOperationQueues are, as the name suggests, only usable as queues — not as stacks. To avoid having to do a whole bunch of manual marshalling of tasks, probably the easiest thing is to treat your queues as though they were immutable and mutate by copying. E.g.
- (NSOperationQueue *)addOperation:(NSOperation *)operation toHeadOfQueue:(NSOperationQueue *)queue
{
// suspending a queue prevents it from issuing new operations; it doesn't
// pause any already ongoing operations. So we do this to prevent a race
// condition as we copy operations from the queue
queue.suspended = YES;
// create a new queue
NSOperationQueue *mutatedQueue = [[NSOperationQueue alloc] init];
// add the new operation at the head
[mutatedQueue addOperation:operation];
// copy in all the preexisting operations that haven't yet started
for(NSOperation *operation in [queue operations])
{
if(!operation.isExecuting)
[mutatedQueue addOperation:operation];
}
// the caller should now ensure the original queue is disposed of...
}
/* ... elsewhere ... */
NSOperationQueue *newQueue = [self addOperation:newOperation toHeadOfQueue:operationQueue];
[operationQueue release];
operationQueue = newQueue;
It seems at present that releasing a queue that is still working (as will happen to the old operation queue) doesn't cause it to cancel all operations, but that's not documented behaviour so probably not trustworthy. If you want to be really safe, key-value observe the operationCount property on the old queue and release it when it goes to zero.

I'm not sure if you're still looking a solution, but I've the same problem has been bugging me for a while, so I went ahead and implemented an operation stack here: https://github.com/cbrauchli/CBOperationStack. I've used it with a few hundred download operations and it has held up well.

Sadly, you cannot do that without running into some tricky issues, because:
Important You should always configure dependencies before running your operations or adding them to an operation queue. Dependencies added afterward may not prevent a given operation object from running. (From: Concurrency Programming Guide: Configuring Interoperation Dependencies)
Take a look at this related question: AFURLConnectionOperation 'start' method gets called before it is ready and never gets called again afterwards

Found a neat implementation of stack/LIFO features on top of NSOperationQueue. It can be used as a category that extends NSOperationQueue or an NSOperationQueue LIFO subclass.
https://github.com/nicklockwood/NSOperationStack

The easiest way would be to separate your operations and data you will be processing, so you can add operations to NSOperationQueue as usual and then take data from a stack or any other data structure you need.
var tasks: [MyTask]
...
func startOperation() {
myQueue.addOperation {
guard let task = tasks.last else {
return
}
tasks.removeLast()
task.perform()
}
}
Now, obviously, you might need to ensure that tasks collection can be used concurrently, but it's a much more common problem with lots of pre-made solutions than hacking your way around NSOperationQueue execution order.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas