dispatch_sync vs. dispatch_async on main queue - objective-c

Bear with me, this is going to take some explaining. I have a function that looks like the one below.
Context: "aProject" is a Core Data entity named LPProject with an array named 'memberFiles' that contains instances of another Core Data entity called LPFile. Each LPFile represents a file on disk and what we want to do is open each of those files and parse its text, looking for #import statements that point to OTHER files. If we find #import statements, we want to locate the file they point to and then 'link' that file to this one by adding a relationship to the core data entity that represents the first file. Since all of that can take some time on large files, we'll do it off the main thread using GCD.
- (void) establishImportLinksForFilesInProject:(LPProject *)aProject {
dispatch_queue_t taskQ = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
for (LPFile *fileToCheck in aProject.memberFiles) {
if (//Some condition is met) {
dispatch_async(taskQ, ^{
// Here, we do the scanning for #import statements.
// When we find a valid one, we put the whole path to the imported file into an array called 'verifiedImports'.
// go back to the main thread and update the model (Core Data is not thread-safe.)
dispatch_sync(dispatch_get_main_queue(), ^{
NSLog(#"Got to main thread.");
for (NSString *import in verifiedImports) {
// Add the relationship to Core Data LPFile entity.
}
});//end block
});//end block
}
}
}
Now, here's where things get weird:
This code works, but I'm seeing an odd problem. If I run it on an LPProject that has a few files (about 20), it runs perfectly. However, if I run it on an LPProject that has more files (say, 60-70), it does NOT run correctly. We never get back to the main thread, the NSLog(#"got to main thread"); never appears and the app hangs. BUT, (and this is where things get REALLY weird) --- if I run the code on the small project FIRST and THEN run it on the large project, everything works perfectly. It's ONLY when I run the code on the large project first that the trouble shows up.
And here's the kicker, if I change the second dispatch line to this:
dispatch_async(dispatch_get_main_queue(), ^{
(That is, use async instead of sync to dispatch the block to the main queue), everything works all the time. Perfectly. Regardless of the number of files in a project!
I'm at a loss to explain this behavior. Any help or tips on what to test next would be appreciated.

This is a common issue related to disk I/O and GCD. Basically, GCD is probably spawning one thread for each file, and at a certain point you've got too many threads for the system to service in a reasonable amount of time.
Every time you call dispatch_async() and in that block you attempt to to any I/O (for example, it looks like you're reading some files here), it's likely that the thread in which that block of code is executing will block (get paused by the OS) while it waits for the data to be read from the filesystem. The way GCD works is such that when it sees that one of its worker threads is blocked on I/O and you're still asking it to do more work concurrently, it'll just spawn a new worker thread. Thus if you try to open 50 files on a concurrent queue, it's likely that you'll end up causing GCD to spawn ~50 threads.
This is too many threads for the system to meaningfully service, and you end up starving your main thread for CPU.
The way to fix this is to use a serial queue instead of a concurrent queue to do your file-based operations. It's easy to do. You'll want to create a serial queue and store it as an ivar in your object so you don't end up creating multiple serial queues. So remove this call:
dispatch_queue_t taskQ = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
Add this in your init method:
taskQ = dispatch_queue_create("com.yourcompany.yourMeaningfulLabel", DISPATCH_QUEUE_SERIAL);
Add this in your dealloc method:
dispatch_release(taskQ);
And add this as an ivar in your class declaration:
dispatch_queue_t taskQ;

I believe Ryan is on the right path: there are simply too many threads being spawned when a project has 1,500 files (the amount I decided to test with.)
So, I refactored the code above to work like this:
- (void) establishImportLinksForFilesInProject:(LPProject *)aProject
{
dispatch_queue_t taskQ = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
dispatch_async(taskQ,
^{
// Create a new Core Data Context on this thread using the same persistent data store
// as the main thread. Pass the objectID of aProject to access the managedObject
// for that project on this thread's context:
NSManagedObjectID *projectID = [aProject objectID];
for (LPFile *fileToCheck in [backgroundContext objectWithID:projectID] memberFiles])
{
if (//Some condition is met)
{
// Here, we do the scanning for #import statements.
// When we find a valid one, we put the whole path to the
// imported file into an array called 'verifiedImports'.
// Pass this ID to main thread in dispatch call below to access the same
// file in the main thread's context
NSManagedObjectID *fileID = [fileToCheck objectID];
// go back to the main thread and update the model
// (Core Data is not thread-safe.)
dispatch_async(dispatch_get_main_queue(),
^{
for (NSString *import in verifiedImports)
{
LPFile *targetFile = [mainContext objectWithID:fileID];
// Add the relationship to targetFile.
}
});//end block
}
}
// Easy way to tell when we're done processing all files.
// Could add a dispatch_async(main_queue) call here to do something like UI updates, etc
});//end block
}
So, basically, we're now spawning one thread that reads all the files instead of one-thread-per-file. Also, it turns out that calling dispatch_async() on the main_queue is the correct approach: the worker thread will dispatch that block to the main thread and NOT wait for it to return before proceeding to scan the next file.
This implementation essentially sets up a "serial" queue as Ryan suggested (the for loop is the serial part of it), but with one advantage: when the for loop ends, we're done processing all the files and we can just stick a dispatch_async(main_queue) block there to do whatever we want. It's a very nice way to tell when the concurrent processing task is finished and that didn't exist in my old version.
The disadvantage here is that it's a bit more complicated to work with Core Data on multiple threads. But this approach seems to be bulletproof for projects with 5,000 files (which is the highest I've tested.)

I think it is more easy to understand with diagram:
For the situation the author described:
|taskQ| ***********start|
|dispatch_1 ***********|---------
|dispatch_2 *************|---------
.
|dispatch_n ***************************|----------
|main queue(sync)|**start to dispatch to main|
*************************|--dispatch_1--|--dispatch_2--|--dispatch3--|*****************************|--dispatch_n|,
which make the sync main queue so busy that finally fail the task.

Related

Thread Handling with NSURLSessionDataTask

In a nutshell, I am trying to display data from a publicly available JSON file on the WEB. The process is the following:
I initiate the download with an NSURLSessionDataTask, then I parse and display the JSON or handle errors if they occur. Here is my relevant code:
- (void) initiateDownload {
NSURLSessionConfiguration *sessionConfig = [NSURLSessionConfiguration defaultSessionConfiguration];
sessionConfig.timeoutIntervalForRequest = 5.0f;
sessionConfig.timeoutIntervalForResource = 20.0f;
NSURLSessionDataTask *downloadWeatherTask = [urlSession dataTaskWithRequest:urlRequest
completionHandler:^(NSData *data, NSURLResponse *response, NSError *downloadError) {
if (downloadError) {
dispatch_sync(dispatch_get_main_queue(), ^{
[self errorReceived:downloadError];
});
} else if (data) {
dispatch_sync(dispatch_get_main_queue(), ^{
[self parseWeatherJSON:data];
});
}
}];
[downloadWeatherTask resume];
}
I have a couple of questions about this:
I am not all that familiar with thread handling. Although I added the
dispatch_sync(dispatch_get_main_queue(), ...)
to both completion blocks, and it seems to work, I am not sure this is the best way to be thread safe (before, I received all kinds of error messages and displaying the data took 10 seconds after the download has already finished). Is there a better way to handle the download and the threads or mine is an acceptable solution?
I would like the user to be able to initiate the download process manually any time he/she wants to refresh the data displayed. At first, I initialized the NSURLSessionDataTask once and made it available anywhere within the class; so I could just call resume every time a refresh is called. However, I could not find a command to re-do the download process. Once I called [downloadWeatherTask resume], I was unable to start the task from the beginning.
So, I added the task to a separate function (the one you see above) and initialize it there every time the function is called. This works fine, but I am not sure it is the best way to go. For example, is it memory safe or am I creating a new task every time the user initiates a refresh call and will eventually run out of memory?
Thank you for the answers!
A little more info: I use the latest XCode 11 and target iOS 9 and up.
NSURLSession will, by default, use a dedicated background serial queue for completion blocks (and delegate methods, should you do that). But you really want to make sure you trigger UI updates from the main queue (retrieved via dispatch_get_main_queue()). And you generally want to avoid updating properties and ivars from multiple threads (unless, they have some thread-safety built in to them, which is unusual), so dispatching the updates to those properties/ivars back to the main queue is a nice simple way to achieve thread safety.
So, bottom line, what you’re doing here is fine.
However, I could not find a command to re-do the download process.
You perform (resume) a given task only once. If you want to perform a similar request again, instantiate a new NSURLSessionDataTask.

how to wait perform tasks in objective-c

I have a question how to wait perform several tasks that are in some method even that method is called few times in different threads;
For example:
When I call method:
1:
2:
I want to see the following:
1:
STEP 1, STEP 2;
2:
STEP 1, STEP 2;
but often I see the following:
1:
2:
STEP 1, STEP 1,
STEP 2, STEP 2,
See code below, maybe it helps to understand the problem better;
//many times per second
- (void)update:(UpdateObjectClass *)updateObject {
//step 1:
//update common data(for example array)
//long process(about 1-2 seconds)
[self updateData:updateObject];
//step 2:
//update table
[self updateTableView];
}
I have tried to use dispatch_barrier_async, but I don't understand how to use this in proper way;
Thank you for any help ;)
I'm borrowing from #remus's answer.
Assuming that -[update:] is being called on the same instance of an object (and not a whole bunch of objects), you can use #synchronized to enforce that your code is only performed one-at-a-time.
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void){
#synchronized(self) {
// Run your updates
// [self updateData:updateObject];
dispatch_async(dispatch_get_main_queue(), ^(void){
// Async thread callback:
[self updateTableView];
});
}
});
However
I am going to go out on a limb, and guess that the reason you need this code to be performed synchronously is because your -[updateData:] method is doing something that is not thread safe, such as modifying a NSMutableDictionary or NSMutableArray. If this is the case, you should really use that #synchronized trick on the mutable thing itself.
I highly recommend that you post the code to -[updateData:] if it is not too long.
You are trying to solve the problem at the wrong level and with the information in the question it is unlikely that any solution can be provided.
Given the output reported we know that updateData and updateTableView are asynchronous and use one or more tasks. We don't know anything about what queue(s) they use, how many tasks they spawn, whether they have an outer task which does not complete until sub tasks have, etc., etc.
If you look at the standard APIs you will see async methods often take a completion block. Internally such methods may use multiple tasks on multiple queues, but they are written such that all such tasks are completed before they call the completion block. Can you redesign updateData so it takes a completion block (which you will then use to invoke updateTableView)?
The completion block model doesn't by itself address all the ways you might need to schedule a task based on the completion of other task(s), there are other mechanisms including: serial queues, dispatch groups and dispatch barriers.
Serial queues enable a task to be scheduled after the completion of all other tasks previously added to the queue. Dispatch groups enable multiple tasks scheduled on multiple queues to be tagged as belonging to a group, and a task scheduled to run after all tasks in a group have completed. Dispatch barriers enable a task to be scheduled after all previous tasks scheduled on a concurrent queue.
You need to study these methods and then embed the appropriate ones for your needs into your design of updateData, updateTableView and ultimately update itself. You can use a bottom up approach, essentially the opposite of what your question is attempting. Start at the lowest level and ask whether one or more tasks should be a group, have a barrier, be sequential, and might need a completion block. Then move " upward".
Probably not the answer you were hoping for! HTH
Consider using dispatch_async to run the array updates and then update your tableView. You can do this inside of a single method:
dispatch_async(dispatch_get_global_queue( DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(void){
// Run your updates
// [self updateData:updateObject];
dispatch_async(dispatch_get_main_queue(), ^(void){
// Async thread callback:
[self updateTableView];
});
});
Edit
I'd consider modifying your updateData method to run inside the async queue; when it is done, call [self updateTableView]; directly from that method. If it's not too long, can you add the [self updateData:updateObject] code (or a portion of) to your question?

Track down a block/deadlock that is causing a freeze when doing executeFetchRequest

I have an app that is freezing frequently and permanently. When this happens, I click pause in Xcode and see that on the main thread it's always stopping at a line of code that executes a fetch request on the MOC. I also see the output __psynch_mutexwait + 17 in the thread list on the left. This is making me assume that the app is hitting deadlock or for some reason the MOC is blocked.
My first instinct was that I might be executing a fetch request on a non-main thread so I put in logs to check, but this isn't the case. All fetches are happening on the main thread.
How can I go about tracking down what might be blocking here? Is there something more I should be looking for in the stack traces?
Is it a problem that I am setting properties of objects fetched on the main thread on other threads? ie, fetch objectA on main but then pass it to another thread and do something like objectA.someNumber = [NSNumber numberWithInt:2] ?
Is it a problem that I am setting properties of objects fetched on the main thread on other threads? ie, fetch objectA on main but then pass it to another thread and do something like objectA.someNumber = [NSNumber numberWithInt:2] ?
Yes! I've tried this.
When you fetch ObjA in ThreadA, and then pass it to ThreadB for some operations, it will fall into deadlock.
It it precicely because your fetch requests are running on the main thread that your app is blocking. Remember that the main thread is a serial queue and that no other block (or event) will run until your fetch request is done (even if in theory it could because you block is in a waiting state). This explains why when you break you always hit a _psanch_mutexwait.
You should run your fetch requests on another queue and if necessary use the result on the main queue. One way to achieve this is with the following pattern:
- (void) fetchRequest1
{
dispatch_async(not_the_main_queue, ^(void) {
// Code the request here.
// Then use the result on the main if necessary.
dispatch_async(dispatch_get_main_queue(), ^(void) {
// Use the result on the main queue.
});
});
}
Also note that its often not necessary to run anything on the main queue. In fact your app will usually run more smoothly if you run as little as possible on that thread. Of course there are some things that must be done there and in those case you could use the following pattern to ensure that it is:
- (void) runBlockOnMainThread:(void(^)(void))block
{
dispatch_queue_t thisQ = dispatch_get_current_queue();
dispatch_queue_t mainQ = dispatch_get_main_queue();
if (thisQ != mainQ)
dispatch_sync(mainQ, block);
else
block();
}

What to do when users generate the same action several time waiting for download?

I am designing an IPhone application. User search something. We grab data from the net. Then we update the table.
THe pseudocode would be
[DoThisAtbackground ^{
LoadData ();
[DoThisAtForeground ^{
UpdateTableAndView();
}];
}];
What about if before the first search is done the user search something else.
What's the industry standard way to solve the issue?
Keep track which thread is still running and only update the table
when ALL threads have finished?
Update the view every time a thread finish?
How exactly we do this?
I suggest you take a look at the iOS Human Interface Guidelines. Apple thinks it's pretty important all application behave in about the same way, so they've written an extensive document about these kind of issues.
In the guidelines there are two things that are relevant to your question:
Make Search Quick and Rewarding: "When possible, also filter remote data while users type. Although filtering users' typing can result in a better search experience, be sure to inform them and give them an opportunity to opt out if the response time is likely to delay the results by more than a second or two."
Feedback: "Feedback acknowledges people’s actions and assures them that processing is occurring. People expect immediate feedback when they operate a control, and they appreciate status updates during lengthy operations."
Although there is of course a lot of nonsense in these guidelines, I think the above points are actually a good idea to follow. As a user, I expect something to happen when searching, and when you update the view every time a thread is finished, the user will see the fastest response. Yes, it might be results the user doesn't want, but something is happening! For example, take the Safari web browser in iOS: Google autocomplete displays results even when you're typing, and not just when you've finished entering your search query.
So I think it's best to go with your second option.
If you're performing the REST request for data to your remote server you can always cancel the request and start the new one without updating the table, which is a way to go. Requests that have the time to finish will update UI and the others won't. For example use ASIHTTPRequest
- (void)serverPerformDataRequestWithQuery:(NSString *)query andDelegate:(__weak id <ServerDelegate)delegate {
[currentRequest setFailedBlock:nil];
[currentRequest cancel];
currentRequest = [[ASIHTTPRequest alloc] initWithURL:kHOST];
[currentRequest startAsynchronous];
}
Let me know if you need an answer for the local SQLite databases too as it is much more complicated.
You could use NSOperationQueue to cancel all pending operations, but it still would not cancel the existing operation. You would still have to implement something to cancel the existing operation... which also works to early-abort the operations in the queue.
I usually prefer straight GCD, unless there are other benefits in my use cases that are a better fit for NSOperationQueue.
Also, if your loading has an external cancel mechanism, you want to cancel any pending I/O operations.
If the operations are independent, consider a concurrent queue, as it will allow the newer request to execute simultaneously as the other(s) are being canceled.
Also, if they are all I/O, consider if you can use dispatch_io instead of blocking a thread. As Monk would say, "You'll thank me later."
Consider something like this:
- (void)userRequestedNewSearch:(SearchInfo*)searchInfo {
// Assign this operation a new token, that uniquely identifies this operation.
uint32_t token = [self nextOperationToken];
// If your "loading" API has an external abort mechanism, you want to keep
// track of the in-flight I/O so any existing I/O operations can be canceled
// before dispatching new work.
dispatch_async(myQueue, ^{
// Try to load your data in small pieces, so you can exit as early as
// possible. If you have to do a monolithic load, that's OK, but this
// block will not exit until that stops.
while (! loadIsComplete) {
if ([self currentToken] != token) return;
// Load some data, set loadIsComplete when loading completes
}
dispatch_async(dispatch_get_main_queue(), ^{
// One last check before updating the UI...
if ([self currentToken] != token) return;
// Do your UI update operations
});
});
}
It will early-abort any operation that is not the last one submitted. If you used NSOperationQueue you could call cancelAllOperations but you would still need a similar mechanism to early-abort the one that is currently executing.

In iOS does either NSURL or NSXML span a new thread?

I have a program that progresses as follows. I call a method called getCharacteristics. This method connects to a remote server via a NSURL connection (all networking code done in another file) and when it receives a response it makes a method call back to the original class. This original class then parses the data (xml) and stores its contents as a map.
The problem I'm having is that it appears that somewhere in this transaction another thread is being spawned off.
Here is sample code showing what I'm doing:
#property map
- (void) aMethod
{
[[WebService getSingleton] callWebService: andReportBackTo: self]
Print "Ready to Return"
return map;
}
- (void) methodThatIsReportedBackToAfterWebServiceRecievesResponse
{
//Parse data and store in map
Print "Done Parsing"
}
The problem that I am running into is that map is being returned before it can be fully created. Additionally, "Ready to Return" is being printed before "Done parsing" which suggests to me that there are multiple threads at work. Am I right? If so, would a simple lock be the best way to make it work?
NSURLConnection will execute in another thread if you tell it to execute asynchronously.
In my opinion the best way to deal with this would be to write your own delegate protocol, and use delegation to return your map when the you have downloaded and parsed your data.
You could retrieve your data synchronously using NSURLConnection, but you may force the user to wait for an extended period of time especially if a connection timeout occurs. I would avoid this approach.