NSStream: Is there any airtight defense against blocking? - objective-c

Under the Stream Programming Guide: Polling versus Run-Loop Scheduling section, the last para says:
It should be pointed out that neither the polling nor run-loop
scheduling approaches are airtight defenses against blocking. If the
NSInputStream hasBytesAvailable method or the NSOutputStream
hasSpaceAvailable method returns NO, it means in both cases that the
stream definitely has no available bytes or space. However, if either
of these methods returns YES, it can mean that there is available
bytes or space or that the only way to find out is to attempt a read
or a write operation (which could lead to a momentary block). The
NSStreamEventHasBytesAvailable and NSStreamEventHasSpaceAvailable
stream events have identical semantics.
So, it seems neither hasBytesAvailable/hasSpaceAvailable, nor stream events provide a guarantee against blocking. Is there any way to get guaranteed non-blocking behaviour with streams? I could create a background thread to get guaranteed non blocking behaviour, but I want to avoid doing that.
Also, I fail to understand why NSStream can't provide gauranteed non-blocking behaviour given that the low-level APIs (select, kqueue, etc.) can do so. Can someone explain why this is the case?

You either run your reading or writing in a different thread or you can't use NSStream. There are no other ways to get guaranteed non-blocking behavior.
For regular files and sockets you most likely will get non-blocking behavior if you schedule the stream on a runloop. But there are other types of stream that are not implemented on top of a file descriptor. By documenting the base class as not always non-blocking Apple keeps options open of implementing different streams in a way where they can't guarantee the non-blocking property.
But since we can't check the source code we can only speculate on this. You might want to file a bug with Apple requesting them to update the docs with that information.

Related

Is it okay to call [NSProcessInfo beginActivityWithOptions] and [NSProcessInfo endActivity] on a per-thread basis?

I've got a MacOS/X app which in general is not averse to app-napping, but sometimes it will spawn one or more child threads to do timing-sensitive networking tasks, which do need to avoid being app-napped.
The elegant thing to do would be to have each of these threads call [[NSProcess processInfo] beginActivityWithOptions [...]] when it starts, and also call [[NSProcess processInfo] endActivity [...]] just before it exits, which would (hopefully) have the effect of avoiding app-nap on my process (or at least on those particular threads) only when one or more of these network-threads is running.
My question is, is this a legal/acceptable calling pattern, or is NSProcessInfo more of a per-process-only kind of API that doesn't implement the thread-safe reference-counting logic that would be necessary to reliably yield the expected behavior if I call it from multiple threads? (if it's the latter, I can implement that logic myself, but I'd rather not reinvent the wheel here)
This API is considered like process-wide, to report that your entire Application is doing the specific kind of job, which should be or should not be affected by power saving heuristics (which are, again, per-process, not per-thread).
The best way to use it would be to begin one activity before all your background threads started, and finish it after all your important background threads finished.
You can do it with DispatchGroup or any other instrument you wish.
That is not the only way though.
beginActivityWithOptions would return the _NSActivityAssertion, which is not generally aware of threads. You could bring your own thread sync mechanism to this party.
Calling this API several times would create several _NSActivityAssertion objects, which is definetely redundant but should work, if you will properly end each of them.
First of all this API should be or should not be affected by power-saving heuristics.
The best way to use it would be to use it with a dispatch group

#synchronized vs lock/unlock

I'm new to Objective-C. When should I used #synchronized, and when should I use lock/unlock? My background is mainly in Java. I know that in Java, obtaining explicit-locks allows you to do more complex, extensive, and flexible operations (vis-à-vis release order, etc.) whereas the synchronized keyword forces locks to be used in a block-structured way and they have also to released in the reverse order of how they were acquired. Does the same rationale hold in Objective-C?
Many would consider locking/unlocking in arbitrary order to be a bug, not a feature. Namely, it quite easily leads to deadlocks.
In any case, there is little difference between #synchonized(), -lock/-unlock, or any other mutex, save for the details of the scope. They are expensive, fragile, error-prone, and, often, getting them absolutely correct leads to performance akin to a single-threaded solution anyway (but with the complexity of threads).
The new hotness are queues. Queues tend to be a lot lighter weight in that they don't generally require system calls for the "fast path" operations that account for most calls. They also tend to be much more expressive of intent.
Grand Central Dispatch or NSOperationQueue specifically. The latter is built upon the former in current OS Release. GCD's API's tend to be lower level, but they are very powerful and surprisingly simple. NSOperationQueue is higher level and allows for directly expressing dependencies and the like.
I would suggest starting with the Cocoa concurrency guide.
You are generally correct. #synchronized takes a lock that's "attached" to a given object (the object referred to in the #synchronized directive does not have to be a lock). As in Java, it defines a new block, and the lock is taken at the beginning, and released at exit, whether by leaving the block normally or through an exception. So, as you guessed, the locks are released in reverse order from acquisition.

iOS, Objective C - NSURLConnection and asynchronious examples, default behavior and best practices?

I have been working with a few applications that deal with NSURLConnections. While researching best practices I have noticed a lot of examples online showing how to use NSOperation and NSOperationQueue to deal with this.
I have also noticed on stackoverflow a few examples that show initializing the connection as synchronous and asynchronous using the class methods of NSURLConnection: sendAsynchronousRequest and sendSynchronousRequest.
Currently I am doing my initialization as follows:
[[NSURLConnection alloc] initWithRequest:request delegate:self];
While doing this I have monitored the main thread and the calls to the delegate methods:
connectionDidFinishLoading, connectionDidReceiveResponse, connectionDidReceiveData and connectionDidFailWithError
Everything I have read in Apples documentation and my tests prove to me that this is asynchronous by default behavior.
I would like to know from more experienced Objective C programmers when the other options would be used for either a best practice, or just be more correct than what I see as the most simplistic way to get async behavior?
This is my first question I have posted on here, if more information is needed please ask.
Synchronous is bad bad bad. Try to avoid it. That will block up your main thread if the data transfer is large, thus resulting in an unresponsive UI.
Yes, it is possible to dispatch a synchronous call onto a different thread, but then you have to access any UI elements back on the main thread and it is a mess.
Normally I just use the delegate methods you have described - it is straightforward, and NSURLConnection already handles the asynchronous call for you away from the main thread. All you need to do is implement the simple delegate methods! It's a little more code, but you always want to go asynchronous. Always. And when it is finished loading, use the information you get to update the UI from the finishedLoading delegate method.
You also have the option of using blocks now, but I can't speak for how well those work or even how to use them well. I'm sure there's a tutorial somewhere - the delegate methods are just so easy to implement.
The method you list are the traditional means of asynchronous transfer and an app that uses them will be efficient in processor (and hence power) use.
The sendAsynchronousRequest method is a relatively new addition, arriving in iOS 5. In terms of best practice there's little other than style to differentiate between it and the data delegate methods other than that a request created with the latter can be cancelled and a request created with the former can't. However the tidiness and hence the readability and greater improbability of bugs of the block-based sendAsynchronousRequest arguably give it an edge if you know you're not going to want to cancel your connections.
As a matter of best practice, sendSynchronousRequest should always be avoided. If you use it on the main thread then you'll block the user interface. If you use it on any other thread or queue that you've created for a more general purpose then you'll block that. If you create a special queue or thread for it, or post it to an NSOperationQueue then you'll get no real advantages over a normal asynchronous post and your app will be less power efficient per Apple's standard WWDC comments.
References to sendSynchronousRequest are probably remnants of pre-iOS 5 patterns. Anywhere you see a sendSynchronousRequest, a sendAsynchronousRequest could be implemented just as easily and so as to perform more efficiently. I'd guess it was included originally because sometimes you're adapting code that needs to flow in a straight line and because there were no blocks and hence no 'essentially a straight line' way to implement an asynchronous call. I really can't think of any good reason to use it now.

Why should I choose GCD over NSOperation and blocks for high-level applications?

Apple's Grand Central Dispatch reference says:
"...if your application needs to operate at the Unix level of the
system—for example, if it needs to manipulate file descriptors, Mach
ports, signals, or timers. GCD is not restricted to system-level
applications, but before you use it for higher-level applications, you
should consider whether similar functionality provided in Cocoa (via
NSOperation and block objects) would be easier to use or more
appropriate for your needs.".
http://developer.apple.com/library/ios/#documentation/Performance/Reference/GCD_libdispatch_Ref/Reference/reference.html
I can't actually think of situations, for high-level applications, in which the use of GCD is mandatory and NSOperation could/should not be used.
Any thoughts?
The point being made here is the same one that Chris Hanson states in his article "When to use NSOperation vs. GCD":
The straightforward answer is a general guideline for all application
development:
Always use the highest-level abstraction available to you, and drop
down to lower-level abstractions when measurement shows that they are
needed.
In this particular case, it means that when writing Cocoa
applications, you should generally be using NSOperation rather than
using GCD directly. Not because of a difference in efficiency, but
because NSOperation provides a higher-level abstraction atop the
mechanisms of GCD.
In general, I agree with this. NSOperation and NSOperationQueue provide support for dependencies and one or two other things that GCD blocks and queues don't have, and they abstract away the lower-level details of how the concurrent operations are implemented. If you need that functionality, NSOperation is a very good way to go.
However, after working with both, I've found myself replacing all of my NSOperation-based code with GCD blocks and queues. I've done this for two reasons: there is significant overhead when using NSOperation for frequent actions, and I believe my code is cleaner and more descriptive when using GCD blocks.
The first reason comes from profiling in my applications, where I found that the NSOperation object allocation and deallocation process took a significant amount of CPU resources when dealing with small and frequent actions, like rendering an OpenGL ES frame to the screen. GCD blocks completely eliminated that overhead, leading to significant performance improvements.
The second reason is more subjective, but I believe that my code is cleaner when using blocks than NSOperations. The quick capture of scope allowed by a block and the inline nature of them make for less code, because you don't need to create custom NSOperation subclasses or bundle up parameters to be passed into the operation, and more descriptive code in my opinion, because you can place the code to be run in a queue at the point where it is fired off.
Again, its a matter of preference, but I've found myself using GCD more, even in otherwise more abstracted Cocoa applications.
Prefer GCD where task is not much complex and optimum CPU performance is required.
Prefer NSOperationQueue where task is complex and requires canceling or suspending a block and dependency management.
GCD is a lightweight way to represent units of work that are going to be executed concurrently. You don’t schedule these units of work; the system takes care of scheduling for you. Adding dependency among blocks can be a headache. Canceling or suspending a block creates extra work for you as a developer!
NSOperation and NSOperationQueue add a little extra overhead compared to GCD, but you can add dependency among various operations. You can re-use, cancel or suspend operations. NSOperation is compatible with Key-Value Observation (KVO); for example, you can have an NSOperation start running by listening to NSNotificationCenter.
For detailed explanation, refer this question: NSOperation vs Grand Central Dispatch
Well, NSOperation has no equivalents to dispatch_source_t, dispatch_io, dispatch_data_t, dispatch_semaphore_t, etc... It's also somewhat higher overhead.
On the flip side, libdispatch has no equivalents to operation dependencies, operation priorities (queue priorities are somewhat different), or KVO on operations.
There are two things that NSOperationQueue can do that GCD doesn't do: The minor one is dependencies (add an operation to a queue but tell it to only execute when certain other operations are finished), and the big one is that NSOperation gives you an object which can receive messages while the task is executing, unlike GCD which has blocks that cannot receive messages except in a very limited way. You either need these two features, or you don't. If you don't, using GCD is just an awful lot easier to use.
That's why useful examples of NSOperation are always quite complex. If they were easy, you would use GCD instead. You usually create a subclass of NSOperation, which will be some significant amount of work, or use one that someone else has created.
I've actually just been reading 'round about this, and, I'm sure it will come as know surprise, opinions differ.
I can't think of a case where you'd have to use GCD over NSOperation, but that doesn't mean such a case doesn't exist. I, however, agree with a general sentiment in terms of best-practice coding:
If you have a couple tools that suit the job (and in this case, you've got NSOperation vs a GCD block), use the class with the highest level of abstraction (ie, the highest level API). Not only is it typically easier to use/less code, you'll also gain from potential future enhancements introduced to the higher level APIs.

Grand Central Dispatch: Queue vs Semaphore for controlling access to a data structure?

I'm doing this with Macruby, but I don't think that should matter much here.
I've got a model which stores its state in a dictionary data structure. I want concurrent operations to be updating this data structure sporadically. It seems to me like GCD offers a few possible solutions to this, including these two:
wrap any code that accesses the data structure in a block sent to some serial queue
use a GCD semaphore, with client code sending wait/signal calls as necessary when accessing the structure
When the queues in the first solution are synchronously called, then it seems pretty much equivalent to the semaphore solution. Do either of these solutions have clear advantages that I'm missing? Is there a better alternative I'm missing?
Also: would it be straightforward to implement a read-write (shared-exclusive) lock with GCD?
Serial Queue
Pros
there are not any lock
Cons
tasks can't work concurrently in the Serial Queue
GCD Semaphore
Pros
tasks can work concurrently
Cons
it uses lock even though it is light weight
Also we can use Atomic Operations instead of GCD Semaphore. It would be lighter than GCD Semaphore in some situation.
Synchronization Tools - Atomic Operations
Guarding access to the data structure with dispatch_sync on serial queue is semantically equivalent to using a dispatch semaphore, and in the uncontended case, they should both be very fast. If performance is important, benchmark and see if there's any significant difference.
As for the readers-writer lock, you can indeed construct one on top of GCD—at least, I cobbled something together the other day here that seems to work. (Warning: there be dragons/not-well-tested code.) My solution funnels the read/write requests through an intermediary serial queue before submitting to a global concurrent queue. The serial queue is suspended/resumed at the appropriate times to ensure that write requests execute serially.
I wanted something that would simulate a private concurrent dispatch queue that allowed for synchronisation points—something that's not exposed in the public GCD api, but is strongly hinted at for the future.
Adding a warning (which ends up being a con for dispatch queues) to the previous answers.
You need to be careful of how the dispatch queues are called as there are some hidden scenarios that were not immediately obvious to me until I ran into them.
I replaced NSLock and #synchronized on a number of critical sections with dispatch queues with the goal of having lightweight synchronization. Unfortunately, I ran into a situation that results in a deadlock and I have pieced it back to using the dispatch_barrier_async / dispatch_sync pattern. It would seem that dispatch_sync may opportunistically call its block on the main queue (if already executing there) even when you create a concurrent queue. This is a problem since dispatch_sync on the current dispatch queue causes a deadlock.
I guess I'll be moving backwards and using another locking technique in these areas.