What is the key difference between threads and coroutines in Kotlin? - kotlin

As new to Kotlin, I am facing a little problem in understanding the difference between thread and coroutine. Can someone make it simple to understand the fundamental them of them?
I started learning core concepts of Kotlin programing language recently, but I got stuck in differentiating threads and coroutines.

Threads are native mechanisms, whereas coroutines are user-level abstractions, which use threads and actually make better use of them.
Key differences I can point out:
Thread is a different mechanism, which is linked to the native thread
of OS. This is the reason why creating hundreds/thousands of threads
is impossible - thread consumes a lot of OS' memory. Coroutine is a
user-level abstraction of some worker which does not use excessive
amount of memory, since it is not linked to native resources and use
resources of JVM heap.
Thread gets blocked instead of suspending and dispatching the job to
another thread.
Thread cannot be used until its work completes.
Coroutine is a user-friendly abstraction which allows you to reuse thread's resources to execute suspending functions. Mechanism is following: suspending function is called from a coroutine and dispatched on a specific thread. While it suspends, thread resources can be used to execute another suspending function.
To get better understanding of how coroutines work in combination with threads, you can read my answer to my own question, which is also a source for this answer.

Related

How the coroutine knows that it is time to resume/suspend?

Let's say we have a job A and a job B (not kotlin's Job, just some kind of work).
I am told that coroutines can suspend and thus the underlying thread used by A will not be blocked and can be used for B, while A suspends.
Let's say, that A performs some kind of downloading data from server. How does A perform such work, while being suspended (if it gets suspended)? How does it know that it is time to resume and hold the thread again? How the thread deal with the coroutines states and decides, which one to run?
I guess it uses good old wait/notify mechanism under the hood, however it is unclear for me, how the example download can happen while the thread is used for another work already?
How does the coroutine perform work, while being suspended (if it gets suspended)?
After some research I found out, that when the coroutine suspends it actually gets dispatched to another thread (as was mentioned by bylazy), in which it continues execution.
How does it know that it is time to resume and hold the thread again?
Taking the example from the question, the download will be dispatched to a separate thread of the implicit threadpool (which was mentioned by Tenfour04) and will use continuation object to resume on former thread.
At the same time, the former thread remains available for another work. Whereas Java's Thread has differences that explain why coroutines' performance is higher:
Thread is a different mechanism, which is linked to the native thread of OS. This is the reason why creating hundreds/thousands of threads is impossible - thread consumes a lot of OS' memory. Coroutine is a user-level abstraction of some worker which does not use excessive amount of memory, since it is not linked to native resources and use resources of JVM heap.
Thread gets blocked instead of suspending and dispatching the job to another thread.
Thread cannot be used until its work completes.
Thread is asynchrounous whereas coroutines are sequentional. According to the previous point, a thread performs some kind of work asyncrhonously and cannot be used. On the other hand a coroutine, being a user-friendly abstraction, is executed on the thread and after it gets suspended, the next one gets executed on the same thread. (This point answers to "How the thread deal with the coroutines states and decides, which one to run?")
So the coroutines make the better and more efficient use of threads, taking care of dispatching, reusing resources, managing thread pool and etc.
The sources I used:
Coroutines vs Threads (Educba)
Difference between a thread and a coroutine in Kotlin

Detect when block is added to Grand Central Dispatch?

I have an iOS application using NSThreads for concurrency tasks. I will try to migrate it to be using the Grand Central Dispatch (GCD) for handling concurrency.
The problem is that the app needs information regarding how many threads has been created since a given time. And how many threads that was spawned since that given time is currently running.
At the moment this is done by creating a category that does a method swizzling on the -main method in NSThread. In the new swizzled method it simply increments the total number of threads running and then decrement the same variable before the new swizzled -main method returns.
The problem is that when I use GCD dispatch_async it does not create a NSThread, hence my category approach does not work. How can I achieve the same while using GCD to handle concurrency?
What I would like to detect is when a new block is added to GCD, and when that block has been executed.
Any suggestions on how to achieve the same is very welcome.
EDIT
Many thanks to #ipmcc and #RyanR for helping me out on this. :) I believe I need to tell some more about the background and what I am trying to accomplish.
What I am actually trying is to extend the iOS testing framework Frank. Frank embeds a small web-server within a given app which enables sending HTTP request to the iOS application and thereby simulating events, a swipe or a tap gesture as an example.
I would like to extend it in a way that enables it to wait until all work triggered by a specific simulated event has ended before returning upon a request.
However I found it hard to detect exactly what work was triggered by the received event. And thats how I came to the solution to just reset a thread counter and then increment this counter for all created threads after the event was simulated, and decrement it when the threads are finishing. And then block until threads count became zero again. I know this approach is not perfect either, and it wont work with GCP.
Is there any other way to achieve it? Another possible solution which I have thought of is to specify that everything must run synchronized except the thread handling the HTTP request. However I don't know if this possible.
Any suggestions on how to achieve blocking after each simulated event until work triggered by that event has completed?
The problem is that the app needs information regarding how many
threads has been created since a given time. And how many threads that
was spawned since that given time is currently running.
You will not be able to get this information from GCD. One of the points of GCD is that you do not manage the thread pool. It is opaque. You'll note that even pthreads, the underlying threading library on which NSThread and GCD are built, does not have a (public) means to enumerate all existing threads or get the number of running threads. This is not going to be doable without hard core low level hackery. If you need to control or know the number of threads, then you need to be the one to spawn and manage them, and GCD is the wrong abstraction for you.
At the moment this is done by creating a category that does a method
swizzling on the -main method in NSThread. In the new swizzled method
it simply increments the total number of threads running and then
decrement the same variable before the new swizzled -main method
returns.
Note that this only tells you the number of threads started using NSThread. As mentioned, NSThread is a fairly high level abstraction on top of pthreads. There is nothing to prevent library code from spawning its own threads using the pthreads API that will be invisible to your count.
The problem is that when I use GCD dispatch_async it does not create a
NSThread, hence my category approach does not work. How can I achieve
the same while using GCD to handle concurrency?
In short, you can't. If you want to go forth and patch functions all over the various frameworks, then you should look up a library called mach_override. (But please don't.)
What I would like to detect is when a new block is added to GCD, and
when that block has been executed.
Since GCD uses thread pools, the act of adding a block does not imply a new thread. (And that's sorta the whole point.)
If you have some limited resource whose consumption you need to manage, the traditional way to do that would be with a limiting semaphore, but that is just one option.
This whole question just reeks of a poor design. Like the number of pthreads, GCD's queue widths are opaque/non-public. Your previous solution was not particularly viable (as discussed), and further efforts are likely to yield similarly poor solutions. You should really rethink your architecture such that knowing how many threads are running isn't important.
EDIT: Thanks for the clarification. There's not really a generic way, from the outside, to tell when all the "work" is done. What if an action sets up a timer that won't call back for ten minutes? At the extreme, consider this: the main runloop continues to spin for the entire life of the app, and as long as the main runloop is spinning, "work" could be being done on it.
In order to detect "doneness" your app has to signal doneness. In order to signal doneness, the app has to have some way (internal to itself) to know it's done. Put differently, the app can't tell something else (i.e. Frank) something it doesn't know. One way to go about this would be to encapsulate all the work you do in your app in NSOperations. NSOperation/NSOperationQueue provide good ways of reporting "doneness." At the simplest level, you could wrap the code where you kickoff work in an NSBlockOperation, then add a completion block to that operation that signals something else when it's done, and enqueue it to an NSOperationQueue for execution. (You could also do this with dispatch_group and dispatch_group_notify if you prefer working in the GCD style.)
If you have specific questions about how to package up your app's work into NSOperations, I would suggest starting a new question.
You can hook into the dispatch introspection functions (introspection.h, methods all start with dispatch_introspection), but you have to link with that library which is supposed to be only for debugging. I don't think you can include that in a release build. Your best bet would be to encapsulate GCD into your own object, so all your code submits blocks to execute through that object and it submits them to GCD after tracking whatever you're interested in. You won't be able to track thread consumption though, because GCD intentionally abstracts that and reuses threads.

Should my block based methods return on the main thread or not when creating an iOS cloud integration framework?

I am in the middle of creating a cloud integration framework for iOS. We allow you to save, query, count and remove with synchronous and asynchronous with selector/callback and block implementations. What is the correct practice? Running the completion blocks on the main thread or a background thread?
For simple cases, I just parameterize it and do all the work i can on secondary threads:
By default, callbacks will be made on any thread (where it is most efficient and direct - typically once the operation has completed). This is the default because messaging via main can be quite costly.
The client may optionally specify that the message must be made on the main thread. This way, it requires one line or argument. If safety is more important than efficiency, then you may want to invert the default value.
You could also attempt to batch and coalesce some messages, or simply use a timer on the main run loop to vend.
Consider both joined and detached models for some of your work.
If you can reduce the task to a result (remove the capability for incremental updates, if not needed), then you can simply run the task, do the work, and provide the result (or error) when complete.
Apple's NSURLConnection class calls back to its delegate methods on the thread from which it was initiated, while doing its work on a background thread. That seems like a sensible procedure. It's likely that a user of your framework will not enjoy having to worry about thread safety when writing a simple callback block, as they would if you created a new thread to run it on.
The two sides of the coin: If the callback touches the GUI, it has to be run on the main thread. On the other hand, if it doesn't, and is going to do a lot of work, running it on the main thread will block the GUI, causing frustration for the end user.
It's probably best to put the callback on a known, documented thread, and let the app programmer make the determination of the effect on the GUI.

Grand Central Dispatch: Queue vs Semaphore for controlling access to a data structure?

I'm doing this with Macruby, but I don't think that should matter much here.
I've got a model which stores its state in a dictionary data structure. I want concurrent operations to be updating this data structure sporadically. It seems to me like GCD offers a few possible solutions to this, including these two:
wrap any code that accesses the data structure in a block sent to some serial queue
use a GCD semaphore, with client code sending wait/signal calls as necessary when accessing the structure
When the queues in the first solution are synchronously called, then it seems pretty much equivalent to the semaphore solution. Do either of these solutions have clear advantages that I'm missing? Is there a better alternative I'm missing?
Also: would it be straightforward to implement a read-write (shared-exclusive) lock with GCD?
Serial Queue
Pros
there are not any lock
Cons
tasks can't work concurrently in the Serial Queue
GCD Semaphore
Pros
tasks can work concurrently
Cons
it uses lock even though it is light weight
Also we can use Atomic Operations instead of GCD Semaphore. It would be lighter than GCD Semaphore in some situation.
Synchronization Tools - Atomic Operations
Guarding access to the data structure with dispatch_sync on serial queue is semantically equivalent to using a dispatch semaphore, and in the uncontended case, they should both be very fast. If performance is important, benchmark and see if there's any significant difference.
As for the readers-writer lock, you can indeed construct one on top of GCD—at least, I cobbled something together the other day here that seems to work. (Warning: there be dragons/not-well-tested code.) My solution funnels the read/write requests through an intermediary serial queue before submitting to a global concurrent queue. The serial queue is suspended/resumed at the appropriate times to ensure that write requests execute serially.
I wanted something that would simulate a private concurrent dispatch queue that allowed for synchronisation points—something that's not exposed in the public GCD api, but is strongly hinted at for the future.
Adding a warning (which ends up being a con for dispatch queues) to the previous answers.
You need to be careful of how the dispatch queues are called as there are some hidden scenarios that were not immediately obvious to me until I ran into them.
I replaced NSLock and #synchronized on a number of critical sections with dispatch queues with the goal of having lightweight synchronization. Unfortunately, I ran into a situation that results in a deadlock and I have pieced it back to using the dispatch_barrier_async / dispatch_sync pattern. It would seem that dispatch_sync may opportunistically call its block on the main queue (if already executing there) even when you create a concurrent queue. This is a problem since dispatch_sync on the current dispatch queue causes a deadlock.
I guess I'll be moving backwards and using another locking technique in these areas.

blocks and threads

I want to know if blocks in c / cocoa run on a seperate thread to the main thread. Would they be useful for executing computationally expensive code while leaving the UI responsive?
Blocks are just snippets of code bundled up into a callable object. How they run is entirely up to the code that calls it.
Running blocks on a separate thread is not only possible, but is precisely the reason the blocks concept was introduced. It exists to support Grand Central Dispatch, which hides a lot of the complexity of concurrent programming behind a task-oriented model.
They don't have to run on another thread, but they can. You can schedule them on NSOperationQueues or GCD queues, and those queues can be drained by background threads.
And yes, this can be a useful construct to help you get time consuming work off the main thread. But that's not all that blocks are useful for, and conversely you can do background processing with or without blocks.
You can use GCD to schedule blocks for issuing on other threads. The two were introduced together, so any discussion of the one usually mentions the other. However, blocks are not in themselves inherently a multithreading mechanism.