Coroutines, understanding suspend - kotlin

I'm trying to understand a passage in Hands-On Design Patterns with Kotlin, Chapter 8, Threads and Coroutines.
Why is it that when we rewrite the function as suspend, "we can serve 20 times more users, all thanks to the smart way Kotlin has rewritten our code".
fun profile(id:String):Profile {
val bio = fetchBioOverHttp(id) //takes 1s
val picture = fetchPictureFromDb(id) // takes 100ms
val friends = fetchFriendsFromDb(id) // takes 500ms
return Profile(bio, picture)
}
I've attached the two relevant pages but basically, it says "if we have a thread pool of of 10 threads, the first 10 requests will get into the pool and the 11th will get stuck until the first one finishes. This means we can serve three users simultaneously, and the fourth one will wait until the first one gets his/her results."
I think I understand this point. 3 threads execute the three methods in parallel, then another 3, then another 3, which gives us 9 threads actively executing code. The 10th thread executes the first fetchBioOverHttp method, and we're out of threads until thread #1 finishes its fetchBioOverHttp call.
However, how does rewriting these methods as suspend methods result in serving 20 times more users? I guess I'm not understanding the path of execution here.

To be honest, I don't like this example.
Author meant that after rewriting httpCall() it doesn't wait for the result - it schedules processing in the background, registers a callback and then immediately returns. The caller thread is freed and it can start handling another request while the first one is being processed. By using this technique we can process multiple requests while using even a single thread.
I don't like this explanation, because it ignores how coroutines really work internally. Instead, it tries to compare them to something the reader could be familiar with - asynchronous callback-based APIs. Normally, this is good as it helps to understand. However, in this case the problem is that in most cases coroutines internally... create a thread pool and use it to schedule blocking IO operations. Therefore, both provided solutions are pretty much the same and the main difference is that we created a pool of 10 threads and by default coroutines use 64 threads.
Kotlin compiler does not cut the function into two. There is still a single function with a lot of additional code inside. I agree it can be interpreted as two functions calling each other, but this is not what the compiler does. If that wasn't explained in the book, I think this is misleading.

Related

How do you know when you need to yield()?

Take Kotlin channels for example
for(msg in channel){
// to stuff
yield() // maybe?
}
How do you know if yield is required? I assume that Channels are built in a way that yielding happens automatically behind the scenes in the iterator but I'm not sure. In general, how do you know you need manual yields when using the rest of Kotlin's coroutine library that might do it for you automatically?
In most cases you should not at all need to use yield() or be concerned with it. Coroutines can switch automatically whenever we get to a suspension point, which usually happens pretty often.
yield() is needed only if our code does not suspend for prolonged time. That usually means we are performing intensive CPU calculations. In your example receiving from the channel is suspending operation, so you don't need yield() here.
You only need to call yield if you want to artificially add a suspension point when you have none in a piece of code. Suspension points are calls to suspend functions.
If you don't know which functions are suspend from the top of your head, you can quickly identify those in IntelliJ IDEA for instance because every suspend function call is marked with an icon:
So in your case you would see it on the iteration over the channel:
You only really need to manually add a yield if you have loops or extended pieces of code that exclusively use regular functions, or more generally if you want to ensure other coroutines have a chance to run at a particular point in time (for instance in tests). This shouldn't happen often.

How to use blocking (I/O bound) APIs within Kotlin coroutines?

I'm writing a Kotlin server using Ktor - where my request handlers are written using Kotlin coroutines.
My understanding is each request handler is run on Ktor's thread pool, which contains far fewer threads than the traditional pool size of 1-thread-per-request server frameworks due to the lightweight/suspendable nature of coroutines. Great!
The issue I have is that my application still needs to interact with some blocking resources (JDBC database connection pool), but my understanding is that if I merely call these blocking APIs directly from the request coroutine I will end up with liveness issues - as I can end up blocking all the threads used to handle my requests! Not great.
Since I'm still relatively new to the world of Kotlin and coroutines, I'm wondering if anyone here can give me some tips on the best way to handle this situation.
I've seen Dispatchers.IO referenced a few times elsewhere. Is that considered the best way to manage these blocking calls? Are there any good examples of this?
The API I'm trying to use does allow for some asyncronicity by passing an Executor. Ideally, I could also wrap these calls in a convenient, idiomatic Kotlin API for suspending transactions.
You understand it all correctly. In most cases you should never block the thread when inside a coroutine. One exception is Dispatchers.IO mentioned by you. It is the standard way of handling blocking code and it is very easy to use:
withContext(Dispatchers.IO) {
// blocking code
}
withContext() is a suspend function, so you can think of above as the way to convert blocking to suspend. However, Dispatchers.IO doesn't really perform any magic - it just uses a bigger pool of threads, designated for blocking. I believe by default it creates 64 threads at maximum.
If you need to perform several parallel blocking operations, it is usually better to create your own thread pool to not block other components of the application.
If the IO library provides asynchronous API then generally it is better to use it instead of the blocking API. However, in many cases libraries provide asynchronous API by managing their own internal thread pool for blocking. In that case using asynchronous API and using blocking API with Dispatchers.IO is very similar. Dispatchers.IO could be even better, because it re-uses same IO threads across all IO operations and it can partially share threads with a thread pool designated for CPU computations (Dispatchers.Default).
Yes. the Dispatchers.IO would be the answer. I had a test with quarkus. The vert.x had no 2-seconds-blocking-alarm after I switched JDBC connection to Dispatchers.IO
https://github.com/hmchangm/quarkus-reactive-kotlin/blob/mariadb/src/main/kotlin/tw/idv/brandy/arrow/repo/FruitRepo.kt

Do we need to lock the immutable list in kotlin?

var list = listOf("one", "two", "three")
fun One() {
list.forEach { result ->
/// Does something here
}
}
fun Two() {
list = listOf("four", "five", "six")
}
Can function One() and Two() run simultaneously? Do they need to be protected by locks?
No, you dont need to lock the variable. Even if the function One() still runs while you change the variable, the forEach function is running for the first list. What could happen is that the assignment in Two() happens before the forEach function is called, but the forEach would either loop over one or the other list and not switch due to the assignment
if you had a println(result) in your forEach, your program would output either
one
two
three
or
four
five
six
dependent on if the assignment happens first or the forEach method is started.
what will NOT happen is something like
one
two
five
six
Can function One() and Two() run simultaneously?
There are two ways that that could happen:
One of those functions could call the other.  This could happen directly (where the code represented by // Does something here in One()⁽¹⁾ explicitly calls Two()), or indirectly (it could call something else which ends up calling Two() — or maybe the list property has a custom setter which does something that calls One()).
One thread could be running One() while a different thread is running Two().  This could happen if your program launches a new thread directly, or a library or framework could do so.  For example, GUI frameworks tend to have one thread for dispatching events, and others for doing work that could take time; and web server frameworks tend to use different threads for servicing different requests.
If neither of those could apply, then there would be no opportunity for the functions to run simultaneously.
Do they need to be protected by locks?
If there's any possibility of them being run on multiple threads, then yes, they need to be protected somehow.
99.999% of the time, the code would do exactly what you'd expect; you'd either see the old list or the new one.  However, there's a tiny but non-zero chance that it would behave strangely — anything from giving slightly wrong results to crashing.  (The risk depends on things like the OS, CPU/cache topology, and how heavily loaded the system is.)
Explaining exactly why is hard, though, because at a low level the Java Virtual Machine⁽²⁾ does an awful lot of stuff that you don't see.  In particular, to improve performance it can re-order operations within certain limits, as long as the end result is the same — as seen from that thread.  Things may look very different from other threads — which can make it really hard to reason about multi-threaded code!
Let me try to describe one possible scenario…
Suppose Thread A is running One() on one CPU core, and Thread B is running Two() on another core, and that each core has its own cache memory.⁽³⁾
Thread B will create a List instance (holding references to strings from the constant pool), and assign it to the list property; both the object and the property are likely to be written to its cache first.  Those cache lines will then get flushed back to main memory — but there's no guarantee about when, nor about the order in which that happens.  Suppose the list reference gets flushed first; at that point, main memory will have the new list reference pointing to a fresh area of memory where the new object will go — but since the new object itself hasn't been flushed yet, who knows what's there now?
So if Thread A starts running One() at that precise moment, it will get the new list reference⁽⁴⁾, but when it tries to iterate through the list, it won't see the new strings.  It might see the initial (empty) state of the list object before it was constructed, or part-way through construction⁽⁵⁾.  (I don't know whether it's possible for it to see any of the values that were in those memory locations before the list was created; if so, those might represent an entirely different type of object, or even not a valid object at all, which would be likely to cause an exception or error of some kind.)
In any case, if multiple threads are involved, it's possible for one to see list holding neither the original list nor the new one.
So, if you want your code to be robust and not fail occasionally⁽⁶⁾, then you have to protect against such concurrency issues.
Using #Synchronized and #Volatile is traditional, as is using explicit locks.  (In this particular case, I think that making list volatile would fix the problem.)
But those low-level constructs are fiddly and hard to use well; luckily, in many situations there are better options.  The example in this question has been simplified too much to judge what might work well (that's the down-side of minimal examples!), but work queues, actors, executors, latches, semaphores, and of course Kotlin's coroutines are all useful abstractions for handling concurrency more safely.
Ultimately, concurrency is a hard topic, with a lot of gotchas and things that don't behave as you'd expect.
There are many source of further information, such as:
These other questions cover some of the issues.
Chapter 17: Threads And Locks from the Java Language Specification is the ultimate reference on how the JVM behaves.  In particular, it describes what's needed to ensure a happens-before relationship that will ensure full visibility.
Oracle has a tutorial on concurrency in Java; much of this applies to Kotlin too.
The java.util.concurrent package has many useful classes, and its summary discusses some of these issues.
Concurrent Programming In Java: Design Principles And Patterns by Doug Lea was at one time the best guide to handling concurrency, and these excerpts discuss the Java memory model.
Wikipedia also covers the Java memory model
(1) According to Kotlin coding conventions, function names should start with a lower-case letter; that makes them easier to distinguish from class/object names.
(2) In this answer I'm assuming Kotlin/JVM.  Similar risks are likely apply to other platforms too, though the details differ.
(3) This is of course a simplification; there may be multiple levels of caching, some of which may be shared between cores/processors; and some systems have hardware which tries to ensure that the caches are consistent…
(4) References themselves are atomic, so a thread will either see the old reference or the new one — it can't see a bit-pattern comprising parts of the old and new ones, pointing somewhere completely random.  So that's one problem we don't have!
(5) Although the reference is immutable, the object gets mutated during construction, so it might be in an inconsistent state.
(6) And the more heavily loaded your system is, the more likely it is for concurrency issues to occur, which means that things will probably fail at the worst possible time!

How to understand coroutine cancellation is cooperative

In Kotlin, coroutine cancellation is cooperative. How should I understand it?
Link to Kotlin documentation.
If you have a Java background, you may be familiar with the thread interruption mechanism. Any thread can call thread.interrupt() and the receiving thread will get a signal in the form of a Boolean isInterrupted flag becoming true. The receiving thread may check the flag at any time with currentThread.isInterrupted() — or it may ignore it completely. That's why this mechanism is said to be cooperative.
Kotlin's coroutine cancellation mechanism is an exact replica of this: you have a coroutineContext.isActive flag that you (or a function you call) may check.
In both cases some well-known functions, for example Thread.sleep() in Java and delay() in Kotlin, check this flag and throw an InterruptedException and CancellationException, respectively. These methods/functions are said to be "interruptible" / "cancellable".
I'm not 100% sure whether I understand your question, but maybe this helps:
Coroutines are usually executed within the same thread you start them with. You can use different dispatchers, but they are designed to work when being started from the same thread. There's no extra scheduling happening.
You can compare this with scheduling mechanisms in an OS. Coroutines behave similar like to cooperative scheduling. You find similar concepts in many frameworks and languages to deal with async operations. Ruby for example has fibers which behave similar.
Basically this means that if a coroutine is hogging on your CPU in a busy loop, you cannot cancel it (unless you kill the whole process). Instead, your coroutines has to regularly check for cancellation and also add waits/delays/yields so that other coroutines can work.
This also defines on when coroutines are helpful the most: when running in a single-threaded-context, it doesn't help to use co-routines for local-only calculations. I used them mostly for processing async calls like interactions with databases or web servers.
This article also has some explanations on how coroutines work - maybe it helps you with any additional questions: https://antonioleiva.com/coroutines/

How many coroutines is too many?

I need to speed up a search over some collection with millions of elements.
Search predicate needs to be passed as argument.
I have been wondering wether the simplest solution(at least for now) wouldn't be just using coroutines for the task.
The question I am facing right now is how many coroutines can I actually create at once. :D As a side note there might be more than one such search running concurrently.
Can I make millions of coroutines(one for every item) for every such search? Should I decide on some workload per coroutine(for example 1000 items per coroutine)? Should I also decide on some cap for coroutines amount?
I have rough understanding of coroutines and how they actually work, however, I have no idea what are the performance limitations of this feature.
Thanks!
The memory weight of a coroutine scales with the depth of the call trace from the coroutine builder block to the suspension point. Each suspend fun call adds another Continuation object to a linked list and this is retained while the coroutine is suspended. A rough figure for one Continuation instance is 100 bytes.
So, if you have a call trace depth of, say, 5, that amounts to 500 bytes per item. A million items is 500 MB.
However, unless your search code involves blocking operations that would leave a thread idle, you aren't gaining anything from coroutines. Your task looks more like an instance of data paralellism and you can solve it very efficiently using the java.util.stream API (as noted by user marstran in the comment).
According the kotlin coroutine starter guide, the example launches 100K coroutines. I believe what you intend to do is exactly what kotlin coroutine is designed for.
If you will not do many modifications over your collection then just store it in a HashMap,
else store it in a TreeMap. Then just search items there. I believe the search methods implemented there are optimized enough to handle a million items in a blink. I would not use coroutines in this case.
Documentation (for Kotlin):
HashMap: https://developer.android.com/reference/kotlin/java/util/HashMap
TreeMap: https://developer.android.com/reference/kotlin/java/util/TreeMap