How does actors in kotlin work when ran on different threads? - kotlin

In the actor example from the official kotlinlang.org documentation, an actor is launched 100 000 times which simply increments a counter inside the actor. Then a get request is sent to the actor and the counter is sent in the response with the correct amount (100 000).
This is the code:
// The messages
sealed class CounterMsg
object IncCounter : CounterMsg() // one-way message to increment counter
class GetCounter(val response: CompletableDeferred<Int>) : CounterMsg() // a two-way message to get the counter
// The actor
fun CoroutineScope.counterActor() = actor<CounterMsg> {
var counter = 0 // actor state
for (msg in channel) { // iterate over incoming messages
when (msg) {
is IncCounter -> counter++
is GetCounter -> msg.response.complete(counter)
}
}
}
fun main() {
runBlocking {
val counterActor = counterActor()
GlobalScope.massiveRun {
counterActor.send(IncCounter) // run action 100000 times
}
val response = CompletableDeferred<Int>()
counterActor.send(GetCounter(response))
println("Counter = ${response.await()}")
counterActor.close()
}
}
I have problems understanding what would happen if the counterActor coroutines would execute on multiple threads? If the coroutines would run on different threads the variable 'counter' in the actor would potentially be susceptible to a race condition, would it not?
Example: One thread runs a coroutine and this receives on the channel, and then on another thread a coroutine could receive and both of them try to update the counter variable at the same time, thus updating the variable incorrectly.
In the text that follows the code example
It does not matter (for correctness) what context the actor itself is executed in. An actor is a coroutine and a coroutine is executed sequentially, so confinement of the state to the specific coroutine works as a solution to the problem of shared mutable state.
Im having a hard time understanding this. Could someone elaborate what this exactly means, and why a race condition does nor occur. When I run the example I see all coroutines run on the same main thread so I can not prove my theory of the race condition.

"actor is launched 100 000 times"
No, actor is launched exactly 1 time, at the line
val counterActor = counterActor()
Then it receives 100000 messages, from 100 coroutines working in parallel on different threads. But they do not increment the variable counter directly, they only add messages to the actor's input message queue. Indeed, this operation, implemented in the kotlinx.coroutines library, is made thread-safe.

Related

launch long-running task then immediately send HTTP response

Using ktor HTTP server, I would like to launch a long-running task and immediately return a message to the calling client. The task is self-sufficient, it's capable of updating its status in a db, and a separate HTTP call returns its status (i.e. for a progress bar).
What I cannot seem to do is just launch the task in the background and respond. All my attempts at responding wait for the long-running task to complete. I have experimented with many configurations of runBlocking and coroutineScope but none are working for me.
// ktor route
get("/launchlongtask") {
val text: String = (myFunction(call.request.queryParameters["loops"]!!.toInt()))
println("myFunction returned")
call.respondText(text)
}
// in reality, this function is complex... the caller (route) is not able to
// determine the response string, it must be done here
suspend fun myFunction(loops : Int) : String {
runBlocking {
launch {
// long-running task, I want to launch it and move on
(1..loops).forEach {
println("this is loop $it")
delay(2000L)
// updates status in db here
}
}
println("returning")
// this string must be calculated in this function (or a sub-function)
return#runBlocking "we just launched $loops loops"
}
return "never get here" // actually we do get here in a coroutineScope
}
output:
returning
this is loop 1
this is loop 2
this is loop 3
this is loop 4
myFunction returned
expected:
returning
myFunction returned
(response sent)
this is loop 1
this is loop 2
this is loop 3
this is loop 4
Just to explain the issue with the code in your question, the problem is using runBlocking. This is meant as the bridge between the synchronous world and the async world of coroutines and
"the name of runBlocking means that the thread that runs it ... gets blocked for the duration of the call, until all the coroutines inside runBlocking { ... } complete their execution."
(from the Coroutine docs).
So in your first example, myFunction won't complete until your coroutine containing loop completes.
The correct approach is what you do in your answer, using CoroutineScope to launch your long-running task. One thing to point out is that you are just passing in a Job() as the CoroutineContext parameter to the CoroutineScope constructor. The CoroutineContext contains multiple things; Job, CoroutineDispatcher, CoroutineExceptionHandler... In this case, because you don't specifiy a CoroutineDispatcher it will use CoroutineDispatcher.Default. This is intended for CPU-intensive tasks and will be limited to "the number of CPU cores (with a minimum of 2)". This may or may not be want you want. An alternative is CoroutineDispatcher.IO - which has a default of 64 threads.
inspired by this answer by Lucas Milotich, I utilized CoroutineScope(Job()) and it seems to work:
suspend fun myFunction(loops : Int) : String {
CoroutineScope(Job()).launch {
// long-running task, I want to launch it and move on
(1..loops).forEach {
println("this is loop $it")
delay(2000L)
// updates status in db here
}
}
println("returning")
return "we just launched $loops loops"
}
not sure if this is resource-efficient, or the preferred way to go, but I don't see a whole lot of other documentation on the topic.

What is the difference between limitedParallelism vs a fixed thread pool dispatcher?

I am trying to use Kotlin coroutines to perform multiple HTTP calls concurrently, rather than one at a time, but I would like to avoid making all of the calls concurrently, to avoid rate limiting by the external API.
If I simply launch a coroutine for each request, they all are sent near instantly. So I looked into the limitedParallelism function, which sounds very close to what I need, and some stack overflow answers suggest is the recommended solution. Older answers to the same question suggested using newFixedThreadPoolContext.
The documentation for that function mentioned limitedParallelism as a preferred alternative "if you do not need a separate thread pool":
If you do not need a separate thread-pool, but only have to limit effective parallelism of the dispatcher, it is recommended to use CoroutineDispatcher.limitedParallelism instead.
However, when I write my code to use limitedParallelism, it does not reduce the number of concurrent calls, compared to newFixedThreadPoolContext which does.
In the example below, I replace my network calls with Thread.sleep, which does not change the behavior.
// method 1
val fixedThreadPoolContext = newFixedThreadPoolContext(2)
// method 2
val limitedParallelismContext = Dispatchers.IO.limitedParallelism(2)
runBlocking {
val jobs = (1..1000).map {
// swap out the dispatcher here
launch(limitedParallelismContext) {
println("started $it")
Thread.sleep(1000)
println(" finished $it")
}
}
jobs.joinAll()
}
The behavior for fixedThreadPoolContext is as expected, no more than 2 of the coroutines runs at a time, and the total time to finish is several minutes (1000 times one second each, divided by two at a time, roughly 500 seconds).
However, for limitedParallelismContext, all "started #" lines print immediately, and one second later, all "finished #" lines print and the program completes in just over 1 total second.
Why does limitedParallelism not have the same effect as using a separate thread pool? What does it accomplish?
I modified your code slightly so that every coroutine takes 200ms to complete and it prints the time when it is completed. Then I pasted it to play.kotlinlang.org to check:
/**
* You can edit, run, and share this code.
* play.kotlinlang.org
*/
import kotlinx.coroutines.*
fun main() {
// method 1
val fixedThreadPoolContext = newFixedThreadPoolContext(2, "Pool")
// method 2
val limitedParallelismContext = Dispatchers.IO.limitedParallelism(2)
runBlocking {
val jobs = (1..10).map {
// swap out the dispatcher here
launch(limitedParallelismContext) {
println("it at ${System.currentTimeMillis()}")
Thread.sleep(200)
}
}
jobs.joinAll()
}
}
And there using kotlin 1.6.21 the result is as expected:
it at 1652887163155
it at 1652887163157
it at 1652887163358
it at 1652887163358
it at 1652887163559
it at 1652887163559
it at 1652887163759
it at 1652887163759
it at 1652887163959
it at 1652887163959
Only 2 coroutines are executed at a time.

Channel flow send suspend behavior

With the following code sample:
val scope = CoroutineScope(Dispatchers.IO)
val flow = channelFlow {
println("1")
send("1")
println("2")
send("2")
}.buffer(0)
scope.launch {
flow.collect {
println("collect $it")
delay(5000)
}
}
The following output:
1
2 // should be printed after collect 1
collect 1
collect 2 // after 5000ms
Expected:
the print 1, then print collect 1, wait 5 seconds, then print 2
it seem that the send function does not suspend, with a buffer set to 0 or RENDEZVOUS, using a standard flow with emit suspend work as expected, is there another operator, or does the channel flow can suspend(have a buffer with 0/1 capacity) ?
As the documentation of Flow.buffer() states, this function creates two coroutines: one producer and one consumer. These coroutines work concurrently. That means that at the time collect() block is launched to process an item, send() on the other side is already resumed. I believe there is a race condition between 2 and collect 1, but in practice the order may be deterministic.
"Normal" flow with emit() works differently. It works sequentially, so the producer and the consumer don't run at the same time. emit() suspends until the consumer finishes working on the previous item and requests another one.

Kotlin actor coroutine and concurrency issue

When looking at the example https://github.com/kotlin/kotlinx.coroutines/blob/master/kotlinx-coroutines-core/jvm/test/guide/example-sync-07.kt , more specifically at the following fragment
// This function launches a new counter actor
fun CoroutineScope.counterActor() = actor<CounterMsg> {
var counter = 0 // actor state
for (msg in channel) { // iterate over incoming messages
when (msg) {
is IncCounter -> counter++
is GetCounter -> msg.response.complete(counter)
}
}
}
I cannot but notice that if for example 5 other coroutines send GetCounter message to the above actor, they will be "served" sequentially one after another, i.e. no parallelism is achieved. Even although we have a CPU with 8 physical threads, each of "client" coroutines reading the counter value will be constrained by a sequential execution of the actor coroutine.
How can we avoid this sequential processing when using actor coroutine?
Thanks

Are Kotlin channels used in coroutines thread safe / synchronized / keep happens-before relationship?

Are the functions available in Kotlin channels thread safe? e.g.
val channel = Channel<Boolean>()
val job1 = GlobalScope.launch {
channel.send(true)
}
val job2 = GlobalScope.launch {
val x = channel.poll()
}
If in the above code job1 was executed by the machine (in real time) before job2 is executed and on different threads, is it guaranteed that x is set with true? Or is it possible that it gets set with null (because cpu cache was not updated)?
Channel class kotlinx.coroutines library is thread-safe. It is designed to support multiple threads.
GlobalScope.launch may not necessarily mean a coroutine will be executed in a new thread
If in the above code job1 was executed by the machine (in real time) before job2 is executed and on different threads, is it guaranteed that x is set with true? Or is it possible that it gets set with null (because cpu cache was not updated)?
The Java Memory Model has no notion of time and it doesn't guarantee anything just based on the fact that a line executed earlier than another one. You can't even ascertain when an action was executed on a CPU.
In the code you posted, there are two concurrently executing coroutines. If and only if channel.poll() gets a non-null value, there is a happens-before edge going from send() to poll(). If it gets a null-value, there is no happens-before edge.
Let's say you determine the wall-clock time in the two coroutines, something like the following:
var sendTime: Long = 0
var receiveTime: Long = 0
suspend fun main() {
val channel = Channel<Boolean>(UNLIMITED)
val job1 = GlobalScope.launch {
channel.send(true)
sendTime = System.nanoTime()
}
val job2 = GlobalScope.launch {
receiveTime = System.nanoTime()
val x = channel.poll()
println(x)
}
job1.join()
job2.join()
println("${receiveTime - sendTime}")
}
The fact that receiveTime is greater than sendTime does not induce a happens-before relationship and it doesn't force channel.poll() to observe the sent item. Calling nanoTime() is not a synchronization action.
Note that these facts have nothing to do Kotlin or coroutines specifically, this is how the Java Memory Model works. If you study the C++ memory model, you'll find it works the same way.