strange speedup in kotlin coroutines? - kotlin

I have these two kotlin code snippets for measuring the startup times of threads.
First:
import kotlinx.coroutines.*
fun main() {
runBlocking {
for(i in 0..99){
val start = System.nanoTime()
launch {
val time = System.nanoTime() - start
println("Starttime: %,d".format(time))
}
}
}
}
This code starts by printing something in the range of 35.000.000, and goes through 100iterations where the value becomes larger and ends around 75.000.000.
Now if I run this code
import kotlinx.coroutines.*
fun main() {
runBlocking {
val list:MutableList<Long> = MutableList<Long>(100, {0})
for(i in 0..99){
val start = System.nanoTime()
launch {
list[i] = System.nanoTime() - start
}
}
delay(1000) // wait for all coroutines to have stored result
list.forEach{ println("Starttime: %,d".format(it)) }
}
}
This code is obviously faster, since the print statement is not included in the launch. But there is some other behaviour which is strange. It starts around 50.000.000, jumps directly to 17.083.000, and then slowly climbs back up to 29.507.300 where it ends.
So my questions is: why does the first piece of code deteriorate so fast, and quickly become slow, while the second code is able to be so much faster, and that it doesnt become slower over time as i add more coroutines?

You're making a little mistake of not preloading classes before measuring time so results are skewed. You should also print jobs index so you can see in what order jobs are actually completed.
I'd suggest modifying your code samples:
First:
suspend fun main() {
// dummy calls to preload classes before measuring
val s = System.nanoTime()
GlobalScope.launch { println("starting %,d".format(s)) }.join()
repeat(100) { i ->
val start = System.nanoTime()
GlobalScope.launch {
val time = System.nanoTime() - start
println("Starttime[$i]: %,d".format(time))
}
}
delay(1000) // wait for all coroutines to print result
}
Second:
suspend fun main() {
// dummy calls to preload classes before measuring
val s = System.nanoTime()
GlobalScope.launch { println("starting %,d".format(s)) }.join()
val list: MutableList<Long> = MutableList<Long>(100, { 0 })
repeat(100) { i ->
val start = System.nanoTime()
GlobalScope.launch {
list[i] = System.nanoTime() - start
}
}
delay(1000) // wait for all coroutines to have stored result
list.forEachIndexed { i, it -> println("Starttime[$i]: %,d".format(it)) }
}
First sample obviously "deteriorates" as you just pile up 100 jobs instantly while they take significant amount of time to access System.out to perform printing operation. You may also experience they most likely do not print [their index] in order.
Second sample is more or less stable because jobs don't perform any heavy operations so there's usually a free thread in Dispatchers.Default pool to execute launch block immediately.

Related

Why can I cancel a Flow without either invoking yield or determining isActive() identification in Kotlin?

I have read the article.
There are two approaches to making computation code cancellable. The first one is to periodically invoke a suspending function that checks for cancellation. There is a yield function that is a good choice for that purpose. The other one is to explicitly check the cancellation status.
I know Flow is suspending functions.
I run Code B , and get Result B as I expected.
I think I can't making computation Code A cancellable, but in fact I can click "Stop" button to cancel Flow after I click "Start" button to emit Flow, why?
Code A
class HandleMeter: ViewModel() {
var currentInfo by mutableStateOf(2.0)
private var myJob: Job?=null
private fun soundDbFlow() = flow {
while (true) {
val data = (0..1000).random().toDouble()
emit(data)
}
}
fun calCurrentAsynNew() {
myJob?.cancel()
myJob = viewModelScope.launch(Dispatchers.IO) {
soundDbFlow().collect {currentInfo=it }
}
}
fun cancelJob(){
myJob?.cancel()
}
}
#Composable
fun Greeting(handleMeter: HandleMeter) {
var currentInfo = handleMeter.currentInfo
Column(
modifier = Modifier.fillMaxSize(),
) {
Text(text = "Current ${currentInfo}")
Button(
onClick = { handleMeter.calCurrentAsynNew() }
) {
Text("Start")
}
Button(
onClick = { handleMeter.cancelJob() }
) {
Text("Stop")
}
}
}
Code B
import kotlinx.coroutines.*
fun main() = runBlocking {
val job = launch(Dispatchers.IO) {
cal()
}
delay(1300L) // delay a bit
println("main: I'm tired of waiting!")
job.cancelAndJoin()
println("main: Now I can quit.")
}
suspend fun cal() {
val startTime = System.currentTimeMillis()
var nextPrintTime = startTime
var i = 0
while (i < 5) {
if ( System.currentTimeMillis() >= nextPrintTime) {
println("job: I'm sleeping ${i++} ...")
nextPrintTime += 500L
}
}
}
Result B
job: I'm sleeping 0 ...
job: I'm sleeping 1 ...
job: I'm sleeping 2 ...
main: I'm tired of waiting!
job: I'm sleeping 3 ...
job: I'm sleeping 4 ...
main: Now I can quit.
Add Content:
To Tenfour04: Thanks!
If the following content you said is true. I think Code C can be canceled when system finish the operation doBigBlockingCalculation() at one time, right? Why do I need Code D?
Since emit() is a suspend function, your Flow is able to interrupt and end the coroutine the next time the emit() function is called in that while loop.
Code C
private fun complicatedFlow() = flow {
while (true) {
val data = (0..1_000_000).doBigBlockingCalculation()
emit(data)
}
}.flowOn(Dispatchers.Default) // since the calculation is blocking
Code D
private fun complicatedFlow() = flow {
while (true) {
val data = (0..1_000_000)
.chunked(100_000)
.flatMap {
it.doBigBlockingCalculation().also { yield() }
}
emit(data)
}
}.flowOn(Dispatchers.Default) // since the calculation is blocking
A Flow on its own is cold. Its a wrapper around some suspend functions that will run when collect() or some other terminal suspending function is called on the Flow.
In your Code A, when the Job is cancelled, it is cancelling the coroutine that called collect on the Flow. collect is a suspend function, so that cancellation will propagate down to the function you defined inside soundDbFlow(). Since emit() is a suspend function, your Flow is able to interrupt and end the coroutine the next time the emit() function is called in that while loop.
Here's an example for how you could use this knowledge:
Suppose your function had to do a very long calculation like this:
private fun complicatedFlow() = flow {
while (true) {
val data = (0..1_000_000).doBigBlockingCalculation()
emit(data)
}
}.flowOn(Dispatchers.Default) // since the calculation is blocking
Now if you tried to cancel this flow, it would work, but since the data line is a very slow operation that is not suspending, the Flow will still complete this very long calculation for no reason, eating up resources for longer than necessary.
To resolve this problem, you could break your calculation up into smaller pieces with yield() calls in between. Then the Flow can be cancelled more promptly.
private fun complicatedFlow() = flow {
while (true) {
val data = (0..1_000_000)
.chunked(100_000)
.flatMap {
it.doBigBlockingCalculation().also { yield() }
}
emit(data)
}
}.flowOn(Dispatchers.Default) // since the calculation is blocking
Not a perfect example. It's kind of wasteful to chunk a big IntRange. An IntRange takes barely any memory, but chunked turns it into Lists containing every value in the range.
It has to do with CoroutineScopes and children of coroutines.
When a parent coroutine is canceled, all its children are canceled as well.
More here:
https://kotlinlang.org/docs/coroutine-context-and-dispatchers.html#children-of-a-coroutine

Why can't I parallel operation when I use either delay() or yield() in Kotlin?

The Code A, Code B and Code C get the same result Result All.
I think the Code B or Code C should get the result Result MyThink because I have added either delay() or yield().
It seems that flow.collect {...} is a block function.
Code A
fun foo(): Flow<Int> = flow {
println("Flow started")
for (i in 1..3) {
delay(500)
emit(i)
}
}
fun main() = runBlocking<Unit> {
println("Calling foo...")
val flow = foo()
println("Calling collect...")
flow.collect { value ->run {
println(value)
}
}
println("Done")
}
Code B
fun foo(): Flow<Int> = flow {
println("Flow started")
for (i in 1..3) {
delay(500)
emit(i)
}
}
fun main() = runBlocking<Unit> {
println("Calling foo...")
val flow = foo()
println("Calling collect...")
flow.collect { value ->run {
println(value)
delay(200)
}
}
println("Done")
}
Code C
fun foo(): Flow<Int> = flow {
println("Flow started")
for (i in 1..3) {
delay(500)
emit(i)
}
}
fun main() = runBlocking<Unit> {
println("Calling foo...")
val flow = foo()
println("Calling collect...")
flow.collect { value ->run {
println(value)
yield()
}
}
println("Done")
}
Result All
Calling foo...
Calling collect...
Flow started
1
2
3
Done
Result MyThink
Calling foo...
Calling collect...
Flow started
1
Done
2
3
It seems that flow.collect {...} is a block function.
That's not true in a literal sense, but there really is behaviour here that you might phrase as "blocking".
collect is a suspending function, which will return only after it has collected all of the items in the Flow that it was called on. Whenever the Flow suspends (with delay or yield, for example), the collection of the Flow is also suspended. This is all happening in the same coroutine (started by runBlocking in this case) that's suspended together. The Flow yielding values and collect processing them will continue after the suspension is over. Finally, when everything's collected, collect will return, and any code you have after it in that same coroutine will run.
This is consistent with the idea that coroutines are sequential by default, i.e. everything is executed top-to-bottom in your code, in order. If you want concurrent behaviour, you have to explicitly opt into it (for example, by launching new coroutines within the current one, with launch, or async). So what you call "blocking" is really just sequential. The collect function does not work like registering a listener would with many other APIs.
To understand the basic idea behind Flow, and how collecting it works within the same coroutine, I always recommend this talk.
If you want to have similar behavior as in Rx
you can use onEach instead collect with launchIn(this)
flow.onEach {
print(it)
}.launchIn(this)
https://proandroiddev.com/from-rxjava-2-to-kotlin-flow-threading-8618867e1955

Can I get return actual value if I use launch in Kotlin?

I'm learning Coroutines of Kotlin.
The Code A use async in the coroutines of Kotlin and I can use .await() on a deferred value to get its eventual result, so one.await() will return Int.
If I use launch in the coroutines, can I get the actual value just like one.await()?
Code A
val time = measureTimeMillis {
val one = async { doSomethingUsefulOne() }
val two = async { doSomethingUsefulTwo() }
println("The answer is ${one.await() + two.await()}")
}
println("Completed in $time ms")
suspend fun doSomethingUsefulOne(): Int {
delay(1000L) // pretend we are doing something useful here
return 13
}
suspend fun doSomethingUsefulTwo(): Int {
delay(1000L) // pretend we are doing something useful here, too
return 29
}
There is no [standard non-hacky] way to get the result from launch. This is the whole point of having async.
The 2 functions launch and async have precisely one difference, which is that launch is fire-and-forget, and async allows to wait for a result.
Therefore, there should be no reason for you to want to use launch over async unless you don't need the result, so the question is quite suprising.
If you use launch, the "actual value" is Unit as you can see from the signature
fun CoroutineScope.launch(
context: CoroutineContext = EmptyCoroutineContext,
start: CoroutineStart = CoroutineStart.DEFAULT,
block: suspend CoroutineScope.() -> Unit
): Job (source)
so you don't even need to start it.
If you pass a lambda to launch as in
launch { doSomethingUsefulOne() }
it is really
launch { doSomethingUsefulOne(); Unit }
and the value of doSomethingUsefulOne() is thrown away.
The short answer is NO
As you pointed out async and await will get you the result.
But Launch is used for a different purpose. Its purpose is to act as a bridge between Coroutine and NonCoroutine worlds.
Consider an example of a ViewModel whose coroutine world is controlled by viewModelScope
fun nonCoroutineWorldFunction() {
....
....
viewModelScope.launch {
// runs in coroutine world
}
....
....
viewModelScope.launch {
// runs in coroutine world
}
}
Launch can be considered something similar to FIRE AND FORGET. You just launch it to do its job rather than waiting for it to do its job

Async doesn't appear to run concurrently

The following code uses 2 async function calls:
import kotlinx.coroutines.*
import kotlin.system.*
fun main() = runBlocking<Unit> {
val time = measureTimeMillis {
val one = async { doSomethingUsefulOne() }
val two = async { doSomethingUsefulTwo() }
println("The answer is ${one.await() + two.await()}")
}
println("Completed in $time ms")
}
suspend fun doSomethingUsefulOne(): Int {
delay(3000L) // pretend we are doing something useful here
println("first")
return 13
}
suspend fun doSomethingUsefulTwo(): Int {
delay(1000L) // pretend we are doing something useful here, too
println("second")
return 29
}
The println results in "first" being printed first followed by "second". But according to the docs, these asyncs should be running concurrently. But since the first one has a delay of 3 seconds, why is its println running first? In fact, it doesn't even matter if I put the println before or after the delay. I get the same result.
The reason you have a gap between the 2 functions is the way you have called them in your print line:
println("The answer is ${one.await() + two.await()}")
Try changing .await() to .launch().
Await() calls the function and then stops until it is complete.

Kotlin coroutines progress counter

I'm making thousands of HTTP requests using async/await and would like to have a progress indicator. I've added one in a naive way, but noticed that the counter value never reaches the total when all requests are done. So I've created a simple test and, sure enough, it doesn't work as expected:
fun main(args: Array<String>) {
var i = 0
val range = (1..100000)
range.map {
launch {
++i
}
}
println("$i ${range.count()}")
}
The output is something like this, where the first number always changes:
98800 100000
I'm probably missing some important detail about concurrency/synchronization in JVM/Kotlin, but don't know where to start. Any tips?
UPDATE: I ended up using channels as Marko suggested:
/**
* Asynchronously fetches stats for all symbols and sends a total number of requests
* to the `counter` channel each time a request completes. For example:
*
* val counterActor = actor<Int>(UI) {
* var counter = 0
* for (total in channel) {
* progressLabel.text = "${++counter} / $total"
* }
* }
*/
suspend fun getAssetStatsWithProgress(counter: SendChannel<Int>): Map<String, AssetStats> {
val symbolMap = getSymbols()?.let { it.map { it.symbol to it }.toMap() } ?: emptyMap()
val total = symbolMap.size
return symbolMap.map { async { getAssetStats(it.key) } }
.mapNotNull { it.await().also { counter.send(total) } }
.map { it.symbol to it }
.toMap()
}
The explanation what exactly makes your wrong approach fail is secondary: the primary thing is fixing the approach.
Instead of async-await or launch, for this communication pattern you should instead have an actor to which all the HTTP jobs send their status. This will automatically handle all your concurrency issues.
Here's some sample code, taken from the link you provided in the comment and adapted to your use case. Instead of some third party asking it for the counter value and updating the GUI with it, the actor runs in the UI context and updates the GUI itself:
import kotlinx.coroutines.experimental.*
import kotlinx.coroutines.experimental.channels.*
import kotlin.system.*
import kotlin.coroutines.experimental.*
object IncCounter
fun counterActor() = actor<IncCounter>(UI) {
var counter = 0
for (msg in channel) {
updateView(++counter)
}
}
fun main(args: Array<String>) = runBlocking {
val counter = counterActor()
massiveRun(CommonPool) {
counter.send(IncCounter)
}
counter.close()
println("View state: $viewState")
}
// Everything below is mock code that supports the example
// code above:
val UI = newSingleThreadContext("UI")
fun updateView(newVal: Int) {
viewState = newVal
}
var viewState = 0
suspend fun massiveRun(context: CoroutineContext, action: suspend () -> Unit) {
val numCoroutines = 1000
val repeatActionCount = 1000
val time = measureTimeMillis {
val jobs = List(numCoroutines) {
launch(context) {
repeat(repeatActionCount) { action() }
}
}
jobs.forEach { it.join() }
}
println("Completed ${numCoroutines * repeatActionCount} actions in $time ms")
}
Running it prints
Completed 1000000 actions in 2189 ms
View state: 1000000
You're losing writes because i++ is not an atomic operation - the value has to be read, incremented, and then written back - and you have multiple threads reading and writing i at the same time. (If you don't provide launch with a context, it uses a threadpool by default.)
You're losing 1 from your count every time two threads read the same value as they will then both write that value plus one.
Synchronizing in some way, for example by using an AtomicInteger solves this:
fun main(args: Array<String>) {
val i = AtomicInteger(0)
val range = (1..100000)
range.map {
launch {
i.incrementAndGet()
}
}
println("$i ${range.count()}") // 100000 100000
}
There's also no guarantee that these background threads will be done with their work by the time you print the result and your program ends - you can test it easily by adding just a very small delay inside launch, a couple milliseconds. With that, it's a good idea to wrap this all in a runBlocking call which will keep the main thread alive and then wait for the coroutines to all finish:
fun main(args: Array<String>) = runBlocking {
val i = AtomicInteger(0)
val range = (1..100000)
val jobs: List<Job> = range.map {
launch {
i.incrementAndGet()
}
}
jobs.forEach { it.join() }
println("$i ${range.count()}") // 100000 100000
}
Have you read Coroutines basics? There's exact same problem as yours:
val c = AtomicInteger()
for (i in 1..1_000_000)
launch {
c.addAndGet(i)
}
println(c.get())
This example completes in less than a second for me, but it prints some arbitrary number, because some coroutines don't finish before main() prints the result.
Because launch is not blocking, there's no guarantee all of coroutines will finish before println. You need to use async, store the Deferred objects and await for them to finish.