Below is the piece of code which I was trying out. there are two blocks of code. the first one creates a million threads/coroutines asynchronously in map and then later adds them up. the second block creates the million coroutines and adds them in the for loop and then prints in out.
According to me the first one should be faster as it creates them in parallel . the second one should be slow as you are creating million in one coroutine.
if my question makes sense at all I would like to get some understanding of this.
fun main() {
//first block
val elapsed1 = measureTimeMillis {
val deferred = (1..1000000).map { n->
GlobalScope.async{
n
}
}
runBlocking {
val sum = deferred.sumByDouble { it.await().toDouble() }
println(sum)
}
}
//second block
val elapsed2 = measureTimeMillis {
val def2 = GlobalScope.async{
var j = 0.0;
for(i in 1..1000000) {
j+=i
}
j
}
runBlocking {
val s = def2.await()
println(s)
}
}
println("elapsed1 $elapsed1")
println("elapsed2 $elapsed2")
}
Please note that, for above small block of code, it is very difficult determine which one is faster than another.
Also, the result of above execution may vary from system to system.
Hence, we cannot confirm which block of code is faster just by running above steps.
Also, I request you to please refer to the following article to get more clarity around this area:
What is microbenchmarking?
Related
I have written a function that scans files (pictures) from two Lists and check if a file is in both lists.
The code below is working as expected, but for large sets it takes some time. So I tried to do this in parallel with coroutines. But in sets of 100 sample files the programm was always slower than without coroutines.
The code:
private fun doJob() {
val filesToCompare = File("C:\\Users\\Tobias\\Desktop\\Test").walk().filter { it.isFile }.toList()
val allFiles = File("\\\\myserver\\Photos\\photo").walk().filter { it.isFile }.toList()
println("Files to scan: ${filesToCompare.size}")
filesToCompare.forEach { file ->
var multipleDuplicate = 0
var s = "This file is a duplicate"
s += "\n${file.absolutePath}"
allFiles.forEach { possibleDuplicate ->
if (file != possibleDuplicate) { //only needed when both lists are the same
// Files that have the same name or contains the name, so not every file gets byte comparison
if (possibleDuplicate.nameWithoutExtension.contains(file.nameWithoutExtension)) {
try {
if (Files.mismatch(file.toPath(), possibleDuplicate.toPath()) == -1L) {
s += "\n${possibleDuplicate.absolutePath}"
i++
multipleDuplicate++
println(s)
}
} catch (e: Exception) {
println(e.message)
}
}
}
}
if (multipleDuplicate > 1) {
println("This file has $multipleDuplicate duplicate(s)")
}
}
println("Files scanned: ${filesToCompare.size}")
println("Total number of duplicates found: $i")
}
How have I tried to add the coroutines?
I wrapped the code inside the first forEach in launch{...} the idea was that for each file a coroutine starts and the second loop is done concurrently. I expected the program to run faster but in fact it was about the same time or slower.
How can I achieve this code to run in parallel faster?
Running each inner loop in a coroutine seems to be a decent approach. The problem might lie in the dispatcher you were using. If you used runBlocking and launch without context argument, you were using a single thread to run all your coroutines.
Since there is mostly blocking IO here, you could instead use Dispatchers.IO to launch your coroutines, so your coroutines are dispatched on multiple threads. The parallelism should be automatically limited to 64, but if your memory can't handle that, you can also use Dispatchers.IO.limitedParallelism(n) to reduce the number of threads.
It seems that I can use Code A to run a task periodically.
You know soundDb() can be fired every 100ms, it's just like to run periodically.
Is it a good way to use kotlin-coroutines run a task periodically?
Code A
fun calCurrentAsyn() {
viewModelScope.launch {
var b = 0.0
for (i in 1..5) {
b = b + soundDb()
delay(100)
}
b = b / 5.0
myInfo.value = b.toString() + " OK Asyn " + a++.toString()
}
}
suspend fun soundDb(): Double {
var k=0.0
for (i in 1..500000000){
k=k+i
}
return k
}
Added Content:
To Joffrey: Thanks!
1: I know that Code B is the better style, will the effect of execution be same between Code A and Code B ?
Code B
viewModelScope.launch {
val b = computeCurrent()
myInfo.value = "$b OK Asyn ${a++}"
}
suspend fun computeCurrent(): Double {
var b = 0.0
repeat(5) {
b += soundDb()
delay(100)
}
return b / 5.0
}
suspend fun soundDb(): Double {
var k=0.0
for (i in 1..500000000){
k=k+i
}
return k
}
2: I hope to get information regularly from a long running coroutine with a periodic task, how can I cancel the flow soundDbFlow().runningAverage() ?
Code C
viewModelScope.launch {
soundDbFlow().runningAverage().collect {
println("Average = $it") // do something with it
}
}
3: You know I can use Timer().scheduleAtFixedRate to get information regularly in background thread just like Code D, which is the between Timer().scheduleAtFixedRate and Flow ?
Code D
private fun startTimer() {
timer = Timer()
timer.scheduleAtFixedRate(timerTask {
recordingTime = recordingTime + 1
val temp = fromCountToTimeByInterval(recordingTime, 10)
_timeElapse.postValue(temp)
}, 0, 10)
}
private fun stopTimer() {
timer.cancel()
recordingTime = 0
_timeElapse.postValue("00:00.00")
}
The approach (launch + loop) to repeat a task is not bad in itself, but the question is rather about how you want this coroutine to affect the rest of the application.
It's hard to tell whether this is an example for the sake of the question, or your actual code. If it is your actual code, your use case is not a classic "periodic task run":
it has a fixed number of iterations
it only has side effects at the end of the execution
This is an indication that it may make more sense to write this code as a suspend function instead:
suspend fun computeCurrent(): Double {
var b = 0.0
repeat(5) {
b += soundDb()
delay(100)
}
return b / 5.0
}
And then use it like this, to make it clearer where the results are used:
viewModelScope.launch {
val b = computeCurrent()
myInfo.value = "$b OK Asyn ${a++}"
}
Maybe you won't actually need to launch that coroutine in an async way (it probably depends on how you make other similar calls).
If you needed to get information regularly (not just at the end) from a long running coroutine with a periodic task, you might want to consider building a Flow instead and collecting it to apply the side-effects:
import kotlinx.coroutines.flow.*
import kotlin.time.Duration
import kotlin.time.Duration.Companion.milliseconds
fun soundDbFlow(period: Duration = 100.milliseconds) = flow {
while (true) {
emit(soundDb())
delay(period)
}
}
fun Flow<Double>.runningAverage(): Flow<Double> = flow {
var valuesCount = 0
var sum = 0.0
collect { value ->
sum += value
valuesCount++
emit(sum / valuesCount)
}
}
And then the usage could be something like:
viewModelScope.launch {
soundDbFlow().take(5).runningAverage().collect {
println("Average = $it") // do something with it
}
}
About the amended question:
I know that Code B is the better style, will the effect of execution be same between Code A and Code B ?
Code A and Code B behave the same. My point was indeed half about the style, because making it a simple suspend function makes it clear (to you and readers) that it only returns a single value. This seems to be a mistake that you make also in your newly added soundDb() function and I'm not sure it's clear to you that loops are not streams, and that you're only returning one value from those functions (not updating anything several times).
The other half of my point was that, since it's only a single value that you updated, it may not even need to be run in a long-running coroutine. You might integrate it with other pieces of suspending code where needed.
how can I cancel the flow soundDbFlow().runningAverage() ?
The flow is automatically canceled if the collecting coroutine is cancelled (either via the job you launched or by cancelling the whole viewModelScope - which happens automatically when the component is not needed). The flow is also cancelled if you use terminal operators that end the collection early, such as first(), takeWhile(), take(n).collect { .. }, etc.
You know I can use Timer().scheduleAtFixedRate to get information regularly in background thread just like Code D, which is the between Timer().scheduleAtFixedRate and Flow ?
It's up to you honestly. If you're using coroutines already, I'd personally favor the flow approach. scheduleAtFixedRate will not be integrated with structured concurrency and will require manual management of the cancellation.
I'm trying to write a code, that will let me test situations, in which Javas #Synchronized is not enough, to synchronize Kotlin coroutines. From my understanding, the code below:
var sharedCounter: Long = 0
#Synchronized
suspend fun updateCounter() {
delay(2)
sharedCounter++
delay(2)
yield()
}
fun main() = runBlocking {
var regularCounter: Long = 0
val scope = CoroutineScope(Dispatchers.IO + Job())
val jobs = mutableListOf<Job>()
repeat(1000) {
val job = scope.launch {
for (i in 1..1_000) {
regularCounter++
updateCounter()
}
}
jobs.add(job)
}
jobs.forEach { it.join() }
println("The number of shared counter is $sharedCounter")
println("The number of regular counter is $regularCounter")
}
should result in both sharedCounter and regularCounter NOT being equal to 1000000.
This code was based on this and this articles.
For some reason, sharedCounter always equals 1000000 and I'm not sure why.
I've tried testing larger for loops, but it did not "break" the synchronization either.
The synchronization lock is released at each suspend function call, but re-acquired before continuing. In your code, delay() and yield() are the suspension points. But sharedCounter++ is all local code, so the lock is held while it performs its three steps of getting the value, incrementing it, and setting it. So, the first delay() releases the lock, the continuation resumes so it re-locks and performs sharedCounter++, and then the lock is released again at the second delay() call.
I'm asking this question solely to better understand how kotlin sequences work. I thought I had a solid grasp, but I cannot explain what I observed in a short test by what I know, so obviously I have a misconception somewhere.
My goal was to do a quick benchmark to compare the performance of lists vs. sequences when filtering for a criteria and then taking the maximum value of the result. This is an operation that occurs fairly often in some code I have, and I'm trying to decide whether or not it's worth rewriting it to use sequences instead of lists. It seems it would be, as sequence is consistently faster, but that is not the question here.
Rather, I would ask you to explain to me how the below described "artifact" can come about.
First of all, here's the complete test I ran:
fun `just checking the performance of sequences`() {
val log = logger()
var totaldif = 0L
var seqFasterThanList = 0
(1..1000).forEach {
val testList = (1..6000000).toList().shuffled()
val testSequence = testList.asSequence()
// filter and find max
val listDuration = measureTimeMillis {
testList.filter { it % 2 == 0 }.max()
}
val sequenceDuration = measureTimeMillis {
testSequence.filter { it % 2 == 0 }.max()
}
log.info("List: {} ms; Sequence: {} ms;", listDuration, sequenceDuration)
if (sequenceDuration < listDuration) {
seqFasterThanList++
totaldif += (listDuration - sequenceDuration)
}
}
log.info("sequence was faster {} out of 1000 runs. average difference: {} ms",
seqFasterThanList, totaldif / seqFasterThanList)
}
The results mostly looked like this:
List: 299 ms; Sequence: 217 ms;
List: 289 ms; Sequence: 211 ms;
List: 288 ms; Sequence: 220 ms;
Except, every once in a while, about 1 in 20, the result looked more like this:
List: 380 ms; Sequence: 63 ms
As you can see, in these cases the operation was vastly faster. This is the kind of behaviour I would expect on operations like find or first, which can terminate early once they find a match. But by its very nature, max has to traverse the entire sequence to guarantee the result. So how is it possible that some times it can find a result more than 3 times as fast as it usually requires, with the same number of elements to traverse?
Further down is my original answer which, as #Slaw pointed out, wasn't actually answering what you asked (it was explaining why Sequence.filter is faster than Iterable.filter, not why Sequence.filter seems to be intermittently faster than it normally is). However, I'm leaving it below as it's related to what I think might be the answer to your actual question.
My guess is this might be related to garbage collection. As you can see from my original answer, when you call Iterable.filter you are causing lots of arrays to be copied, i.e. you're putting lots of stuff in memory, which has to be cleaned up at certain points. I wonder if it's this cleanup of stuff in memory created by the List tests, which is actually causing the anomalies. I think what might be happening is that every so often the garbage collector kicks in and does a full collection: this is causing the List test to slow down to slower than normal. And after this runs the memory is all cleaned up, which might be why the Sequence test is faster that time.
And the reason I suspect it's related to garbage collection is because I replicated your anomalies, then made one change: instead of calling testList.filter I call testList.filterTo, passing in an ArrayList of the same size as the list. That means that no array copying has to happen, and also the creation of the ArrayList is now outside of the timing:
val arrayList = ArrayList<Int>(testList.size)
val listDuration = measureTimeMillis {
testList.filterTo(arrayList) { it % 2 == 0 }.max()
}
As soon as I did that, the anomalies disappeared. Maybe you can check on your system and see if this makes the anomalies disappear there too. It's intermittent so a bit difficult to know for sure.
This doesn't prove that it's garbage collection, of course, but I think it makes it a possible culprit. You could turn on GC logging to see if you wanted to know for sure. If you do, let us know what you find: it would be interesting to hear your results.
Original answer below (explaining why Iterable.filter is slower than Sequence.filter)
If you look at the source code for Iterable<T>.filter you'll see it does this:
public inline fun <T> Iterable<T>.filter(predicate: (T) -> Boolean): List<T> {
return filterTo(ArrayList<T>(), predicate)
}
It creates a new ArrayList then loops round the items, checking the predicate against each one, and adding them to that array list if they match the predicate. This means that every X items (whatever the array list's default size is), the array list has to resize itself to allow more items in (i.e. create a new copy of the underlying array in which it's storing all its data).
In a sequence, however, the code is different:
public fun <T> Sequence<T>.filter(predicate: (T) -> Boolean): Sequence<T> {
return FilteringSequence(this, true, predicate)
}
Here there isn't some underlying array storing all the items, so no copying of arrays has to take place. Instead, there's just an Iterator which will return the next item which matches the predicate whenever next is called.
You can see the details of how this is implemented in the FilteringSequence class:
internal class FilteringSequence<T>(
private val sequence: Sequence<T>,
private val sendWhen: Boolean = true,
private val predicate: (T) -> Boolean
) : Sequence<T> {
override fun iterator(): Iterator<T> = object : Iterator<T> {
val iterator = sequence.iterator()
var nextState: Int = -1 // -1 for unknown, 0 for done, 1 for continue
var nextItem: T? = null
private fun calcNext() {
while (iterator.hasNext()) {
val item = iterator.next()
if (predicate(item) == sendWhen) {
nextItem = item
nextState = 1
return
}
}
nextState = 0
}
override fun next(): T {
if (nextState == -1)
calcNext()
if (nextState == 0)
throw NoSuchElementException()
val result = nextItem
nextItem = null
nextState = -1
#Suppress("UNCHECKED_CAST")
return result as T
}
I have been reading kotlin docs, and if I understood correctly the two Kotlin functions work as follows :
withContext(context): switches the context of the current coroutine, when the given block executes, the coroutine switches back to previous context.
async(context): Starts a new coroutine in the given context and if we call .await() on the returned Deferred task, it will suspends the calling coroutine and resume when the block executing inside the spawned coroutine returns.
Now for the following two versions of code :
Version1:
launch(){
block1()
val returned = async(context){
block2()
}.await()
block3()
}
Version2:
launch(){
block1()
val returned = withContext(context){
block2()
}
block3()
}
In both versions block1(), block3() execute in default context(commonpool?) where as block2() executes in the given context.
The overall execution is synchronous with block1() -> block2() -> block3() order.
Only difference I see is that version1 creates another coroutine, where as version2 executes only one coroutine while switching context.
My questions are :
Isn't it always better to use withContext rather than async-await as it is functionally similar, but doesn't create another coroutine. Large numbers of coroutines, although lightweight, could still be a problem in demanding applications.
Is there a case async-await is more preferable to withContext?
Update:
Kotlin 1.2.50 now has a code inspection where it can convert async(ctx) { }.await() to withContext(ctx) { }.
Large number of coroutines, though lightweight, could still be a problem in demanding applications
I'd like to dispel this myth of "too many coroutines" being a problem by quantifying their actual cost.
First, we should disentangle the coroutine itself from the coroutine context to which it is attached. This is how you create just a coroutine with minimum overhead:
GlobalScope.launch(Dispatchers.Unconfined) {
suspendCoroutine<Unit> {
continuations.add(it)
}
}
The value of this expression is a Job holding a suspended coroutine. To retain the continuation, we added it to a list in the wider scope.
I benchmarked this code and concluded that it allocates 140 bytes and takes 100 nanoseconds to complete. So that's how lightweight a coroutine is.
For reproducibility, this is the code I used:
fun measureMemoryOfLaunch() {
val continuations = ContinuationList()
val jobs = (1..10_000).mapTo(JobList()) {
GlobalScope.launch(Dispatchers.Unconfined) {
suspendCoroutine<Unit> {
continuations.add(it)
}
}
}
(1..500).forEach {
Thread.sleep(1000)
println(it)
}
println(jobs.onEach { it.cancel() }.filter { it.isActive})
}
class JobList : ArrayList<Job>()
class ContinuationList : ArrayList<Continuation<Unit>>()
This code starts a bunch of coroutines and then sleeps so you have time to analyze the heap with a monitoring tool like VisualVM. I created the specialized classes JobList and ContinuationList because this makes it easier to analyze the heap dump.
To get a more complete story, I used the code below to also measure the cost of withContext() and async-await:
import kotlinx.coroutines.*
import java.util.concurrent.Executors
import kotlin.coroutines.suspendCoroutine
import kotlin.system.measureTimeMillis
const val JOBS_PER_BATCH = 100_000
var blackHoleCount = 0
val threadPool = Executors.newSingleThreadExecutor()!!
val ThreadPool = threadPool.asCoroutineDispatcher()
fun main(args: Array<String>) {
try {
measure("just launch", justLaunch)
measure("launch and withContext", launchAndWithContext)
measure("launch and async", launchAndAsync)
println("Black hole value: $blackHoleCount")
} finally {
threadPool.shutdown()
}
}
fun measure(name: String, block: (Int) -> Job) {
print("Measuring $name, warmup ")
(1..1_000_000).forEach { block(it).cancel() }
println("done.")
System.gc()
System.gc()
val tookOnAverage = (1..20).map { _ ->
System.gc()
System.gc()
var jobs: List<Job> = emptyList()
measureTimeMillis {
jobs = (1..JOBS_PER_BATCH).map(block)
}.also { _ ->
blackHoleCount += jobs.onEach { it.cancel() }.count()
}
}.average()
println("$name took ${tookOnAverage * 1_000_000 / JOBS_PER_BATCH} nanoseconds")
}
fun measureMemory(name:String, block: (Int) -> Job) {
println(name)
val jobs = (1..JOBS_PER_BATCH).map(block)
(1..500).forEach {
Thread.sleep(1000)
println(it)
}
println(jobs.onEach { it.cancel() }.filter { it.isActive})
}
val justLaunch: (i: Int) -> Job = {
GlobalScope.launch(Dispatchers.Unconfined) {
suspendCoroutine<Unit> {}
}
}
val launchAndWithContext: (i: Int) -> Job = {
GlobalScope.launch(Dispatchers.Unconfined) {
withContext(ThreadPool) {
suspendCoroutine<Unit> {}
}
}
}
val launchAndAsync: (i: Int) -> Job = {
GlobalScope.launch(Dispatchers.Unconfined) {
async(ThreadPool) {
suspendCoroutine<Unit> {}
}.await()
}
}
This is the typical output I get from the above code:
Just launch: 140 nanoseconds
launch and withContext : 520 nanoseconds
launch and async-await: 1100 nanoseconds
Yes, async-await takes about twice as long as withContext, but it's still just a microsecond. You'd have to launch them in a tight loop, doing almost nothing besides, for that to become "a problem" in your app.
Using measureMemory() I found the following memory cost per call:
Just launch: 88 bytes
withContext(): 512 bytes
async-await: 652 bytes
The cost of async-await is exactly 140 bytes higher than withContext, the number we got as the memory weight of one coroutine. This is just a fraction of the complete cost of setting up the CommonPool context.
If performance/memory impact was the only criterion to decide between withContext and async-await, the conclusion would have to be that there's no relevant difference between them in 99% of real use cases.
The real reason is that withContext() a simpler and more direct API, especially in terms of exception handling:
An exception that isn't handled within async { ... } causes its parent job to get cancelled. This happens regardless of how you handle exceptions from the matching await(). If you haven't prepared a coroutineScope for it, it may bring down your entire application.
An exception not handled within withContext { ... } simply gets thrown by the withContext call, you handle it just like any other.
withContext also happens to be optimized, leveraging the fact that you're suspending the parent coroutine and awaiting on the child, but that's just an added bonus.
async-await should be reserved for those cases where you actually want concurrency, so that you launch several coroutines in the background and only then await on them. In short:
async-await-async-await — don't do that, use withContext-withContext
async-async-await-await — that's the way to use it.
Isn't it always better to use withContext rather than asynch-await as it is funcationally similar, but doesn't create another coroutine. Large numebrs coroutines, though lightweight could still be a problem in demanding applications
Is there a case asynch-await is more preferable to withContext
You should use async/await when you want to execute multiple tasks concurrently, for example:
runBlocking {
val deferredResults = arrayListOf<Deferred<String>>()
deferredResults += async {
delay(1, TimeUnit.SECONDS)
"1"
}
deferredResults += async {
delay(1, TimeUnit.SECONDS)
"2"
}
deferredResults += async {
delay(1, TimeUnit.SECONDS)
"3"
}
//wait for all results (at this point tasks are running)
val results = deferredResults.map { it.await() }
//Or val results = deferredResults.awaitAll()
println(results)
}
If you don't need to run multiple tasks concurrently you can use withContext.
When in doubt, remember this like a rule of thumb:
If multiple tasks have to happen in parallel and the final result depends on completion of all of them, then use async.
For returning the result of a single task, use withContext.