Properly cancelling kotlin coroutine job - kotlin

I'm scratching my head around properly cancelling coroutine job.
Test case is simple I have a class with two methods:
class CancellationTest {
private var job: Job? = null
private var scope = MainScope()
fun run() {
job?.cancel()
job = scope.launch { doWork() }
}
fun doWork() {
// gets data from some source and send it to BE
}
}
Method doWork has an api call that is suspending and respects cancellation.
In the above example after counting objects that were successfully sent to backend I can see many duplicates, meaning that cancel did not really cancel previous invocation.
However if I use snippet found on the internet
internal class WorkingCancellation<T> {
private val activeTask = AtomicReference<Deferred<T>?>(null)
suspend fun cancelPreviousThenRun(block: suspend () -> T): T {
activeTask.get()?.cancelAndJoin()
return coroutineScope {
val newTask = async(start = CoroutineStart.LAZY) {
block()
}
newTask.invokeOnCompletion {
activeTask.compareAndSet(newTask, null)
}
val result: T
while (true) {
if (!activeTask.compareAndSet(null, newTask)) {
activeTask.get()?.cancelAndJoin()
yield()
} else {
result = newTask.await()
break
}
}
result
}
}
}
It works properly, objects are not duplicated and sent properly to BE.
One last thing is that I'm calling run method in a for loop - but anyways I'm not quire sure I understand why job?.cancel does not do its job properly and WorkingCancellation is actually working

Short answer: cancellation only works out-of-the box if you call suspending library functions. Non-suspending code needs manual checks to make it cancellable.
Cancellation in Kotlin coroutines is cooperative, and requires the job being cancelled to check for cancellation and terminate whatever work it's doing. If the job doesn't check for cancellation, it can quite happily carry on running forever and never find out it has been cancelled.
Coroutines automatically check for cancellation when you call built-in suspending functions. If you look at the docs for commonly-used suspending functions like await() and yield(), you'll see that they always say "This suspending function is cancellable".
Your doWork isn't a suspend function, so it can't call any other suspending functions and consequently will never hit one of those automatic checks for cancellation. If you do want to cancel it, you will need to have it periodically check whether the job is still active, or change its implementation to use suspending functions. You can manually check for cancellation by calling ensureActive on the Job.

In addition to Sam's answer, consider this example that mocks a continuous transaction, lets say location updates to a server.
var pingInterval = System.currentTimeMillis()
job = launch {
while (true) {
if (System.currentTimeMillis() > pingInterval) {
Log.e("LocationJob", "Executing location updates... ")
pingInterval += 1000L
}
}
}
Continuously it will "ping" the server with location udpates, or like any other common use-cases, say this will continuously fetch something from it.
Then I have a function here that's being called by a button that cancels this job operation.
fun cancel() {
job.cancel()
Log.e("LocationJob", "Location updates done.")
}
When this function is called, the job is cancelled, however the operation keeps on going because nothing ensures the coroutine scope to stop working, all actions above will print
Ping server my location...
Ping server my location...
Ping server my location...
Ping server my location...
Location updates done.
Ping server my location...
Ping server my location...
Now if we insert ensureActive() inside the infinite loop
while (true) {
ensureActive()
if (System.currentTimeMillis() > pingInterval) {
Log.e("LocationJob", "Ping server my location... ")
pingInterval += 1000L
}
}
Cancelling the job will guarantee that the operation will stop. I tested using delay though and it guaranteed total cancellation when the job it is being called in is cancelled. Emplacing ensureActive(), and cancelling after 2 seconds, prints
Ping server my location...
Ping server my location...
Location updates done.

Related

How to cancel kotlin coroutine with potentially "un-cancellable" method call inside it?

I have this piece of code:
// this method is used to evaluate the input string, and it returns evaluation result in string format
fun process(input: String): String {
val timeoutMillis = 5000L
val page = browser.newPage()
try {
val result = runBlocking {
withTimeout(timeoutMillis) {
val result = page.evaluate(input).toString()
return#withTimeout result
}
}
return result
} catch (playwrightException: PlaywrightException) {
return "Could not parse template! '${playwrightException.localizedMessage}'"
} catch (timeoutException: TimeoutCancellationException) {
return "Could not parse template! (timeout)"
} finally {
page.close()
}
}
It should throw exception after 5 seconds if the method is taking too long to execute (example: input potentially contains infinite loop) but it doesent (becomes deadlock I assume) coz coroutines should be cooperative. But the method I am calling is from another library and I have no control over its computation (for sticking yield() or smth like it).
So the question is: is it even possible to timeout such coroutine? if yes, then how?
Should I use java thread insted and just kill it after some time?
But the method I am calling is from another library and I have no control over its computation (for sticking yield() or smth like it).
If that is the case, I see mainly 2 situations here:
the library is aware that this is a long-running operation and supports thread interrupts to cancel it. This is the case for Thread.sleep and some I/O operations.
the library function really does block the calling thread for the whole time of the operation, and wasn't designed to handle thread interrupts
Situation 1: the library function is interruptible
If you are lucky enough to be in situation 1, then simply wrap the library's call into a runInterruptible block, and the coroutines library will translate cancellation into thread interruptions:
fun main() {
runBlocking {
val elapsed = measureTimeMillis {
withTimeoutOrNull(100.milliseconds) {
runInterruptible {
interruptibleBlockingCall()
}
}
}
println("Done in ${elapsed}ms")
}
}
private fun interruptibleBlockingCall() {
Thread.sleep(3000)
}
Situation 2: the library function is NOT interruptible
In the more likely situation 2, you're kind of out of luck.
Should I use java thread insted and just kill it after some time?
There is no such thing as "killing a thread" in Java. See Why is Thread.stop deprecated?, or How do you kill a Thread in Java?.
In short, in that case you do not have a choice but to block some thread.
I do not know a solution to this problem that doesn't leak resources. Using an ExecutorService would not help if the task doesn't support thread interrupts - the threads will not die even with shutdownNow() (which uses interrupts).
Of course, the blocked thread doesn't have to be your thread. You can technically launch a separate coroutine on another thread (using another dispatcher if yours is single-threaded), to wrap the libary function call, and then join() the job inside a withTimeout to avoid waiting for it forever. That is however probably bad, because you're basically deferring the problem to whichever scope you use to launch the uncancellable task (this is actually why we can't use a simple withContext here).
If you use GlobalScope or another long-running scope, you effectively leak the hanging coroutine (without knowing for how long).
If you use a more local parent scope, you defer the problem to that scope. This is for instance the case if you use the scope of an enclosing runBlocking (like in your example), which makes this solution pointless:
fun main() {
val elapsed = measureTimeMillis {
doStuff()
}
println("Completely done in ${elapsed}ms")
}
private fun doStuff() {
runBlocking {
val nonCancellableJob = launch(Dispatchers.IO) {
uncancellableBlockingCall()
}
val elapsed = measureTimeMillis {
withTimeoutOrNull(100.milliseconds) {
nonCancellableJob.join()
}
}
println("Done waiting in ${elapsed}ms")
} // /!\ runBlocking will still wait here for the uncancellable child coroutine
}
// Thread.sleep is in fact interruptible but let's assume it's not for the sake of the example
private fun uncancellableBlockingCall() {
Thread.sleep(3000)
}
Outputs something like:
Done waiting in 122ms
Completely done in 3055ms
So the bottom line is either live with this long thing potentially hanging, or ask the developers of that library to handle interruption or make the task cancellable.

launch long-running task then immediately send HTTP response

Using ktor HTTP server, I would like to launch a long-running task and immediately return a message to the calling client. The task is self-sufficient, it's capable of updating its status in a db, and a separate HTTP call returns its status (i.e. for a progress bar).
What I cannot seem to do is just launch the task in the background and respond. All my attempts at responding wait for the long-running task to complete. I have experimented with many configurations of runBlocking and coroutineScope but none are working for me.
// ktor route
get("/launchlongtask") {
val text: String = (myFunction(call.request.queryParameters["loops"]!!.toInt()))
println("myFunction returned")
call.respondText(text)
}
// in reality, this function is complex... the caller (route) is not able to
// determine the response string, it must be done here
suspend fun myFunction(loops : Int) : String {
runBlocking {
launch {
// long-running task, I want to launch it and move on
(1..loops).forEach {
println("this is loop $it")
delay(2000L)
// updates status in db here
}
}
println("returning")
// this string must be calculated in this function (or a sub-function)
return#runBlocking "we just launched $loops loops"
}
return "never get here" // actually we do get here in a coroutineScope
}
output:
returning
this is loop 1
this is loop 2
this is loop 3
this is loop 4
myFunction returned
expected:
returning
myFunction returned
(response sent)
this is loop 1
this is loop 2
this is loop 3
this is loop 4
Just to explain the issue with the code in your question, the problem is using runBlocking. This is meant as the bridge between the synchronous world and the async world of coroutines and
"the name of runBlocking means that the thread that runs it ... gets blocked for the duration of the call, until all the coroutines inside runBlocking { ... } complete their execution."
(from the Coroutine docs).
So in your first example, myFunction won't complete until your coroutine containing loop completes.
The correct approach is what you do in your answer, using CoroutineScope to launch your long-running task. One thing to point out is that you are just passing in a Job() as the CoroutineContext parameter to the CoroutineScope constructor. The CoroutineContext contains multiple things; Job, CoroutineDispatcher, CoroutineExceptionHandler... In this case, because you don't specifiy a CoroutineDispatcher it will use CoroutineDispatcher.Default. This is intended for CPU-intensive tasks and will be limited to "the number of CPU cores (with a minimum of 2)". This may or may not be want you want. An alternative is CoroutineDispatcher.IO - which has a default of 64 threads.
inspired by this answer by Lucas Milotich, I utilized CoroutineScope(Job()) and it seems to work:
suspend fun myFunction(loops : Int) : String {
CoroutineScope(Job()).launch {
// long-running task, I want to launch it and move on
(1..loops).forEach {
println("this is loop $it")
delay(2000L)
// updates status in db here
}
}
println("returning")
return "we just launched $loops loops"
}
not sure if this is resource-efficient, or the preferred way to go, but I don't see a whole lot of other documentation on the topic.

How delay function is working in Kotlin without blocking the current thread?

Past few days I am learning coroutines, most of thee concepts are clear but I don't understand the implementation of the delay function.
How delay function is resuming the coroutine after the delayed time? For a simple program, there is only one main thread, and to resume the coroutine after the delayed time I assume there should be another timer thread that handles all the delayed invocations and invokes them later. Is it true? Can someone explain the implementation detail of the delay function?
TL; DR;
When using runBlocking, delay is internally wrapped and runs on same thread and when using any other dispatcher it suspends and is resumed by resuming the continuation by event-loop thread. Check the long answer below to understand the internals.
Long answer:
#Francesc answer is pointing correctly but is somewhat abstract, and still does not explains how actually delay works internally.
So, as he pointed to the delay function:
public suspend fun delay(timeMillis: Long) {
if (timeMillis <= 0) return // don't delay
return suspendCancellableCoroutine sc# { cont: CancellableContinuation<Unit> ->
cont.context.delay.scheduleResumeAfterDelay(timeMillis, cont)
}
}
What it does is "Obtains the current continuation instance inside suspend functions and suspends the currently running coroutine after running the block inside the lambda"
So this line cont.context.delay.scheduleResumeAfterDelay(timeMillis, cont) is going to be executed and then the current coroutine gets suspended i.e. frees the current thread it was stick on.
cont.context.delay points to
internal val CoroutineContext.delay: Delay get() = get(ContinuationInterceptor) as? Delay ?: DefaultDelay
that says if ContinuationInterceptor is implementation of Delay then return that otherwise use DefaultDelay which is internal actual val DefaultDelay: Delay = DefaultExecutor a DefaultExecutor which is internal actual object DefaultExecutor : EventLoopImplBase(), Runnable {...} an implementation of EventLoop and has a thread of its own to run on.
Note: ContinuationInterceptor is an implementation of Delay when coroutine is in the runBlocking block in order to make sure the delay run on same thread otherwise it is not. Check this snippet to see the results.
Now I couldn't find implemenation of Delay created by runBlocking since internal expect fun createEventLoop(): EventLoop is an expect function which is implemented from outside, not by the source. But the DefaultDelay is implemented as follows
public override fun scheduleResumeAfterDelay(timeMillis: Long, continuation: CancellableContinuation<Unit>) {
val timeNanos = delayToNanos(timeMillis)
if (timeNanos < MAX_DELAY_NS) {
val now = nanoTime()
DelayedResumeTask(now + timeNanos, continuation).also { task ->
continuation.disposeOnCancellation(task)
schedule(now, task)
}
}
}
This is how scheduleResumeAfterDelay is implemented it creates a DelayedResumeTask with the continuation passed by delay, and then calls schedule(now, task) which calls scheduleImpl(now, delayedTask) which finally calls delayedTask.scheduleTask(now, delayedQueue, this) passing the delayedQueue in the object
#Synchronized
fun scheduleTask(now: Long, delayed: DelayedTaskQueue, eventLoop: EventLoopImplBase): Int {
if (_heap === kotlinx.coroutines.DISPOSED_TASK) return SCHEDULE_DISPOSED // don't add -- was already disposed
delayed.addLastIf(this) { firstTask ->
if (eventLoop.isCompleted) return SCHEDULE_COMPLETED // non-local return from scheduleTask
/**
* We are about to add new task and we have to make sure that [DelayedTaskQueue]
* invariant is maintained. The code in this lambda is additionally executed under
* the lock of [DelayedTaskQueue] and working with [DelayedTaskQueue.timeNow] here is thread-safe.
*/
if (firstTask == null) {
/**
* When adding the first delayed task we simply update queue's [DelayedTaskQueue.timeNow] to
* the current now time even if that means "going backwards in time". This makes the structure
* self-correcting in spite of wild jumps in `nanoTime()` measurements once all delayed tasks
* are removed from the delayed queue for execution.
*/
delayed.timeNow = now
} else {
/**
* Carefully update [DelayedTaskQueue.timeNow] so that it does not sweep past first's tasks time
* and only goes forward in time. We cannot let it go backwards in time or invariant can be
* violated for tasks that were already scheduled.
*/
val firstTime = firstTask.nanoTime
// compute min(now, firstTime) using a wrap-safe check
val minTime = if (firstTime - now >= 0) now else firstTime
// update timeNow only when going forward in time
if (minTime - delayed.timeNow > 0) delayed.timeNow = minTime
}
/**
* Here [DelayedTaskQueue.timeNow] was already modified and we have to double-check that newly added
* task does not violate [DelayedTaskQueue] invariant because of that. Note also that this scheduleTask
* function can be called to reschedule from one queue to another and this might be another reason
* where new task's time might now violate invariant.
* We correct invariant violation (if any) by simply changing this task's time to now.
*/
if (nanoTime - delayed.timeNow < 0) nanoTime = delayed.timeNow
true
}
return SCHEDULE_OK
}
It finally sets the task into the DelayedTaskQueue with the current time.
// Inside DefaultExecutor
override fun run() {
ThreadLocalEventLoop.setEventLoop(this)
registerTimeLoopThread()
try {
var shutdownNanos = Long.MAX_VALUE
if (!DefaultExecutor.notifyStartup()) return
while (true) {
Thread.interrupted() // just reset interruption flag
var parkNanos = DefaultExecutor.processNextEvent() /* Notice here, it calls the processNextEvent */
if (parkNanos == Long.MAX_VALUE) {
// nothing to do, initialize shutdown timeout
if (shutdownNanos == Long.MAX_VALUE) {
val now = nanoTime()
if (shutdownNanos == Long.MAX_VALUE) shutdownNanos = now + DefaultExecutor.KEEP_ALIVE_NANOS
val tillShutdown = shutdownNanos - now
if (tillShutdown <= 0) return // shut thread down
parkNanos = parkNanos.coerceAtMost(tillShutdown)
} else
parkNanos = parkNanos.coerceAtMost(DefaultExecutor.KEEP_ALIVE_NANOS) // limit wait time anyway
}
if (parkNanos > 0) {
// check if shutdown was requested and bail out in this case
if (DefaultExecutor.isShutdownRequested) return
parkNanos(this, parkNanos)
}
}
} finally {
DefaultExecutor._thread = null // this thread is dead
DefaultExecutor.acknowledgeShutdownIfNeeded()
unregisterTimeLoopThread()
// recheck if queues are empty after _thread reference was set to null (!!!)
if (!DefaultExecutor.isEmpty) DefaultExecutor.thread // recreate thread if it is needed
}
}
// Called by run inside the run of DefaultExecutor
override fun processNextEvent(): Long {
// unconfined events take priority
if (processUnconfinedEvent()) return nextTime
// queue all delayed tasks that are due to be executed
val delayed = _delayed.value
if (delayed != null && !delayed.isEmpty) {
val now = nanoTime()
while (true) {
// make sure that moving from delayed to queue removes from delayed only after it is added to queue
// to make sure that 'isEmpty' and `nextTime` that check both of them
// do not transiently report that both delayed and queue are empty during move
delayed.removeFirstIf {
if (it.timeToExecute(now)) {
enqueueImpl(it)
} else
false
} ?: break // quit loop when nothing more to remove or enqueueImpl returns false on "isComplete"
}
}
// then process one event from queue
dequeue()?.run()
return nextTime
}
And then the event loop (run function) of internal actual object DefaultExecutor : EventLoopImplBase(), Runnable {...} finally handles the tasks by dequeuing the tasks and resuming the actual Continuation which was suspended the function by calling delay if the delay time has reached.
All suspending functions work the same way, when compiled it gets converted into a state machine with callbacks.
When you call delay what happens is that a message is posted on a queue with a certain delay, similar to Handler().postDelayed(delay) and, when the delay has lapsed, it calls back to the suspension point and resumes execution.
You can check the source code for the delay function to see how it works:
public suspend fun delay(timeMillis: Long) {
if (timeMillis <= 0) return // don't delay
return suspendCancellableCoroutine sc# { cont: CancellableContinuation<Unit> ->
cont.context.delay.scheduleResumeAfterDelay(timeMillis, cont)
}
}
So if the delay is positive, it schedules the callback in the delay time.

What does a Coroutine Join do?

So for example I have the following code:
scope.launch {
val job = launch {
doSomethingHere()
}
job.join()
callOnlyWhenJobAboveIsDone()
}
Job.join() is state as such in the documentation:
Suspends coroutine until this job is complete. This invocation resumes normally (without exception) when the job is complete for any reason and the Job of the invoking coroutine is still active. This function also starts the corresponding coroutine if the Job was still in new state.
If I understand it correctly, since join() suspends the coroutine until its completed, then my code above will do exactly what it wants. That is, the method callOnlyWhenJobAboveIsDone() will only be called when doSomethingHere() is finished. Is that correct?
Can anyone explain further the use case for job.join()? Thanks in advance.
Explaining further my usecase:
val storeJobs = ArrayList<Job>()
fun callThisFunctionMultipleTimes() {
scope.launch {
val job = launch {
doSomethingHere()
}
storeJobs.add(job)
job.join()
callOnlyWhenJobAboveIsDone()
}
}
fun callOnlyWhenJobAboveIsDone() {
// Check if there is still an active job
// by iterating through the storedJobs
// and checking if any is active
// if no job is active do some other things
}
is this a valid usecase for job.join()?
That is, the method callOnlyWhenJobAboveIsDone() will only be called when doSomethingHere() is finished. Is that correct?
Yes.
Can anyone explain further the use case for job.join()?
In your case there is actually no need for another job, you could just write:
scope.launch {
doSomethingHere()
callOnlyWhenJobAboveIsDone()
}
That will do the exact same thing, so it is not really a usecase for a Job. Now there are other cases when .join() is really useful.
You want to run (launch) multiple asynchronous actions in parallel, and wait for all of them to finish:
someData
.map { Some.asyncAction(it) } // start in parallel
.forEach { it.join() } // wait for all of them
You have to keep track of an asynchronous state, for example an update:
var update = Job()
fun doUpdate() {
update.cancel() // don't update twice at the same time
update = launch {
someAsyncCode()
}
}
Now to make sure that the last update was done, for example if you want to use some updated data, you can just:
update.join()
anywhere, you can also
update.cancel()
if you want to.
Whats really useful about launch {} is that it not only returns a Job, but also attaches the Job to the CoroutineScope. Through that you can keep track of every async action happening inside your application. For example in your UI you could make every Element extend the CoroutineScope, then you can just cancel the scope if the Element leaves the rendered area, and all updates / animations in it will get stopped.
Kotlin's Job.join() is the non-blocking equivalent of Java's Thread.join().
So your assumption is correct: the point of job.join() is to wait for the completion of the receiver job before executing the rest of the current coroutine.
However, instead of blocking the thread that calls join() (like Java's Thread.join() would do) it simply suspends the coroutine calling join(), leaving the current thread free to do whatever it pleases (like executing another coroutine) in the meantime.
val queryProduct = GlobalScope.async {
}
val verification = GlobalScope.async {
}
GlobalScope.launch {
verification.join()
queryProduct.join()
}
This is how I use join(). When two asyncs are completed, another launch starts.

Project Reactor - subscribe on parallel scheduler doesn't work

I'm looking at examples and reading documentation and I've found some problems while trying to subscribe on Flux in a parallel manner.
I have a 3 functions, as below.
private val log = LoggerFactory.getLogger("main")
private val sequence = Flux.just(1, 2)
fun a() {
sequence.subscribeOn(Schedulers.parallel()).subscribe { log.info("*** {}", it) }
sequence.subscribe { log.info(">>> {}", it) }
}
fun b() {
sequence.subscribe { log.info(">>> {}", it) }
}
fun c() {
sequence.subscribeOn(Schedulers.parallel()).subscribe { log.info("*** {}", it) }
}
Now, when I run each method separately I have a proper output from functions a() and b(), but output from c() is empty. Is that to be expected, is it by design? If so, why is that happening?
Flux.just(...) captures value(s) and thus is optimized to execute immediately in the subscribing Thread.
When you use subscribeOn, you change that subscribing Thread from main to something else, making the just truly asynchronous.
In a(), without a subscribeOn that second just would block the main thread just enough that the test doesn't finish before the asynchronous alternative completes.
In c(), there is no such blocking of the main thread. As a consequence, the test terminates before the asynchronous just has had time to emit anything, and that is why you see no output.
To make that more visible, add a Thread.sleep(10) and you'll see some output.