How to convert Java blocking function into cancellable suspend function? - kotlin

Kotlin suspend functions should be nonblocking by convention (1). Often we have old Java code which relies on java Thread interruption mechanism, which we cannot (don't want to) modif (2):
public void doSomething(String arg) {
for (int i = 0; i < 100_000; i++) {
heavyCrunch(arg, i);
if (Thread.interrupted()) {
// We've been interrupted: no more crunching.
return;
}
}
}
What is the best way to adapt this code for usage in coroutines?
Version A: is unacceptable because it will run the code on the caller thread. So it will violate the "suspending functions do not block the caller thread" convention:
suspend fun doSomething(param: String) = delegate.performBlockingCode(param)
Version B: is better because it would run the blocking function in background thread, thus it wouldn't block the callers thread (except if by chance the caller uses the same thread from Dispatchers.Default threads pool). But coroutines job cancelling wouldn't interrupt performBlockingCode() which relies on thread interruption.
suspend fun doSomething(param: String) = withContext(Dispatchers.Default) {
delegate.performBlockingCode(param)
}
Version C: is currently the only way which I see to make it working. The idea is to convert blocking function into nonblocking with Java mechanisms and later use suspendCancellableCoroutine (3) for converting asynchronous method into suspend function:
private ExecutorService executor = Executors.newSingleThreadExecutor();
public Future doSomethingAsync(String arg) {
return executor.submit(() -> {
doSomething(arg);
});
}
suspend fun doSomething(param: String) = suspendCancellableCoroutine<Any> { cont ->
try {
val future = delegate.doSomethingAsync(param)
} catch (e: InterruptedException) {
throw CancellationException()
}
cont.invokeOnCancellation { future.cancel(true) }
}
As commented below, above code won't work properly, because continuation.resumeWith() is not called
Version D: uses CompletableFuture: which provides a way to register callback for when completable completes: thenAccept
private ExecutorService executor = Executors.newSingleThreadExecutor();
public CompletableFuture doSomethingAsync(String arg) {
return CompletableFuture.runAsync(() -> doSomething(arg), executor);
}
suspend fun doSomething(param: String) = suspendCancellableCoroutine<Any> { cont ->
try {
val completableFuture = delegate.doSomethingAsync(param)
completableFuture.thenAccept { cont.resumeWith(Result.success(it)) }
cont.invokeOnCancellation { completableFuture.cancel(true) }
} catch (e: InterruptedException) {
throw CancellationException()
}
}
Do you know any better way for that?
https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html
https://medium.com/#elizarov/blocking-threads-suspending-coroutines-d33e11bf4761
https://medium.com/#elizarov/callbacks-and-kotlin-flows-2b53aa2525cf

You may wrap blocking code via suspend fun kotlinx.coroutines.runInterruptible
It suppressed compile warning and blocking code will throw InterruptedException on cancellation
val job = launch {
runInterruptible {
Thread.sleep(500)
}
}
job.cancelAndJoin() // Cause will be 'java.lang.InterruptedException'
Tested on org.jetbrains.kotlinx:kotlinx-coroutines-core-jvm:1.4.2

Related

Heap issue when using kotlin coroutine in a Batch process

I want to call an API for each element in a list.
So I created below code which is an extension function:
suspend fun <T, V> Iterable<T>.customAsyncAll(method: suspend (T) -> V): Iterable<V> {
val deferredList = mutableListOf<Deferred<V>>()
val scope = CoroutineScope(dispatchers.io)
forEach {
val deferred = scope.async {
try {
method(it)
} catch (e: Exception) {
log.error { "customAsyncAll Exception in $method method " + e.stackTraceToString())
}
throw e
}
}
deferredList.add(deferred)
}
return deferredList.awaitAll()
}
Call the code as:
val result = runBlocking{ list.customAsyncAll { apiCall(it) }.toList() }
I see error posting Resource Exhausted event: Java heap space. What is wrong with this code?
When an exception is thrown in one of the api calls, will the rest of the courouting async stuff be released or it still occupies heap space?
I'm guessing you are passing a somewhat large list (50+ items). I do believe that making so many calls is the problem, and realistically speaking I don't think you will have any performance gain by opening more than 10 connections to the API at a time. Μy suggestion would be to limit the concurrent calls to any number of less than 20.
There are many ways to implement this, using Semaphore is my recommendation.
suspend fun <T, V> Iterable<T>.customAsyncAll(method: suspend (T) -> V): Iterable<V> {
val deferredList = mutableListOf<Deferred<V>>()
val scope = CoroutineScope(Dispatchers.IO)
val sema = Semaphore(10)
forEach {
val deferred = scope.async {
sema.withPermit {
try {
method(it)
} catch (e: Exception) {
log.error {
"customAsyncAll Exception in $method method "
+ e.stackTraceToString())
}
throw e
}
}
}
deferredList.add(deferred)
}
return deferredList.awaitAll()
}
 
sidenote
Be sure to cancel any custom CouroutineScope you create after you are done with it, see Custom usage.

How to get correct return value for suspend function when using GlobalScope.launch?

I have a suspend function
private suspend fun getResponse(record: String): HashMap<String, String> {}
When I call it in my main function I'm doing this, but the type of response is Job, not HashMap, how can I get the correct return type?
override fun handleRequest(event: SQSEvent?, context: Context?): Void? {
event?.records?.forEach {
try {
val response: Job = GlobalScope.launch {
getResponse(it.body)
}
} catch (ex: Exception) {
logger.error("error message")
}
}
return null
}
Given your answers in the comments, it looks like you're not looking for concurrency here. The best course of action would then be to just make getRequest() a regular function instead of a suspend one.
Assuming you can't change this, you need to call a suspend function from a regular one. To do so, you have several options depending on your use case:
block the current thread while you do your async stuff
make handleRequest a suspend function
make handleRequest take a CoroutineScope to start coroutines with some lifecycle controlled externally, but that means handleRequest will return immediately and the caller has to deal with the running coroutines (please don't use GlobalScope for this, it's a delicate API)
Option 2 and 3 are provided for completeness, but most likely in your context these won't work for you. So you have to block the current thread while handleRequest is running, and you can do that using runBlocking:
override fun handleRequest(event: SQSEvent?, context: Context?): Void? {
runBlocking {
// do your stuff
}
return null
}
Now what to do inside runBlocking depends on what you want to achieve.
if you want to process elements sequentially, simply call getResponse directly inside the loop:
override fun handleRequest(event: SQSEvent?, context: Context?): Void? {
runBlocking {
event?.records?.forEach {
try {
val response = getResponse(it.body)
// do something with the response
} catch (ex: Exception) {
logger.error("error message")
}
}
}
return null
}
If you want to process elements concurrently, but independently, you can use launch and put both getResponse() and the code using the response inside the launch:
override fun handleRequest(event: SQSEvent?, context: Context?): Void? {
runBlocking {
event?.records?.forEach {
launch { // coroutine scope provided by runBlocking
try {
val response = getResponse(it.body)
// do something with the response
} catch (ex: Exception) {
logger.error("error message")
}
}
}
}
return null
}
If you want to get the responses concurrently, but process all responses only when they're all done, you can use map + async:
override fun handleRequest(event: SQSEvent?, context: Context?): Void? {
runBlocking {
val responses = event?.records?.mapNotNull {
async { // coroutine scope provided by runBlocking
try {
getResponse(it.body)
} catch (ex: Exception) {
logger.error("error message")
null // if you want to still handle other responses
// you could also throw an exception otherwise
}
}
}.map { it.await() }
// do something with all responses
}
return null
}
You can use GlobalScope.async() instead of launch() - it returns Deferred, which is a future/promise object. You can then call await() on it to get a result of getResponse().
Just make sure not to do something like: async().await() - it wouldn't make any sense, because it would still run synchronously. If you need to run getResponse() on all event.records in parallel, then you can first go in loop and collect all deffered objects and then await on all of them.

How to write an extension function / wrapper for Kotlin Coroutines Flow?

I have Coroutines code which is using a callbackFlow like this:
fun getUniqueEventAsFlow(receiverId: String): Flow<Any> = callbackFlow {
RxEventBus().register(
receiverId,
FirstUniqueEvent::class.java,
false
) { amEvent ->
offer(amEvent)
}
// Suspend until either onCompleted or external cancellation are invoked
awaitClose {
unsubscribeFromEventBus(receiverId)
cancel()
}
}.flowOn(Dispatchers.Default)
.catch { throwable ->
reportError(throwable as Exception)
}
What I'd like to do is wrap the following so that it can be called automatically, since I have many similar functions in the code:
// Suspend until either onCompleted or external cancellation are invoked
awaitClose {
unsubscribeFromEventBus(receiverId)
cancel()
}
}.flowOn(Dispatchers.Default)
.catch { throwable ->
reportError(throwable as Exception)
}
I would like to wrap the awaitClose & flowOn once, and not have to write it for every callbackFlow. Do you know which Kotlin higher order construct I can use to achieve this?
Thank you,
Igor
Here is the solution for wrapping awaitClose and handleErrors:
/**
* Utility function which suspends a coroutine until it is completed or closed.
* Unsubscribes from Rx Bus event and ensures that the scope is cancelled upon completion.
*/
suspend fun finalizeFlow(scope: ProducerScope<Any>, receiverId: String) {
scope.awaitClose {
unsubscribeFromEventBus(receiverId)
scope.cancel()
}
scope.invokeOnClose {
Logger.debug(
javaClass.canonicalName,
"Closed Flow channel for receiverId $receiverId"
)
}
}
/**
* Extension function which does error handling on [Flow].
*/
fun <T> Flow<T>.handleErrors(): Flow<T> = catch { throwable ->
reportError(throwable as Exception)
}

Ensure main JVM program will blow, even with launched with Kotlin Coroutines

I have a very simple Kotlin program like
fun main() {
val scope = CoroutineScope(Dispatchers.Default)
val job = scope.launch() { // I only check if this Job isActive later
withTimeout(2000) {
terminate(task)
}
}
}
private suspend fun terminate(task: Task): Nothing = suspendCoroutine {
throw IllegalAccessError("Task ${task.name} should honor timeouts!")
}
When terminate() is called I want my program to blow. I don't want to recover. However, I can't only see
Exception in thread "DefaultDispatcher-worker-2"
abc.xyz.mainKt$terminate$$inlined$suspendCoroutine$lambda$1: Task Robot should honor timeouts!
// More stacktrace ...
in logs, since Coroutines is "swallowing" this Exception.
Therefore, my question is : how would be a guaranteed way to blow my program when a timeout happens, with a design driven by Kotlin Coroutines?
How about this?
fun main() = runBlocking {
withTimeout(2000) {
terminate(task)
}
}

How can I guarantee to get latest data when I use Coroutine in Kotlin?

The Code A is from the project architecture-samples, you can see it here.
The updateTasksFromRemoteDataSource() is suspend function, so it maybe run asynchronously.
When I call the function getTasks(forceUpdate: Boolean) with the paramter True, I'm afraid that return tasksLocalDataSource.getTasks() will be fired before updateTasksFromRemoteDataSource().
I don't know if the Code B can guarantee return tasksLocalDataSource.getTasks() will be fired after updateTasksFromRemoteDataSource().
Code A
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource()
} catch (ex: Exception) {
return Result.Error(ex)
}
}
return tasksLocalDataSource.getTasks()
}
}
private suspend fun updateTasksFromRemoteDataSource() {
val remoteTasks = tasksRemoteDataSource.getTasks()
if (remoteTasks is Success) {
// Real apps might want to do a proper sync, deleting, modifying or adding each task.
tasksLocalDataSource.deleteAllTasks()
remoteTasks.data.forEach { task ->
tasksLocalDataSource.saveTask(task)
}
} else if (remoteTasks is Result.Error) {
throw remoteTasks.exception
}
}
...
}
Code B
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
coroutineScope {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource()
} catch (ex: Exception) {
return Result.Error(ex)
}
}
}
return tasksLocalDataSource.getTasks()
}
}
...
}
Added Content
To Tenfour04: Thanks!
If somebody implement updateTasksFromRemoteDataSource() with lauch just like Code C, are you sure the Code C is return tasksLocalDataSource.getTasks() will be fired after updateTasksFromRemoteDataSource() when I call the function getTasks(forceUpdate: Boolean) with the paramter True?
Code C
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource()
} catch (ex: Exception) {
return Result.Error(ex)
}
}
return tasksLocalDataSource.getTasks()
}
}
private suspend fun updateTasksFromRemoteDataSource() {
val remoteTasks = tasksRemoteDataSource.getTasks()
if (remoteTasks is Success) {
// Real apps might want to do a proper sync, deleting, modifying or adding each task.
tasksLocalDataSource.deleteAllTasks()
launch { //I suppose that launch can be fired
remoteTasks.data.forEach { task ->
tasksLocalDataSource.saveTask(task)
}
}
} else if (remoteTasks is Result.Error) {
throw remoteTasks.exception
}
}
}
New Added Content
To Joffrey: Thanks!
I think that the Code D can be compiled.
In this case, when forceUpdate is true, tasksLocalDataSource.getTasks() maybe be run before updateTasksFromRemoteDataSource() is done.
Code D
class DefaultTasksRepository(
private val tasksRemoteDataSource: TasksDataSource,
private val tasksLocalDataSource: TasksDataSource,
private val ioDispatcher: CoroutineDispatcher = Dispatchers.IO,
private val myCoroutineScope: CoroutineScope
) : TasksRepository {
override suspend fun getTasks(forceUpdate: Boolean): Result<List<Task>> {
// Set app as busy while this function executes.
wrapEspressoIdlingResource {
if (forceUpdate) {
try {
updateTasksFromRemoteDataSource(myCoroutineScope)
} catch (ex: Exception) {
return Result.Error(ex)
}
}
return tasksLocalDataSource.getTasks()
}
}
private suspend fun updateTasksFromRemoteDataSource(myCoroutineScope: CoroutineScope) {
val remoteTasks = tasksRemoteDataSource.getTasks()
if (remoteTasks is Success) {
// Real apps might want to do a proper sync, deleting, modifying or adding each task.
tasksLocalDataSource.deleteAllTasks()
myCoroutineScope.launch {
remoteTasks.data.forEach { task ->
tasksLocalDataSource.saveTask(task)
}
}
} else if (remoteTasks is Result.Error) {
throw remoteTasks.exception
}
}
...
}
suspend functions look like regular functions from the call site's point of view because they execute sequentially just like regular synchronous functions.
What I mean by this is that the instructions following a plain call to a suspend function do not execute until the called function completes its execution.
This means that code A is fine (when forceUpdate is true, tasksLocalDataSource.getTasks() will never run before updateTasksFromRemoteDataSource() is done), and the coroutineScope in code B is unnecessary.
Now regarding code C, structured concurrency is here to save you.
People simply cannot call launch without a CoroutineScope receiver.
Since TaskRepository doesn't extend CoroutineScope, the code C as-is will not compile.
There are 2 ways to make this compile though:
Using GlobalScope.launch {}: this will cause the problem you expect, indeed. The body of such a launch will be run asynchronously and independently of the caller. updateTasksFromRemoteDataSource can in this case return before the launch's body is done. The only way to control this is to use .join() on the Job returned by the call to launch (which waits until it's done). This is why it is usually not recommended to use the GlobalScope, because it can "leak" coroutines.
wrapping calls to launch in a coroutineScope {...} inside updateTasksFromRemoteDataSource. This will ensure that all coroutines launched within the coroutineScope block are actually finished before the coroutineScope call completes. Note that everything that's inside the coroutineScope block may very well run concurrently, though, depending on how launch/async are used, but this is the whole point of using launch in the first place, isn't it?
Now with Code D, my answer for code C sort of still holds. Whether you pass a scope or use the GlobalScope, you're effectively creating coroutines with a bigger lifecycle than the suspending function that starts them.
Therefore, it does create the problem you fear.
But why would you pass a CoroutineScope if you don't want implementers to launch long lived coroutines in the provided scope?
Assuming you don't do that, it's unlikely that a developer would use the GlobalScope (or any scope) to do this. It's generally bad style to create long-lived coroutines from a suspending function. If your function is suspending, callers usually expect that when it completes, it has actually done its work.