Safe to use (non-thread-safe) mutableMap in suspend function? [closed] - kotlin

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I'm learning Coroutines in Kotlin and I have a piece of code that looks like this (see below).
My friend says that the mutableMapOf is LinkedHashMap, which is not thread safe. The argument is that the suspend function may be run by different threads, and thus LinkedHashMap is unsuitable.
Is it safe to use a simple mutable map here or is ConcurrentMap needed?
When a suspend function is suspended, can it be resumed and executed by another thread?
Even if (2) is possible, is there "happens-before/ happens-after" guarantee that ensures all the variables (and the underlying object contents) are deep synchronized from main memory before the new thread takes over?
Here's a simplified version of the code:
class CoroutineTest {
private val scope = CoroutineScope(SupervisorJob() + Dispatchers.Default)
suspend fun simpleFunction(): MutableMap<Int,String> {
val myCallResults = mutableMapOf<Int,String>()
val deferredCallResult1 = scope.async {
//make rest call get string back
}
val deferredCallResult2 = scope.async {
//make rest call get string back
}
...
myCallResults.put( 1, deferredCallResult1.await() )
myCallResults.put( 2, deferredCallResult2.await() )
...
return myCallResults
}
}
Thanks in advance!
PS. I ran this code with much more async call results and had no problem; all call results are accounted for. But that can be inconclusive which is why I ask.

Since the map is local to the suspend function, it is safe to use a non-thread-safe implementation. It is possible that different threads will be working with the map between different suspend function calls (in this case the await() calls), but there is guaranteed happens-before/happens-after within the suspend function.
If your map were declared outside the suspend function and accessed via a property, then there could be simultaneous calls to this function and you would be concurrently modifying it, which would be a problem.

No, it is not safe to use a single mutableMapOf() from multiple coroutines.
You understand suspending incorrectly. This is not function that is suspended. The coroutine running in the function could suspend. From this perspective suspending functions aren't really different than normal functions - they could be executed by many coroutines at the same time and all of them will work concurrently.
But... there is nothing wrong with your code for another reason. This mutable map is a local variable, so it is only available to the coroutine/thread that created it. Therefore, it is not accessed concurrently at all. It would be different if the map would be a property of CoroutineTest - then it might mean you need to use ConcurrentMap.
Updated
After reading all comments I believe I have a better understanding of your (or your friend) concerns, so I can provide a more accurate answer.
Yes, after suspending a coroutine it can resume from another thread, so coroutines make possible that some part of a function will be executed by one thread and other part will be executed by another thread. In your example it is possible that put(1 and put(2 will be invoked from two different threads.
However, saying that LinkedHashMap is not thread-safe doesn't mean, that it has to be always accessed by the same thread. It can be accessed by multiple threads, but not at the same time. One thread need to finish performing changes to the map and only then another thread can perform its modifications.
Now, in your code async { } blocks can work in parallel to each other. They can also work in parallel to the outer scope. But the contents of each of them works sequentially. put(2 line can only be executed after put(1 fully finishes, so the map is not accessed by multiple threads at the same time.
As stated earlier, it would be different if the map would be stored e.g. as a property, simpleFunction() would modify it and this function would be invoked multiple times in parallel - then each invocation would try to modify it at the same time. It would be also different if async operations would modify myCallResults directly. As I said, async blocks run in parallel to each other, so they could modify the map at the same time. But since you only return a result from async blocks and then modify the map from a single coroutine (from outer scope), the map is accessed sequentially, not concurrently.

Related

What is the difference between using coroutineScope() and launching a child coroutine and calling join on it?

I am trying to understand the coroutineScope() suspend function in Kotlin and I'm having a hard time understanding the exact purpose of this function.
As per the kotlinlang docs,
This function is designed for parallel decomposition of work. When any
child coroutine in this scope fails, this scope fails and all the rest
of the children are cancelled (for a different behavior see
supervisorScope). This function returns as soon as the given block and
all its children coroutines are completed.
But I feel this behavior can be achieved by launching a child coroutine and calling join on it.
So for example
suspend fun other() {
coroutineScope {
launch { // some task }
async { // some task }
}
}
This can be written as (scope is a reference to the scope created by the parent coroutine)
suspend fun other(scope: CoroutineScope) {
scope.launch {
launch { // some task }
async { // some task }
}.join()
}
Is there any difference between these two approaches since it looks
like they will produce same result and also seem to work in the same fashion?
If not, is coroutineScope merely a way to reduce this
boilerplate code of passing scope from parent coroutine and
calling join on child coroutine?
TLDR
Using CoroutineScope as in the example adds boilerplate code, is more confusing, error-prone and may handle cases like errors and cancellations differently. coroutineScope() is generally preferred in such cases.
Full answer
These two patterns are conceptually different and are used in different cases. Coroutines are all about sequential code and structured concurrency. Sequential means we can write a traditional code that waits in-place, it doesn't use callbacks, etc. and at the same time we don't get a performance hit. Structured concurrency means concurrent tasks have their owners, tasks consists of smaller sub-tasks that are explicit to the framework.
By mixing both above together we get a very easy to use and error-proof concurrency model where in most cases we don't have to launch background jobs and then manage them manually, watch for errors, handle cancellations, etc. We simply fork into sub-tasks and then join them in-place - that's all.
In Kotlin this is represented by suspend functions. Suspend functions are always executed within some context, this context is passed everywhere implicitly and the coroutines framework provides utils to use this context easily. One of the most common patterns is to fork and then join and this is exactly what coroutineScope() does. It creates a scope for launching sub-tasks and we can't leave this scope until all children are successful. We don't have to pass the scope manually, we don't have to join, we don't have to pass errors from children to their siblings and to parent, we don't have to pass cancellations from the parent to children - this is all automatic.
Therefore, suspend functions and coroutineScope() should be the default way of writing concurrent code with coroutines. This approach is easy to write, easy to read and it is error-proof. We can't easily leak a background task, because coroutineScope() won't let us go anywhere. We can't mistakenly ignore errors from background tasks. Etc.
Of course, in some cases we can't use this pattern. Sometimes, we actually would like to only launch a long-running task and return immediately. Sometimes, we don't consider the caller to be the owner of the task. For example, we could have some kind of a service that manages its tasks and we only schedule these tasks, but the service itself owns them. For these cases we can use CoroutineScope.
By using the scope explicitly we can launch tasks in the different context than the current one or from outside of coroutine world. We generally have more control, but at the same time we partially opt-out of the code correctness guarantees I mentioned above. For example, if we forget to invoke join() we can easily leak background tasks or perform operations in unexpected order. Also, in your case if the coroutine invoking other() is cancelled, all launched operations will be still running in the background. For these reasons, we should use CoroutineScope explicitly only if needed.
Common patterns
As a result of all that was said above, when working with coroutines we usually use one of these patterns:
Suspend function - it runs within the caller context and it waits for all its subtasks, it doesn't launch anything in the background.
Function receiving CoroutineScope either as a param or receiver - usually, that means the function wants to do something with the context even after returning (because otherwise it could be simply a suspend function). It either launches some background tasks or stores the context somewhere for a later use.
Regular function that uses its own CoroutineScope to launch tasks. Usually, this is some kind of a service that keeps its custom context.
At least to me, function which is suspend and receives CoroutineScope is pretty confusing, it is not entirely clear what to expect from it. Will it execute the operation in the caller context or in the provided one? Will it wait to finish or only schedule the operation in the background and return immediately? Maybe it will do both: first do some initial processing synchronously (therefore suspend), but also schedule additional task in the background (therefore scope: CoroutineScope)? We don't know this, we have to read the documentation or source code to understand its behavior. Your second example is unnecessary complication over a simple suspend function.
To further make my point consider this example:
data class User(
val firstName: String,
val lastName: String,
) {
fun getFullName(user: User) = ...
}
This example is far from perfect, but the main point is that it is confusing why we have to pass user to getFullName() if we call this function on a user already. We don't know whether it returns a full name of the passed user, the user we invoked the function on or maybe some kind of a mix? If that would be a member function not receiving a User or a static utility function receiving a User, everything would be clear. But a member function receiving a User is simply confusing. This is similar to your second example where we pass the context both implicitly and explicitly and we don't know which one is used and how exactly.

What's the use of runBlocking in kotlin if it blocks the current thread? [duplicate]

This question already has answers here:
Kotlin coroutines `runBlocking`
(2 answers)
Closed 1 year ago.
I'm currently working on a codebase that is using runBlocking in a lot of places.
Here is an example
fun doSomeComputation() {
val rows = runBlocking { //suspend function which queries database }
//rows is used for further computation
}
From what I understand runBlocking blocks the current thread. So what is the benefit we are exactly getting by using it instead of using regular function? Read somewhere we use async code so that thread is not blocked and UI doesn't become unresponsive but how is using runBlocking async code since it is blocking the thread?
I had the same doubt in javascript async/await
Since thread gets blocked why even use await?
This is exactly why we should not use runBlocking() as you do. runBlocking() is useful for starting our application, for example we can put a single runBlocking() in the main() to bootstrap coroutines and then we don't use it anywhere else. It is also useful to bridge suspendable and non-suspendable code if blocking is our expected result. But otherwise, we should avoid using it.
If we need to bridge classic and suspendable code, but we don't want to block the thread, then we need to stick to classic asynchronous techniques like futures or callbacks.

Should runBlocking only be used for tests and in main function?

I have this requirement for a function that gets called periodically:
1. Get some input
2. Do 8 independent computations based on the input
3. Merge the results from these 8 computations and output the merged result
Since I've got at least 8 processors, I can do the 8 independent computations in parallel. So I created the following function:
fun process(in: InputType): ResultType {
runBlocking(Dispatchers.Default) {
val jobs = in.splitToList().map { async { processItem(it) } }
return jobs.awaitAll()
}
}
However, I've read in the documentation of runBlocking that it is "to be used in main functions and in tests."
This function is not the main function but is called way down in the call hierarchy in an application that does not otherwise use coroutines anywhere else.
What should I use to achieve this requirement if I shouldn't use runBlocking?
There is nothing wrong in using runBlocking() like this. The main point is to not overuse runBlocking() as a cheap way to convert regular code into coroutine one. When converting to coroutines, it may be tempting to just put runBlocking() everywhere in our code and that's all. This would be wrong, because it ignores structured concurrency and we risk blocking threads that should not be blocked.
However, if our whole application is not based on coroutines, we just need to use them in some place and we never need to cancel background tasks, then I think runBlocking() is just fine.
Alternative is to create CoroutineScope and keep it in some service with clearly defined lifecycle. Then we can easily manage background tasks, cancel them, etc.

Is it OK to use redundant/nested withContext calls?

I have a personal project written in Kotlin, and I developed a habit of using withContext(...) very generously. I tend to use withContext(Dispatchers.IO) when calling anything that could possibly be related to I/O.
For example:
suspend fun getSomethingFromDatabase(db: AppDatabase) = withContext(Dispatchers.IO) {
return // ...
}
suspend fun doSomethingWithDatabaseItem(db: AppDatabase) {
val item = withContext(Dispatchers.IO) {
getSomethingFromDatabase(db)
}
// ...
}
You can see a redundant withContext(Dispatchers.IO) in the second function. I'm being extra cautious here, because I might not know/remember if getSomethingFromDatabase switches to an appropriate context or not. Does this impact performance? Is this bad? What's the idiomatic way of dealing with Dispatchers?
Note: I know that it's perfectly fine to switch between different contexts this way, but this question is specifically about using the same context.
You do not need withContext for anything besides calling code that demands a specific context. Therefore withContext(Dispatchers.Main) should only be used when you're working with UI functions that require the main thread. And you should only use withContext(Dispatchers.IO) when calling blocking IO related code.
A proper suspend function does not block (see Suspending convention section here), and therefore, you should never have to specify a dispatcher to call a suspend function. The exception would be if you're working with someone else's code or API and they are using suspend functions incorrectly!
I don't know what your AppDatabase class is, but if it is sensibly designed, it will expose suspend functions instead of blocking functions, so you should not need withContext to retrieve values from it. But if it does expose blocking functions for retrieving items, then the code of your first function is correct.
And your second function definitely doesn't need withContext because it's simply using it to call something that I can see is a suspend function.
As for whether it's OK to use redundant context switching...it doesn't hurt anything besides possibly wasting a tiny bit of time and memory context switching and allocating lambdas for no reason. And it makes your code less readable.

Kotlin 1.3: how to execute a block on a separate thread?

I've been reading up about concurrency in Kotlin and thought I started to understand it... Then I discovered that async() has been deprecated in 1.3 and I'm back to the start.
Here's what I'd like to do: create a thread (and it does have to be a thread rather than a managed pool, unfortunately), and then be able to execute async blocks on that thread, and return Deferred instances that will let me use .await().
What is the recommended way to do this in Kotlin?
1. Single-threaded coroutine dispatcher
Here's what I'd like to do: create a thread (and it does have to be a thread rather than a managed pool, unfortunately)
Starting a raw thread to handle your coroutines is an option only if you're prepared to dive deep and implement your own coroutine dispatcher for that case. Kotlin offers support for your requirement via a single-threaded executor service wrapped into a dispatcher. Note that this still leaves you with almost complete control over how you start the thread, if you use the overload that takes a thread factory:
val threadPool = Executors.newSingleThreadExecutor {
task -> Thread(task, "my-background-thread")
}.asCoroutineDispatcher()
2. async-await vs. withContext
and then be able to execute async blocks on that thread, and return Deferred instances that will let me use .await().
Make sure you actually need async-await, which means you need it for something else than
val result = async(singleThread) { blockingCal() }.await()
Use async-await only if you need to launch a background task, do some more stuff on the calling thread, and only then await() on it.
Most users new to coroutines latch onto this mechanism due to its familiarity from other languages and use it for plain sequential code like above, but avoiding the pitfall of blocking the UI thread. Kotlin has a "sequential by default" philosophy which means you should instead use
val result = withContext(singleThread) { blockingCall() }
This doesn't launch a new coroutine in the background thread, but transfers the execution of the current coroutine onto it and back when it's done.
3. Deprecated top-level async
Then I discovered that async() has been deprecated in 1.3
Spawning free-running background tasks is a generally unsound practice because it doesn't behave well in the case of errors or even just unusual patterns of execution. Your calling method may return or fail without awaiting on its result, but the background task will go on. If the application repeatedly re-enters the code that spawns the background task, your singleThread executor's queue will grow without bound. All these tasks will run without a purpose because their requestor is long gone.
This is why Kotlin has deprecated top-level coroutine builders and now you must explicitly qualify them with a coroutine scope whose lifetime you must define according to your use case. When the scope's lifetime runs out, it will automatically cancel all the coroutines spawned within it.
On the example of Android this would amount to binding the coroutine scope to the lifetime of an Activity, as explained in the KDoc of CoroutineScope.
Like it's stated with the message, it's deprecated in favor of calling async with an explicit scope like GlobalScope.async {} instead.
This is the actual implementation of the deprecated method as well.
By removing the top level async function, you'll not run into issues with implicit scopes or wrong imports.
Let me recommend this solution: Kotlin coroutines with returned value
It parallelizes tasks into 3 background threads (so called "triplets pool") but it's easy to change it to be single threaded as per your requirement by replacing tripletsPool with backgroundThread as below:
private val backgroundThread = ThreadPoolExecutor(1, 1, 5L, TimeUnit.SECONDS, LinkedBlockingQueue())