Kotlin Coroutines: Do we need to synchronize shared state? - kotlin

From the official guide and samples from web, I didn't see any mentions of locking or synchronization, or how safe is modifying a shared variable in multiple launch or async calls.

Coroutines bring a concurrent programming model that may result in simultaneously executed code. Just as you know it from thread-based libraries, you have to care about synchronization as noted in the docs:
Coroutines can be executed concurrently using a multi-threaded dispatcher like the Dispatchers.Default. It presents all the usual concurrency problems. The main problem being synchronization of access to shared mutable state. Some solutions to this problem in the land of coroutines are similar to the solutions in the multi-threaded world, but others are unique.
With Kotlin Coroutines you can make use of acquainted strategies like using thread-safe data structures, confining execution to a single thread or using locks (e.g. Mutex).
Besides the common patterns, Kotlin coroutines encourage us to use a "share by communication" style. Concretely, an "actor" can be shared between coroutines. They can be used by coroutines, which may send/take messages to/from it. Also have a look at Channels.

Related

How to use blocking (I/O bound) APIs within Kotlin coroutines?

I'm writing a Kotlin server using Ktor - where my request handlers are written using Kotlin coroutines.
My understanding is each request handler is run on Ktor's thread pool, which contains far fewer threads than the traditional pool size of 1-thread-per-request server frameworks due to the lightweight/suspendable nature of coroutines. Great!
The issue I have is that my application still needs to interact with some blocking resources (JDBC database connection pool), but my understanding is that if I merely call these blocking APIs directly from the request coroutine I will end up with liveness issues - as I can end up blocking all the threads used to handle my requests! Not great.
Since I'm still relatively new to the world of Kotlin and coroutines, I'm wondering if anyone here can give me some tips on the best way to handle this situation.
I've seen Dispatchers.IO referenced a few times elsewhere. Is that considered the best way to manage these blocking calls? Are there any good examples of this?
The API I'm trying to use does allow for some asyncronicity by passing an Executor. Ideally, I could also wrap these calls in a convenient, idiomatic Kotlin API for suspending transactions.
You understand it all correctly. In most cases you should never block the thread when inside a coroutine. One exception is Dispatchers.IO mentioned by you. It is the standard way of handling blocking code and it is very easy to use:
withContext(Dispatchers.IO) {
// blocking code
}
withContext() is a suspend function, so you can think of above as the way to convert blocking to suspend. However, Dispatchers.IO doesn't really perform any magic - it just uses a bigger pool of threads, designated for blocking. I believe by default it creates 64 threads at maximum.
If you need to perform several parallel blocking operations, it is usually better to create your own thread pool to not block other components of the application.
If the IO library provides asynchronous API then generally it is better to use it instead of the blocking API. However, in many cases libraries provide asynchronous API by managing their own internal thread pool for blocking. In that case using asynchronous API and using blocking API with Dispatchers.IO is very similar. Dispatchers.IO could be even better, because it re-uses same IO threads across all IO operations and it can partially share threads with a thread pool designated for CPU computations (Dispatchers.Default).
Yes. the Dispatchers.IO would be the answer. I had a test with quarkus. The vert.x had no 2-seconds-blocking-alarm after I switched JDBC connection to Dispatchers.IO
https://github.com/hmchangm/quarkus-reactive-kotlin/blob/mariadb/src/main/kotlin/tw/idv/brandy/arrow/repo/FruitRepo.kt

Unnecessarily mark functions as suspending in favor of common abstraction

I'm working on a project with an API running in the JVM and a JS client to access this API from the browser. The data classes of those objects which are converted to/from JSON are in a multiplatform module so that I can reuse the code on both platforms and don't accidentally end up with mismatched attributes. At this point it would be nice to also have the APIs interface in this mutliplatform module which then would be implemented and hosted in the JVM and implemented and presented in the browser. However, all methods of this interface need to be suspending in the browser since requests are (at least with Ktor's client, which I'm using) while they do not need to be suspending in the JVM.
Is there a good reason against having all those methods suspending even though I don't make use of it in the JVM? I know that methods usually should be suspending only if it's actually needed, but then I would be writing all the same interfaces (besides the suspend keyword) twice which seems like a lot of unnecessary boilerplate code to me. The methods which would unnecessarily be marked as suspending are called from suspending contexts (I'm using Ktor in the JVM too) so restricted usage wouldn't be a problem.
This seems like a matter of preference, really. Both using suspend and not using it have disadvantges, so you have to choose which weigh less.
From what you write, it seems that the advantages of using suspend (write code only once) outweigh the disavantage of polluting the interface with an unnecessary modifier. I am not aware of the possible runtime overheads here. Personally, I would opt to go with suspend.
The methods which would unnecessarily be marked as suspending are called from suspending contexts (I'm using Ktor in the JVM too) so restricted usage wouldn't be a problem.
This is the key point: the biggest hassle of the unnecessary suspend is having to launch a coroutine. If you're already inside a coroutine, the overhead of just one function along the call path being unnecessarily suspend is very low: a single extra object allocated per call.
While it's true that with having one interface you avoid boilerplate and you get the hassle of having to launch a coroutine on JVM, I'd consider another perspective:
When designing your abstraction IMO you shouldn't get much into implementation details, instead of thinking how jvm and/or js handles communication with the api I'd go with the question "Do I want to leave room for the platforms to handle this communication in an async/suspend way?". I believe this way you'll arrive to a more scalable solution, but true you'll lose out on some of the micro-optimizations

How to understand coroutine cancellation is cooperative

In Kotlin, coroutine cancellation is cooperative. How should I understand it?
Link to Kotlin documentation.
If you have a Java background, you may be familiar with the thread interruption mechanism. Any thread can call thread.interrupt() and the receiving thread will get a signal in the form of a Boolean isInterrupted flag becoming true. The receiving thread may check the flag at any time with currentThread.isInterrupted() — or it may ignore it completely. That's why this mechanism is said to be cooperative.
Kotlin's coroutine cancellation mechanism is an exact replica of this: you have a coroutineContext.isActive flag that you (or a function you call) may check.
In both cases some well-known functions, for example Thread.sleep() in Java and delay() in Kotlin, check this flag and throw an InterruptedException and CancellationException, respectively. These methods/functions are said to be "interruptible" / "cancellable".
I'm not 100% sure whether I understand your question, but maybe this helps:
Coroutines are usually executed within the same thread you start them with. You can use different dispatchers, but they are designed to work when being started from the same thread. There's no extra scheduling happening.
You can compare this with scheduling mechanisms in an OS. Coroutines behave similar like to cooperative scheduling. You find similar concepts in many frameworks and languages to deal with async operations. Ruby for example has fibers which behave similar.
Basically this means that if a coroutine is hogging on your CPU in a busy loop, you cannot cancel it (unless you kill the whole process). Instead, your coroutines has to regularly check for cancellation and also add waits/delays/yields so that other coroutines can work.
This also defines on when coroutines are helpful the most: when running in a single-threaded-context, it doesn't help to use co-routines for local-only calculations. I used them mostly for processing async calls like interactions with databases or web servers.
This article also has some explanations on how coroutines work - maybe it helps you with any additional questions: https://antonioleiva.com/coroutines/

Alternative for concurrency:: data structures in macOS

For my macOS application, I'd like to use concurrent map and queue data structure to be shared between multithread process and support parallel operations.
After some research I've found what what I need, but unfortunately those are only implemented in windows.
concurrency::concurrent_unordered_map<key,value> concurrency::concurrent_queue<key>
Perhaps there are synonyms internal implementations in macOS in CoreFoundation or other framework that comes with Xcode SDK (disregarding the language implementation) ?
thanks,
Perhaps there are synonyms internal implementations in macOS in CoreFoundation or other framework that comes with Xcode SDK (disregarding the language implementation) ?
Nope. You must roll-your-own or source elsewhere.
The Apple provided collections are not thread safe, however the common recommendation is to combine them with Grand Central Dispatch (GCD) to provide lightweight thread-safe wrappers, and this is quite easy to do.
Here is an outline of one way to do it for NSMutableDictionary, which you would use for your concurrent map:
Write a subclass, say ThreadSafeDictionary, of NSMutabableDictionary. This will allow your thread safe version to be passed anywhere an NSMutableDictionary is accepted.
The subclass should have a private instance of a standard NSMutableDictionary, say actualDictionary
To subclass NSMutableDicitonary you just need to override 2 methods from NSMutableDictionary and 4 methods from NSDictionary. Each of these methods should invoke the same method on actualDictionary after meeting any concurrency requirements.
To handle concurrency requirements the subclass should first create a private concurrent dispatch queue using dispatch_queue_create() and save this in an instance variable, say operationQueue.
For operations which read from the dictionary the override method uses a dispatch_sync() to schedule the read on actualDicitonary and return the value. As operationQueue is concurrent this allows multiple concurrent readers.
For operations which write to the dictionary the override method uses a dispatch_async_barrier() to schedule the write on actualDicitonary. The async means the writer is allowed to continue without waiting for any other writers, and the barrier ensures there are no other concurrent operations mutating the dictionary.
You repeat the above outline to implement the concurrent queue based on one of the other collection types.
If after studying the documentation you get stuck on the design or implementation ask a new question, show what you have, describe the issue, include a link back to this question so people can follow the thread, and someone will undoubtedly help you take the next step.
HTH

Can `SendChannel.offer`, `CompletableDeferred.complete`, and similar be called outside coroutines?

CompletableDeferred documentation says
All functions on this interface and on all interfaces derived from it are thread-safe and can be safely invoked from concurrent coroutines without external synchronization.
Is it safe to call these functions outside any coroutine?
For SendChannel<E>, offer and close are not suspend and so they can be called outside coroutines syntactically, but is it actually safe to do so?
If a coroutine is needed, what is the cheapest way to start one: launch(Unconfined)?
It is safe to call offer and close from anywhere. That is what documentation means to say with "are thread-safe" phrase.
One of the reasons these methods are included into channel APIs is to enable integration of coroutines with the regular non-coroutine world that is based on various callbacks and event handlers. You can see the actual example of such integration in this guide on UI programming with coroutines.