The default executor in jOOQ uses the ForkJoinPool common pool by default, or plain unmanaged threads when only one CPU is available:
Since I use a standard blocking JDBC driver (as opposed to an async driver like r2dbc), most of the time jOOQ threads spend will be waiting for I/O, it is advisable to allocate more threads than provided by the ForkJoinPool common pool, the default sizes of which seem to be configured more for CPU-intensive work.
I'm using Kotlin coroutines, what would the best way be to integrate jOOQ's executor with my Kotlin Dispatchers.IO thread pool, which has a better default configuration for threads doing blocking IO?
With Kotlin, the IO dispatcher can be obtained as an executor using the asExecutor() extension function. Therefore, configuring jOOQ to use it is simple:
DSL
.using(
DefaultConfiguration()
.set(ExecutorProvider { Dispatchers.IO.asExecutor() })
)
Related
when using IO, in an fx block I can use continueOn with dispatchers.io() but also Dispatchers.IO or I can mix. Is there a preferred way? Is there any difference between the two?
Note: I am also using the coroutines integration to run the IO
IO.fx {
effect { _viewState.postValue(ViewState.Loading) }.bind()
continueOn(dispatchers().io()) // dispatchers from Arrow
val repositoryDto: RepositoryDto = effect { service.getRepository() }.bind()
continueOn(Dispatchers.Default) // Dispatchers from Coroutines
ViewState.Content(repositoryDto)
}
There is no preferred way, and the users can choose to use one, or both together.
While Arrow Fx's pool offers a couple of simple Executor pools similar to other functional effect libraries. These are very efficient with a functional approach.
On the other side KotlinX's dispatcher currently offers more features such as an EventLoop, it loads the Main dispatcher through ServiceLoader and has testing support.
I am also using the coroutines integration to run the IO
The KotlinX integration module for Arrow Fx's IO is only meant for integration with structured concurrency CoroutineScope.
I often create classes that have functions that contain a coroutine. It isn't always clear whether the function is being used by some component that is bound to the UI or whether it's doing background work that is more IO oriented. Here's an example:
fun myFunction() {
GlobalScope.launch {
// Do something
}
}
In this example, no Dispatcher.MAIN or Dispatchers.IO is specified. Is this the correct way to do this? Does the coroutine use the scope of whatever the calling client happens to be using? Should I only specify a dispatcher when I know definitively that I need a specific scope?
GlobalScope binds the lifecycle of the Coroutine to the lifecycle of the application itself.
Which means Coroutine started from this scope would continue to live until one of two things occur
Coroutine completes its job.
The Application itself is killed.
Using async or launch on the instance of GlobalScope is highly discouraged.
No Dispatcher.MAIN or Dispatchers.IO is specified. Is this the correct way to do this?
Yea, why not? If the work inside coroutine is not related to either UI or IO go for it.
Should I only specify a dispatcher when I know definitively that I
need a specific scope?
To answer this, let's first see the definition of launch from docs,
fun CoroutineScope.launch(
context: CoroutineContext = EmptyCoroutineContext,
start: CoroutineStart = CoroutineStart.DEFAULT,
block: suspend CoroutineScope.() -> Unit ): Job (source)
The Dispatcher which we are talking about is a kind of CoroutineContext. As you can see in the definition if the CoroutineContext is not mentioned(which means we have not mentioned the Dispatcher too) it is by default set to EmptyCoroutineContext which internally uses Dispatchers.Default and this is what docs say about it,
The default CoroutineDispatcher that is used by all standard builders
like launch, async, etc if neither a dispatcher nor any other
ContinuationInterceptor is specified in their context.
It is backed by a shared pool of threads on JVM. By default, the
maximum number of threads used by this dispatcher is equal to the
number of CPU cores, but is at least two.
So even if you forget to mention the Dispatcher, Scheduler will pick any random available thread from the pool and hand it the Coroutine. But make sure that not to initiate any UI related work without mentioning the Dispatcher.
First of all, you must differentiate the scope from the context and dispatcher.
Coroutine scope is primarily about the lifecycle of the coroutine and deals with the concept of structured concurrency. It may have a default dispatcher, which would be the one logically associated with the object to which you tie the coroutine's lifecycle. For example, if you scope a coroutine to an Android activity, the default dispatcher will be UI.
Coroutine context refers to a dispatcher. The context should change during the coroutine's execution, as the logic inside requires it. Typically, you will use withContext to temporarily switch dispatchers in order to avoid blocking the UI thread. You will not typically launch the whole coroutine in the thread pool, unless all of it should run on a background thread (e.g., no UI interaction).
Second, the choice of dispatcher should be collocated with the code that requires a specific one. It should happen within the function that deals with a given concern, like making REST requests or DB operations. This once again reinforces the practice not to decide on dispatchers when launching the coroutine.
GlobalScope is an EmptyCoroutineScope and all coroutines launched with this scope are like demo threads. They cannot be canceled and remain active until their completion. I suggest implementing a specific scope e not using GlobalScope in order to control all the coroutines that are launched. The GlobalScope use the Dispatchers.Default as the default dispatcher and in your case you always create coroutines in the default dispatcher.
Astoundingly, Kotlin doesn't seem to provide suspending versions of InputStream and OutputStream.
It's not hard to roll your own, but that doesn't give you the kind of default compatibility with other code that these ubiquitous interfaces provide in Java.
What would I use for suspending stream interfaces in Kotlin if I wanted to maximize interoperability without adapters?
I think that the primary way for doing IO with purely Java APIs is just calling blocking methods in the context of Dispatchers.IO.
However, there is ktor-io library by JetBrains that implements IO in a purely suspending way.
Request scope enables us to track request wise variables throughout the request processing. But I think it depends on thread local variables. I assume using Kotlin coroutines will that break the Guice Injection of Request Scope semantics..
Coroutines do not always run on the same thread and therefore you will have problems with thread local variables, e.g. Guice Request Scope.
But it is possible to transfer thread local variables between coroutines: https://github.com/Kotlin/kotlinx.coroutines/blob/master/docs/coroutine-context-and-dispatchers.md#thread-local-data
I don't know Guice and so I don't know if there is a way to integrate ThreadContextElement into this framework.
See also: How to use code that relies on ThreadLocal with Kotlin coroutines
Is there a specific language implementation in Kotlin which differs from another language's implementation of coroutines?
What does it mean that a coroutine is like a lightweight thread?
What is the difference?
Are Kotlin coroutines actually running in parallel (concurrently)?
Even in a multi-core system, is there only one coroutine running at any given time?
Here I'm starting 100,000 coroutines. What happens behind this code?
for(i in 0..100000){
async(CommonPool){
// Run long-running operations
}
}
What does it mean that a coroutine is like a lightweight thread?
Coroutine, like a thread, represents a sequence of actions that are executed concurrently with other coroutines (threads).
What is the difference?
A thread is directly linked to the native thread in the corresponding OS (operating system) and consumes a considerable amount of resources. In particular, it consumes a lot of memory for its stack. That is why you cannot just create 100k threads. You are likely to run out of memory. Switching between threads involves OS kernel dispatcher and it is a pretty expensive operation in terms of CPU cycles consumed.
A coroutine, on the other hand, is purely a user-level language abstraction. It does not tie any native resources and, in the simplest case, uses just one relatively small object in the JVM heap. That is why it is easy to create 100k coroutines. Switching between coroutines does not involve OS kernel at all. It can be as cheap as invoking a regular function.
Are Kotlin coroutines actually running in parallel (concurrently)? Even in a multi-core system, is there only one coroutine running at any given time?
A coroutine can be either running or suspended. A suspended coroutine is not associated to any particular thread, but a running coroutine runs on some thread (using a thread is the only way to execute anything inside an OS process). Whether different coroutines all run on the same thread (a thus may use only a single CPU in a multicore system) or in different threads (and thus may use multiple CPUs) is purely in the hands of a programmer who is using coroutines.
In Kotlin, dispatching of coroutines is controlled via coroutine context. You can read more about then in the
Guide to kotlinx.coroutines
Here I'm starting 100,000 coroutines. What happens behind this code?
Assuming that you are using launch function and CommonPool context from the kotlinx.coroutines project (which is open source) you can examine their source code here:
launch is defined here https://github.com/Kotlin/kotlinx.coroutines/blob/master/core/kotlinx-coroutines-core/src/main/kotlin/kotlinx/coroutines/experimental/Builders.kt
CommonPool is defined here https://github.com/Kotlin/kotlinx.coroutines/blob/master/core/kotlinx-coroutines-core/src/main/kotlin/kotlinx/coroutines/experimental/CommonPool.kt
The launch just creates new coroutine, while CommonPool dispatches coroutines to a ForkJoinPool.commonPool() which does use multiple threads and thus executes on multiple CPUs in this example.
The code that follows launch invocation in {...} is called a suspending lambda. What is it and how are suspending lambdas and functions implemented (compiled) as well as standard library functions and classes like startCoroutines, suspendCoroutine and CoroutineContext is explained in the corresponding Kotlin coroutines design document.
Since I used coroutines only on JVM, I will talk about the JVM backend. There are also Kotlin Native and Kotlin JavaScript, but these backends for Kotlin are out of my scope.
So let's start with comparing Kotlin coroutines to other languages coroutines. Basically, you should know that there are two types of coroutines: stackless and stackful. Kotlin implements stackless coroutines - it means that coroutine doesn't have its own stack, and that limiting a little bit what coroutine can do. You can read a good explanation here.
Examples:
Stackless: C#, Scala, Kotlin
Stackful: Quasar, Javaflow
What does it mean that a coroutine is like a lightweight thread?
It means that coroutine in Kotlin doesn't have its own stack, it doesn't map on a native thread, it doesn't require context switching on a processor.
What is the difference?
Thread - preemptively multitasking. (usually).
Coroutine - cooperatively multitasking.
Thread - managed by OS (usually).
Coroutine - managed by a user.
Are Kotlin coroutines actually running in parallel (concurrently)?
It depends. You can run each coroutine in its own thread, or you can run all coroutines in one thread or some fixed thread pool.
More about how coroutines execute is here.
Even in a multi-core system, is there only one coroutine running at any given time?
No, see the previous answer.
Here I'm starting 100,000 coroutines. What happens behind this code?
Actually, it depends. But assume that you write the following code:
fun main(args: Array<String>) {
for (i in 0..100000) {
async(CommonPool) {
delay(1000)
}
}
}
This code executes instantly.
Because we need to wait for results from async call.
So let's fix this:
fun main(args: Array<String>) = runBlocking {
for (i in 0..100000) {
val job = async(CommonPool) {
delay(1)
println(i)
}
job.join()
}
}
When you run this program, Kotlin will create 2 * 100000 instances of Continuation, which will take a few dozen MB of RAM, and in the console, you will see numbers from 1 to 100000.
So let’s rewrite this code in this way:
fun main(args: Array<String>) = runBlocking {
val job = async(CommonPool) {
for (i in 0..100000) {
delay(1)
println(i)
}
}
job.join()
}
What do we achieve now? Now we create only 100,001 instances of Continuation, and this is much better.
Each created Continuation will be dispatched and executed on CommonPool (which is a static instance of ForkJoinPool).