I'd like to organize a thread barrier: given a single lock object, any thread can obtain it and continue thread's chain further, but any other thread will stay dormant on the same lock object until the first thread finishes and releases the lock.
Let's express my intention in code (log() simply prints string in a log):
val mutex = Semaphore(1) // number of permits is 1
source
.subscribeOn(Schedulers.newThread()) // any unbound scheduler (io, newThread)
.flatMap {
log("#1")
mutex.acquireUninterruptibly()
log("#2")
innerSource
.doOnSubscribe(log("#3"))
.doFinally {
mutex.release()
log("#4")
}
}
.subscribe()
It actually works well, i can see how multiple threads show log "#1" and only one of them propagates further, obtaining lock object mutex, then it releases it and i can see other logs, and next threads comes into play. OK
But sometimes, when pressure is quite high and number of threads is greater, say 4-5, i experience DEADLOCK:
Actually, the thread that has acquired the lock, prints "#1" and "#2" but it then never print "#3" (so doOnSubscribe() not called), so it actually stops and does nothing, not subscribing to innerSource in flatMap. So all threads are blocked and app is not responsive at all.
My question - is it safe to have blocking operation inside flatMap? I dig into flatMap source code and i see the place where it internally subscribes:
if (!isDisposed()) {
o.subscribe(new FlatMapSingleObserver<R>(this, downstream));
}
Is it possible that thread's subscription, that has acquired lock, was disposed somehow?
You can use flatMap second parameter maxConcurrency and set it to 1, so it does what you want without manually locking
Related
This question already has answers here:
how to cap kotlin coroutines maximum concurrency
(7 answers)
Closed last month.
I have code, something like this:
entities.forEach {
launch() {
doingSomethingWithDB(it)
}
}
suspend fun doingSomethingWithDB(entity) {
getDBConnectionFromPool()
// doing something
returnDBConnectionToPool()
}
And when the number of entities exceeds the size of DB connections pool (I use HikariCP), I get the error - Connection is not available.... Even if I only use the single thread (e.g. -Dkotlinx.coroutines.io.parallelism=1), I get this error anyway.
Are there best practices for limiting the number of parallel coroutines when dealing with external resources (like fixed size DB connection pool)?
As your doingSomethingWithDB() acquires and releases resources manually at the beginning/end, limiting the parallelism is not sufficient in this case - we need to limit the concurrency. The easiest way to do this is by using a Semaphore:
val semaphore = Semaphore(8)
suspend fun doingSomethingWithDB(entity) {
semaphore.withPermit {
getDBConnectionFromPool()
// doing something
returnDBConnectionToPool()
}
}
A few words of explanation: because coroutines can suspend and switch from thread to thread, even if we limit the parallelism of coroutines that invoke doingSomethingWithDB(), still this function can be invoked arbitrary number of times concurrently. Parallelism only means how many coroutines could be actively executing at a specific moment in time, but if any of them suspend, additional coroutines could proceed.
The view model is given below
class ClickRowViewModel #Inject constructor(
private val clickRowRepository: ClickRowRepository
): ViewModel() {
private val _clickRowsFlow = MutableStateFlow<List<ClickRow>>(mutableListOf())
val clickRowsFlow = _clickRowsFlow.asStateFlow()
fun fetchAndInitialiseClickRows() {
viewModelScope.launch {
_clickRowsFlow.update {
clickRowRepository.fetchClickRows()
}
}
}
}
My test is as follows:
I am using InstantTaskExecutorRule as follows
#get:Rule
val instantTaskExecutorRule = InstantTaskExecutorRule()
The actual value never resolves to the expected value even though $result seems to have two elements but the actualValue is an empty list. I don't know what I am doing wrong.
Update
I tried to use the first terminal operator as well but the returned output returns an empty list.
Update # 2
I tried async but I got the following error
kotlinx.coroutines.test.UncompletedCoroutinesError: After waiting for 60000 ms, the test coroutine is not completing, there were active child jobs: [DeferredCoroutine{Active}#a4a38f0]
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt$runTestCoroutine$3$3.invokeSuspend(TestBuilders.kt:342)
Update # 3
This test passes in Android Studio, but fails using CLI
Test failing in CLI
You can't call toList on a SharedFlow like that:
Shared flow never completes. A call to Flow.collect on a shared flow never completes normally, and neither does a coroutine started by the Flow.launchIn function.
So calling toList will hang forever, because the flow never hits an end point where it says "ok that's all the elements", and toList needs to return a final value. Since StateFlow only contains one element at a time anyway, and you're not collecting over a period of time, you probably just want take(1).toList().
Or use first() if you don't want the wrapping list, which it seems you don't - each element in the StateFlow is a List<ClickRow>, which is what clickRowRepository.fetchClickRows() returns too. So expectedValue is a List<ClickRow>, whereas actualValue is a List<List<ClickRow>> - so they wouldn't match anyway!
edit your update (using first()) has a couple of issues.
First of all, the clickRowsFlow StateFlow in your ViewModel only updates when you call fetchAndInitialiseClickRows(), because that's what fetches a value and sets it on the StateFlow. You're not calling that in your second example, so it won't update.
Second, that StateFlow is going to go through two state values, right? The first is the initial empty list, the second is the row contents you get back from the repo. So when you access that StateFlow, it either needs to be after the update has happened, or (better) you need to ignore the first state and only return the second one:
val actualValue = clickRowViewModel.clickRowsFlow
.drop(1) // ignore the initial state
.first() // then take the first result after that
// start the update -after- setting up the flow collection,
// so there's no race condition to worry about
clickRowsViewModel.fetchAndInitialiseClickRows()
This way, you subscribe to the StateFlow and immediately get (and drop) the initial state. Then when the update happens, it should push another value to the subscriber, which takes that first new value as its final result.
But there's another complication - because fetchAndInitialiseClickRows() kicks off its own coroutine and returns immediately, that means the fetch-and-update task is running asynchronously. You need to give it time to finish, before you start asserting any results from it.
One option is to start the coroutine and then block waiting for the result to show up:
// start the update
clickRowsViewModel.fetchAndInitialiseClickRows()
// run the collection as a blocking operation, which completes when you get
// that second result
val actualValue = clickRowViewModel.clickRowsFlow
.drop(1)
.first()
This works so long as fetchAndInitialiseClickRows doesn't complete immediately. That consumer chain up there requires at least two items to be produced while it's subscribed - if it never gets to see the initial state, it'll hang waiting for that second (really a third) value that's never coming. This introduces a race condition and even if it's "probably fine in practice" it still makes the test brittle.
Your other option is to subscribe first, using a coroutine so that execution can continue, and then start the update - that way the subscriber can see the initial state, and then the update that arrives later:
// async is like launch, but it returns a `Deferred` that produces a result later
val actualValue = async {
clickRowViewModel.clickRowsFlow
.drop(1)
.first()
}
// now you can start the update
clickRowsViewModel.fetchAndInitialiseClickRows()
// then use `await` to block until the result is available
assertEquals(expected, actualValue.await())
You always need to make sure you handle waiting on your coroutines, otherwise the test could finish early (i.e. you do your asserting before the results are in). Like in your first example, you're launching a coroutine to populate your list, but not ensuring that has time to complete before you check the list's contents.
In that case you'd have to do something like advanceUntilIdle() - have a look at this section on testing coroutines, it shows you some ways to wait on them. This might also work for the one you're launching with fetchAndInitialiseClickRows (since it says it waits for other coroutines on the scheduler, not the same scope) but I'm not really familiar with it, you could look into it if you like!
The following code only prints 10000 i.e. only the last element
val channel = BroadcastChannel<Int>(Channel.CONFLATED)
val flowJob = channel.asFlow().buffer(Channel.UNLIMITED).onEach {
println(it)
}.launchIn(GlobalScope)
for (i in 0..100) {
channel.offer(i*i)
}
flowJob.join()
Code can be ran in the playground.
But since the Flow is launched in separate dispatching thread, and value is sent to the Channel and since Flow has an unlimited buffer, it should receive each element till onEach is invoked. But why only the last element is able to get received?
Is this the expected behavior or some bug? If its expected behavior how would somebody try to push only the newest elements to the flow, but all the flow that has certain buffer can receive the element.
Actually, this is about the "Conflate" way of buffering. For buffering a flow you have a couple of ways such as using buffer() method or collectLatest() or conflate(). Each of them has their own way to buffer. So conflate() method's way is that when the flow emits values, it tries to collect but when the collector is too slow, then conflate() skips the intermediate values for the sake of the flow. And it's doing it even tho every time it's emitted in a separate coroutine. So in a channel, a similar thing is happening basically.
Here is the official doc explanation:
When a flow represents partial results of the operation or operation
status updates, it may not be necessary to process each value, but
instead, only most recent ones. In this case, the conflate operator
can be used to skip intermediate values when a collector is too slow
to process them.
Check out this link.
The explanation is for flow but you need to focus on the feature that you are using. And in this case, conflation is same for channel and flow.
The problem here is the Channel.CONFLATED. Taken from the docs:
Channel that buffers at most one element and conflates all subsequent `send` and `offer` invocations,
so that the receiver always gets the most recently sent element.
Back-to-send sent elements are _conflated_ -- only the the most recently sent element is received,
while previously sent elements **are lost**.
Sender to this channel never suspends and [offer] always returns `true`.
This channel is created by `Channel(Channel.CONFLATED)` factory function invocation.
This implementation is fully lock-free.
so this is why you only get the most recent (last) element. I'd use an UNLIMITED Channel instead:
val channel = Channel<Int>(Channel.UNLIMITED)
val flowJob = channel.consumeAsFlow().onEach {
println(it)
}.launchIn(GlobalScope)
for (i in 0..100) {
channel.offer(i*i)
}
flowJob.join()
As some of the comments stated, using Channel.CONFLATED will store only the last value, and you are offering to the channel, even if your flow has a buffer.
Also join() will suspend until the Job is not complete, in your case infinitely, that's why you needed the timeout.
val channel = Channel<Int>(Channel.RENDEZVOUS)
val flowJob = channel.consumeAsFlow().onEach {
println(it)
}.launchIn(GlobalScope)
GlobalScope.launch{
for (i in 0..100) {
channel.send(i * i)
}
channel.close()
}
flowJob.join()
Check out this solution (playground link), with the Channel.RENDEZVOUS your channel will accept new elements only if the others are already consumed.
This is why we have to use send instead of offer, send suspends until it can send elements, while offer returns a boolean indicating if send was succesfull.
At last, we have to close the channel, in order for join() not to suspend until eternity.
I'm using a kotlin channel in order to migrate a database: I have 1 producer and multiple processors which write to database. The producer just sends the batches of documents to channel:
fun CoroutineScope.produceDocumentBatches(mongoCollection: MongoCollection<Document>) = produce<List<Document>> {
var batch = arrayListOf<Document>()
for ((counter, document) in mongoCollection.find().withIndex()) {
if ((counter + 1) % 100 == 0) {
sendBlocking(batch)
batch = arrayListOf()
}
batch.add(document)
}
if (batch.isNotEmpty()) sendBlocking(batch) }
}
This is how my processors look like:
private fun CoroutineScope.processDocumentsAsync(
documentDbCollection: MongoCollection<Document>,
channel: ReceiveChannel<List<Document>>,
numberOfProcessedDocuments: AtomicInteger
) = launch(Dispatchers.IO) {
// do processing
}
And this is how I use them in the script:
fun run() = runBlocking {
val producer = produceDocumentBatches(mongoCollection)
(1..64).map { processDocumentsAsync(documentDbCollection, producer, count) }
}
So is it fine to use sendBlocking with regards to performance? If I use just send I create many suspending functions inside one coroutine because writes to database are much slower than reads and I get java.lang.OutOfMemoryError: Java heap space. Do I understand correctly that producer blocks Main thread but it's fine for performance because all consumers are executed on IO threads?
Maybe my understanding is not correct, but I think you are not creating many suspending functions inside a coroutine, not sure if it even makes sense to say something like that. As you haven't defined a capacity when using produce function, default value is zero and it is using a RendezvousChannel. It is going to suspend till another coroutine invokes receive. I don't think you need sendBlocking.
My guess is that you are using two many consumers and has two many Document instances in memory, it could be the reason for the OutOfMemoryError. What is the size of each Document? What are your jvm heap configurations?
My suggestions:
Use send instead of sendBlocking
Decrease the number of consumers to a much smaller number and see if the OutOfMemory persists.
Use clear() to erase the ArrayList instead of creating a new instance.
If it works fine after those suggestions, try to increase the number of consumers and check if everything still works fine.
I'm writing a native WebRTC application that forwards pre-encoded frames for a client right now. That part is all fine, but I'm having segfault issues every time I attempt to exit my application, specifically with regards to how I'm destroying my WebRtcPeerConnectionFactory.
I'm instantiating this object by first launching separate threads for networking, signaling, and working respectively, and in my destructor kill these threads by calling thread->Quit() before setting my webRtcPeerConnectionFactory to a nullptr (as I've seen in examples in the source code do in their conductor.cc files), but I either segfault or hang indefinitely depending on the order with which the prior two actions are taken.
On a high level is there a correct way to gracefully destroy the factory object or is there some cleanup function I'm not calling? I can't find any other examples online that take advantage of the WebRTC threading model so I'm not sure where to move on from here. Thanks!
My instantiation of the object is performed like so:
rtc_network_thread_ = rtc::Thread::CreateWithSocketServer();
rtc_worker_thread_ = rtc::Thread::Create();
rtc_signaling_thread_ = rtc::Thread::Create();
if (!rtc_network_thread_->Start() || !rtc_worker_thread_->Start() || !rtc_signaling_thread->Start()) {
// error handling
}
peer_connection_factory_ = webrtc::CreatePeerConnectionFactory(
rtc_network_thread_.get(), rtc_worker_thread_.get(), rtc_signaling_thread_.get(),
nullptr, webrtc::CreateBuiltInAudioEncoderFactory(), webrtc::CreateBuiltInAudioDecoderFactory(),
dummy_encoder_factory_.get(), nullptr)
And my subsequent cleanup looks like this:
rtc_worker_thread_->Quit();
rtc_network_thread_->Quit();
rtc_signaling_thread_->Quit();
if (peer_connection_factory_) {
// errors occur here, either with setting to nullptr or with threads
// quitting if I quit the threads after setting my factory to a nullptr
peer_connection_factory_ = nullptr;
}