Why Flux.flatMap() doesn't wait for completion of inner publisher? - kotlin

Could you please explain what exactly happens in Flux/Mono returned by HttpClient.response() ? I thought value generated by http client will NOT be passed downstream until Mono completes but I see that tons of requests are generated which ends up with reactor.netty.internal.shaded.reactor.pool.PoolAcquirePendingLimitException: Pending acquire queue has reached its maximum size of 8 exception. It works as expected (items being processed one by one) if I replace call to testRequest() with Mono.fromCallable { }.
What am I missing ?
Test code:
import org.asynchttpclient.netty.util.ByteBufUtils
import reactor.core.publisher.Flux
import reactor.core.publisher.Mono
import reactor.netty.http.client.HttpClient
import reactor.netty.resources.ConnectionProvider
class Test {
private val client = HttpClient.create(ConnectionProvider.create("meh", 4))
fun main() {
Flux.fromIterable(0..99)
.flatMap { obj ->
println("Creating request for: $obj")
testRequest()
.doOnError { ex ->
println("Failed request for: $obj")
ex.printStackTrace()
}
.map { res ->
obj to res
}
}
.doOnNext { (obj, res) ->
println("Created request for: $obj ${res.length} characters")
}
.collectList().block()!!
}
fun testRequest(): Mono<String> {
return client.get()
.uri("https://projectreactor.io/docs/netty/release/reference/index.html#_connection_pool")
.responseContent()
.reduce(StringBuilder(), { sb, buf ->
val str= ByteBufUtils.byteBuf2String(Charsets.UTF_8, buf)
sb.append(str)
})
.map { it.toString() }
}
}

When you create the ConnectionProvider like this ConnectionProvider.create("meh", 4), this means connection pool with max connections 4 and max pending requests 8. See here more about this.
When you use flatMap this means Transform the elements emitted by this Flux asynchronously into Publishers, then flatten these inner publishers into a single Flux through merging, which allow them to interleave See here more about this.
So what happens is that you are trying to run all requests simultaneously.
So you have two options:
If you want to use flatMap then increase the number of the pending requests.
If you want to keep the number of the pending requests you may consider for example using concatMap instead of flatMap, which means Transform the elements emitted by this Flux asynchronously into Publishers, then flatten these inner publishers into a single Flux, sequentially and preserving order using concatenation. See more here about this.

Related

Send upstream exception in SharedFlow to collectors

I want to achieve the following flow logic in Kotlin (Android):
Collectors listen to a List<Data> across several screens of my app.
The source-of-truth is a database, that exposes data and all changes to it as a flow.
On the first initialization the data should be initialized or updated via a remote API
If any API exception occurs, the collectors must be made aware of it
In my first attempt, the flow was of the type Flow<List<Data>>, with the following logic:
val dataFlow = combine(localDataSource.dataFlow, flow {
emit(emptyList()) //do not wait for API on first combination
emit(remoteDataSource.suspendGetDataMightThrow())
}) { (local, remote) ->
remote.takeUnless { it.isEmpty() }?.let { localDataSource.updateIfChanged(it) }
local
}.shareIn(externalScope, SharingStarted.Lazily, 1)
This worked fine, except when suspendGetDataMightThrow() throws an exception. Because shareIn stops propagating the exception through the flow, and instead breaks execution of the externalScope, my collectors are not notified about the exception.
My solution was to wrap the data with a Result<>, resulting of a flow type of Flow<Result<List<Data>>>, and the code:
val dataFlow = combine(localDataSource.dataFlow, flow {
emit(Result.success(emptyList())) //do not wait for API on first combination
emit(runCatching { remoteDataSource.suspendGetDataMightThrow() })
}) { (local, remote) ->
remote.onSuccess {
data -> data.takeUnless { it.isEmpty() }?.let { localDataSource.updateIfChanged(it) }
}
if (remote.isFailure) remote else local
}.shareIn(externalScope, SharingStarted.Lazily, 1)
I can now collect it as follows, and the exception is passed to the collectors:
dataRepository.dataFlow
.map { it.getOrThrow() }
.catch {
// ...
}
.collect {
// ...
}
Is there a less verbose solution to obtain the exception, than to wrap the whole thing in a Result?
I am aware that there are other issues with the code (1 API failure is emitted forever). This is only a proof-of-concept to get the error-handling working.

Why can a Flow emit both Int and String value in Kotlin?

You know that Array and List only store the same data struction.
I run the Code A and get the Result A.
It seems that the Flow can emit both Int value and String value, why?
Code A
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*
suspend fun performRequest(request: Int): Int {
delay(1000) // imitate long-running asynchronous work
return request
}
fun main() = runBlocking<Unit> {
(1..3).asFlow() // a flow of requests
.transform { request ->
emit("Making request $request")
if (request >1) {
emit(performRequest(request))
}
}
.collect { response -> println(response) }
}
Result A
Making request 1
Making request 2
2
Making request 3
3
This is not a question of Flow but Java/Kotling generics and type safety.
The type this flow returns is Comperable<*>
val flow: Flow<Comparable<*>> = (1..3).asFlow() // a flow of requests
.transform { request ->
emit("Making request $request")
if (request > 1) {
emit(performRequest(request))
}
If you explicitly specify which value you want to return Flow you can restrict the types.
About generics you can refer here or check any document about generics in java/kotlin, type safety you can refer this question
Also when you are in doubt what your specified type is use alt + enter with Android Studio to see avaialble options and select Specify type explicitly.
Disregarding the nature of this request, you can have the functionality you want by making your flow emit instances of some algebraic data type that is basically a "sum" (from the type-theoretic POV) of your constituent types:
sealed interface Record
data class IntData(val get: Int) : Record
data class Metadata(val get: String) : Record
// somewhere later (flow is of type Flow<Record>)
fun main() = runBlocking<Unit> {
(1..3).asFlow() // a flow of requests
.transform { request ->
emit(Metadata("Making request $request"))
if (request > 1) {
emit(IntData(performRequest(request)))
}
// probably want to handle the `else` case too
}
.collect { response -> println(response) }
}
This would be a good solution since it's extendable (i.e. you can add the other cases later on if you need to).
In your specific case though, since you just want to debug the flow, you might not want to actually emit the "metadata" and just go for the tests of your code directly.

How to make several synchronuous call of rxjava Single

I have difficulties making sequential calls of RxJava Single observerable. What I mean is that I have a function that makes http request using retrofit that returns a Single.
fun loadFriends(): Single<List<Friend>> {
Log.d("msg" , "make http request")
return webService.getFriends()
}
and if I subscribe from several places at the same time:
loadFriends().subscribeOn(Schedulers.io()).subscribe()
loadFriends().subscribeOn(Schedulers.io()).subscribe()
I want that loadFriends() makes only one https request but in this case I have two http request
I know how to solve this problem in blocking way:
The solution is to make loadFriends() blocking.
private val lock = Object()
prival var inMemoryCache: List<Friends>? = null
fun loadFriends(): Single<List<Friend>> {
return Single.fromCallable {
if(inMemoryCache == null) {
synchronize(lock) {
if(inMemoryCache == null) {
inMemoryCache = webService.getFriends().blockingGet()
}
}
}
inMemoryCache
}
But I want to solve this problem in a reactive way
You can remedy this by creating one common source for all your consumers to subscribe to, and that source will have the cache() operator invoked against it. The effect of this operator is that the first subscriber's subscription will be delegated downstream (i.e. the network request will be invoked), and subsequent subscribers will see internally cached results produced as a result of that first subscription.
This might look something like this:
class Friends {
private val friendsSource by lazy { webService.getFriends().cache() }
fun someFunction() {
// 1st subscription - friends will be fetched from network
friendsSource
.subscribeOn(Schedulers.io())
.subscribe()
// 2nd subscription - friends will be fetched from internal cache
friendsSource
.subscribeOn(Schedulers.io())
.subscribe()
}
}
Note that the cache is indefinite, so if periodically refreshing the list of friends is important you'll need to come up with a way to do so.

Should Flux.then and Mono.then behave differently in error case?

I encountered a case where I have a nested Flux. I don't care about the individual results of the inner flux as it returns Unit (in Kotlin / Void in Java), but I want to know if the Flux aborted due to an error or not. I thought I could use the then function, as the doc states: Error signal is replayed in the resulting Mono<V>
My problem can be reduced to the minimum (Kotlin) unit test:
#Test
fun fluxTest() {
val flux = Flux.just("willFail", "willSucceed")
.flatMap { outer ->
// In my real world example the inner flux is created via Flux.fromIterable from a property of the
// outer`-object
Flux.just(1)
.flatMap { inner ->
// this simulates a Mono.fromSupplier that can throw exceptions
if (outer == "willFail") Mono.error<Unit>(RuntimeException("bam"))
else Mono.just(Unit)
}
// We don't care about the Flux as it returns Unit/Void
// All we want to know is, whether there was an error or not
.then(Mono.just(outer))
}
.onErrorContinue { error, item -> println("$item => $error") }
.collectList()
StepVerifier.create(flux)
.expectNextMatches { it.size == 1 }
.verifyComplete()
}
So we have 2 elements. In the inner Flux one of the elements will fail on processing and the other won't. I expect the error to propagate through the pipeline where it is catched and discarded in the onErrorContinue.
Therefore I'd expect 1 element in the resulting list, but I get the original 2. I have no clue why.
Now comes the fun part: In this particular test case, I can replace Flux.just(1) with Mono.just(1) (in my real world case this doesn't work ofc because the flux has more than 1 element) and suddenly my test passes:
#Test
fun fluxTest() {
val flux = Flux.just("willFail", "willSucceed")
.flatMap { outer ->
// In my real world example the inner flux is created via Flux.fromIterable from a property of the
// outer`-object
Mono.just(1)
.flatMap { inner ->
// this simulates a Mono.fromSupplier that can throw exceptions
if (outer == "willFail") Mono.error<Unit>(RuntimeException("bam"))
else Mono.just(Unit)
}
// We don't care about the Flux as it returns Unit/Void
// All we want to know is, whether there was an error or not
.then(Mono.just(outer))
}
.onErrorContinue { error, item -> println("$item => $error") }
.collectList()
StepVerifier.create(flux)
.expectNextMatches { it.size == 1 }
.verifyComplete()
}
So obviously there is a difference in Mono.then(Mono<T>) and in Flux.then(Mono<T>), but it shouldn't since the Javadoc is the same right?
Side note: Instead of Flux.then(Mono.just(outer)) I also tried Mono.defer but that is not changing anything.

Kotlin Coroutines - unlimited stream to fan out batches

I'm looking to implement a pipeline for processing an infinite stream of messages. I'm new to coroutines and trying to follow along with the docs but I'm not confident I'm doing the right thing.
My infinite stream is of batches of records and I'd like to fan out the processing of each record to a coroutine, wait for a batch to finish (to log stats and stuff) before continuing to the next batch.
-> process [record] \
source -> [records] -> process [record] -> [log batch stats]
-> process [record] /
|------------------- while(true) -------------------|
What I had planned is to have 2 Channels, one for the infinite stream, and one for the intermediate records that will fill up and empty on each batch.
runBlocking {
val infinite: Channel<List<Record>> = produce { send(source.getBatch()) }
val records = Channel<Record>(Channel.Factory.UNLIMITED)
while(true) {
infinite.receive().forEach { records.send(it) }
while(!records.isEmpty()) {
launch { process(records.receive()) }
}
// ??? Wait for jobs?
logBatchStats()
}
}
From googling, it seems that waiting for jobs is discouraged, plus I wasn't sure if calling .map on a channel will actually receive messages to convert them to jobs:
records.map { record -> launch { process(record) } }
yields a Channel<Job>. It seems I can call .toList() on it to collapse it, but then I need to join the jobs? Again, google suggested to do that by having a parent job, but I'm not really sure how to do that with launch.
Anyway, very much a n00b to this.
Thanks for the help.
I don't see a reason to have two channels. You could directly iterate over the list of records. And you should use async instead of launch. Then you can use await or even better awaitAll for the list of results.
val infinite: ReceiveChannel<List<Record>> = produce { ... }
while(true) {
val resultsDeferred = infinite.receive().map {
async {
process(it)
}
}
val results = resultsDeferred.awaitAll()
logBatchStats()
}