Kotlin RX .zipWith function body executes once or once per observer? - kotlin

When data is transmitted to my app, it follows a sequence of:
1 ReadStart packet with ID information
1 or more DataPackets that are combined to form the payload
1 ReadDone packet to signal the transfer is done
I have a Kotlin RX function that creates an Observable:
val readStartPublishProcessor: PublishProcessor<ReadStartPacket>
val dataPacketPublishProcessor: PublishProcessor<DataPacket>
val readDonePublishProcessor: PublishProcessor<ReadDonePacket>
...
private fun setupReadListener(): Flowable<ReadEvent> {
val dataFlowable = dataPacketPublishProcessor.buffer(readDonePublishProcessor)
return readStartPublishProcessor
.zipWith(other = dataFlowable) { readStart, dataPackets ->
Log.d(tag, "ReadEvent with ${dataPackets.size} data packets")
ReadEvent(event = readStart, payload = combinePackets(dataPackets))
}
}
From reading the documentation of .zipWith, I expect the .zipWith function body to execute once for each pair of values emitted from the readStartPublishProcessor and dataFlowable, and then pass that computed result to each subscriber:
The Zip method returns an Observable that applies a function of your
choosing to the combination of items emitted, in sequence, by two (or
more) other Observables, with the results of this function becoming
the items emitted by the returned Observable. ... It will only emit as
many items as the number of items emitted by the source Observable
that emits the fewest items.
But if I have more than 1 observer I see the .zipWith function body executed the same number of times as the number of observers, each time with the same pair of emitted values. This is a problem because of side effects in the function called from within the .zipWith function body. (Note: neither .share nor .replay operators are not used in the observers.)
Why does is seem to be running the .zipWith function body for each observer rather than just once, and is there a way to write this so it only executes once regardless of the number of observers?

A couple of points...
The function called from within zipWith should not contain side effects. It should be a pure function. If you absolutely need side effects there, then use one of the do operators.
The observable returned from zipWith is cold (Observables are cold by default,) this means that every observer gets its own execution context. I.E., that the operator subscribes to its source observables every time that it is subscribed to and calls the function block it has for each subscription.
If you want subscriptions to share an execution context, then you must use the share or refCount operator. Learn more about Hot and Cold Observables here.

Related

Mono switchIfEmpty emitting twice

I have a scenario where i need to find a method that return a string response from external API. I have two possibilities for this response give me a valid response (with parameter 1 or parameter 2) or if both responses are not valid, return a final empty publisher to chain.
Mono<String> checkResponse(String parameter)
Check if call checkResponse(parameter1) is acceptable, ignore second call (switchIfEmpty) and continue chain, or
Check if call checkResponse(parameter2) is acceptable and continue chain, or
return Mono.Empty() and discard chain
Actually i have
checkResponse(stringArg1)
.switchIfEmpty(checkResponse(stringArg2))
.flatMapMany ...
.flatMap ...
method
public Mono<String> checkResponse(String s)
return webClient.post()
.uri(URI)
.body(BodyInserters.fromValue(s))
.retrieve()
.bodyToMono(String.class)
But switchIfEmpty is always executing.
Regards,
Are you sure that it's actually emitting twice?
There are two aspects of Project Reactor that is important to understand:
On Assembly
On Subscription
This code:
checkResponse(stringArg1)
.switchIfEmpty(checkResponse(stringArg2));
will assemble the Monos for both checkResponse calls.
In essence, the checkResponse-method is called twice - however only the Mono returned from the first checkResponse-call will be subscribed to as long as it emits an item.
You can verify this behaviour with this:
checkResponse(stringArg1)
.doOnSubscribe(s -> System.out.println("First checkResponse subscription"))
.switchIfEmpty(checkResponse(stringArg2)
.doOnSubscribe(s -> System.out.println("Second checkResponse subscription"))
);
Something that's very typical of reactive code is that top-level code within a method that returns a Mono/Flux usually executes at assembly time while all the lambdas passed to their various operators such as map/flatMap/concatMap/etc execute at subscription time.
To illustrate:
public Mono<String> getName(int id) {
// Assembly time
System.out.println("This executes at assembly time");
return userRepo.get(id)
.map(user -> {
// Subscription time
System.out.println("This executes at subscription time");
return user.name;
});
}
If assembling the Mono might be expensive while it may never be subscribed to like in your case here, you can defer assembly until subscription-time using Mono.defer:
checkResponse(stringArg1)
.switchIfEmpty(Mono.defer(() -> checkResponse(stringArg2)));
Actually there's difference between assembly time and subscription time.
Assembly time is when you create your pipeline by building the reactive chain.
Subscription time is when the execution triggered and the data starts to flow. You should consider using callbacks and lambdas since they are lazily evaluated.
So your checkResponse() method called "twice" on assembly time, because it is not a lambda, but just a regular method. And it returns Mono
You can use Mono.defer(() -> checkResponse()) and delay the execution and assembling inner mono until you subscribed.

Unable to assert state flow value in view model

The view model is given below
class ClickRowViewModel #Inject constructor(
private val clickRowRepository: ClickRowRepository
): ViewModel() {
private val _clickRowsFlow = MutableStateFlow<List<ClickRow>>(mutableListOf())
val clickRowsFlow = _clickRowsFlow.asStateFlow()
fun fetchAndInitialiseClickRows() {
viewModelScope.launch {
_clickRowsFlow.update {
clickRowRepository.fetchClickRows()
}
}
}
}
My test is as follows:
I am using InstantTaskExecutorRule as follows
#get:Rule
val instantTaskExecutorRule = InstantTaskExecutorRule()
The actual value never resolves to the expected value even though $result seems to have two elements but the actualValue is an empty list. I don't know what I am doing wrong.
Update
I tried to use the first terminal operator as well but the returned output returns an empty list.
Update # 2
I tried async but I got the following error
kotlinx.coroutines.test.UncompletedCoroutinesError: After waiting for 60000 ms, the test coroutine is not completing, there were active child jobs: [DeferredCoroutine{Active}#a4a38f0]
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt$runTestCoroutine$3$3.invokeSuspend(TestBuilders.kt:342)
Update # 3
This test passes in Android Studio, but fails using CLI
Test failing in CLI
You can't call toList on a SharedFlow like that:
Shared flow never completes. A call to Flow.collect on a shared flow never completes normally, and neither does a coroutine started by the Flow.launchIn function.
So calling toList will hang forever, because the flow never hits an end point where it says "ok that's all the elements", and toList needs to return a final value. Since StateFlow only contains one element at a time anyway, and you're not collecting over a period of time, you probably just want take(1).toList().
Or use first() if you don't want the wrapping list, which it seems you don't - each element in the StateFlow is a List<ClickRow>, which is what clickRowRepository.fetchClickRows() returns too. So expectedValue is a List<ClickRow>, whereas actualValue is a List<List<ClickRow>> - so they wouldn't match anyway!
edit your update (using first()) has a couple of issues.
First of all, the clickRowsFlow StateFlow in your ViewModel only updates when you call fetchAndInitialiseClickRows(), because that's what fetches a value and sets it on the StateFlow. You're not calling that in your second example, so it won't update.
Second, that StateFlow is going to go through two state values, right? The first is the initial empty list, the second is the row contents you get back from the repo. So when you access that StateFlow, it either needs to be after the update has happened, or (better) you need to ignore the first state and only return the second one:
val actualValue = clickRowViewModel.clickRowsFlow
.drop(1) // ignore the initial state
.first() // then take the first result after that
// start the update -after- setting up the flow collection,
// so there's no race condition to worry about
clickRowsViewModel.fetchAndInitialiseClickRows()
This way, you subscribe to the StateFlow and immediately get (and drop) the initial state. Then when the update happens, it should push another value to the subscriber, which takes that first new value as its final result.
But there's another complication - because fetchAndInitialiseClickRows() kicks off its own coroutine and returns immediately, that means the fetch-and-update task is running asynchronously. You need to give it time to finish, before you start asserting any results from it.
One option is to start the coroutine and then block waiting for the result to show up:
// start the update
clickRowsViewModel.fetchAndInitialiseClickRows()
// run the collection as a blocking operation, which completes when you get
// that second result
val actualValue = clickRowViewModel.clickRowsFlow
.drop(1)
.first()
This works so long as fetchAndInitialiseClickRows doesn't complete immediately. That consumer chain up there requires at least two items to be produced while it's subscribed - if it never gets to see the initial state, it'll hang waiting for that second (really a third) value that's never coming. This introduces a race condition and even if it's "probably fine in practice" it still makes the test brittle.
Your other option is to subscribe first, using a coroutine so that execution can continue, and then start the update - that way the subscriber can see the initial state, and then the update that arrives later:
// async is like launch, but it returns a `Deferred` that produces a result later
val actualValue = async {
clickRowViewModel.clickRowsFlow
.drop(1)
.first()
}
// now you can start the update
clickRowsViewModel.fetchAndInitialiseClickRows()
// then use `await` to block until the result is available
assertEquals(expected, actualValue.await())
You always need to make sure you handle waiting on your coroutines, otherwise the test could finish early (i.e. you do your asserting before the results are in). Like in your first example, you're launching a coroutine to populate your list, but not ensuring that has time to complete before you check the list's contents.
In that case you'd have to do something like advanceUntilIdle() - have a look at this section on testing coroutines, it shows you some ways to wait on them. This might also work for the one you're launching with fetchAndInitialiseClickRows (since it says it waits for other coroutines on the scheduler, not the same scope) but I'm not really familiar with it, you could look into it if you like!

Determining when a Flow returns no data

The Kotlin flow states the following:
A suspending function asynchronously returns a single value, but how
can we return multiple asynchronously computed values? This is where
Kotlin Flows come in.
However, if the source of my flow is such that when it completes but returns no data, is there a way to determine that from the flow? For example, if the source of the flow calls a backend API but the API returns no data, is there a way to determine when the flow completes and has no data?
There is an onEmpty method that will invoke an action if the flow completes without emitting any items.
It can also be used to emit a special value to the flow to indicate that it was empty. For example, if you have a flow of events you can emit an EmptyDataEvent.
Or you can just do whatever it is that you want to do inside this onEmpty lambda.
Highly related is also the onCompletion method
If the API returns a Flow, it is usually expected to return 0 or more elements. That flow will typically be collected by some piece of code in order to process the values. The collect() call is a suspending function, and will return when the flow completes:
val flow = yourApiCallReturningFlow()
flow.collect { element ->
// process element here
}
// if we're here, the flow has completed
Any other terminal operator like first(), toList(), etc. will handle the flow's completion in some way (and may even cancel the flow early).
I'm not sure about what you're looking for here, but for example there is a terminal operator count:
val flow = yourApiCallReturningFlow()
val hasAnyElement = flow.count() == 0
There is also onEmpty that allows to emit some specific values instead of the empty flow.
I guess it depends on what you want to do when there are items in the flow.
You can just do toList() on the flow and check if it's empty

Why is Flow created on ConflatedBroadcastChannel only able to receive last element?

The following code only prints 10000 i.e. only the last element
val channel = BroadcastChannel<Int>(Channel.CONFLATED)
val flowJob = channel.asFlow().buffer(Channel.UNLIMITED).onEach {
println(it)
}.launchIn(GlobalScope)
for (i in 0..100) {
channel.offer(i*i)
}
flowJob.join()
Code can be ran in the playground.
But since the Flow is launched in separate dispatching thread, and value is sent to the Channel and since Flow has an unlimited buffer, it should receive each element till onEach is invoked. But why only the last element is able to get received?
Is this the expected behavior or some bug? If its expected behavior how would somebody try to push only the newest elements to the flow, but all the flow that has certain buffer can receive the element.
Actually, this is about the "Conflate" way of buffering. For buffering a flow you have a couple of ways such as using buffer() method or collectLatest() or conflate(). Each of them has their own way to buffer. So conflate() method's way is that when the flow emits values, it tries to collect but when the collector is too slow, then conflate() skips the intermediate values for the sake of the flow. And it's doing it even tho every time it's emitted in a separate coroutine. So in a channel, a similar thing is happening basically.
Here is the official doc explanation:
When a flow represents partial results of the operation or operation
status updates, it may not be necessary to process each value, but
instead, only most recent ones. In this case, the conflate operator
can be used to skip intermediate values when a collector is too slow
to process them.
Check out this link.
The explanation is for flow but you need to focus on the feature that you are using. And in this case, conflation is same for channel and flow.
The problem here is the Channel.CONFLATED. Taken from the docs:
Channel that buffers at most one element and conflates all subsequent `send` and `offer` invocations,
so that the receiver always gets the most recently sent element.
Back-to-send sent elements are _conflated_ -- only the the most recently sent element is received,
while previously sent elements **are lost**.
Sender to this channel never suspends and [offer] always returns `true`.
This channel is created by `Channel(Channel.CONFLATED)` factory function invocation.
This implementation is fully lock-free.
so this is why you only get the most recent (last) element. I'd use an UNLIMITED Channel instead:
val channel = Channel<Int>(Channel.UNLIMITED)
val flowJob = channel.consumeAsFlow().onEach {
println(it)
}.launchIn(GlobalScope)
for (i in 0..100) {
channel.offer(i*i)
}
flowJob.join()
As some of the comments stated, using Channel.CONFLATED will store only the last value, and you are offering to the channel, even if your flow has a buffer.
Also join() will suspend until the Job is not complete, in your case infinitely, that's why you needed the timeout.
val channel = Channel<Int>(Channel.RENDEZVOUS)
val flowJob = channel.consumeAsFlow().onEach {
println(it)
}.launchIn(GlobalScope)
GlobalScope.launch{
for (i in 0..100) {
channel.send(i * i)
}
channel.close()
}
flowJob.join()
Check out this solution (playground link), with the Channel.RENDEZVOUS your channel will accept new elements only if the others are already consumed.
This is why we have to use send instead of offer, send suspends until it can send elements, while offer returns a boolean indicating if send was succesfull.
At last, we have to close the channel, in order for join() not to suspend until eternity.

How to end / close a MutableSharedFlow?

SharedFlow has just been introduced in coroutines 1.4.0-M1, and it is meant to replace all BroadcastChannel implementations (as stated in the design issue decription).
I have a use case where I use a BroadcastChannel to represent incoming web socket frames, so that multiple listeners can "subscribe" to the frames.
The problem I have when I move to a SharedFlow is that I can't "end" the flow when I receive a close frame, or an upstream error (which I would like to do to inform all subscribers that the flow is over).
How can I make all subscriptions terminate when I want to effectively "close" the SharedFlow?
Is there a way to tell the difference between normal closure and closure with exception? (like channels)
If MutableSharedFlow doesn't allow to convey the end of the flow to subscribers, what is the alternative if BroadcastChannel gets deprecated/removed?
The SharedFlow documentation describes what you need:
Note that most terminal operators like Flow.toList would also not complete, when applied to a shared flow, but flow-truncating operators like Flow.take and Flow.takeWhile can be used on a shared flow to turn it into a completing one.
SharedFlow cannot be closed like BroadcastChannel and can never represent a failure. All errors and completion signals should be explicitly materialized if needed.
Basically you will need to introduce a special object that you can emit from the shared flow to indicate that the flow has ended, using takeWhile at the consumer end can make them emit until that special object is received.
I think a possible solution is creating a boolean flag isValid and publicly expose only flows with .takeWhile { isValid }. Then just call isValid = false and sFlow.emit() when you want to close all subscribers.
Possible implementation:
private var isValid = true // In real scenario use atomic boolean
private val _sharedFlow = MutableSharedFlow<Unit>()
val sharedFlow: Flow<Unit> get() = _sharedFlow.takeWhile { isValid }
suspend fun cancelSharedFlow() {
isValid = false
_sharedFlow.emit(Unit)
}
EDIT: In my case .emit() was always suspending so I had to use BufferOverflow.DROP_LATEST (which is not suitable for many usecases). Not sure if the problem is in this example or elsewhere in my app. If you see a problem, please comment :)