Does collect return a list snapshot if run on a parallel stream?

Does collect return a list snapshot if run on a parallel stream? - kotlin

I have a unit test that started to fail on Circle CI only. It fails on the last line in this (Kotlin) example:
generator.generateNames(50) // returns List<String>
.parallelStream()
.map { name ->
val playerId = "${name.firstName.toLowerCase()}"
Player(playerId = playerId)
}.collect(Collectors.toList()).last()
throwing: Caused by: java.util.NoSuchElementException.
It works always on my local machine or on Circle CI if I do not use a parallel stream. My theory is that the collect call returns a List snapshot (it actually doesn't block until the List is completely filled) and that CI doesn't have enough CPU to collect a single element in other threads?
However, my stream is ordered and so is the Collector right? Is this even collecting in parallel?

The exception you are getting probably has a message in it, and not just the name of the exception. That message is likely telling you the error. For example, the last part of your code calls the Kotlin extension function last() which in the implementation:
public fun <T> List<T>.last(): T {
if (isEmpty())
throw NoSuchElementException("List is empty.")
return this[lastIndex]
}
So if you are seeing "List is empty" message in the stack trace for java.util.NoSuchElementException then that is the cause.
Also, if you share the stack trace you can actually see what is throwing the exception. But looking at your code this is the only likely candidate.
The question then is, "why is the final list empty?!" ... is generateNames(50) working differently in this environment? The problem is not with collect(Collectors.toList()) which provides a synchronous result.

Related

Mono switchIfEmpty emitting twice

I have a scenario where i need to find a method that return a string response from external API. I have two possibilities for this response give me a valid response (with parameter 1 or parameter 2) or if both responses are not valid, return a final empty publisher to chain.
Mono<String> checkResponse(String parameter)
Check if call checkResponse(parameter1) is acceptable, ignore second call (switchIfEmpty) and continue chain, or
Check if call checkResponse(parameter2) is acceptable and continue chain, or
return Mono.Empty() and discard chain
Actually i have
checkResponse(stringArg1)
.switchIfEmpty(checkResponse(stringArg2))
.flatMapMany ...
.flatMap ...
method
public Mono<String> checkResponse(String s)
return webClient.post()
.uri(URI)
.body(BodyInserters.fromValue(s))
.retrieve()
.bodyToMono(String.class)
But switchIfEmpty is always executing.
Regards,

Are you sure that it's actually emitting twice?
There are two aspects of Project Reactor that is important to understand:
On Assembly
On Subscription
This code:
checkResponse(stringArg1)
.switchIfEmpty(checkResponse(stringArg2));
will assemble the Monos for both checkResponse calls.
In essence, the checkResponse-method is called twice - however only the Mono returned from the first checkResponse-call will be subscribed to as long as it emits an item.
You can verify this behaviour with this:
checkResponse(stringArg1)
.doOnSubscribe(s -> System.out.println("First checkResponse subscription"))
.switchIfEmpty(checkResponse(stringArg2)
.doOnSubscribe(s -> System.out.println("Second checkResponse subscription"))
);
Something that's very typical of reactive code is that top-level code within a method that returns a Mono/Flux usually executes at assembly time while all the lambdas passed to their various operators such as map/flatMap/concatMap/etc execute at subscription time.
To illustrate:
public Mono<String> getName(int id) {
// Assembly time
System.out.println("This executes at assembly time");
return userRepo.get(id)
.map(user -> {
// Subscription time
System.out.println("This executes at subscription time");
return user.name;
});
}
If assembling the Mono might be expensive while it may never be subscribed to like in your case here, you can defer assembly until subscription-time using Mono.defer:
checkResponse(stringArg1)
.switchIfEmpty(Mono.defer(() -> checkResponse(stringArg2)));

Actually there's difference between assembly time and subscription time.
Assembly time is when you create your pipeline by building the reactive chain.
Subscription time is when the execution triggered and the data starts to flow. You should consider using callbacks and lambdas since they are lazily evaluated.
So your checkResponse() method called "twice" on assembly time, because it is not a lambda, but just a regular method. And it returns Mono
You can use Mono.defer(() -> checkResponse()) and delay the execution and assembling inner mono until you subscribed.

Unable to assert state flow value in view model

The view model is given below
class ClickRowViewModel #Inject constructor(
private val clickRowRepository: ClickRowRepository
): ViewModel() {
private val _clickRowsFlow = MutableStateFlow<List<ClickRow>>(mutableListOf())
val clickRowsFlow = _clickRowsFlow.asStateFlow()
fun fetchAndInitialiseClickRows() {
viewModelScope.launch {
_clickRowsFlow.update {
clickRowRepository.fetchClickRows()
}
}
}
}
My test is as follows:
I am using InstantTaskExecutorRule as follows
#get:Rule
val instantTaskExecutorRule = InstantTaskExecutorRule()
The actual value never resolves to the expected value even though $result seems to have two elements but the actualValue is an empty list. I don't know what I am doing wrong.
Update
I tried to use the first terminal operator as well but the returned output returns an empty list.
Update # 2
I tried async but I got the following error
kotlinx.coroutines.test.UncompletedCoroutinesError: After waiting for 60000 ms, the test coroutine is not completing, there were active child jobs: [DeferredCoroutine{Active}#a4a38f0]
at kotlinx.coroutines.test.TestBuildersKt__TestBuildersKt$runTestCoroutine$3$3.invokeSuspend(TestBuilders.kt:342)
Update # 3
This test passes in Android Studio, but fails using CLI
Test failing in CLI

You can't call toList on a SharedFlow like that:
Shared flow never completes. A call to Flow.collect on a shared flow never completes normally, and neither does a coroutine started by the Flow.launchIn function.
So calling toList will hang forever, because the flow never hits an end point where it says "ok that's all the elements", and toList needs to return a final value. Since StateFlow only contains one element at a time anyway, and you're not collecting over a period of time, you probably just want take(1).toList().
Or use first() if you don't want the wrapping list, which it seems you don't - each element in the StateFlow is a List<ClickRow>, which is what clickRowRepository.fetchClickRows() returns too. So expectedValue is a List<ClickRow>, whereas actualValue is a List<List<ClickRow>> - so they wouldn't match anyway!
edit your update (using first()) has a couple of issues.
First of all, the clickRowsFlow StateFlow in your ViewModel only updates when you call fetchAndInitialiseClickRows(), because that's what fetches a value and sets it on the StateFlow. You're not calling that in your second example, so it won't update.
Second, that StateFlow is going to go through two state values, right? The first is the initial empty list, the second is the row contents you get back from the repo. So when you access that StateFlow, it either needs to be after the update has happened, or (better) you need to ignore the first state and only return the second one:
val actualValue = clickRowViewModel.clickRowsFlow
.drop(1) // ignore the initial state
.first() // then take the first result after that
// start the update -after- setting up the flow collection,
// so there's no race condition to worry about
clickRowsViewModel.fetchAndInitialiseClickRows()
This way, you subscribe to the StateFlow and immediately get (and drop) the initial state. Then when the update happens, it should push another value to the subscriber, which takes that first new value as its final result.
But there's another complication - because fetchAndInitialiseClickRows() kicks off its own coroutine and returns immediately, that means the fetch-and-update task is running asynchronously. You need to give it time to finish, before you start asserting any results from it.
One option is to start the coroutine and then block waiting for the result to show up:
// start the update
clickRowsViewModel.fetchAndInitialiseClickRows()
// run the collection as a blocking operation, which completes when you get
// that second result
val actualValue = clickRowViewModel.clickRowsFlow
.drop(1)
.first()
This works so long as fetchAndInitialiseClickRows doesn't complete immediately. That consumer chain up there requires at least two items to be produced while it's subscribed - if it never gets to see the initial state, it'll hang waiting for that second (really a third) value that's never coming. This introduces a race condition and even if it's "probably fine in practice" it still makes the test brittle.
Your other option is to subscribe first, using a coroutine so that execution can continue, and then start the update - that way the subscriber can see the initial state, and then the update that arrives later:
// async is like launch, but it returns a `Deferred` that produces a result later
val actualValue = async {
clickRowViewModel.clickRowsFlow
.drop(1)
.first()
}
// now you can start the update
clickRowsViewModel.fetchAndInitialiseClickRows()
// then use `await` to block until the result is available
assertEquals(expected, actualValue.await())
You always need to make sure you handle waiting on your coroutines, otherwise the test could finish early (i.e. you do your asserting before the results are in). Like in your first example, you're launching a coroutine to populate your list, but not ensuring that has time to complete before you check the list's contents.
In that case you'd have to do something like advanceUntilIdle() - have a look at this section on testing coroutines, it shows you some ways to wait on them. This might also work for the one you're launching with fetchAndInitialiseClickRows (since it says it waits for other coroutines on the scheduler, not the same scope) but I'm not really familiar with it, you could look into it if you like!

Kotlin Path.useLines { } - How not to get IOException("Stream closed")?

Kotlin has nice wrappers and shortcuts, but sometimes I get caught not understanding them.
I have this simplified code:
class PipeSeparatedItemsReader (private val filePath: Path) : ItemsReader {
override fun readItems(): Sequence<ItemEntry> {
return filePath.useLines { lines ->
lines.map { ItemEntry("A","B","C","D",) }
}
}
And then I have:
val itemsPath = Path(...).resolve()
val itemsReader = PipeSeparatedItemsReader(itemsPath)
for (itemEntry in itemsReader.readItems())
updateItem(itemEntry)
// I have also tried itemsReader.readItems().forEach { ... }
Which is quite straightforward - I expect this code to give me a sequence which opens a file and reads the lines, parses them, and gives ItemEntrys, and when used up, close the file.
What I get, however, is IOException("Stream closed").
Somehow, even before the first item is read (I have debugged), somewhere within Kotlin's wrappers, the reader.in becomes null, so this exception is thrown in hasNext().
I have seen a similar question here: Kotlin to chain multiple sequences from different InputStream?
That one includes a lot of Java boilerplate which I would like to avoid.
How should I code this sequence using Path.useLines()?

Every Kotlin helper with "use" in the name closes the underlying resource at the end of the lambda you pass (at least that's a convention in the stdlib as far as I know). The most common example being AutoCloseable.use.
The Path.useLines extension is no exception:
Calls the block callback giving it a sequence of all the lines in this file and closes the reader once the processing is complete. [emphasis mine]
This means useLines closes the sequence of lines once the block is done, and thus you cannot return a lazy sequence out of it because you can't use it after the useLines function returns.
So, if you want to return a sequence of lines for later use, you cannot return a transformed sequence from that of useLines directly. Sequences actually cannot detect when someone is done using them, hence why useLines needs a lambda to give a "scope" or "lifetime" to the sequence and know when to close the underlying reader.
If you want to wrap this, you have 2 major options: either split the sequence operation and the close operation (make your PipeSeparatedItemsReader closeable), or use a lambda to process things in-place in readItems() the same way useLines does.

Retrieve all flux elements in StepVerifier

I am working on testing a flux. I don't know how many elements exactly the flux has. Initially I have tried with StepVerifier and faced issues as i do not know the elements. Later I have referred this question and tried the same but I am getting the below error:
java.lang.AssertionError: expectation "expectComplete" failed (expected: onComplete(); actual: onNext
My understanding is that, my code is expecting a complete signal but the flux has some more elements left(so it gives onNext() instead of onComplete()). Please help me to understand where I am missing things. Below is my code:
StepVerifier.create(flux)
.recordWith(ArrayList::new)
.consumeRecordedWith(elements-> {assertThat(elements.size()).isGreaterThan(0);})
.verifyComplete();

You're not actually consuming your Flux, you're just setting up what happens when it's consumed. Your verifyComplete(); call then fails, understandably, because the Flux hasn't been consumed at all, and it's thus not complete!
You need to add a thenConsumeWhile() call to actually consume it.
If you really need to use AssertJ as you do above, then you can do:
StepVerifier.create(flux)
.recordWith(ArrayList::new)
.thenConsumeWhile(x -> true)
.consumeRecordedWith(elements -> {
assertThat(elements.isEmpty()).isFalse();
})
.verifyComplete();
However, there's no need for AssertJ here - the reactor test package is enough, and adding additional testing frameworks makes the testing code much less clear IMHO. So if you're not wedded to AssertJ, just do:
StepVerifier.create(flux)
.recordWith(ArrayList::new)
.thenConsumeWhile(x -> true)
.expectRecordedMatches(elements -> !elements.isEmpty())
.verifyComplete();
Note that in real-world use, you'd probably want to adjust the predicate in thenConsumeWhile so that it runs a check against each element in turn, too. I've also adjusted the above code to use isEmpty() rather than checking if size()>0, as it's semantically clearer while achieving the same purpose.

From the same issue, with something new: I had so many entries in my flux that it couldn't fit into the memory (yes, those test case fixtures were designed that way)...
So buffering everything into a List wasn't an option.
And I tried different API methods on StepVerifier and found the following to work:
StepVerifier.create( myFlux )
.thenConsumeWhile( Predicate<T>, Consumer<T> )
.verifyComplete();
I literally did
StepVerifier.create( myFlux )
.thenConsumeWhile( __ -> true, entry -> {
// assertions
} )
.verifyComplete();

In RxJava/RxKotlin, what are the differences between returning a Completable.error(Exception()) and throwing?

What are the differences in the following cases:
fun a(params: String) = Completable.fromAction {
if (params.isEmpty()) {
throw EmptyRequiredFieldException()
}
}
VS
fun b(params: String) = if(params.isEmpty())
Completable.error(EmptyRequiredFieldException())
else
Completable.complete()
Specifically in the context of android, if it matters (even though I don't think it does)
Thanks!

According to documentation,
If the Action throws an exception, the respective Throwable is delivered to the downstream via CompletableObserver.onError(Throwable), except when the downstream has disposed this Completable source. In this latter case, the Throwable is delivered to the global error handler via RxJavaPlugins.onError(Throwable) as an UndeliverableException.
So both of two ways you described are similar (except when the downstream has disposed). Note, that first approach (with manually throwing exception) allow to modify behavior of Completable at runtime. And second one - statically defined as you return particular type of Completable and can't modify it.
What to choose depends on your needs.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Does collect return a list snapshot if run on a parallel stream? - kotlin

Related

Mono switchIfEmpty emitting twice

Unable to assert state flow value in view model

Kotlin Path.useLines { } - How not to get IOException("Stream closed")?

Retrieve all flux elements in StepVerifier

In RxJava/RxKotlin, what are the differences between returning a Completable.error(Exception()) and throwing?

Categories

Resources