Combining kotlin flow results - kotlin

I'm wandering if there is a clean way to launch a series of flows in Kotlin and then, after their resolution, perform further operations based on whether they succeeded or not
For example's sake I need to read all integers from a DB (returning them into a flow), check if they are even or odd against an external API (also returning a flow), and then remove the odd ones from the DB
In code it would be something like this
fun findEven() {
db.readIntegers()
.map { listOfInt ->
listOfInt.asFlow()
.flatMapMerge { singleInt ->
httpClient.apiCallToCheckForOddity(singleInt)
.catch {
// API failure when number is even
}
.map {
// API success when number is odd
db.remove(singleInt).collect()
}
}.collect()
}.collect()
}
But the problem I see with this code is the access to the DB deleting entries done in parallel, and I think a better solution would be to run all API calls and somewhere collect all that failed and all that succeeded, so to be able to do a bulk insertion in the DB only once instead of having multiple coroutines do that on their own

In my opinion, it's kind of an anti-pattern to produce side effects in map, filter, etc. A side effect like removing items from a database should be a separate step (collect in the case of a Flow, and forEach in the case of a List) for clarity.
The nested flow is also kind of convoluted, since you can directly modify the list as a List.
I think you can do it like this, assuming the API can only check one item at a time.
suspend fun findEven() {
db.readIntegers()
.map { listOfInt ->
listOfInt.filter { singleInt ->
runCatching {
httpClient.apiCallToCheckForOddity(singleInt)
}.isSuccess
}
}
.collect { listOfOddInt ->
db.removeAll(listOfOddInt)
}
}
Parallel version, if the API call returns the parameter. (By the way, Kotlin APIs should not throw exceptions on non-programmer errors).
suspend fun findEven() {
db.readIntegers()
.map { listOfInt ->
coroutineScope {
listOfInt.map { singleInt ->
async {
runCatching {
httpClient.apiCallToCheckForOddity(singleInt)
}
}
}.awaitAll()
.mapNotNull(Result<Int>::getOrNull)
}
}
.collect { listOfOddInt ->
db.removeAll(listOfOddInt)
}
}

Related

Combine a Flow and a non Flow api response Kotlin

I currently have a piece of logic as follows:
interface anotherRepository {
fun getThings(): Flow<List<String>>
}
interface repository {
suspend fun getSomeThings(): AsyncResult<SomeThings>
}
when (val result = repository.getSomeThings()) {
is AsyncResult.Success -> {
anotherRepository.getThings().collectLatest {
// update the state
}
else -> { }
}
}
The problem I am having is that, if repository.getSomeThings has been triggered multiple times before, anotherRepository.getThings is getting triggered for the amount of all the pre-loaded values from repository.getSomeThings. I was wondering what is the proper way to use these repositories, one a suspend function, the other a Flow together. The equivalent behaviour that is combineLatest{} in Rx.
Thank you.
There are a couple of ways to solve your problem. One way is just to call
repository.getSomeThings() in the collectLatest block and cache last result:
var lastResult: AsyncResult<SomeThings>? = null
anotherRepository.getThings().collectLatest {
if (lastResult == null) {
lastResult = repository.getSomeThings()
}
// use lastResult and List<String>
}
Another approach is to create a Flow, which will be calling repository.getSomeThings() function, and combine two Flows:
combine(
anotherRepository.getThings(),
flow {emit(repository.getSomeThings())}
) { result1: List<String>, result2: AsyncResult<SomeThings> ->
...
}

Processing and aggregating data from multiple servers efficiently

Summary
My goal is to process and aggregate data from multiple servers efficiently while handling possible errors. For that, I
have a sequential version that I want to speed up. As I am using Kotlin, coroutines seem the way to go for this
asynchronous task. However, I'm quite new to this, and can't figure out how to do this idiomatic. None of my attempts
satisfied my requirements completely.
Here is the sequential version of the core function that I am currently using:
suspend fun readDataFromServers(): Set<String> = coroutineScope {
listOfServers
// step 1: read data from servers while logging errors
.mapNotNull { url ->
runCatching { makeRequestTo(url) }
.onFailure { println("err while accessing $url: $it") }
.getOrNull()
}
// step 2: do some element-wise post-processing
.map { process(it) }
// step 3: aggregate data
.toSet()
}
Background
In my use case, there are numServers I want to read data from. Each of them usually answers within successDuration,
but the connection attempt may fail after timeoutDuration with probability failProb and throw an IOException. As
downtimes are a common thing in my system, I do not need to retry anything, but only log it for the record. Hence,
the makeRequestTo function can be modelled as follows:
suspend fun makeRequestTo(url: String) =
if (random.nextFloat() > failProb) {
delay(successDuration)
"{Some response from $url}"
} else {
delay(timeoutDuration)
throw IOException("Connection to $url timed out")
}
Attempts
All these attempts can be tried out in the Kotlin playground. I don't know how long this link stays alive; maybe I'll need to upload this as a gist, but I liked that people can execute the code directly.
Async
I tried using async {makeRequestTo(it)} after listOfServers and awaiting the results in the following mapNotNull
similar
to this post
. While this collapses the communication time to timeoutDuration, all following processing steps have to wait for that
long before they can continue. Hence, some composition of Deferreds was required here, which is discouraged in
Kotlin (or at least should be avoided in favor of suspending
functions).
suspend fun readDataFromServersAsync(): Set<String> = supervisorScope {
listOfServers
.map { async { makeRequestTo(it) } }
.mapNotNull { kotlin.runCatching { it.await() }.onFailure { println("err: $it") }.getOrNull() }
.map { process(it) }
.toSet()
}
Loops
Using normal loops like below fulfills the functional requirements, but feels a bit more complex than it should be.
Especially the part where shared state must be synchronized makes me to not trust this code and any future modifications
to it.
val results = mutableSetOf<String>()
val mutex = Mutex()
val logger = CoroutineExceptionHandler { _, exception -> println("err: $exception") }
for (server in listOfServers) {
launch(logger) {
val response = makeRequestTo(server)
val processed = process(response)
mutex.withLock {
results.add(processed)
}
}
}
return#supervisorScope results

Design pattern to best implement batch api requests that happen transparently to the calling layer

I have a batch processor that I want to refactor to be expressed a 1-to-1 fashion based on input to increase readability, and for further optimization later on. The issue is that there is a service that should be called in batches to reduce HTTP overhead, so mixing the 1-to-1 code with the batch code is a bit tricky, and we may not want to call the service with every input. Results can be sent out eagerly one-by-one, but order must be maintained, so something like a flow doesn't seem to work.
So, ideally the batch processor would look something like this:
class Processor<A, B> {
val service: Service<A, B>
val scope: CoroutineScope
fun processBatch(input: List<A>) {
input.map {
Pair(it, scope.async { service.call(it) })
}.map {
(a, b) ->
runBlocking { b.await().let { /** handle result, do something with a if result is null, etc **/ } }
}
}
}
The desire is to perform all of the service logic in such a way that it is executing in the background, automatically splitting the inputs for the service into batches, executing them asynchronously, and somehow mapping the result of the batch call into the suspended call.
Here is a hacky implementation:
class Service<A, B> {
val inputContainer: MutableList<A>
val outputs: MutableList<B>
val runCalled = AtomicBoolean(false)
val batchSize: Int
suspended fun call(input: A): B? {
// some prefiltering logic that returns a null early
val index = inputContainer.size
inputContainer.add(a) // add to overall list for later batching
return suspend {
run()
outputs[index]
}
}
fun run() {
val batchOutputs = mutableListOf<Deferred<List<B?>>>()
if (!runCalled.getAndSet(true)) {
inputs.chunked(batchSize).forEach {
batchOutputs.add(scope.async { batchCall(it) })
}
runBlocking {
batchOutputs.map {
val res = result.await()
outputs.addAll(res)
}
}
}
}
suspended fun batchCall(input: List<A>): List<B?> {
// batch API call, etc
}
}
Something like this could work but there are several concerns:
All API calls go out at once. Ideally this would be batching and executing in the background while other inputs are being scheduled, but this is not .
Processing of the service result for the first input cannot resume until all results have been returned. Ideally we could process the result if the service call has returned, while other results continue to be performed in the background.
Containers of intermediate results seem hacky and prone to bugs. Cleanup logic is also needed, which introduces more hacky bits into the rest of the code
I can think of several optimizations to the address 1 and 2, but I imagine concerns related to 3 would be worse. This seems like a fairly common call pattern and I would expect there to be a library or much simpler design pattern to accomplish this, but I haven't been able to find anything. Any guidance is appreciated.
You're on the right track by using Deferred. The solution I would use is:
When the caller makes a request, create a CompletableDeferred
Using a channel, pass this CompletableDeferred to the service for later completion
Have the caller suspend until the service completes the CompletableDeferred
It might look something like this:
val requestChannel = Channel<Pair<Request, CompletableDeferred<Result>>()
suspend fun doRequest(request: Request): Result {
val result = CompletableDeferred<Result>()
requestChannel.send(Pair(request, result))
return result.await()
}
fun run() = scope.launch {
while(isActive) {
val (requests, deferreds) = getBatch(batchSize).unzip()
val results = batchCall(requests)
(results zip deferreds).forEach { (result, deferred) ->
deferred.complete(result)
}
}
}
suspend fun getBatch(batchSize: Int) = buildList {
repeat(batchSize) {
add(requestChannel.receive())
}
}

Kotlin add custom method to stream chaining

I need to add a custom method (which is a Consumer) to the dot chaining in stream api, i not sure how to do it, following is my code.
If that is not possible, is there anyway to do it with other operation? Maybe like with .map or something else?
fun main(args: Array<String>) {
var countries: List<String> = listOf("India", "Germany", "Japan")
var firstCountry = countries.stream()
.filter{it == "Germany"}
.performOperation{} //not sure what to do here
.findFirst()
println(firstCountry)
}
fun performOperation(country: String) {
if(country.length > 3) {
throw InvalidLengthException("Error")
}
//do some operation, won't return any value
doCustomOperation(country)
}
You may already be aware that when it comes to steams there are two types of operations, one is your map, filter etc. known as intermediate opeartion and others are terminal operations such as forEach. You said your custom operation wont return any value, hence making it a terminal operation. moreover it seems to me that you want to perform same operation for all the elements, basically a forEach. for this you can define an extension function on Stream as
fun <T> Stream<T>.someOperation(operation: (T) -> Unit){
this.forEach { operation(it) }
}
There is two ways to do what you want.
var firstCountry = countries.stream()
.filter{it == "Germany"}
.also(::performOperation)
.findFirst()
The :: is a function reference and is basically the same as .also { performOperation(it)}
The second one would be to make your own extension method on list. I wouldn't recommend it until you understand kotlin lambdas and extension methods
fun Stream<String>.performOperation(): Stream<String> {
for(country in this) {
if(country.length > 3) {
throw InvalidLengthException("Error")
}
doCustomOperation(country)
}
return this
}
You would just call that one like .performOperation() where you have the .performOperation{}

Simplify the statement using rxkotlin

I've wanted to try RxJava with kotlin to make coding easier, so I've produced this:
fun postAnswers() {
disposable = getToken.execute().subscribe({ token ->
questions.forEach { form ->
val answers = form.answers?.filter { it.isChecked }?.map { it.answer_id }
disposable = postAnswer.execute(token?.token!!, SavedAnswer(form.form_id, answers)).subscribe({
//Post live data about success
}, {
//Post live data failure
})
}
}, {
//Post live data failure
})
}
But I have an impression it can be done better, but I do not know how. Basically what I am trying to achieve is getting a Token object from database, that returns Flowable Token? and then use it to call postAnswer in a for cycle, because I need to post each answer separately (That's how the API is designed). After that, postAnswer only returns Completable, but I need to let the Activity know (this is from ViewModel code) how many answers were posted
I've thought about using .flatMap or .concat functions, but I am not sure if it will be helpful in this case. Also, do I need to assign getToken.execute() to disposable?
Thank you for your answers
EDIT:
Here is my questions list:
private var questions: List<Form> = emptyList()
It gets filled by viewModel functions
Try to think with nesting :) This here will probably do: for each saved answer, post a request.
disposable = getToken.execute()
.switchMap { token -> // switchMap because your old token is probably invalidated
val savedAnswers = questions
.map { form->
val formId = form.form_id
form.answers
?.filter { it.isChecked }
?.map { it.answer_id }
?.let { SavedAnswer(formId, answersIds) }
?: SavedAnswer(formId, emptyList() ) // if no checked answer, then return empty list of ids
}
Observable.list(savedAnswers)
.concatMap { savedAnswer -> // concatMap because you want the whole list to be executed once per time, use flatMap if you want it to be in parallel.
postAnswer.execute(token?.token!!, savedAnswer) // FYI: !! is bad practice in Kotlin, try make it less anbiguous
}
.toList()
}
.subscribe({ listOfResultsFromPostings : List<SomeResultHere> ->
//Post live data about success
}, {
//Post live data failure
})