Kotlin \ Android - LiveData async transformation prevent previous result - kotlin

So I have a LiveData that I transform to an async function that takes a while to execute (like 2 seconds sometimes, or 4 seconds).
sometimes the call takes long, and sometimes it's really fast (depends on the results) sometimes it's instant (empty result)
the problem is that if I have 2 consecutive emits in my LiveData, sometimes the first result takes a while to execute, and the second one will take an instant, than it will show the second before the first, and than overwrite the result with the earlier calculation,
what I want is mroe of a sequential effect. (kinda like RxJava concatMap)
private val _state = query.mapAsync(viewModelScope) { searchString ->
if (searchString.isEmpty()) {
NoSearch
} else {
val results = repo.search(searchString)
if (results.isNotEmpty()) {
Results(results.map { mapToMainResult(it, searchString) })
} else {
NoResults
}
}
}
#MainThread
fun <X, Y> LiveData<X>.mapAsync(
scope: CoroutineScope,
mapFunction: androidx.arch.core.util.Function<X, Y>
): LiveData<Y> {
val result = MediatorLiveData<Y>()
result.addSource(this) { x ->
scope.launch(Dispatchers.IO) { result.postValue(mapFunction.apply(x)) }
}
return result
}
how do I prevent the second result from overwriting the first result?

#MainThread
fun <X, Y> LiveData<X>.mapAsync(
scope: CoroutineScope,
mapFunction: (X) -> Y,
): LiveData<Y> = switchMap { value ->
liveData(scope.coroutineContext) {
withContext(Dispatchers.IO) {
emit(mapFunction(value))
}
}
}

Related

Kotlin - Debounce Only One Specific Value When Emitting from Flow

I have two flows that are being combined to transform the flows into a single flow. One of the flows has a backing data set that emits much faster than the the other.
Flow A - emits every 200 ms
Flow B - emits every ~1s
The problem I am trying to fix is this one:
combine(flowA, flowB) { flowAValue, flowBValue // just booleans
flowAValue && flowBValue
}.collect {
if(it) {
doSomething
}
}
Because Flow A emits extremely quickly, the boolean that's emitted can get cleared rapidly, which means that when flowB emits true, flowA already emitted true and the state is now false.
I've attempted something like:
suspend fun main() {
flowA.debounce {
if (it) {
1250L
} else {
0L
}
}.collect {
println(it)
}
}
But this doesn't work as sometimes the true values aren't emitted - inverting the conditional (so that if(true) = 0L else 1250L) also doesn't work. Basically what I'm looking for is that if flowA is true - hold that value for 1 second before changing values. Is something like that possible?
I made this use conflated on the 2nd flow, that is drastically faster, so that zipping them will always take the latest value from fastFlow, when slowFlow is finally ready, if you don't use conflated on the 2nd flow, it will always be the first time both emit.
fun forFlow() = runTest {
val slowString = listOf("first", "second", "third", "fourth")
val slowFlow = flow {
slowString.forEach {
delay(100)
emit(it)
}
}
val fastFlow = flow {
(1 until 1000).forEach { num ->
delay(5)
emit(num)
}
}.conflate()
suspend fun zip() {
slowFlow.zip(fastFlow) { first, second -> "$first: $second" }
.collect {
println(it)
}
}
runBlocking {
zip()
}
println("Done!")
}
With Conflated on fastFlow:
first: 1
second: 15
third: 32
fourth: 49
Done!
Without Conflated on fastFlow:
first: 1
second: 2
third: 3
fourth: 4
Done!

How to efficiently perform concurrent computation with coroutines

I'm trying to improve my knowledge of coroutines and currently working on following problem:
Given a random non empty string with a length of 14 characters, what would be the most efficient way to find a string that contains a specific prefix (let's assume prefix length is 5)?
Most of the solutions I encountered on the internet either a) manually launch async{} 2 or 3 times or b) launch async{} in a loop and then await all of them to complete which won't work for this scenario.
One approach I tried was to launch new coroutines until I get a non null repsonse from the computation function and cancel the scope after, however there's a clear a performance issue that I'm not seeing since this approach can take more than 20s to calculate for a prefix with length 1.
...
private val _flow = MutableSharedFlow<String>()
suspend fun invoke(prefix: String) = withContext(dispatcher) { // dispatcher is Dispatchers.Default
_flow.onEach {
println("String is=$it")
this.cancel()
}.launchIn(this)
repeat(Int.MAX_VALUE) {
launch {
getString(prefix)?.let {
_flow.emit(it)
}
}
}
}
private fun getString(prefix: String): String? { // or any other cpu intensive task
val randomString = generateRandomStringAccordingToSpecs() // implemented elsewhere
if (randomString .startsWith(prefix = "prefix", ignoreCase = true)) {
return randomString
} else {
return null
}
}
I also tried an approach with a while loop and 4 parallel executions, for which I'm getting better performace results, however awaiting after every X calculations doesn't seem like the most efficient solution to me:
suspend fun invoke(prefix: String) = withContext(dispatcher) {
var resultString: String? = getString(prefix)
while (resultString == null) {
val tasks = listOf(
async { getString(prefix) },
async { getString(prefix) },
async { getString(prefix) },
async { getString(prefix) }
)
resultString = tasks.awaitAll().filterNotNull().firstOrNull()
}
println("String is=$resultString")
}
private fun getString(prefix: String): String? { // or any other cpu intensive task
val randomString = generateRandomStringAccordingToSpecs() // implemented elsewhere
if (randomString .startsWith(prefix = "prefix", ignoreCase = true)) {
return randomString
} else {
return null
}
}
In the example above I'm using a find suffix problem, but in general, what is the most efficient way to concurrently perform some CPU intensive calculations with coroutines?
Especially for the calculations where we don't know how many times the task must be executed before we get an answer.
This seems like a job for the select function. Assuming your generateRandomStringAccordingToSpecs() is a computationally blocking function, you want to have all your CPU cores working on the problem simultaneously and you just want the first valid result, you could build an operator like this:
suspend fun <T> getFirstResult(block: suspend CoroutineScope.() -> T): T =
withContext(Dispatchers.Default) {
coroutineScope {
select {
repeat(Runtime.getRuntime().availableProcessors()) {
async { block() }.onAwait {
coroutineContext.cancelChildren()
it
}
}
}
}
}
It starts as many parallel coroutines as there are CPUs, and once any of them returns a result, it cancels the rest and returns that result.
So you can use this with a coroutine block that uses a while loop indefinitely until a result is returned:
suspend fun invoke(prefix: String) = getFirstResult {
while(isActive) {
return#getFirstResult getString(prefix) ?: continue
}
}

Implement backoff strategy in flow

I'm trying to implement a backoff strategy just using kotlin flow.
I need to fetch data from timeA to timeB
result = dataBetween(timeA - timeB)
if the result is empty then I want to increase the end time window using exponential backoff
result = dataBetween(timeA - timeB + exponentialBackOffInDays)
I was following this article which is explaining how to approach this in rxjava2.
But got stuck at a point where flow does not have takeUntil operator yet.
You can see my implementation below.
fun main() {
runBlocking {
(0..8).asFlow()
.flatMapConcat { input ->
// To simulate a data source which fetches data based on a time-window start-date to end-date
// available with in that time frame.
flow {
println("Input: $input")
if (input < 5) {
emit(emptyList<String>())
} else { // After emitting this once the flow should complete
emit(listOf("Available"))
}
}.retryWhenThrow(DummyException(), predicate = {
it.isNotEmpty()
})
}.collect {
//println(it)
}
}
}
class DummyException : Exception("Collected size is empty")
private inline fun <T> Flow<T>.retryWhenThrow(
throwable: Throwable,
crossinline predicate: suspend (T) -> Boolean
): Flow<T> {
return flow {
collect { value ->
if (!predicate(value)) {
throw throwable // informing the upstream to keep emitting since the condition is met
}
println("Value: $value")
emit(value)
}
}.catch { e ->
if (e::class != throwable::class) throw e
}
}
It's working fine except even after the flow has a successful value the flow continue to collect till 8 from the upstream flow but ideally, it should have stopped when it reaches 5 itself.
Any help on how I should approach this would be helpful.
Maybe this does not match your exact setup but instead of calling collect, you might as well just use first{...} or firstOrNull{...}
This will automatically stop the upstream flows after an element has been found.
For example:
flowOf(0,0,3,10)
.flatMapConcat {
println("creating list with $it elements")
flow {
val listWithElementCount = MutableList(it){ "" } // just a list of n empty strings
emit(listWithElementCount)
}
}.first { it.isNotEmpty() }
On a side note, your problem sounds like a regular suspend function would be a better fit.
Something like
suspend fun getFirstNonEmptyList(initialFrom: Long, initialTo: Long): List<Any> {
var from = initialFrom
var to = initialTo
while (coroutineContext.isActive) {
val elements = getElementsInRange(from, to) // your "dataBetween"
if (elements.isNotEmpty()) return elements
val (newFrom, newTo) = nextBackoff(from, to)
from = newFrom
to = newTo
}
throw CancellationException()
}

Equivalent of RxJava .toList() in Kotlin coroutines flow

I have a situation where I need to observe userIds then use those userIds to observe users. Either userIds or users could change at any time and I want to keep the emitted users up to date.
Here is an example of the sources of data I have:
data class User(val name: String)
fun observeBestUserIds(): Flow<List<String>> {
return flow {
emit(listOf("abc", "def"))
delay(500)
emit(listOf("123", "234"))
}
}
fun observeUserForId(userId: String): Flow<User> {
return flow {
emit(User("${userId}_name"))
delay(2000)
emit(User("${userId}_name_updated"))
}
}
In this scenario I want the emissions to be:
[User(abc_name), User(def_name)], then
[User(123_name), User(234_name)], then
[User(123_name_updated), User(234_name_updated)]
I think I can achieve this in RxJava like this:
observeBestUserIds.concatMapSingle { ids ->
Observable.fromIterable(ids)
.concatMap { id ->
observeUserForId(id)
}
.toList()
}
What function would I write to make a flow that emits that?
I believe you're looking for combine, which gives you an array that you can easily call toList() on:
observeBestUserIds().collectLatest { ids ->
combine(
ids.map { id -> observeUserForId(id) }
) {
it.toList()
}.collect {
println(it)
}
}
And here's the inner part with more explicit parameter names since you can't see the IDE's type hinting on Stack Overflow:
combine(
ids.map { id -> observeUserForId(id) }
) { arrayOfUsers: Array<User> ->
arrayOfUsers.toList()
}.collect { listOfUsers: List<User> ->
println(listOfUsers)
}
Output:
[User(name=abc_name), User(name=def_name)]
[User(name=123_name), User(name=234_name)]
[User(name=123_name_updated), User(name=234_name)]
[User(name=123_name_updated), User(name=234_name_updated)]
Live demo (note that in the demo, all the output appears at once, but this is a limitation of the demo site - the lines appear with the timing you'd expect when the code is run locally)
This avoids the (abc_name_updated, def_name_updated) discussed in the original question. However, there's still an intermediate emission with 123_name_updated and 234_name because the 123_name_updated is emitted first and it sends the combined version immediately because they're the latest from each flow.
However, this can be avoided by debouncing the emissions (on my machine, a timeout as small as 1ms works, but I did 20ms to be conservative):
observeBestUserIds().collectLatest { ids ->
combine(
ids.map { id -> observeUserForId(id) }
) {
it.toList()
}.debounce(timeoutMillis = 20).collect {
println(it)
}
}
which gets you the exact output you wanted:
[User(name=abc_name), User(name=def_name)]
[User(name=123_name), User(name=234_name)]
[User(name=123_name_updated), User(name=234_name_updated)]
Live demo
This is unfortunatly non trivial with the current state of kotlin Flow, there seem to be important operators missing. But please notice that you are not looking for rxJavas toList(). If you would try to to do it with toList and concatMap in rxjava you would have to wait till all observabes finish.
This is not what you want.
Unfortunately for you I think there is no way around a custom function.
It would have to aggregate all the results returned by observeUserForId for all the ids which you would pass to it. It would also not be a simple windowing function, since in reality it is conceivable that one observeUserForId already returned twice and another call still didn't finish. So checking whether you already have the same number of users as you passed ids into your aggregating functions isn't enought, you also have to group by user id.
I'll try to add code later today.
Edit: As promised here is my solution I took the liberty of augmenting the requirements slightly. So the flow will emit every time all userIds have values and an underlying user changes. I think this is more likely what you want since users probably don't change properties in lockstep.
Nevertheless if this is not what you want leave a comment.
import kotlinx.coroutines.delay
import kotlinx.coroutines.flow.*
import kotlinx.coroutines.runBlocking
data class User(val name: String)
fun observeBestUserIds(): Flow<List<String>> {
return flow {
emit(listOf("abc", "def"))
delay(500)
emit(listOf("123", "234"))
}
}
fun observeUserForId(userId: String): Flow<User> {
return flow {
emit(User("${userId}_name"))
delay(2000)
emit(User("${userId}_name_updated"))
}
}
inline fun <reified K, V> buildMap(keys: Set<K>, crossinline valueFunc: (K) -> Flow<V>): Flow<Map<K, V>> = flow {
val keysSize = keys.size
val valuesMap = HashMap<K, V>(keys.size)
flowOf(*keys.toTypedArray())
.flatMapMerge { key -> valueFunc(key).map {v -> Pair(key, v)} }
.collect { (key, value) ->
valuesMap[key] = value
if (valuesMap.keys.size == keysSize) {
emit(valuesMap.toMap())
}
}
}
fun observeUsersForIds(): Flow<List<User>> {
return observeBestUserIds().flatMapLatest { ids -> buildMap(ids.toSet(), ::observeUserForId as (String) -> Flow<User>) }
.map { m -> m.values.toList() }
}
fun main() = runBlocking {
observeUsersForIds()
.collect { user ->
println(user)
}
}
This will return
[User(name=def_name), User(name=abc_name)]
[User(name=123_name), User(name=234_name)]
[User(name=123_name_updated), User(name=234_name)]
[User(name=123_name_updated), User(name=234_name_updated)]
You can run the code online here
You can use flatMapConcat
val users = observeBestUserIds()
.flatMapConcat { ids ->
flowOf(*ids.toTypedArray())
.map { id ->
observeUserForId(id)
}
}
.flattenConcat()
.toList()
or
observeBestUserIds()
.flatMapConcat { ids ->
flowOf(*ids.toTypedArray())
.map { id ->
observeUserForId(id)
}
}
.flattenConcat()
.collect { user ->
}

Kotlin coroutines progress counter

I'm making thousands of HTTP requests using async/await and would like to have a progress indicator. I've added one in a naive way, but noticed that the counter value never reaches the total when all requests are done. So I've created a simple test and, sure enough, it doesn't work as expected:
fun main(args: Array<String>) {
var i = 0
val range = (1..100000)
range.map {
launch {
++i
}
}
println("$i ${range.count()}")
}
The output is something like this, where the first number always changes:
98800 100000
I'm probably missing some important detail about concurrency/synchronization in JVM/Kotlin, but don't know where to start. Any tips?
UPDATE: I ended up using channels as Marko suggested:
/**
* Asynchronously fetches stats for all symbols and sends a total number of requests
* to the `counter` channel each time a request completes. For example:
*
* val counterActor = actor<Int>(UI) {
* var counter = 0
* for (total in channel) {
* progressLabel.text = "${++counter} / $total"
* }
* }
*/
suspend fun getAssetStatsWithProgress(counter: SendChannel<Int>): Map<String, AssetStats> {
val symbolMap = getSymbols()?.let { it.map { it.symbol to it }.toMap() } ?: emptyMap()
val total = symbolMap.size
return symbolMap.map { async { getAssetStats(it.key) } }
.mapNotNull { it.await().also { counter.send(total) } }
.map { it.symbol to it }
.toMap()
}
The explanation what exactly makes your wrong approach fail is secondary: the primary thing is fixing the approach.
Instead of async-await or launch, for this communication pattern you should instead have an actor to which all the HTTP jobs send their status. This will automatically handle all your concurrency issues.
Here's some sample code, taken from the link you provided in the comment and adapted to your use case. Instead of some third party asking it for the counter value and updating the GUI with it, the actor runs in the UI context and updates the GUI itself:
import kotlinx.coroutines.experimental.*
import kotlinx.coroutines.experimental.channels.*
import kotlin.system.*
import kotlin.coroutines.experimental.*
object IncCounter
fun counterActor() = actor<IncCounter>(UI) {
var counter = 0
for (msg in channel) {
updateView(++counter)
}
}
fun main(args: Array<String>) = runBlocking {
val counter = counterActor()
massiveRun(CommonPool) {
counter.send(IncCounter)
}
counter.close()
println("View state: $viewState")
}
// Everything below is mock code that supports the example
// code above:
val UI = newSingleThreadContext("UI")
fun updateView(newVal: Int) {
viewState = newVal
}
var viewState = 0
suspend fun massiveRun(context: CoroutineContext, action: suspend () -> Unit) {
val numCoroutines = 1000
val repeatActionCount = 1000
val time = measureTimeMillis {
val jobs = List(numCoroutines) {
launch(context) {
repeat(repeatActionCount) { action() }
}
}
jobs.forEach { it.join() }
}
println("Completed ${numCoroutines * repeatActionCount} actions in $time ms")
}
Running it prints
Completed 1000000 actions in 2189 ms
View state: 1000000
You're losing writes because i++ is not an atomic operation - the value has to be read, incremented, and then written back - and you have multiple threads reading and writing i at the same time. (If you don't provide launch with a context, it uses a threadpool by default.)
You're losing 1 from your count every time two threads read the same value as they will then both write that value plus one.
Synchronizing in some way, for example by using an AtomicInteger solves this:
fun main(args: Array<String>) {
val i = AtomicInteger(0)
val range = (1..100000)
range.map {
launch {
i.incrementAndGet()
}
}
println("$i ${range.count()}") // 100000 100000
}
There's also no guarantee that these background threads will be done with their work by the time you print the result and your program ends - you can test it easily by adding just a very small delay inside launch, a couple milliseconds. With that, it's a good idea to wrap this all in a runBlocking call which will keep the main thread alive and then wait for the coroutines to all finish:
fun main(args: Array<String>) = runBlocking {
val i = AtomicInteger(0)
val range = (1..100000)
val jobs: List<Job> = range.map {
launch {
i.incrementAndGet()
}
}
jobs.forEach { it.join() }
println("$i ${range.count()}") // 100000 100000
}
Have you read Coroutines basics? There's exact same problem as yours:
val c = AtomicInteger()
for (i in 1..1_000_000)
launch {
c.addAndGet(i)
}
println(c.get())
This example completes in less than a second for me, but it prints some arbitrary number, because some coroutines don't finish before main() prints the result.
Because launch is not blocking, there's no guarantee all of coroutines will finish before println. You need to use async, store the Deferred objects and await for them to finish.