How does concurrent modification work with coroutines? - kotlin

I am going through this coroutines hands-on tutorial Coroutines-Channels. So there is this task of concurrently fetching contributors and showing intermediate progress at the same time using channels, See Here
Below is snippet of the proposed solution
suspend fun loadContributorsChannels(
service: GitHubService,
req: RequestData,
updateResults: suspend (List<User>, completed: Boolean) -> Unit) = coroutineScope {
........
.........
val channel = Channel<List<User>>()
for (repo in repos) {
launch {
val users = service.getRepoContributors(req.org, repo.name) // suspend function
.also { logUsers(repo, it) }
.bodyList()
channel.send(users) // suspend function
}
}
var allUsers = emptyList<User>()
repeat(repos.size) {
val users = channel.receive() // suspend function
allUsers = (allUsers + users).aggregate()
updateResults(allUsers, it == repos.lastIndex) // suspend function
}
}
The function loadContributorsChannels() is called inside a coroutine which is using a Default dispatcher.See here. I have 2 questions.
In the snippet above is allUsers being modified concurrently since we are already inside a coroutine which is using a Default dispatcher?
If I change the code sequence like below why do I get incorrect results? How is the code above different from the snippet below?
val contributorsChannel = Channel<List<User>>()
var contributors = emptyList<User>()
for(repo in repos) {
launch {
val contributorsPerRepo = service
.getRepoContributors(req.org, repo.name) // suspend function
.also { logUsers(repo, it) }
.bodyList()
contributors = (contributors + contributorsPerRepo).aggregate()
contributorsChannel.send(contributors) // suspend function
}
}
repeat(repos.size) {
updateResults(contributorsChannel.receive(), it == repos.lastIndex) // suspend functions
}
Is this because of concurrent modification or am I missing something?

In the original code, the top-level coroutine is the only one using allUsers. It is its local state.
In your code, contributors is a variable shared by all the coroutines and concurrently updated.
The original code correctly applies the Channel as a synchronization mechanism to fan-in all the concurrent computation into a single coroutine that collects the results and uses them.

Related

How to implement timeout without interrupting a calculation in Kotlin coroutine?

Let's say a request started a long calculation, but ready to wait no longer than X seconds for the result. But instead of interrupting the calculation I would like it to continue in parallel until completion.
The first condition ("wait no longer than") is satisfied by withTimeoutOrNull function.
What is the standard (idiomatic Kotlin) way to continue the computation and execute some final action when the result is ready?
Example:
fun longCalculation(key: Int): String { /* 2..10 seconds */}
// ----------------------------
cache = Cache<Int, String>()
suspend fun getValue(key: Int) = coroutineScope {
val value: String? = softTimeoutOrNull(
timeout = 5.seconds(),
calculation = { longCalculation() },
finalAction = { v -> cache.put(key, v) }
)
// return calculated value or null
}
This is a niche enough case that I don't think there's a consensus on an idiomatic way to do it.
Since you want the work to continue in the background even if the current coroutine is resuming, you need a separate CoroutineScope to launch that background work, rather than using the coroutineScope builder that launches the coroutine as a child coroutine of the current one. That outer scope will determine the lifetime of the coroutines that it launches (cancelling them if it is cancelled). Typically, if you're using coroutines, you already have a scope on hand that's associated with the lifecycle of the current class.
I think this would do what you're describing. The current coroutine can wrap a Deferred.await() call in withTimeoutOrNull to see if it can wait for the other coroutine (that's not a child coroutine since it was launched directly from an outer CoroutineScope) without interfering with it.
suspend fun getValue(key: Int): String? {
val deferred = someScope.async {
longCalculation(key)
.also { cache.put(key, it) }
}
return withTimeoutOrNull(5000) { deferred.await() }
}
Here's a generalized version:
/**
* Launches a coroutine in the specified [scope] to perform the given [calculation]
* and [finalAction] with that calculation's result. Returns the result of the
* calculation if it is available within [timeout].
*/
suspend fun <T> softTimeoutOrNull(
scope: CoroutineScope,
timeout: Duration,
calculation: suspend () -> T,
finalAction: suspend (T) -> Unit = { }
): T? {
val deferred = scope.async {
calculation().also { finalAction(it) }
}
return withTimeoutOrNull(timeout) { deferred.await() }
}

Questions on recursive coroutines in Kotlin

Recently I've been trying to familiarize myself with Kotlin some more, so I decided to write a webscraper utilizing coroutines. What I want to accomplish is pull each page, harvest it for links and contents or posts, then feed the links back to the process, until there is nowhere left to go. As of now it has some obvious shortcomings, such no delay enforced between calls or saving addresses and only visiting new ones. But the questions I have are regarding coroutines, here.
Consider the following class,. I've added some toy classes to simulate how it is intended to work, which I won't detail, but you can imagine how they work.
class Scraper(
private val client: Client = ToyClient(delayMillis = 1000, alwaysFindBody = "Test body"),
private val extraction: Extraction = ToyExtraction(
alwaysFindLinks = listOf("https://google.com"),
alwaysFindPosts = listOf("Test post")
),
private val repository: Repository = ToyRepository()
) {
// I could manage my own coroutine scope's lifecycle, but how would I go about this?
// private val scope = CoroutineScope(Dispatchers.Default + SupervisorJob())
private val seed = "https://google.com"
private val log = KotlinLogging.logger {}
fun start() = runBlocking {
log.info { "Scraping started!" }
scrape(seed).join()
log.info { "Scraping finished!" }
}
private fun CoroutineScope.scrape(address: String): Job = launch(Dispatchers.Default) {
log.info { "A scraping coroutine has started" }
val page = request(address)
val contents = extract(page)
save(contents)
contents.links.forEach { scrape(it) }
// Job would not progress here after submitting new jobs, only after each children have been completed
// log.info { "A scraping coroutine has finished" }
}
private suspend fun request(address: String): Page {
log.info { "Getting page: $address" }
return client.get(address)
}
private suspend fun extract(page: Page): PageContents {
log.info { "Extracting page: ${page.address}" }
return extraction.extract(page)
}
private suspend fun save(contents: PageContents) {
log.info { "Processing contents of: $contents" }
repository.save(contents.posts)
}
}
The main recursive operation is CoroutineScope.scrape() which launches a job, which itself can launch children jobs as well and so on.
My main questions are:
If I were to manage the scope myself as a property, how could I do that and achieve the same behavior? That is, I would wait for all dynamically spawned jobs to complete as well, return when all are finished.
I wrote my webclient's function using a 3rd party library as such:
fun suspend get(address: String): Page { ... }
Am I fine just marking this method as suspend to get all benefits from this in terms of coroutines?
Thanks in advance!
You don't even need a scope for that, launch a top-level job and use job.join() to await until it and all its children are done. If you want to block while waiting for that to happen, then you are already doing it right by using runBlocking.
No, marking a function as suspend doesn't affect its blocking behavior. It only allows the function to suspend itself, which must be explicit either in your code or the code you're calling into.

Why does a normal function need to be wrapped with viewModelScope.launch?

The following code is from the project.
1: In my mind,a suspend fun should be launched in another suspend fun or viewModelScope.launch{ }, withContext{ } ... , filterItems() is only a normal function, I don't know why filterItems() need to be wrapped with viewModelScope.launch{ } in the function filterTasks(), could you tell me ?
2: In the function filterTasks(), viewModelScope.launch{ } will launch in coroutines, it's asynchronous, I think return result maybe be launched before I get the result from viewModelScope.launch{}, so the result maybe null, is the code correct?
Code
private fun filterTasks(tasksResult: Result<List<Task>>): LiveData<List<Task>> {
val result = MutableLiveData<List<Task>>()
if (tasksResult is Success) {
isDataLoadingError.value = false
viewModelScope.launch {
result.value = filterItems(tasksResult.data, getSavedFilterType())
//return filterItems(tasksResult.data, getSavedFilterType()) //It will cause error.
}
} else {
result.value = emptyList()
showSnackbarMessage(R.string.loading_tasks_error)
isDataLoadingError.value = true
}
return result //I think it maybe be launched before I get the result from viewModelScope.launch{}
}
private fun filterItems(tasks: List<Task>, filteringType: TasksFilterType): List<Task> {
val tasksToShow = ArrayList<Task>()
// We filter the tasks based on the requestType
for (task in tasks) {
when (filteringType) {
ALL_TASKS -> tasksToShow.add(task)
ACTIVE_TASKS -> if (task.isActive) {
tasksToShow.add(task)
}
COMPLETED_TASKS -> if (task.isCompleted) {
tasksToShow.add(task)
}
}
}
return tasksToShow
}
It doesn't, unless it performs some heavy work and you want to move it to a background thread, which is the case here. Here the author just wanted to disjoint the work so the live data can be updated with an empty list first, and the filtered list later(computationally intensive to get), but forgot to do it out of the main thread.
In this particular case the author may have forgotten to add a background dispatcher as a parameter
viewModelScope.launch(Dispatchers.Default)
hence, in this scenario the intended behavior was not achieved, so you see this "nonsensical" coroutine.
I think you can contribute to the project with a fix :)
yes, you are right. but if you looked up the implementation of the launch {} such in lifecycleScope.launch {} or viewModelScope.launch {} you would find out the "block" which is "the coroutine code which will be invoked in the context of the provided scope" is cast to be suspend, so any block of code between launch {} is suspend code block. so in your example filterItems is cast to suspend under the hood and it's wrapped with viewModelScope.launch{ } to do its heavy task not in main thread.
public fun CoroutineScope.launch(
context: CoroutineContext = EmptyCoroutineContext,
start: CoroutineStart = CoroutineStart.DEFAULT,
// the below line is doing the magic
block: suspend CoroutineScope.() -> Unit
): Job {
val newContext = newCoroutineContext(context)
val coroutine = if (start.isLazy)
LazyStandaloneCoroutine(newContext, block) else
StandaloneCoroutine(newContext, active = true)
coroutine.start(start, coroutine, block)
return coroutine
}
I agree that the code looks suspicious, main reason is that it launches the filterItems coroutine into the Main dispatcher, basically just postponing the moment when filterItems will run on the GUI thread. If filterItems takes long to complete, it will block the GUI; if it doesn't take long, then why would you launch a concurrent coroutine in the first place?
Furthermore, on an architectural level, I don't see a reason why you'd have a function returning LiveData<List<Task>> when you can just have a suspend fun returning List<Task>.

How to suspend kotlin coroutine until notified

I would like to suspend a kotlin coroutine until a method is called from outside, just like the old Java object.wait() and object.notify() methods. How do I do that?
Here: Correctly implementing wait and notify in Kotlin is an answer how to implement this with Kotlin threads (blocking). And here: Suspend coroutine until condition is true is an answer how to do this with CompleteableDeferreds but I do not want to have to create a new instance of CompleteableDeferred every time.
I am doing this currently:
var nextIndex = 0
fun handleNext(): Boolean {
if (nextIndex < apps.size) {
//Do the actual work on apps[nextIndex]
nextIndex++
}
//only execute again if nextIndex is a valid index
return nextIndex < apps.size
}
handleNext()
// The returned function will be called multiple times, which I would like to replace with something like notify()
return ::handleNext
From: https://gitlab.com/SuperFreezZ/SuperFreezZ/blob/master/src/superfreeze/tool/android/backend/Freezer.kt#L69
Channels can be used for this (though they are more general):
When capacity is 0 – it creates RendezvousChannel. This channel does not have any buffer at all. An element is transferred from sender to receiver only when send and receive invocations meet in time (rendezvous), so send suspends until another coroutine invokes receive and receive suspends until another coroutine invokes send.
So create
val channel = Channel<Unit>(0)
And use channel.receive() for object.wait(), and channel.offer(Unit) for object.notify() (or send if you want to wait until the other coroutine receives).
For notifyAll, you can use BroadcastChannel instead.
You can of course easily encapsulate it:
inline class Waiter(private val channel: Channel<Unit> = Channel<Unit>(0)) {
suspend fun doWait() { channel.receive() }
fun doNotify() { channel.offer(Unit) }
}
It is possible to use the basic suspendCoroutine{..} function for that, e.g.
class SuspendWait() {
private lateinit var myCont: Continuation<Unit>
suspend fun sleepAndWait() = suspendCoroutine<Unit>{ cont ->
myCont = cont
}
fun resume() {
val cont = myCont
myCont = null
cont.resume(Unit)
}
}
It is clear, the code have issues, e.g. myCont field is not synchonized, it is expected that sleepAndWait is called before the resume and so on, hope the idea is clear now.
There is another solution with the Mutex class from the kotlinx.coroutines library.
class SuspendWait2 {
private val mutex = Mutex(locaked = true)
suspend fun sleepAndWait() = mutex.withLock{}
fun resume() {
mutex.unlock()
}
}
I suggest using a CompletableJob for that.
My use case:
suspend fun onLoad() {
var job1: CompletableJob? = Job()
var job2: CompletableJob? = Job()
lifecycleScope.launch {
someList.collect {
doSomething(it)
job1?.complete()
}
}
lifecycleScope.launch {
otherList.collect {
doSomethingElse(it)
job2?.complete()
}
}
joinAll(job1!!, job2!!) // suspends until both jobs are done
job1 = null
job2 = null
// Do something one time
}

Kotlin Coroutines - How to block to await/join all jobs?

I am new to Kotlin/Coroutines, so hopefully I am just missing something/don't fully understand how to structure my code for the problem I am trying to solve.
Essentially, I am taking a list of strings, and for each item in the list I want to send it to another method to do work (make a network call and return data based on the response). (Edit:) I want all calls to launch concurrently, and block until all calls are done/the response is acted on, and then return a new list with the info of each response.
I probably don't yet fully understand when to use launch/async, but I've tried to following with both launch (with joinAll), and async (with await).
fun processData(lstInputs: List<String>): List<response> {
val lstOfReturnData = mutableListOf<response>()
runBlocking {
withContext(Dispatchers.IO) {
val jobs = List(lstInputs.size) {
launch {
lstOfReturnData.add(networkCallToGetData(lstInputs[it]))
}
}
jobs.joinAll()
}
}
return lstofReturnData
What I am expecting to happen, is if my lstInputs is a size of 120, when all jobs are joined, my lstOfReturnData should also have a size of 120.
What actually is happening is inconsitent results. I'll run it once, and I get 118 in my final list, run it again, it's 120, run it again, it's 117, etc. In the networkCallToGetData() method, I am handling any exceptions, to at least return something for every request, regardless if the network call fails.
Can anybody help explain why I am getting inconsistent results, and what I need to do to ensure I am blocking appropriately and all jobs are being joined before moving on?
mutableListOf() creates an ArrayList, which is not thread-safe.
Try using ConcurrentLinkedQueue instead.
Also, do you use the stable version of Kotlin/Kotlinx.coroutine (not the old experimental one)? In the stable version, with the introduction of structured concurrency, there is no need to write jobs.joinAll anymore. launch is an extesion function of runBlocking which will launch new coroutines in the scope of the runBlocking and the runBlocking scope will automatically wait for all the launched jobs to finsish. So the code above can be shorten to
val lstOfReturnData = ConcurrentLinkedQueue<response>()
runBlocking {
lstInputs.forEach {
launch(Dispatches.IO) {
lstOfReturnData.add(networkCallToGetData(it))
}
}
}
return lstOfReturnData
runBlocking blocks current thread interruptibly until its completion. I guess it's not what you want. If I think wrong and you want to block the current thread than you can get rid of coroutine and just make network call in the current thread:
val lstOfReturnData = mutableListOf<response>()
lstInputs.forEach {
lstOfReturnData.add(networkCallToGetData(it))
}
But if it is not your intent you can do the following:
class Presenter(private val uiContext: CoroutineContext = Dispatchers.Main)
: CoroutineScope {
// creating local scope for coroutines
private var job: Job = Job()
override val coroutineContext: CoroutineContext
get() = uiContext + job
// call this to cancel job when you don't need it anymore
fun detach() {
job.cancel()
}
fun processData(lstInputs: List<String>) {
launch {
val deferredList = lstInputs.map {
async(Dispatchers.IO) { networkCallToGetData(it) } // runs in parallel in background thread
}
val lstOfReturnData = deferredList.awaitAll() // waiting while all requests are finished without blocking the current thread
// use lstOfReturnData in Main Thread, e.g. update UI
}
}
}
Runblocking should mean you don't have to call join.
Launching a coroutine from inside a runblocking scope should do this for you.
Have you tried just:
fun processData(lstInputs: List<String>): List<response> {
val lstOfReturnData = mutableListOf<response>()
runBlocking {
lstInputs.forEach {
launch(Dispatchers.IO) {
lstOfReturnData.add(networkCallToGetData(it))
}
}
}
return lstofReturnData