Questions on recursive coroutines in Kotlin - kotlin

Recently I've been trying to familiarize myself with Kotlin some more, so I decided to write a webscraper utilizing coroutines. What I want to accomplish is pull each page, harvest it for links and contents or posts, then feed the links back to the process, until there is nowhere left to go. As of now it has some obvious shortcomings, such no delay enforced between calls or saving addresses and only visiting new ones. But the questions I have are regarding coroutines, here.
Consider the following class,. I've added some toy classes to simulate how it is intended to work, which I won't detail, but you can imagine how they work.
class Scraper(
private val client: Client = ToyClient(delayMillis = 1000, alwaysFindBody = "Test body"),
private val extraction: Extraction = ToyExtraction(
alwaysFindLinks = listOf("https://google.com"),
alwaysFindPosts = listOf("Test post")
),
private val repository: Repository = ToyRepository()
) {
// I could manage my own coroutine scope's lifecycle, but how would I go about this?
// private val scope = CoroutineScope(Dispatchers.Default + SupervisorJob())
private val seed = "https://google.com"
private val log = KotlinLogging.logger {}
fun start() = runBlocking {
log.info { "Scraping started!" }
scrape(seed).join()
log.info { "Scraping finished!" }
}
private fun CoroutineScope.scrape(address: String): Job = launch(Dispatchers.Default) {
log.info { "A scraping coroutine has started" }
val page = request(address)
val contents = extract(page)
save(contents)
contents.links.forEach { scrape(it) }
// Job would not progress here after submitting new jobs, only after each children have been completed
// log.info { "A scraping coroutine has finished" }
}
private suspend fun request(address: String): Page {
log.info { "Getting page: $address" }
return client.get(address)
}
private suspend fun extract(page: Page): PageContents {
log.info { "Extracting page: ${page.address}" }
return extraction.extract(page)
}
private suspend fun save(contents: PageContents) {
log.info { "Processing contents of: $contents" }
repository.save(contents.posts)
}
}
The main recursive operation is CoroutineScope.scrape() which launches a job, which itself can launch children jobs as well and so on.
My main questions are:
If I were to manage the scope myself as a property, how could I do that and achieve the same behavior? That is, I would wait for all dynamically spawned jobs to complete as well, return when all are finished.
I wrote my webclient's function using a 3rd party library as such:
fun suspend get(address: String): Page { ... }
Am I fine just marking this method as suspend to get all benefits from this in terms of coroutines?
Thanks in advance!

You don't even need a scope for that, launch a top-level job and use job.join() to await until it and all its children are done. If you want to block while waiting for that to happen, then you are already doing it right by using runBlocking.
No, marking a function as suspend doesn't affect its blocking behavior. It only allows the function to suspend itself, which must be explicit either in your code or the code you're calling into.

Related

Why compose ui testing's IdlingResource is blocking the main thread?

I've written a "minimal" AS project to replicate my the problem I'm facing. Here's the gh link.
I'm trying to write an end-to-end ui test in my compose-only project. The test covers a simple sign-in -> sync data -> go to main view use case.
Here's the whole test:
#HiltAndroidTest
class ExampleInstrumentedTest {
#get:Rule(order = 1)
val hiltRule = HiltAndroidRule(this)
#get:Rule(order = 2)
val composeTestRule = createAndroidComposeRule<MainActivity>()
#Inject
lateinit var dao: DummyDao
val isSyncing = mutableStateOf(false)
#Before
fun setup() {
runBlocking {
hiltRule.inject()
dao.deleteAllData()
dao.deleteUser()
}
composeTestRule.activity.isSyncingCallback = {
synchronized(isSyncing) {
isSyncing.value = it
}
}
composeTestRule.registerIdlingResource(
object : IdlingResource {
override val isIdleNow: Boolean
get() {
synchronized(isSyncing) {
return !isSyncing.value
}
}
}
)
}
#Test
fun runsTheStuffAndItWorks() {
composeTestRule
.onNodeWithText("login", ignoreCase = true, useUnmergedTree = true)
.assertIsDisplayed()
.performClick()
composeTestRule
.onNodeWithTag("sync")
.assertExists()
composeTestRule.waitForIdle()
assertFalse(isSyncing.value)
composeTestRule.onRoot().printToLog("not in the list")
composeTestRule
.onNodeWithTag("the list", useUnmergedTree = true)
.assertIsDisplayed()
}
}
The test runs "alright" up to the point where it should be waiting for the sync worker to finish its job and finally navigate to the "main composable".
Unfortunately, the test seems to be blocking the device's ui thread when the idling resource is not idle, finishing the test immediately as the idling resource does become idle.
I've tried using Espresso's IdlingResource directly, which also didn't work, showing similar results. I've tried adding compose's IdlingResource in different points as well, but that also didn't work (adding one between navigation calls also blocks the UI thread and the test fails even sooner).
What am I doing wrong here? Am I forgetting to setup something?

Why does a normal function need to be wrapped with viewModelScope.launch?

The following code is from the project.
1: In my mind,a suspend fun should be launched in another suspend fun or viewModelScope.launch{ }, withContext{ } ... , filterItems() is only a normal function, I don't know why filterItems() need to be wrapped with viewModelScope.launch{ } in the function filterTasks(), could you tell me ?
2: In the function filterTasks(), viewModelScope.launch{ } will launch in coroutines, it's asynchronous, I think return result maybe be launched before I get the result from viewModelScope.launch{}, so the result maybe null, is the code correct?
Code
private fun filterTasks(tasksResult: Result<List<Task>>): LiveData<List<Task>> {
val result = MutableLiveData<List<Task>>()
if (tasksResult is Success) {
isDataLoadingError.value = false
viewModelScope.launch {
result.value = filterItems(tasksResult.data, getSavedFilterType())
//return filterItems(tasksResult.data, getSavedFilterType()) //It will cause error.
}
} else {
result.value = emptyList()
showSnackbarMessage(R.string.loading_tasks_error)
isDataLoadingError.value = true
}
return result //I think it maybe be launched before I get the result from viewModelScope.launch{}
}
private fun filterItems(tasks: List<Task>, filteringType: TasksFilterType): List<Task> {
val tasksToShow = ArrayList<Task>()
// We filter the tasks based on the requestType
for (task in tasks) {
when (filteringType) {
ALL_TASKS -> tasksToShow.add(task)
ACTIVE_TASKS -> if (task.isActive) {
tasksToShow.add(task)
}
COMPLETED_TASKS -> if (task.isCompleted) {
tasksToShow.add(task)
}
}
}
return tasksToShow
}
It doesn't, unless it performs some heavy work and you want to move it to a background thread, which is the case here. Here the author just wanted to disjoint the work so the live data can be updated with an empty list first, and the filtered list later(computationally intensive to get), but forgot to do it out of the main thread.
In this particular case the author may have forgotten to add a background dispatcher as a parameter
viewModelScope.launch(Dispatchers.Default)
hence, in this scenario the intended behavior was not achieved, so you see this "nonsensical" coroutine.
I think you can contribute to the project with a fix :)
yes, you are right. but if you looked up the implementation of the launch {} such in lifecycleScope.launch {} or viewModelScope.launch {} you would find out the "block" which is "the coroutine code which will be invoked in the context of the provided scope" is cast to be suspend, so any block of code between launch {} is suspend code block. so in your example filterItems is cast to suspend under the hood and it's wrapped with viewModelScope.launch{ } to do its heavy task not in main thread.
public fun CoroutineScope.launch(
context: CoroutineContext = EmptyCoroutineContext,
start: CoroutineStart = CoroutineStart.DEFAULT,
// the below line is doing the magic
block: suspend CoroutineScope.() -> Unit
): Job {
val newContext = newCoroutineContext(context)
val coroutine = if (start.isLazy)
LazyStandaloneCoroutine(newContext, block) else
StandaloneCoroutine(newContext, active = true)
coroutine.start(start, coroutine, block)
return coroutine
}
I agree that the code looks suspicious, main reason is that it launches the filterItems coroutine into the Main dispatcher, basically just postponing the moment when filterItems will run on the GUI thread. If filterItems takes long to complete, it will block the GUI; if it doesn't take long, then why would you launch a concurrent coroutine in the first place?
Furthermore, on an architectural level, I don't see a reason why you'd have a function returning LiveData<List<Task>> when you can just have a suspend fun returning List<Task>.

Kotlin Flow: Testing hangs

I am trying to test Kotlin implementation using Flows. I use Kotest for testing. This code works:
ViewModel:
val detectedFlow = flow<String> {
emit("123")
delay(10L)
emit("123")
}
Test:
class ScanViewModelTest : StringSpec({
"when the flow contains values they are emitted" {
val detectedString = "123"
val vm = ScanViewModel()
launch {
vm.detectedFlow.collect {
it shouldBe detectedString
}
}
}
})
However, in the real ViewModel I need to add values to the flow, so I use ConflatedBroadcastChannel as follows:
private val _detectedValues = ConflatedBroadcastChannel<String>()
val detectedFlow = _detectedValues.asFlow()
suspend fun sendDetectedValue(detectedString: String) {
_detectedValues.send(detectedString)
}
Then in the test I try:
"when the flow contains values they are emitted" {
val detectedString = "123"
val vm = ScanViewModel()
runBlocking {
vm.sendDetectedValue(detectedString)
}
runBlocking {
vm.detectedFlow.collect { it shouldBe detectedString }
}
}
The test just hangs and never completes. I tried all kind of things: launch or runBlockingTest instead of runBlocking, putting sending and collecting in the same or separate coroutines, offer instead of send... Nothing seems to fix it. What am I doing wrong?
Update: If I create flow manually it works:
private val _detectedValues = ConflatedBroadcastChannel<String>()
val detectedFlow = flow {
this.emit(_detectedValues.openSubscription().receive())
}
So, is it a bug in asFlow() method?
The problem is that the collect function you used in your test is a suspend function that will suspend the execution until the Flow is finished.
In the first example, your detectedFlow is finite. It will just emit two values and finish. In your question update, you are also creating a finite flow, that will emit a single value and finish. That is why your test works.
However, in the second (real-life) example the flow is created from a ConflatedBroadcastChannel that is never closed. Therefore the collect function suspends the execution forever. To make the test work without blocking the thread forever, you need to make the flow finite too. I usually use the first() operator for this. Another option is to close the ConflatedBroadcastChannel but this usually means modifications to your code just because of the test which is not a good practice.
This is how your test would work with the first() operator
"when the flow contains values they are emitted" {
val detectedString = "123"
val vm = ScanViewModel()
runBlocking {
vm.sendDetectedValue(detectedString)
}
runBlocking {
vm.detectedFlow.first() shouldBe detectedString
}
}

How to suspend kotlin coroutine until notified

I would like to suspend a kotlin coroutine until a method is called from outside, just like the old Java object.wait() and object.notify() methods. How do I do that?
Here: Correctly implementing wait and notify in Kotlin is an answer how to implement this with Kotlin threads (blocking). And here: Suspend coroutine until condition is true is an answer how to do this with CompleteableDeferreds but I do not want to have to create a new instance of CompleteableDeferred every time.
I am doing this currently:
var nextIndex = 0
fun handleNext(): Boolean {
if (nextIndex < apps.size) {
//Do the actual work on apps[nextIndex]
nextIndex++
}
//only execute again if nextIndex is a valid index
return nextIndex < apps.size
}
handleNext()
// The returned function will be called multiple times, which I would like to replace with something like notify()
return ::handleNext
From: https://gitlab.com/SuperFreezZ/SuperFreezZ/blob/master/src/superfreeze/tool/android/backend/Freezer.kt#L69
Channels can be used for this (though they are more general):
When capacity is 0 – it creates RendezvousChannel. This channel does not have any buffer at all. An element is transferred from sender to receiver only when send and receive invocations meet in time (rendezvous), so send suspends until another coroutine invokes receive and receive suspends until another coroutine invokes send.
So create
val channel = Channel<Unit>(0)
And use channel.receive() for object.wait(), and channel.offer(Unit) for object.notify() (or send if you want to wait until the other coroutine receives).
For notifyAll, you can use BroadcastChannel instead.
You can of course easily encapsulate it:
inline class Waiter(private val channel: Channel<Unit> = Channel<Unit>(0)) {
suspend fun doWait() { channel.receive() }
fun doNotify() { channel.offer(Unit) }
}
It is possible to use the basic suspendCoroutine{..} function for that, e.g.
class SuspendWait() {
private lateinit var myCont: Continuation<Unit>
suspend fun sleepAndWait() = suspendCoroutine<Unit>{ cont ->
myCont = cont
}
fun resume() {
val cont = myCont
myCont = null
cont.resume(Unit)
}
}
It is clear, the code have issues, e.g. myCont field is not synchonized, it is expected that sleepAndWait is called before the resume and so on, hope the idea is clear now.
There is another solution with the Mutex class from the kotlinx.coroutines library.
class SuspendWait2 {
private val mutex = Mutex(locaked = true)
suspend fun sleepAndWait() = mutex.withLock{}
fun resume() {
mutex.unlock()
}
}
I suggest using a CompletableJob for that.
My use case:
suspend fun onLoad() {
var job1: CompletableJob? = Job()
var job2: CompletableJob? = Job()
lifecycleScope.launch {
someList.collect {
doSomething(it)
job1?.complete()
}
}
lifecycleScope.launch {
otherList.collect {
doSomethingElse(it)
job2?.complete()
}
}
joinAll(job1!!, job2!!) // suspends until both jobs are done
job1 = null
job2 = null
// Do something one time
}

Kotlin Coroutines - How to block to await/join all jobs?

I am new to Kotlin/Coroutines, so hopefully I am just missing something/don't fully understand how to structure my code for the problem I am trying to solve.
Essentially, I am taking a list of strings, and for each item in the list I want to send it to another method to do work (make a network call and return data based on the response). (Edit:) I want all calls to launch concurrently, and block until all calls are done/the response is acted on, and then return a new list with the info of each response.
I probably don't yet fully understand when to use launch/async, but I've tried to following with both launch (with joinAll), and async (with await).
fun processData(lstInputs: List<String>): List<response> {
val lstOfReturnData = mutableListOf<response>()
runBlocking {
withContext(Dispatchers.IO) {
val jobs = List(lstInputs.size) {
launch {
lstOfReturnData.add(networkCallToGetData(lstInputs[it]))
}
}
jobs.joinAll()
}
}
return lstofReturnData
What I am expecting to happen, is if my lstInputs is a size of 120, when all jobs are joined, my lstOfReturnData should also have a size of 120.
What actually is happening is inconsitent results. I'll run it once, and I get 118 in my final list, run it again, it's 120, run it again, it's 117, etc. In the networkCallToGetData() method, I am handling any exceptions, to at least return something for every request, regardless if the network call fails.
Can anybody help explain why I am getting inconsistent results, and what I need to do to ensure I am blocking appropriately and all jobs are being joined before moving on?
mutableListOf() creates an ArrayList, which is not thread-safe.
Try using ConcurrentLinkedQueue instead.
Also, do you use the stable version of Kotlin/Kotlinx.coroutine (not the old experimental one)? In the stable version, with the introduction of structured concurrency, there is no need to write jobs.joinAll anymore. launch is an extesion function of runBlocking which will launch new coroutines in the scope of the runBlocking and the runBlocking scope will automatically wait for all the launched jobs to finsish. So the code above can be shorten to
val lstOfReturnData = ConcurrentLinkedQueue<response>()
runBlocking {
lstInputs.forEach {
launch(Dispatches.IO) {
lstOfReturnData.add(networkCallToGetData(it))
}
}
}
return lstOfReturnData
runBlocking blocks current thread interruptibly until its completion. I guess it's not what you want. If I think wrong and you want to block the current thread than you can get rid of coroutine and just make network call in the current thread:
val lstOfReturnData = mutableListOf<response>()
lstInputs.forEach {
lstOfReturnData.add(networkCallToGetData(it))
}
But if it is not your intent you can do the following:
class Presenter(private val uiContext: CoroutineContext = Dispatchers.Main)
: CoroutineScope {
// creating local scope for coroutines
private var job: Job = Job()
override val coroutineContext: CoroutineContext
get() = uiContext + job
// call this to cancel job when you don't need it anymore
fun detach() {
job.cancel()
}
fun processData(lstInputs: List<String>) {
launch {
val deferredList = lstInputs.map {
async(Dispatchers.IO) { networkCallToGetData(it) } // runs in parallel in background thread
}
val lstOfReturnData = deferredList.awaitAll() // waiting while all requests are finished without blocking the current thread
// use lstOfReturnData in Main Thread, e.g. update UI
}
}
}
Runblocking should mean you don't have to call join.
Launching a coroutine from inside a runblocking scope should do this for you.
Have you tried just:
fun processData(lstInputs: List<String>): List<response> {
val lstOfReturnData = mutableListOf<response>()
runBlocking {
lstInputs.forEach {
launch(Dispatchers.IO) {
lstOfReturnData.add(networkCallToGetData(it))
}
}
}
return lstofReturnData