Kotlin - How to read from file asynchronously? - kotlin

Is there any kotlin idiomatic way to read a file content's asynchronously? I couldn't find anything in documentation.

A least as of Java 7 (which is where Android is stuck), there isn't any API that would tap into the low-level async file IO support (like io_uring). There is a class called AsynchronousFileChannel, but, as its docs state,
An AsynchronousFileChannel is associated with a thread pool to which tasks are submitted to handle I/O events and dispatch to completion handlers that consume the results of I/O operations on the channel.
That makes it no better than the following, bog-standard Kotlin idiom:
launch {
val contents = withContext(Dispatchers.IO) {
FileInputStream("filename.txt").use { it.readBytes() }
}
processContents(contents)
}
go_on_with_other_stuff_while_file_is_loading()
This uses Kotlin's own dedicated IO thread pool and unblocks the UI thread. If you're on Android, that is your actual concern, anyway.

Java NIO Asynchronous Channel is the tool you want.
Check out this AsynchronousFileChannel.aRead extension function from coroutine example:
suspend fun AsynchronousFileChannel.aRead(buf: ByteBuffer): Int =
suspendCoroutine { cont ->
read(buf, 0L, Unit, object : CompletionHandler<Int, Unit> {
override fun completed(bytesRead: Int, attachment: Unit) {
cont.resume(bytesRead)
}
override fun failed(exception: Throwable, attachment: Unit) {
cont.resumeWithException(exception)
}
})
}
You just open an AsynchronousFileChannel then call this aRead() in a coroutine,
val channel = AsynchronousFileChannel.open(Paths.get(fileName))
try {
val buf = ByteBuffer.allocate(4096)
val bytesRead = channel.aRead(buf)
} finally {
channel.close()
}
It's an essential function, don't know why it is not part of coroutine-core lib.

javasync/RxIo uses Java NIO Asynchronous Channel to provide a non-blocking API to read and write a file content's asynchronously, including kotlin idiomatic way. Next you have two examples: one reading/writing in bulk through coroutines, and other iterating lines through an asynchronous Kotlin Flow:
suspend fun copyNio(from: String, to: String) {
val data = Path(from).readText() // suspension point
Path(to).writeText(data) // suspension point
}
fun printLinesFrom(filename: String) {
Path(filename)
.lines() // Flow<String>
.onEach(::println)
.collect() // block if you want to wait for completion
}
Disclaimer I am the author and main contributor of javasync/RxIo

Related

Difference between GlobalScope and runBlocking when waiting for multiple async

I have a Kotlin Backend/server API using Ktor, and inside a certain endpoint's service logic I need to concurrently get details for a list of ids and then return it all to the client with the 200 response.
The way I wanted to do it is by using async{} and awaitAll()
However, I can't understand whether I should use runBlocking or GlobalScope.
What is really the difference here?
fun getDetails(): List<Detail> {
val fetched: MutableList<Details> = mutableListOf()
GlobalScope.launch { --> Option 1
runBlocking { ---> Option 2
Dispatchers.IO --> Option 3 (or any other dispatcher ..)
myIds.map { id ->
async {
val providerDetails = getDetails(id)
fetched += providerDetails
}
}.awaitAll()
}
return fetched
}
launch starts a coroutine that runs in parallel with your current code, so fetched would still be empty by the time your getDetails() function returns. The coroutine will continue running and mutating the List that you have passed out of the function while the code that retrieved the list already has the reference back and will be using it, so there's a pretty good chance of triggering a ConcurrentModificationException. Basically, this is not a viable solution at all.
runBlocking runs a coroutine while blocking the thread that called it. The coroutine will be completely finished before the return fetched line, so this will work if you are OK with blocking the calling thread.
Specifying a Dispatcher isn't an alternative to launch or runBlocking. It is an argument that you can add to either to determine the thread pool used for the coroutine and its children. Since you are doing IO and parallel work, you should probably be using runBlocking(Dispatchers.IO).
Your code can be simplified to avoid the extra, unnecessary mutable list:
fun getDetails(): List<Detail> = runBlocking(Dispatchers.IO) {
myIds.map { id ->
async {
getDetails(id)
}
}.awaitAll()
}
Note that this function will rethrow any exceptions thrown by getDetails().
If your project uses coroutines more generally, you probably have higher level coroutines running, in which case this should probably be a suspend function (non-blocking) instead:
suspend fun getDetails(): List<Detail> = withContext(Dispatchers.IO) {
myIds.map { id ->
async {
getDetails(id)
}
}.awaitAll()
}

What's the proper way of returning a result out of a IO coroutine job?

The problem is very simple, but I can't really seem to wrap my head around it. I'm launching a non-blocking thread in the IO scope in order to read from a file. However, I can't get the result in time before I return from the method - it always returns the initial empty value "". What am I missing here?
private fun getFileContents(): String {
var result = ""
val fileName = getFilename()
val job = CoroutineScope(Dispatchers.IO).launch {
kotlin.runCatching {
val file = getFile(fileName)
file.openFileInput().use { inputStream ->
result = String(inputStream.readBytes(), Charsets.UTF_8)
}
}
}
return result
}
Coroutines are launched asynchronously. Your non-suspending function cannot wait for the result without blocking. For more information about why asynchronous code results in your function returning with the default result, read the answers here.
getFileContents() has to be a suspend function to be able to return something without blocking, in which case you don't need to launch a coroutine either. But then whatever calls this function must be in a suspend function or coroutine.
private suspend fun getFileContents(): String = withContext(Dispatchers.IO) {
val fileName = getFilename()
kotlin.runCatching {
val file = getFile(fileName)
file.openFileInput().use { inputStream ->
result = String(inputStream.readBytes(), Charsets.UTF_8)
}
}.getOrDefault("")
}
There are two "worlds" of code: either you are in a suspending/coroutine context or you are not. When you are in a function that is not a suspend function, you can only return results that can be computed immediately, or you can block until the result is ready.
Generally, if you're using coroutines, you launch a coroutine at some high level in your code, and then you are free to use suspend functions everywhere because almost all of your code is initially triggered by a coroutine. By "high level", I mean you launch the coroutine when a UI screen appears or a UI button is pressed, for example.
Basically, your coroutine launches are usually in UI listeners and UI event functions, not in lower-level code like the function in your question. The coroutine calls a suspend function, which can call other suspend functions, so you don't need to launch more coroutines to perform your various sequential tasks.
The alternate solution is to return a Deferred with the result, like this:
private fun getFileContents(): Deferred<String> {
val fileName = getFilename()
return CoroutineScope(Dispatchers.IO).async {
kotlin.runCatching {
val file = getFile(fileName)
file.openFileInput().use { inputStream ->
result = String(inputStream.readBytes(), Charsets.UTF_8)
}
}.getOrDefault("")
}
}
But to unpack the result, you will need to call await() on the Deferred instance inside a coroutine somewhere.

How to emit Flow value from different function? Kotlin Coroutines

I have a flow :
val myflow = kotlinx.coroutines.flow.flow<Message>{}
and want to emit values with function:
override suspend fun sendMessage(chat: Chat, message: Message) {
myflow.emit(message)
}
But compiler does not allow me to do this, is there any workarounds to solve this problem?
You can use StateFlow for such use case.
Here's a sample code.
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*
val chatFlow = MutableStateFlow<String>("")
fun main() = runBlocking {
// Observe values
val job = launch {
chatFlow.collect {
print("$it ")
}
}
// Change values
arrayOf("Hey", "Hi", "Hello").forEach {
delay(100)
sendMessage(it)
}
delay(1000)
// Cancel running job
job.cancel()
job.join()
}
suspend fun sendMessage(message: String) {
chatFlow.value = message
}
You can test this code by running below snippet.
<iframe src="https://pl.kotl.in/DUBDfUnX3" style="width:600px;"></iframe>
The answer of Animesh Sahu is pretty much correct. You can also return a Channel as a flow (see consumeAsFlow or asFlow on a BroadcastChannel).
But there is also a thing called StateFlow currently in development by Kotlin team, which is, in part, meant to implement a similar behavior, although it is unknown when it is going to be ready.
EDIT: StateFlow and SharedFlow have been released as part of a stable API (https://blog.jetbrains.com/kotlin/2020/10/kotlinx-coroutines-1-4-0-introducing-stateflow-and-sharedflow/). These tools can and should be used when state management is required in an async execution context.
Use a SharedStateFlow it has got everything you need.
Initialization of your flow:
val myFlow = MutableSharedFlow<Message>()
and now it should just work as you were trying earlier with:
override suspend fun sendMessage(chat: Chat, message: Message) {
myFlow.emit(message)
}
Flow is self contained, once the block (lambda) inside the flow is executed the flow is over, you've to do operations inside and emit them from there.
Here is the similar github issue, says:
Afaik Flow is designed to be a self contained, replayable, cold stream, so emission from outside of it's own scope wouldn't be part of the contract. I think what you're looking for is a Channel.
And IMHO you're probably looking at the Channels, or specifically a ConflatedBroadcastChannel for multiple receivers. The difference between a normal channel and a broadcast channel is that multiple receivers can listen to a broadcast channel using openSubscription function which returns a ReceiveChannel associated with the BroadcastChannel.

Run code in main thread when IO thread dispatch completes?

I'm working with livedata. I want to run some arbitrary code in IO and then once that has completed, run some arbitrary code in the Main thread.
In JavaScript, you can accomplish something like this by chaining promises together. I know Kotlin is different, but that's at least a framework I'm coming from that I understand.
I have a function that will sometimes be called from Main and sometimes from IO, but it requires no special IO features itself. From within class VM: ViewModel():
private val mState = MyState() // data class w/property `a`
val myLiveData<MyState> = MutableLiveData(mState)
fun setVal(a: MyVal) {
mState = mState.copy(a=a)
myLiveData.value = mState
}
fun buttonClickHandler(a: MyVal) {
setVal(a) // Can execute in Main
}
fun getValFromDb() {
viewModelScope.launch(Dispatchers.IO) {
val a: MyVal = fetchFromDb()
setVal(a) // Error! Cannot call setValue from background thread!
}
}
Seems to me the obvious way would be to execute val a = fetchFromDb() from IO and then pull setVal(a) out of that block and into Main.
Is there a way to accomplish this? I don't see a conceptual reason why this feature could not exist. Is there some idea like
doAsyncThatReturnsValue(Dispatchers.IO) { fetchFromDb()}
.then(previousBlockReturnVal, Dispatchers.Main) { doInMain() }
that could be run in a ViewModel?
Please substitute "coroutine" for "thread" wherever appropriate above. :)
Launch is fine. You just have to switch around the dispatchers and use withContext:
fun getValFromDb() {
// run this coroutine on main thread
viewModelScope.launch(Dispatchers.Main) {
// obtain result by running given block on IO thread
// suspends coroutine until it's ready (without blocking the main thread)
val a: MyVal = withContext(Dispatchers.IO){ fetchFromDb() }
// executed on main thread
setVal(a)
}
}

Concurrent S3 File Upload via Kotlin Coroutines

I need to upload many files to S3, it would take hours to complete that job sequentially. That's exactly what Kotlin's new coroutines excels in, so I wanted to give them a first try instead of fiddling around again with some Thread-based execution service.
Here is my (simplified) code:
fun upload(superTiles: Map<Int, Map<Int, SuperTile>>) = runBlocking {
val s3 = AmazonS3ClientBuilder.standard().withRegion("eu-west-1").build()
for ((x, ys) in superTiles) {
val jobs = mutableListOf<Deferred<Any>>()
for ((y, superTile) in ys) {
val job = async(CommonPool) {
uploadTile(s3, x, y, superTile)
}
jobs.add(job)
}
jobs.map { it.await() }
}
}
suspend fun uploadTile(s3: AmazonS3, x: Int, y: Int, superTile: SuperTile) {
val json: String = "{}"
val key = "$s3Prefix/x4/$z/$x/$y.json"
s3.putObject(PutObjectRequest("my_bucket", ByteArrayInputStream(json.toByteArray()), metadata))
}
The problem: the code is still very slow and logging reveals that requests are still executed sequentially: a job is finished before the next one is created. Only in very few cases (1 out of 10) I see jobs running concurrently.
Why does the code not run much faster / concurrently? What can I do about it?
Kotlin coroutines excel when you work with asynchronous API, while AmazonS3.putObject API that you are using is an old-school blocking, synchronous API, so you get only as many concurrent uploads as the number of threads in the CommonPool that you are using. There is no value in marking your uploadTile function with suspend modified, because it does not use any suspending functions in its body.
The first step in getting more throughput in your upload task is to start using asynchronous API for that. I'd suggest to look at Amazon S3 TransferManager for that purse. See if that gets your problem solved first.
Kotlin coroutines are designed to help you to combine your async APIs into a easy-to-use logical workflows. For example, it is straightforward to adapt asynchronous API of TransferManager for use with coroutines by writing the following extension function:
suspend fun Upload.await(): UploadResult = suspendCancellableCoroutine { cont ->
addProgressListener {
if (isDone) {
// we know it should not actually wait when done
try { cont.resume(waitForUploadResult()) }
catch (e: Throwable) { cont.resumeWithException(e) }
}
}
cont.invokeOnCompletion { abort() }
}
This extension enables you to write very fluent code that works with TransferManager and you can rewrite your uploadTile function to work with TransferManager instead of working with blocking AmazonS3 interface:
suspend fun uploadTile(tm: TransferManager, x: Int, y: Int, superTile: SuperTile) {
val json: String = "{}"
val key = "$s3Prefix/x4/$z/$x/$y.json"
tm.upload(PutObjectRequest("my_bucket", ByteArrayInputStream(json.toByteArray()), metadata))
.await()
}
Notice, how this new version of uploadTile uses a suspending function await that was defined above.