Ktor server, a correct way to receive messages from websocket - kotlin

I'm new in Ktor-server and don't fully understand how web sockets receive messages. I found several solutions in different sources. (try\catch and webscoket blocks are omitted)
while(true) way
while(true){
val incoming = receiveDeserialized<IncomingDto>()
MessageService.newMessage(incoming)
}
consumeEach way
incoming.consumeEach { frame ->
// process frame
}
flow way
incoming.receiveAsFlow().filterIsInstance<Frame.Text>()
.collect{
// process frame
}
for way
for (frame in incoming){
frame as? Frame.Text ?: continue
// process frame
}
Which way is correct? Or do they do the same thing?
Second question. Should I need to use async{} inside the receive block so as not to block the receive channel. For example
while(true){
val incoming = receiveDeserialized<IncomingDto>()
async{
println("starting heavy task")
// heavy task
delay(500)
println("task complete")
}
}

These are different methods of working with ReceiveChannel so use one that better suits your needs. The difference between consumeEach and for loop is described here. In the while(true) example you receive and deserialize (using an installed content converter) a frame. In the flow example receiveAsFlow is used
to represent the given receive channel as a hot flow.
Yes. You can use async or launch to not block receiving other frames.

Related

How to propagate closing to a chain of flows in kotlin

I am using kotlin and I wanted to stream over a possibly huge resultset using flows. I found some explanations around the web:
Callbacks and Kotlin Flows
Use Flow for asynchronous data streams
I implemented it and it works fine. I also needed to batch the results before sending them to an external services, so I implemented a chunked operation on flows. Something like that:
fun <T> Flow<T>.chunked(chunkSize: Int): Flow<List<T>> {
return callbackFlow {
val listOfResult = mutableListOf<T>()
this#chunked.collect {
listOfResult.add(it)
if (listOfResult.size == chunkSize) {
trySendBlocking(listOfResult.toList())
listOfResult.clear()
}
}
if (listOfResult.isNotEmpty()) {
trySendBlocking(listOfResult)
}
close()
}
}
To be sure that everything was working fine, I created some integration tests:
first flow + chuncked to consume all rows, passed
using the first flow (the one created from the jdbc repository) and
applying take operator just to consider few x items. It passed correctly.
using first flow + chunked operator + take operator, it hangs forever
So the last test showed that there was something wrong in the implementation.
I investigated a lot without finding nothing useful but, dumping the threads, I found a coroutine thread blocked in the trySendBlocking call on the first flow, the one created in the jdbc repository.
I am wondering in which way the chunked operator is supposed to propagate the closing to the upstream flow since it seems this part is missing.
In both cases I am propagating downstream the end of data with a close() call but I took a look the take operator and I saw it is triggering back the closing with an emitAbort(...)
Should I do something similar in the callbackFlow{...}?
After a bit of investigation, I was able to avoid the locking adding a timeout on the trySendBlocking inside the repository but I didnĀ“t like that. At the end, I realized that I could cast the original flow (in the chunked operator) to a SendChannel and close it if the downstream flow is closed:
trySendBlocking(listOfResult.toList()).onSuccess {
LOGGER.debug("Sent")
}.onFailure {
LOGGER.warn("An error occurred sending data.", it)
}.onClosed {
LOGGER.info("Channel has been closed")
(originalFlow as SendChannel<*>).close(it)
}
Is this the correct way of closing flows backwards? Any hint to solve this issue?
Thanks!
You shouldn't use trySendBlocking instead of send. You should never use a blocking function in a coroutine without wrapping it in withContext with a Dispatcher that can handle blocking code (e.g. Dispatchers.Default). But when there's a suspend function alternative, use that instead, in this case send().
Also, callbackFlow is more convoluted than necessary for transforming a flow. You should use the standard flow builder instead (and so you'll use emit() instead of send()).
fun <T> Flow<T>.chunked(chunkSize: Int): Flow<List<T>> = flow {
val listOfResult = mutableListOf<T>()
collect {
listOfResult.add(it)
if (listOfResult.size == chunkSize) {
emit(listOfResult.toList())
listOfResult.clear()
}
}
if (listOfResult.isNotEmpty()) {
emit(listOfResult)
}
}

Kotlin Coroutines - Asynchronously consume a sequence

I'm looking for a way to keep a Kotlin sequence that can produces values very quickly, from outpacing slower async consumers of its values. In the following code, if the async handleValue(it) cannot keep up with the rate that the sequence is producing values, the rate imbalance leads to buffering of produced values, and eventual out-of-memory errors.
getSequence().map { async {
handleValue(it)
}}
I believe this is a classic producer/consumer "back-pressure" situation, and I'm trying to understand how to use Kotlin coroutines to deal with it.
Thanks for any suggestions :)
Kotlin channels and flows offer buffering producer dispatched data until the consumer/collector is ready to consume it.
But Channels have some concerns that have been manipulated in Flows; for instance, they are considered hot streams:
The producer starts for dispatching data whether or not there is an attached consumer; and this introduces resource leaks.
As long as no consumer attached to the producer, the producer will stuck in suspending state
However Flows are cold streams; nothing will be produced until there is something to consume.
To handle your query with Flows:
GlobalScope.launch {
flow {
// Producer
for (item in getSequence()) emit(item)
}.map { handleValue(it) }
.buffer(10) // Optionally specify the buffer size
.collect { // Collector
}
}
For my own reference, and to anyone else this may help, here's how I eventually solved this using Channels - https://kotlinlang.org/docs/channels.html#channel-basics
A producer coroutine:
fun itemChannel() : ReceiveChannel<MyItem> {
return produce {
while (moreItems()) {
send(nextItem()) // <-- suspend until next 'receive()'
}
}
}
And a function to run multiple consumer coroutines, each reading off that channel:
fun itemConsumers() {
runBlocking {
val channel = itemChannel()
repeat(numberOfConsumers) {
launch {
var more = true
while (more) {
try {
val item = channel.receive()
// do stuff with item here...
} catch (ex: ClosedReceiveChannelException) {
more = false
}
}
}
}
}
}
The idea here is that the consumer receives off the channel within the coroutine, so the next receive() is not called until a consumer coroutine finishes handling the last item. This results in the desired back-pressure, as opposed to receiving from a sequence or flow in the main thread, and then passing the item into a coroutine to be consumed. In that scenario there is no back-pressure from the receiver, since the receive happens in a different coroutine than where the received item is consumed.

How can I get a non-blocking infinite loop in a Kotlin Actor?

I would like to consume some stream-data using Kotlin actors
I was thinking to put my consumer inside an actor, while it polls in an infinite loop while(true). Then, when I decide, I send a message to stop the consumer.
Currently I have this:
while(true) {
for (message in channel){ <--- blocked in here, waiting
when(message) {
is MessageStop -> consumer.close()
else -> {}
}
}
consumer.poll()
}
The problem
The problem with this is that it only runs when I send a message to the actor, so my consumer is not polling the rest of the time because channel is blocking waiting to receive the next message
Is there any alternative?, someone with the same issue?, or something similar to actors but not blocked by channel in Kotlin?
Since the channel is just a Channel (https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines.channels/-channel/index.html) you can first check if the channel is empty and if so start your polling. Otherwise handle the messages.
E.g.
while(true) {
while (channel.isNotEmpty()) {
val message = channel.receive()
when(message) {
is MessageStop -> consumer.close()
else -> {}
}
}
consumer.poll()
}
In the end I used AKKA with Kotlin, I'm finding much easier this way
You should use postDelayed(), for example:
final Runnable r = new Runnable() {
public void run() {
// your code here
handler.postDelayed(this, 1000)
}
}
You can change 1000 with the the millisecond delay you want. Also I highly recommend to put your code inside a thread (if you are not already have) to prevent ANR (App Not Responding)

Kotlin wrap sequential IO calls as a Sequence

I need to process all of the results from a paged API endpoint. I'd like to present all of the results as a sequence.
I've come up with the following (slightly psuedo-coded):
suspend fun getAllRowsFromAPI(client: Client): Sequence<Row> {
var currentRequest: Request? = client.requestForNextPage()
return withContext(Dispatchers.IO) {
sequence {
while(currentRequest != null) {
var rowsInPage = runBlocking { client.makeRequest(currentRequest) }
currentRequest = client.requestForNextPage()
yieldAll(rowsInPage)
}
}
}
}
This functions but I'm not sure about a couple of things:
Is the API request happening inside runBlocking still happening with the IO dispatcher?
Is there a way to refactor the code to launch the next request before yielding the current results, then awaiting on it later?
Question 1: The API-request will still run on the IO-dispatcher, but it will block the thread it's running on. This means that no other tasks can be scheduled on that thread while waiting for the request to finish. There's not really any reason to use runBlocking in production-code at all, because:
If makeRequest is already a blocking call, then runBlocking will do practically nothing.
If makeRequest was a suspending call, then runBlocking would make the code less efficient. It wouldn't yield the thread back to the pool while waiting for the request to finish.
Whether makeRequest is a blocking or non-blocking call depends on the client you're using. Here's a non-blocking http-client I can recommend: https://ktor.io/clients/
Question 2: I would use a Flow for this purpose. You can think of it as a suspendable variant of Sequence. Flows are cold, which means that it won't run before the consumer asks for its contents (in contrary to being hot, which means the producer will push new values no matter if the consumer wants it or not). A Kotlin Flow has an operator called buffer which you can use to make it request more pages before it has fully consumed the previous page.
The code could look quite similar to what you already have:
suspend fun getAllRowsFromAPI(client: Client): Flow<Row> = flow {
var currentRequest: Request? = client.requestForNextPage()
while(currentRequest != null) {
val rowsInPage = client.makeRequest(currentRequest)
emitAll(rowsInPage.asFlow())
currentRequest = client.requestForNextPage()
}
}.flowOn(Dispatchers.IO)
.buffer(capacity = 1)
The capacity of 1 means that will only make 1 more request while processing an earlier page. You could increase the buffer size to make more concurrent requests.
You should check out this talk from KotlinConf 2019 to learn more about flows: https://www.youtube.com/watch?v=tYcqn48SMT8
Sequences are definitely not the thing you want to use in this case, because they are not designed to work in asynchronous environment. Perhaps you should take a look at flows and channels, but for your case the best and simplest choice is just a collection of deferred values, because you want to process all requests at once (flows and channels process them one-by-one, maybe with limited buffer size).
The following approach allows you to start all requests asynchronously (assuming that makeRequest is suspended function and supports asynchronous requests). When you'll need your results, you'll need to wait only for the slowest request to finish.
fun getClientRequests(client: Client): List<Request> {
val requests = ArrayList<Request>()
var currentRequest: Request? = client.requestForNextPage()
while (currentRequest != null) {
requests += currentRequest
currentRequest = client.requestForNextPage()
}
return requests
}
// This function is not even suspended, so it finishes almost immediately
fun getAllRowsFromAPI(client: Client): List<Deferred<Page>> =
getClientRequests(client).map {
/*
* The better practice would be making getAllRowsFromApi an extension function
* to CoroutineScope and calling receiver scope's async function.
* GlobalScope is used here just for simplicity.
*/
GlobalScope.async(Dispatchers.IO) { client.makeRequest(it) }
}
fun main() {
val client = Client()
val deferredPages = getAllRowsFromAPI(client) // This line executes fast
// Here you can do whatever you want, all requests are processed in background
Thread.sleep(999L)
// Then, when we need results....
val pages = runBlocking {
deferredPages.map { it.await() }
}
println(pages)
// In your case you also want to "unpack" pages and get rows, you can do it here:
val rows = pages.flatMap { it.getRows() }
println(rows)
}
I happened across suspendingSequence in Kotlin's coroutines-examples:
https://github.com/Kotlin/coroutines-examples/blob/090469080a974b962f5debfab901954a58a6e46a/examples/suspendingSequence/suspendingSequence.kt
This is exactly what I was looking for.

How to wrap a Flux with a blocking operation in the subscribe?

In the documentation it is written that you should wrap blocking code into a Mono: http://projectreactor.io/docs/core/release/reference/#faq.wrap-blocking
But it is not written how to actually do it.
I have the following code:
#PostMapping(path = "some-path", consumes = MediaType.APPLICATION_STREAM_JSON_VALUE)
public Mono<Void> doeSomething(#Valid #RequestBody Flux<Something> something) {
something.subscribe(something -> {
// some blocking operation
});
// how to return Mono<Void> here?
}
The first problem I have here is that I need to return something but I cant.
If I would return a Mono.empty for example the request would be closed before the work of the flux is done.
The second problem is: how do I actually wrap the blocking code like it is suggested in the documentation:
Mono blockingWrapper = Mono.fromCallable(() -> {
return /* make a remote synchronous call */
});
blockingWrapper = blockingWrapper.subscribeOn(Schedulers.elastic());
You should not call subscribe within a controller handler, but just build a reactive pipeline and return it. Ultimately, the HTTP client will request data (through the Spring WebFlux engine) and that's what subscribes and requests data to the pipeline.
Subscribing manually will decouple the request processing from that other operation, which will 1) remove any guarantee about the order of operations and 2) break the processing if that other operation is using HTTP resources (such as the request body).
In this case, the source is not blocking, but only the transform operation is. So we'd better use publishOn to signal that the rest of the chain should be executed on a specific Scheduler. If the operation here is I/O bound, then Schedulers.elastic() is the best choice, if it's CPU-bound then Schedulers .paralell is better. Here's an example:
#PostMapping(path = "/some-path", consumes = MediaType.APPLICATION_STREAM_JSON_VALUE)
public Mono<Void> doSomething(#Valid #RequestBody Flux<Something> something) {
return something.collectList()
.publishOn(Schedulers.elastic())
.map(things -> {
return processThings(things);
})
.then();
}
public ProcessingResult processThings(List<Something> things) {
//...
}
For more information on that topic, check out the Scheduler section in the reactor docs. If your application tends to do a lot of things like this, you're losing a lot of the benefits of reactive streams and you might consider switching to a Servlet-based model where you can configure thread pools accordingly.