Difference between Kotlin arrow IO, IO.fx, IO !effect - kotlin

I am trying to use arrow in kotlin
Arrow has three functions
IO {}
IO.fx {}
IO.fx { !effect}
I want to know the difference between these. I know IO.fx and IO.fx {!effect} help us use side effects but then whats the difference between the two and why would I use one over the other

While this is going to change shortly, on version 0.11.X:
IO { } is a constructor that takes a suspend function, so you can call any suspend function inside. It's a shortcut for IO.effect { }
suspend fun bla(): Unit = ...
fun myIO(): IO<Unit> = IO { bla() }
fun otherIO(): IO<Unit> = IO.effect { bla() }
IO.fx { } is the same as IO except it adds a few DSL functions that are shortcuts for other APIs of IO. The most important one is ! or bind, which executes another IO inside.
fun myIO(): IO<Unit> = IO.fx { bla() }
fun nestIO(): IO<IO<Unit>> = IO.fx { myIO() }
fun unpackIO(): IO<Unit> = IO.fx { !myIO() }
Another function it enables is the constructor effect from the first point. So what you're effectively doing is adding an additional layer of wrapping that may not be necessary.
fun inefficientNestIO(): IO<IO<Unit>> = IO.fx { effect { bla() } }
fun inefficientUnpackedIO(): IO<Unit> = IO.fx { !effect { bla() } }
We frequently see that inefficientUnpackedIO from people who come to the support channels, and it's easily replaceable by just IO { bla() }.
Why have two ways of doing the same in effect and fx? It's something we're looking to improve on the next releases. We recommend using the least powerful abstraction wherever possible, so reserve fx only when using other IO-based APIs such as scheduling or parallelization.
IO.fx {
val id = getUserIdSuspend()
val friends: List<User> =
!parMapN(
userFriends(id),
IO { userProfile(id) },
::toUsers
)
!friends.parTraverse(IO.applicative()) { user ->
IO { broadcastStatus(user) }
}
}

Related

Why is this extension function slower than non extension counterpart?

I was trying to write a parallel map extension function to do map operation over a List in parallel using coroutines.
However there is a significant overhead in my solution and I can't find out why.
This is my implementation of the pmap extension function:
fun <T, U> List<T>.pmap(scope: CoroutineScope = GlobalScope,
transform: suspend (T) -> U): List<U> {
return map { i -> scope.async { transform(i) } }.map { runBlocking { it.await() } }
}
However, when I do the exact same operation in a normal function, it takes up to extra 100ms (which is a lot).
I tried using inline but it had no effect.
I'm leaving here the full test I've done to demonstrate this behavior:
import kotlinx.coroutines.*
import kotlin.system.measureTimeMillis
fun main() {
test()
}
fun <T, U> List<T>.pmap(scope: CoroutineScope = GlobalScope,
transform: suspend (T) -> U): List<U> {
return this.map { i -> scope.async { transform(i) } }.map { runBlocking { it.await() } }
}
fun test() {
val list = listOf<Long>(100,200,300)
val transform: suspend (Long) -> Long = { long: Long ->
delay(long)
long*2
}
val timeTakenPmap = measureTimeMillis {
list.pmap(GlobalScope) { transform(it) }
}
val manualpmap = measureTimeMillis {
list.map { GlobalScope.async { transform(it) } }
.map { runBlocking { it.await() } }
}
val timeTakenMap = measureTimeMillis {
list.map { runBlocking { transform(it) } }
}
println("pmapTime: $timeTakenPmap - mapTime: $timeTakenMap - manualpmap: $manualpmap")
}
It can be run in kotlin playground: https://pl.kotl.in/CIXVqezg3
In the playground it prints this result:
pmapTime: 411 - mapTime: 602 - manualpmap: 302
MapTime and manualPmap give reasonable results, only 2ms of time outside the delays. But pmapTime is way off. And the code between manualpmap and pmap looks exactly the same to me.
In my own machine it runs a little faster, pmap takes around 350ms.
Does anyone know why this happens?
First of all, manual benchmarks like this are usually of very little significance. There are many things that can be optimized away by the compiler or the JIT and any conclusion can be quite wrong. If you really want to compare things, you should instead use benchmarking libraries which take into account JVM warmup etc.
Now, the overhead you see (if you could confirm there was an actual overhead) might be caused by the fact that your higher-order extension is not marked inline, so instances of the lambda you pass need to be created - but as #Tenfour04 noted there are many other possible reasons: thread pool lazy initialization, significance of the list size, etc.
That being said, this is really not an appropriate way to write parallel map, for several reasons:
GlobalScope is a pretty bad default in general, and should be used in very specific situations only. But don't worry about it because of the next point.
You don't need an externally provided CoroutineScope if the coroutines you launch do not outlive your method. Instead, use coroutineScope { ... } and make your function suspend, and the caller will choose the context if they need to
map { it.await() } is inefficient in case of errors: if the last element's transformation immediately fails, map will wait for all previous elements to finish before failing. You should prefer awaitAll which takes care of this.
runBlocking should be avoided in coroutines (blocking threads in general, especially when you don't control which thread you're blocking), so using it in deep library-like functions like this is dangerous, because it will likely be used in coroutines at some point.
Applying those points gives:
suspend inline fun <T, U> List<T>.pmap(transform: suspend (T) -> U): List<U> {
return coroutineScope {
map { async { transform(it) } }.awaitAll()
}
}

Access ApplicationCall in object without propagation

Is there a thread-safe method in Ktor where it is possible to statically access the current ApplicationCall? I am trying to get the following simple example to work;
object Main {
fun start() {
val server = embeddedServer(Jetty, 8081) {
intercept(ApplicationCallPipeline.Call) {
// START: this will be more dynamic in the future, we don't want to pass ApplicationCall
Addon.processRequest()
// END: this will be more dynamic in the future, we don't want to pass ApplicationCall
call.respondText(output, ContentType.Text.Html, HttpStatusCode.OK)
return#intercept finish()
}
}
server.start(wait = true)
}
}
fun main(args: Array<String>) {
Main.start();
}
object Addon {
fun processRequest() {
val call = RequestUtils.getCurrentApplicationCall()
// processing of call.request.queryParameters
// ...
}
}
object RequestUtils {
fun getCurrentApplicationCall(): ApplicationCall {
// Here is where I am getting lost..
return null
}
}
I would like to be able to get the ApplicationCall for the current context to be available statically from the RequestUtils so that I can access information about the request anywhere. This of course needs to scale to be able to handle multiple requests at the same time.
I have done some experiments with dependency inject and ThreadLocal, but to no success.
Well, the application call is passed to a coroutine, so it's really dangerous to try and get it "statically", because all requests are treated in a concurrent context.
Kotlin official documentation talks about Thread-local in the context of coroutine executions. It uses the concept of CoroutineContext to restore Thread-Local values in specific/custom coroutine context.
However, if you are able to design a fully asynchronous API, you will be able to bypass thread-locals by directly creating a custom CoroutineContext, embedding the request call.
EDIT: I've updated my example code to test 2 flavors:
async endpoint: Solution fully based on Coroutine contexts and suspend functions
blocking endpoint: Uses a thread-local to store application call, as referred in kotlin doc.
import io.ktor.server.engine.embeddedServer
import io.ktor.server.jetty.Jetty
import io.ktor.application.*
import io.ktor.http.ContentType
import io.ktor.http.HttpStatusCode
import io.ktor.response.respondText
import io.ktor.routing.get
import io.ktor.routing.routing
import kotlinx.coroutines.asContextElement
import kotlinx.coroutines.launch
import kotlin.coroutines.AbstractCoroutineContextElement
import kotlin.coroutines.CoroutineContext
import kotlin.coroutines.coroutineContext
/**
* Thread local in which you'll inject application call.
*/
private val localCall : ThreadLocal<ApplicationCall> = ThreadLocal();
object Main {
fun start() {
val server = embeddedServer(Jetty, 8081) {
routing {
// Solution requiring full coroutine/ supendable execution.
get("/async") {
// Ktor will launch this block of code in a coroutine, so you can create a subroutine with
// an overloaded context providing needed information.
launch(coroutineContext + ApplicationCallContext(call)) {
PrintQuery.processAsync()
}
}
// Solution based on Thread-Local, not requiring suspending functions
get("/blocking") {
launch (coroutineContext + localCall.asContextElement(value = call)) {
PrintQuery.processBlocking()
}
}
}
intercept(ApplicationCallPipeline.ApplicationPhase.Call) {
call.respondText("Hé ho", ContentType.Text.Plain, HttpStatusCode.OK)
}
}
server.start(wait = true)
}
}
fun main() {
Main.start();
}
interface AsyncAddon {
/**
* Asynchronicity propagates in order to properly access coroutine execution information
*/
suspend fun processAsync();
}
interface BlockingAddon {
fun processBlocking();
}
object PrintQuery : AsyncAddon, BlockingAddon {
override suspend fun processAsync() = processRequest("async", fetchCurrentCallFromCoroutineContext())
override fun processBlocking() = processRequest("blocking", fetchCurrentCallFromThreadLocal())
private fun processRequest(prefix : String, call : ApplicationCall?) {
println("$prefix -> Query parameter: ${call?.parameters?.get("q") ?: "NONE"}")
}
}
/**
* Custom coroutine context allow to provide information about request execution.
*/
private class ApplicationCallContext(val call : ApplicationCall) : AbstractCoroutineContextElement(Key) {
companion object Key : CoroutineContext.Key<ApplicationCallContext>
}
/**
* This is your RequestUtils rewritten as a first-order function. It defines as asynchronous.
* If not, you won't be able to access coroutineContext.
*/
suspend fun fetchCurrentCallFromCoroutineContext(): ApplicationCall? {
// Here is where I am getting lost..
return coroutineContext.get(ApplicationCallContext.Key)?.call
}
fun fetchCurrentCallFromThreadLocal() : ApplicationCall? {
return localCall.get()
}
You can test it in your navigator:
http://localhost:8081/blocking?q=test1
http://localhost:8081/blocking?q=test2
http://localhost:8081/async?q=test3
server log output:
blocking -> Query parameter: test1
blocking -> Query parameter: test2
async -> Query parameter: test3
The key mechanism you want to use for this is the CoroutineContext. This is the place that you can set key value pairs to be used in any child coroutine or suspending function call.
I will try to lay out an example.
First, let us define a CoroutineContextElement that will let us add an ApplicationCall to the CoroutineContext.
class ApplicationCallElement(var call: ApplicationCall?) : AbstractCoroutineContextElement(ApplicationCallElement) {
companion object Key : CoroutineContext.Key<ApplicationCallElement>
}
Now we can define some helpers that will add the ApplicationCall on one of our routes. (This could be done as some sort of Ktor plugin that listens to the pipeline, but I don't want to add to much noise here).
suspend fun PipelineContext<Unit, ApplicationCall>.withCall(
bodyOfCall: suspend PipelineContext<Unit, ApplicationCall>.() -> Unit
) {
val pipeline = this
val appCallContext = buildAppCallContext(this.call)
withContext(appCallContext) {
pipeline.bodyOfCall()
}
}
internal suspend fun buildAppCallContext(call: ApplicationCall): CoroutineContext {
var context = coroutineContext
val callElement = ApplicationCallElement(call)
context = context.plus(callElement)
return context
}
And then we can use it all together like in this test case below where we are able to get the call from a nested suspending function:
suspend fun getSomethingFromCall(): String {
val call = coroutineContext[ApplicationCallElement.Key]?.call ?: throw Exception("Element not set")
return call.parameters["key"] ?: throw Exception("Parameter not set")
}
fun Application.myApp() {
routing {
route("/foo") {
get {
withCall {
call.respondText(getSomethingFromCall())
}
}
}
}
}
class ApplicationCallTest {
#Test
fun `we can get the application call in a nested function`() {
withTestApplication({ myApp() }) {
with(handleRequest(HttpMethod.Get, "/foo?key=bar")) {
assertEquals(HttpStatusCode.OK, response.status())
assertEquals("bar", response.content)
}
}
}
}

coroutine scope and async - right approach?

I love the concept of co-routines and I've been using in my android projects. Currently i'm working on a JVM module which i'll be including in a Ktor project and i know ktor has support for co-routines.
(find the attached code snippet)
Just wanted to know is this the right approach?
How do i use async with recursion?
Any resources that you can recommend which can help me grasp more in-depth knowledge of co-routines would be helpful.
Thanks in advance!
override suspend fun processInstruction(args.. ): List<Any> = coroutineScope {
val dataWithFields = async{
listOfFields.fold(mutableList()){ acc,field ->
val data = someProcess(field)
val nested = processInstruction(...nestedField) // nested call
acc.addAll(data)
acc.addAll(nested)
acc
}
}
return#coroutineScope postProcessData(dataWithFields.await())
}
If you want to process all nested calls in parallel, you should wrap each of them in async (async should be inside of the loop). And then, after the loop, you should await all the results. (In your code you run await right after single async, so there is no parallel execution).
For example, if you have Element:
interface Element {
val subElements: List<Element>
suspend fun calculateData(): SomeData
}
interface SomeData
And you want to calculateData of all subElements in parallel, you can do it like this:
suspend fun Element.calculateAllData(): List<SomeData> = coroutineScope {
val data = async { calculateData() }
val subData = subElements.map { sub -> async { sub.calculateAllData() } }
return#coroutineScope listOf(data.await()) + subData.awaitAll().flatten()
}
As you said in a comments section, you need parent-data to calculate sub-data, therefore the first thing calculateAllData() should do is calculate the parent-data:
suspend fun Element.calculateAllData(
parentData: SomeData = defaultParentData()
): List<SomeData> = coroutineScope {
val data = calculateData(parentData)
val subData = subElements.map { sub -> async { sub.calculateAllData(data) } }
return#coroutineScope listOf(data) + subData.awaitAll().flatten()
}
Now you may wonder how fast it works. Consider the following Element implementation:
class ElementImpl(override val subElements: List<Element>) : Element {
override suspend fun calculateData(parentData: SomeData): SomeData {
delay(1000)
return SomeData()
}
}
fun elmOf(vararg elements: Element) = ElementImpl(listOf(*elements))
And the following test:
println(measureTime {
elmOf(
elmOf(),
elmOf(
elmOf(),
elmOf(
elmOf(),
elmOf(),
elmOf()
)
),
elmOf(
elmOf(),
elmOf()
),
elmOf()
).calculateAllData()
})
If parent-data isn't needed to calculate sub-data, it prints 1.06s, since in this case, all the data is calculated in parallel. Otherwise, it prints 4.15s, since elements tree height is 4.

How to execute a program with Kotlin and Arrow

I'm trying to learn a bit of Functional Programming using Kotlin and Arrow and in this way I've already read some blogposts like the following one: https://jorgecastillo.dev/kotlin-fp-1-monad-stack, which is good, I've understand the main idea, but when creating a program, I can't figure out how to run it.
Let me be more explicit:
I have the following piece of code:
typealias EitherIO<A, B> = EitherT<ForIO, A, B>
sealed class UserError(
val message: String,
val status: Int
) {
object AuthenticationError : UserError(HttpStatus.UNAUTHORIZED.reasonPhrase, HttpStatus.UNAUTHORIZED.value())
object UserNotFound : UserError(HttpStatus.NOT_FOUND.reasonPhrase, HttpStatus.NOT_FOUND.value())
object InternalServerError : UserError(HttpStatus.INTERNAL_SERVER_ERROR.reasonPhrase, HttpStatus.INTERNAL_SERVER_ERROR.value())
}
#Component
class UserAdapter(
private val myAccountClient: MyAccountClient
) {
#Lazy
#Inject
lateinit var subscriberRepository: SubscriberRepository
fun getDomainUser(ssoId: Long): EitherIO<UserError, User?> {
val io = IO.fx {
val userResource = getUserResourcesBySsoId(ssoId, myAccountClient).bind()
userResource.fold(
{ error -> Either.Left(error) },
{ success ->
Either.right(composeDomainUserWithSubscribers(success, getSubscribersForUserResource(success, subscriberRepository).bind()))
})
}
return EitherIO(io)
}
fun composeDomainUserWithSubscribers(userResource: UserResource, subscribers: Option<Subscribers>): User? {
return subscribers.map { userResource.toDomainUser(it) }.orNull()
}
}
private fun getSubscribersForUserResource(userResource: UserResource, subscriberRepository: SubscriberRepository): IO<Option<Subscribers>> {
return IO {
val msisdnList = userResource.getMsisdnList()
Option.invoke(subscriberRepository.findAllByMsisdnInAndDeletedIsFalse(msisdnList).associateBy(Subscriber::msisdn))
}
}
private fun getUserResourcesBySsoId(ssoId: Long, myAccountClient: MyAccountClient): IO<Either<UserError, UserResource>> {
return IO {
val response = myAccountClient.getUserBySsoId(ssoId)
if (response.isSuccessful) {
val userResource = JacksonUtils.fromJsonToObject(response.body()?.string()!!, UserResource::class.java)
Either.Right(userResource)
} else {
when (response.code()) {
401 -> Either.Left(UserError.AuthenticationError)
404 -> Either.Left(UserError.UserNotFound)
else -> Either.Left(UserError.InternalServerError)
}
}
}.handleError { Either.Left(UserError.InternalServerError) }
}
which, as you can see is accumulating some results into an IO monad. I should run this program using unsafeRunSync() from arrow, but on javadoc it's stated the following: **NOTE** this function is intended for testing, it should never appear in your mainline production code!.
I should mention that I know about unsafeRunAsync, but in my case I want to be synchronous.
Thanks!
Instead of running unsafeRunSync, you should favor unsafeRunAsync.
If you have myFun(): IO<A> and want to run this, then you call myFun().unsafeRunAsync(cb) where cb: (Either<Throwable, A>) -> Unit.
For instance, if your function returns IO<List<Int>> then you can call
myFun().unsafeRunAsync { /* it (Either<Throwable, List<Int>>) -> */
it.fold(
{ Log.e("Foo", "Error! $it") },
{ println(it) })
}
This will run the program contained in the IO asynchronously and pass the result safely to the callback, which will log an error if the IO threw, and otherwise it will print the list of integers.
You should avoid unsafeRunSync for a number of reasons, discussed here. It's blocking, it can cause crashes, it can cause deadlocks, and it can halt your application.
If you really want to run your IO as a blocking computation, then you can precede this with attempt() to have your IO<A> become an IO<Either<Throwable, A>> similar to the unsafeRunAsync callback parameter. At least then you won't crash.
But unsafeRunAsync is preferred. Also, make sure your callback passed to unsafeRunAsync won't throw any errors, at it's assumed it won't. Docs.

Kotlin Process Collection In Parallel?

I have a collection of objects, which I need to perform some transformation on. Currently I am using:
var myObjects: List<MyObject> = getMyObjects()
myObjects.forEach{ myObj ->
someMethod(myObj)
}
It works fine, but I was hoping to speed it up by running someMethod() in parallel, instead of waiting for each object to finish, before starting on the next one.
Is there any way to do this in Kotlin? Maybe with doAsyncTask or something?
I know when this was asked over a year ago it was not possible, but now that Kotlin has coroutines like doAsyncTask I am curious if any of the coroutines can help
Yes, this can be done using coroutines. The following function applies an operation in parallel on all elements of a collection:
fun <A>Collection<A>.forEachParallel(f: suspend (A) -> Unit): Unit = runBlocking {
map { async(CommonPool) { f(it) } }.forEach { it.await() }
}
While the definition itself is a little cryptic, you can then easily apply it as you would expect:
myObjects.forEachParallel { myObj ->
someMethod(myObj)
}
Parallel map can be implemented in a similar way, see https://stackoverflow.com/a/45794062/1104870.
Java Stream is simple to use in Kotlin:
tasks.stream().parallel().forEach { computeNotSuspend(it) }
If you are using Android however, you cannot use Java 8 if you want an app compatible with an API lower than 24.
You can also use coroutines as you suggested. But it's not really part of the language as of now (August 2017) and you need to install an external library. There is very good guide with examples.
runBlocking<Unit> {
val deferreds = tasks.map { async(CommonPool) { compute(it) } }
deferreds.forEach { it.await() }
}
Note that coroutines are implemented with non-blocking multi-threading, which mean they can be faster than traditional multi-threading. I have code below benchmarking the Stream parallel versus coroutine and in that case the coroutine approach is 7 times faster on my machine. However you have to do some work yourself to make sure your code is "suspending" (non-locking) which can be quite tricky. In my example I'm just calling delay which is a suspend function provided by the library. Non-blocking multi-threading is not always faster than traditional multi-threading. It can be faster if you have many threads doing nothing but waiting on IO, which is kind of what my benchmark is doing.
My benchmarking code:
import kotlinx.coroutines.experimental.CommonPool
import kotlinx.coroutines.experimental.async
import kotlinx.coroutines.experimental.delay
import kotlinx.coroutines.experimental.launch
import kotlinx.coroutines.experimental.runBlocking
import java.util.*
import kotlin.system.measureNanoTime
import kotlin.system.measureTimeMillis
class SomeTask() {
val durationMS = random.nextInt(1000).toLong()
companion object {
val random = Random()
}
}
suspend fun compute(task: SomeTask): Unit {
delay(task.durationMS)
//println("done ${task.durationMS}")
return
}
fun computeNotSuspend(task: SomeTask): Unit {
Thread.sleep(task.durationMS)
//println("done ${task.durationMS}")
return
}
fun main(args: Array<String>) {
val n = 100
val tasks = List(n) { SomeTask() }
val timeCoroutine = measureNanoTime {
runBlocking<Unit> {
val deferreds = tasks.map { async(CommonPool) { compute(it) } }
deferreds.forEach { it.await() }
}
}
println("Coroutine ${timeCoroutine / 1_000_000} ms")
val timePar = measureNanoTime {
tasks.stream().parallel().forEach { computeNotSuspend(it) }
}
println("Stream parallel ${timePar / 1_000_000} ms")
}
Output on my 4 cores computer:
Coroutine: 1037 ms
Stream parallel: 7150 ms
If you uncomment out the println in the two compute functions you will see that in the non-blocking coroutine code the tasks are processed in the right order, but not with Streams.
You can use RxJava to solve this.
List<MyObjects> items = getList()
Observable.from(items).flatMap(object : Func1<MyObjects, Observable<String>>() {
fun call(item: MyObjects): Observable<String> {
return someMethod(item)
}
}).subscribeOn(Schedulers.io()).observeOn(AndroidSchedulers.mainThread()).subscribe(object : Subscriber<String>() {
fun onCompleted() {
}
fun onError(e: Throwable) {
}
fun onNext(s: String) {
// do on output of each string
}
})
By subscribing on Schedulers.io(), some method is scheduled on background thread.
To process items of a collection in parallel you can use Kotlin Coroutines. For example the following extension function processes items in parallel and waits for them to be processed:
suspend fun <T, R> Iterable<T>.processInParallel(
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> R,
): List<R> = coroutineScope { // or supervisorScope
map {
async(dispatcher) { processBlock(it) }
}.awaitAll()
}
This is suspend extension function on Iterable<T> type, which does a parallel processing of items and returns some result of processing each item. By default it uses Dispatchers.IO dispatcher to offload blocking tasks to a shared pool of threads. Must be called from a coroutine (including a coroutine with Dispatchers.Main dispatcher) or another suspend function.
Example of calling from a coroutine:
val myObjects: List<MyObject> = getMyObjects()
someCoroutineScope.launch {
val results = myObjects.processInParallel {
someMethod(it)
}
// use processing results
}
where someCoroutineScope is an instance of CoroutineScope.
Or if you want to just launch and forget you can use this function:
fun <T> CoroutineScope.processInParallelAndForget(
iterable: Iterable<T>,
dispatcher: CoroutineDispatcher = Dispatchers.IO,
processBlock: suspend (v: T) -> Unit
) = iterable.forEach {
launch(dispatcher) { processBlock(it) }
}
This is an extension function on CoroutineScope, which doesn't return any result. It also uses Dispatchers.IO dispatcher by default. Can be called using CoroutineScope or from another coroutine.
Calling example:
someoroutineScope.processInParallelAndForget(myObjects) {
someMethod(it)
}
// OR from another coroutine:
someCoroutineScope.launch {
processInParallelAndForget(myObjects) {
someMethod(it)
}
}
where someCoroutineScope is an instance of CoroutineScope.