Should I use synchronous Mono::map or asycnhronous Mono::flatMap? - spring-webflux

The projectReactor documentation says that Mono::flatMap is asynchronous, as shown below.
So, I can write all my methods to return Mono publishers like this.
public Mono<String> myMethod(String name) {
return Mono.just("hello " + name);
}
and use it with Mono::flatMap like this:
Mono.just("name").flatMap(this::myMethod);
Does this make the execution of my method asynchronous? Does this make my code more reactive, better and faster than just using Mono::map? Is the overhead prohibitive for doing this for all my methods?
public final Mono flatMap(Function<? super T,? extends Mono<? extends R>> transformer)
Transform the item emitted by this Mono asynchronously, returning the value emitted by another Mono (possibly changing the value type).

Does this make the execution of my method asynchronous?
Let's go to the definition of Asynchronous for this:
Asynchronous programming is a means of parallel programming in which a unit of work runs separately from the main application thread and notifies the calling thread of its completion, failure or progress.
Here your unit of work is happening in the same thread, unless you do a subscribeOn with a Scheduler. So this isn't async.
Does this make my code more reactive, better and faster than just using Mono::map?
No way. Since in this case, the publisher Mono.just("hello " + name) immediately notifies the subscriber that I am done, the thread in which the processing was going on immediately picks up that event from the event loop and starts processing the response.
Rather, this might cause few more operations internally than a map which simply transforms the element.
Thus, ideally, you should use a flatMap when you have an I/O Operation (like DB calls) or Network calls, which might take some time, which you can utilize in doing some other task , if all the threads are busy.

Related

Kotlin co-routine executes sequentially, but only on production machine

I have a bunch of network requests I want to conduct in parallel.
The following pseudo code should give a good idea of what I'm doing right now:
runBlocking {
buildList {
withContext(tracer.asContextElement()) {
items.forEach { item ->
add(
async {
// a few IO intensive operations (i.e. network requests)
}
)
}
}
}.awaitAll()
}
I have tracing tools set up and locally this seems to do the job. In my production infrastructure however the async tasks execute sequentially, i.e. the second one starts immediately after the first one finishes.
I have also tried using withContext(Dispatchers.IO.plus(tracer.asContextElement())) but I observe no difference.
The only thing I can say is that my development machine has multiple CPU cores, and my production machine will normally have 1. Regardless, due to the IO heavy nature of these processes I doubt this is the problem. I can't really explain what is causing this, but my gut feeling is that I'm fundamentally not understanding something about how Coroutines work in Kotlin.
As to the nature of the network request in question, I'm using a third party SDK that asynchronously executes the request, and seems to use ForkJoinPool.commonPool() under the hood as an executor.
If you don't switch dispatchers here, all those coroutines will run in the same thread - the one blocked by runBlocking. If the computation inside each coroutine is blocking, they will block the only thread one by one without any way to parallelize. This would explain what you're seeing (although it's strange that you don't reproduce locally).
I have also tried using withContext(Dispatchers.IO.plus(tracer.asContextElement())) but I observe no difference.
Your fix should work, unless the IO you're performing is actually managing threads itself and also confining the execution to a single thread no matter where it's called from. Maybe you should look into the actual IO then.
EDIT: you mentioned that you perform the IO operations via a third party SDK that uses the common ForkJoinPool - this one is backed by a single thread on a single-CPU machine, so this explains why the calls aren't parallelized in your single-CPU production machine. The only options to fix that would be:
check whether the SDK you're using allows to customize the backing pool of threads
customize the size of the ForkJoinPool using the JVM property java.util.concurrent.ForkJoinPool.common.parallelism
use another SDK :)
You still need to customize the dispatcher in addition to that if you're calling the library in a blocking way, but not if you're converting their async tasks into suspensions using Future.await() or similar.
Now, a few other things to note in this code:
you don't need buildList { .. }, you can just use map { thing } instead of forEach { add(thing) } and you'll get the resulting list as a return value (it also works across withContext, because it returns the lambda result)
withContext actually waits for all child coroutines to finish, so awaitAll() is misplaced here (it should rather be inside withContext)
actually, you probably don't need withContext at all, you can pass the custom context directly to runBlocking, unless you have other things in runBlocking that you don't want to run in this context
(optional) if the IO computations don't return results, you don't need awaitAll at all, and you could just use launch instead.
Assuming you do need the result, so ignoring the last point, your current code (with dispatcher fix) could be rewritten to:
val results = runBlocking(Dispatchers.IO + tracer.asContextElement()) {
items.map { item ->
async {
performIO(item)
}
}.awaitAll()
}
Otherwise:
runBlocking(Dispatchers.IO + tracer.asContextElement()) {
items.map { item ->
launch {
performIO(item)
}
}
}

Schedulers in Project Reactor with Spring Webflux

Project Reactor is awesome, easily I can switch a thread to processing some parts on another thread but I've looked inside to Schedulers.fromExecutorService() method, and this method every time allocates new ExecutorService. So when this method is called then always schedulers are creating and allocated again. I am not sure but I think it potential memory leak...
Mono<String> sometext() {
return Mono
.fromCallable(() -> "" )
.subscribeOn(Schedulers.newParallel("my-custom));
}
I wonder about registering Scheduler as bean, it singleton so only once will be allocated not every time or create him in the constructor. Many of the blogs explaining the threading model in this way.
...
private final Scheduler scheduler = Schedulers.newParallel("my-custom);
..
Mono.fromCallable(() -> "" ).subscribeOn(scheduler)
Schedulers.newParallel() will indeed create a new scheduler with an associated backed threadpool every time you call it - so yes, you're correct, if you're using that method then you want to make sure you store a reference to it somewhere so you can reuse it. Simply providing the same name argument won't just retrieve the new scheduler, it'll just create a different one with the same name.
How you do this is up to you - it can be via a spring bean (as long as it's a singleton and not a prototype bean!), a field, or whatever else fits best in with your use case.
However, before all of this I'd first consider whether you definitely need to create a separate parallel scheduler at all. The Schedulers.parallel() scheduler is a default parallel scheduler that can be used for parallel work out the tin (it doesn't create a new one on each invocation), and unless you need separately configured parallel schedulers for separate services for some reason, best practice is just to use that.

Proper logging in reactive application - WebFlux

last time I am thinking about proper using logger in our applications.
For example, I have a controller which returns a stream of users but in the log, I see the "Fetch Users" log is being logged by another thread than the thread on the processing pipeline but is it a good approach?
#Slf4j
class AwesomeController {
#GetMapping(path = "/users")
public Flux<User> getUsers() {
log.info("Fetch users..");
return Flux.just(...)..subscribeOn(Schedulers.newParallel("my-custom"));
}
}
In this case, two threads are used and from my perspective, not a good option, but I can't find good practices with loggers in reactive applications. I think below approach is better because allocation memory is from processing thread but not from spring webflux thread which potential can be blocking but logger.
#GetMapping(path = "/users")
public Flux<User> getUsers() {
return Flux.defer(() -> {
return Mono.fromCallable(() -> {
log.info("Fetch users..");
.....
})
}).subscribeOn(Schedulers.newParallel("my-custom"))
}
The normal thing to do would be to configure the logger as asynchronous (this usually has to be explicit as per the comments, but all modern logging frameworks support it) and then just include it "normally" (either as a separate line as you have there, or in a side-effect method such as doOnNext() if you want it half way through the reactive chain.)
If you want to be sure that the logger's call isn't blocking, then use BlockHound to make sure (this is never a bad idea anyway.) But in any case, I can't see a use case for your second example there - that makes the code rather difficult to follow with no real advantage.
One final thing to watch out for - remember that if you include the logging statement separately as you have above, rather than as part of the reactive chain, then it'll execute when the method at calltime rather than subscription time. That may not matter in scenarios like this where the two happen near simultaneously, but would be rather confusing if (for example) you're returning a publisher which may be subscribed to multiple times - in that case, you'd only ever see the "Fetch users..." statement once, which isn't obvious when glancing through the code.

Axon & CompletableFuture

I've faced with problems when i try to use CompletableFuture with Axon.
For example:
CompletableFuture future = CompletableFuture.supplyAsync(() -> {
log.info("Start processing target: {}", target.toString());
return new Event();
}, threadPool);
future.thenAcceptAsync(event -> {
log.info("Send Event");
AggregateLifecycle.apply(event);
}, currentExecutor);
in thenAcceptAsync - AggregateLifecycle.apply(event) has unexpected behavior. Some of my #EventSourcingHandler handlers start handling event twice. Does anybody know how to fix it?
I have been reading docs and everything that i got is:
In most cases, the DefaultUnitOfWork will provide you with the
functionality you need. It expects processing to happen within a
single thread.
so, it seems i should use somehow CurrentUnitOfWork.get/set methods but still can't understand Axon API.
You should not apply() events asynchronously. The apply() method will call the aggregate's internal #EventSourcingHandler methods and schedule the event for publication when the unit of work completes (successfully).
The way Axon works with the Unit of Work (which coordinates activity of a single message handler invocation), the apply() method must be invoked in the thread that manages that Unit of Work.
If you want asynchronous publication of Events, use an Event Bus that uses an Async Transport, and use Tracking Processors.

ViewComponent InvokeAsync method and non async operation

In asp.net core ViewComponent we have to implement logic in an InvokeAsync method that returns an IViewComponentResult. However I do not have any async logic to perform inside the invoke method. So based on SO post here I have removed the async qualifier and just return Task.FromResult
public Task<IViewComponentResult> InvokeAsync(MyBaseModel model)
{
var name = MyFactory.GetViewComponent(model.DocumentTypeID);
return Task.FromResult<IViewComponentResult>(View(name, model));
}
and then in View ( since I don't have async I am not using await here)
#Component.InvokeAsync("MyViewComponent", new { model = Model })
However view renders this:
System.Threading.Tasks.Task1[Microsoft.AspNetCore.Html.IHtmlContent]`
You must await the Component.InvokeAsync. The fact that your method doesn't do anything async doesn't matter. The method itself is async.
However, that's a bit of an oversimplification. Frankly, the ease of the async/await keywords belies how complicated all this actually is. To be accurate, instead of calling these types of methods "async", it's more appropriate to discuss them as "task-returning". A task is essentially a handle for some operation. That operation could be async or sync. It's most closely associated with async, simply because wrapping sync operations in a task would be pretty pointless in most scenarios. However, the point is that just because something must return a task does not also imply that it must be async.
All async does is allow the possibility of a thread switch. In scenarios where there's some operation, typically involving I/O, that would cause the working thread to be idle for some period of time, the thread becomes available to be used for other work, and the original work may complete on a different thread. Notice the use of the passive language here. Async operations can involve no thread switching; the task could complete on the same thread, as if it was sync. The task could even complete immediately, if the underlying operation has already completed.
In your scenario here, you're not doing any async work, which is fine. However, the method definition requires Task<T> as the return, so you must use Task.FromResult to return your actual result. That's all pretty standard stuff, and seems to be understood already by you. What you're missing, I think, is that you're thinking that since you're not actually doing any asynchronous work, it would be wrong to utilize await. There's nothing magical about the await keyword; it basically just means hold here until the task completes. If there's no async work to be done, as is the case here, the sync code will just run as normal and yield back to the calling code when done, However, as a convenience, await also performs one other vital function: it unwraps the task.
That is where your problem lies. Since you're not awaiting, the task itself is being returned into the Razor view processing pipeline. It doesn't know what to do with that, so it does what it does by default and just calls ToString on it, hence the text you're getting back. Unwrapped, you'd just have IViewComponentResult and Razor does know what to do with that.
If your logic performed inside the invoke method is synchronous, i.e., you don't have any await, you have 2 options:
You can define invoke method without async keyword and it should return Task.FromResult<T>
Use public IViewComponentResult Invoke() instead.
I think the async keyword enables the await keyword and that's pretty much about it. Nothing special about async keyword.
On the main thread where your view is getting rendered, since the tag helper method
to invoke a view component Component.InvokeAsync() is awaitable, you do need to put await keyword there to start the task. await examines the view component rendering to see if it has already completed. If it has, then the main thread just keeps going. Otherwise the main thread will tell the ViewComponent to run.