Schedulers in Project Reactor with Spring Webflux - spring-webflux

Project Reactor is awesome, easily I can switch a thread to processing some parts on another thread but I've looked inside to Schedulers.fromExecutorService() method, and this method every time allocates new ExecutorService. So when this method is called then always schedulers are creating and allocated again. I am not sure but I think it potential memory leak...
Mono<String> sometext() {
return Mono
.fromCallable(() -> "" )
.subscribeOn(Schedulers.newParallel("my-custom));
}
I wonder about registering Scheduler as bean, it singleton so only once will be allocated not every time or create him in the constructor. Many of the blogs explaining the threading model in this way.
...
private final Scheduler scheduler = Schedulers.newParallel("my-custom);
..
Mono.fromCallable(() -> "" ).subscribeOn(scheduler)

Schedulers.newParallel() will indeed create a new scheduler with an associated backed threadpool every time you call it - so yes, you're correct, if you're using that method then you want to make sure you store a reference to it somewhere so you can reuse it. Simply providing the same name argument won't just retrieve the new scheduler, it'll just create a different one with the same name.
How you do this is up to you - it can be via a spring bean (as long as it's a singleton and not a prototype bean!), a field, or whatever else fits best in with your use case.
However, before all of this I'd first consider whether you definitely need to create a separate parallel scheduler at all. The Schedulers.parallel() scheduler is a default parallel scheduler that can be used for parallel work out the tin (it doesn't create a new one on each invocation), and unless you need separately configured parallel schedulers for separate services for some reason, best practice is just to use that.

Related

Proper logging in reactive application - WebFlux

last time I am thinking about proper using logger in our applications.
For example, I have a controller which returns a stream of users but in the log, I see the "Fetch Users" log is being logged by another thread than the thread on the processing pipeline but is it a good approach?
#Slf4j
class AwesomeController {
#GetMapping(path = "/users")
public Flux<User> getUsers() {
log.info("Fetch users..");
return Flux.just(...)..subscribeOn(Schedulers.newParallel("my-custom"));
}
}
In this case, two threads are used and from my perspective, not a good option, but I can't find good practices with loggers in reactive applications. I think below approach is better because allocation memory is from processing thread but not from spring webflux thread which potential can be blocking but logger.
#GetMapping(path = "/users")
public Flux<User> getUsers() {
return Flux.defer(() -> {
return Mono.fromCallable(() -> {
log.info("Fetch users..");
.....
})
}).subscribeOn(Schedulers.newParallel("my-custom"))
}
The normal thing to do would be to configure the logger as asynchronous (this usually has to be explicit as per the comments, but all modern logging frameworks support it) and then just include it "normally" (either as a separate line as you have there, or in a side-effect method such as doOnNext() if you want it half way through the reactive chain.)
If you want to be sure that the logger's call isn't blocking, then use BlockHound to make sure (this is never a bad idea anyway.) But in any case, I can't see a use case for your second example there - that makes the code rather difficult to follow with no real advantage.
One final thing to watch out for - remember that if you include the logging statement separately as you have above, rather than as part of the reactive chain, then it'll execute when the method at calltime rather than subscription time. That may not matter in scenarios like this where the two happen near simultaneously, but would be rather confusing if (for example) you're returning a publisher which may be subscribed to multiple times - in that case, you'd only ever see the "Fetch users..." statement once, which isn't obvious when glancing through the code.

Should I use synchronous Mono::map or asycnhronous Mono::flatMap?

The projectReactor documentation says that Mono::flatMap is asynchronous, as shown below.
So, I can write all my methods to return Mono publishers like this.
public Mono<String> myMethod(String name) {
return Mono.just("hello " + name);
}
and use it with Mono::flatMap like this:
Mono.just("name").flatMap(this::myMethod);
Does this make the execution of my method asynchronous? Does this make my code more reactive, better and faster than just using Mono::map? Is the overhead prohibitive for doing this for all my methods?
public final Mono flatMap(Function<? super T,? extends Mono<? extends R>> transformer)
Transform the item emitted by this Mono asynchronously, returning the value emitted by another Mono (possibly changing the value type).
Does this make the execution of my method asynchronous?
Let's go to the definition of Asynchronous for this:
Asynchronous programming is a means of parallel programming in which a unit of work runs separately from the main application thread and notifies the calling thread of its completion, failure or progress.
Here your unit of work is happening in the same thread, unless you do a subscribeOn with a Scheduler. So this isn't async.
Does this make my code more reactive, better and faster than just using Mono::map?
No way. Since in this case, the publisher Mono.just("hello " + name) immediately notifies the subscriber that I am done, the thread in which the processing was going on immediately picks up that event from the event loop and starts processing the response.
Rather, this might cause few more operations internally than a map which simply transforms the element.
Thus, ideally, you should use a flatMap when you have an I/O Operation (like DB calls) or Network calls, which might take some time, which you can utilize in doing some other task , if all the threads are busy.

WebFlux and Kotlin corountines without ReactiveCrudRepository

I'm working on a project which is using Kotlin, Spring Boot, Hibernate (all on latest version) and I would like to make it reactive with WebFlux framework from Spring.
Problem is that I can't use ReactiveCrudRepository because web app have to use Oracle database and therefore Hibernate. So I couldn't figured out a way how to use non blocking access to Oracle SQL database (only free frameworks).
My question is:
Is it possible to use this like that:
Casual CrudRepository which is blocking
Service which use corountines and returns everything as Mono
Service example code:
fun getAllLanguages(): Mono<Collection<ProgrammingLanguage>> = async { repository.findAll() }.asMono()
Afterwards there will be controller with:
fun listProgrammingLanguagesReactive() = mono(Unconfined) {
service.also { logger.info { "requesting list of programming languages" } }
.getAllLanguages()
.also { logger.info { "responding with list of programming languages" } }
}
This approach works but I'm not sure whether it will work all the time and whether this is not terrible practice and so on.
The problem with synchronous blocking API is that there will be a thread blocked for each API call. There is simply no way around it, coroutines or not.
Your approach is as good as any for providing asynchronous adapter to blocking API.
However, please consider following:
You may want to confine async { repository.findAll() } and similar blocking calls to a dedicated fixed ThreadPool/Dispatcher. While coroutines are cheap, remember, that repository.findAll() blocks actual underlying thread and you don't want to exhaust all thread in the CommonPool (which is used by async by default).
This is a useful practice, as you're limiting the number of threads/simultaneous blocking calls. If your fixed pool gets exhausted at some point, then incoming requests will be suspended, without blocking threads, until there are available threads in the pool to process them.

Command Pattern clarification

I cannot see any of the command pattern classes e.g. invoker, receiver manifesting in the accepted answer of the following link Long list of if statements in Java. I have gone with the accepted answer to solve my 30+ if/else statements.
I have one repository that I am trying to pass DTOs to save to the database. I want the repository to invoke the correct save method for the DTO so I am checking the instance type at runtime.
Here is the implementation in Repository
private Map<Class<?>, Command> commandMap;
public void setCommandMap(Map<Class<?>, Command> commandMap) {
this.commandMap = commandMap;
}
and a method that will populate the commandMap
commandMap.put(Address.class, new CommandAddress());
commandMap.put(Animal.class, new CommandAnimal());
commandMap.put(Client.class, new CommandClient());
and finally the method that saves
public void getValue(){
commandMap.get(these.get(0).getClass()).save();
}
The service class that uses the Repo registers the commandMap.
Does the accepted answer represent a sort of (approximate) implementation of the Command pattern?
It seems like an enum that implements an exec interface will eliminate your if/else problem or turn it into a switch.
It does not look like you need a command patterm.
Gof says:
Use the Command pattern when you want to
parameterize objects by an action to perform, as MenuItem objects did above. You can express such parameterization in a procedural language with a callback function, that is, a function that's registered somewhere to be called at a later point. Commands are an object-oriented replacement for callbacks.
specify, queue, and execute requests at different times. A Command object can have a lifetime independent of the original request. If the receiver of a request can be represented in an address space-independent way, then you can transfer a command object for the request to a different process and fulfill the request there.
support undo. The Command's Execute operation can store state for reversing its effects in the command itself. The Command interface must have an added Unexecute operation that reverses the effects of a previous call to Execute. Executed commands are stored in a history list. Unlimited-level undo and redo is achieved by traversing this list backwards and forwards calling Unexecute and Execute, respectively.
support logging changes so that they can be reapplied in case of a system crash. By augmenting the Command interface with load and store operations, you can keep a persistent log of changes. Recovering from a crash involves reloading logged commands from disk and reexecuting them with the Execute operation.
structure a system around high-level operations built on primitives operations. Such a structure is common in information systems that support transactions. A transaction encapsulates a set of changes to data. The Command pattern offers a way to model transactions. Commands have a common interface, letting you invoke all transactions the same way. The pattern also makes it easy to extend the system with new transactions.
Which of the above do you want to do?

Apache JENA TDB files locked after creation with web application

I am using JENA to create a triple store (TDB functionality) with the following code:
public void createTDBFromOWL() {
Dataset dataset = TDBFactory.createDataset(newTripleStoreLocation);
dataset.begin(ReadWrite.WRITE);
try {
//getting the model inside the transaction
Model model = dataset.getDefaultModel();
FileManager fileManager=FileManager.get();
Model holder=fileManager.readModel(model, newOWLFileLocation);
//committing dataset
dataset.commit();
model.close();
holder.close();
} finally {
dataset.end();
dataset.close();
}
}
After I create the triple store, the files created are locked by my application server (Glassfish), and I can't delete them until I manually stop Glassfish and it releases its lock. As shown in the above code, I think I am closing everything, so I don't get why a lock is maintained on the files.
When you call Dataset#close(), the implementation delegates that call to an underlying
DatasetGraphBase#close(), which then ultimately delegates to DatasetGraphTDB#_close().
This results in calls to TripleTable#close() and QuadTable#close(). Both of these call (several) NodeTupleTable#close(). Continuing with the indirection, this calls NodeTable#close() and TupleTable#close(). The former is an interface, so we'd need to make a proper guess as to which class is run in your implementation. The latter iterates through a collection of TupleIndex objects and calls close() on each of them. TupleIndex is, also, an interface.
There is only one meaningful heirarchy of descendents from TupleIndex that results in something which can lock a file, which leads us to TupleIndexRecord#close(). We can then follow a particular implementation of RangeIndex called BPlusTree all the way down until we see actual ownership of the MappedByteBuffer
Ultimately, while reading the implementation of BlockAccessMapped#close(), it seems like the entire heirarchy is closing things properly, down to the final classes, but that this longstanding bug may be the culprit. From the documentation:
once a file has been mapped a number of operations on that file will
fail until the mapping has been released (e.g. delete, truncating to a
size less than the mapped area). However the programmer can't control
accurately the time at which the unmapping takes place --- typically
it depends on the processing of finalization or a PhantomReference
queue.
So there you have it. Despite Jena's best efforts, one cannot yet control when that file will be unmapped in Java. This ends up being the tradeoff for memory-mapped file IO in java.