Quarkus: execute parallel unis - kotlin

In a quarkus / kotlin application, I want to start multiple database requests concurrently. I am new at quarkys and I am not sure if I am doing things right:
val uni1 = Uni.createFrom().item(repo1).onItem().apply { it.request() }
val uni2 = Uni.createFrom().item(repo2).onItem().apply { it.request() }
return Uni.combine().all()
.unis(uni1, uni2)
.asTuple()
.onItem()
.apply { tuple ->
Result(tuple.item1, tuple.item2) }
.await()
.indefinitely()
Will the request() really be made in parallel? Is it the right way to do it in quarkus?

Yes, your code is right.
Uni.combine().all() runs all the passed Unis concurrently. You will get the tuple (containing the individual results) when all the Unis have completed (emitted a result).
From your code, you may remove the tuple step and use combineWith instead.
Finally, note that the await().indefinitely() blocks the caller thread, forever if one of the Uni does not complete (for whatever reason). I strongly recommend using await().atMost(...)

Related

Hwo to convert Flux<Item> to List<Item> by blocking

Background
I have a legacy application where I need to return a List<Item>
There are many different Service classes each belonging to an ItemType.
Each service class calls a few different backend APIs and collects the responses to create a SubType of the Item.
So we can say, each service class implementation returns an Item
All backend API access code is using WebClient which returns Mono of some type, and I can zip all Mono within the service to create an Item
The user should be able to look up many different types of items in one call. This requires many backend calls
So for performance sake, I wanted to make this all asynchronous using reactor, so I introduced Spring Reactive code.
Problem
If my endpoint had to return Flux<Item> then this code work fine,
But this is some service code which is used by other legacy code caller.
So eventually I want to return the List<Item> but When I try to convert my Flux into the List I get an error
"message": "block()/blockFirst()/blockLast() are blocking,
which is not supported in thread reactor-http-nio-3",
Here is the service, which is calling a few other service classes.
Flux<Item> itemFlux = Flux.fromIterable(searchRequestByItemType.entrySet())
.flatMap(e ->
getService(e.getKey()).searchItems(e.getValue()))
.subscribeOn(Schedulers.boundedElastic());
Mono<List<Item>> listMono = itemFlux
.collectList()
.block(); //This line throws error
Here is what the above service is calling
default Flux<Item> searchItems(List<SingleItemSearchRequest> requests) {
return Flux.fromIterable(requests)
.flatMap(this::searchItem)
.subscribeOn(Schedulers.boundedElastic());
}
Here is what a single-item search is which is used by above
public Mono<Item> searchItem(SingleItemSearchRequest sisr) {
return Mono.zip(backendApi.getItemANameApi(sisr.getItemIdentifiers().getItemId()),
sisr.isAddXXXDetails()
?backendApi.getItemAXXXApi(sisr.getItemIdentifiers().getItemId())
:Mono.empty(),
sisr.isAddYYYDetails()
?backendApi.getItemAYYYApi(sisr.getItemIdentifiers().getItemId())
:Mono.empty())
.map(tuple3 -> Item.builder()
.name(tuple3.getT1())
.xxxDetails(tuple3.getT2())
.yyyDetails(tuple3.getT3())
.build()
);
}
Sample project to replicate the problem..
https://github.com/mps-learning/spring-reactive-example
I’m new to spring reactor, feel free to pinpoint ALL errors in the code.
UPDATE
As per Patrick Hooijer Bonus suggestion, updating the Mono.zip entries to always contain some default.
#Override
public Mono<Item> searchItem(SingleItemSearchRequest sisr) {
System.out.println("\t\tInside " + supportedItem() + " searchItem with thread " + Thread.currentThread().toString());
//TODO: how to make these XXX YYY calls conditionals In clear way?
return Mono.zip(getNameDetails(sisr).defaultIfEmpty("Default Name"),
getXXXDetails(sisr).defaultIfEmpty("Default XXX Details"),
getYYYDetails(sisr).defaultIfEmpty("Default YYY Details"))
.map(tuple3 -> Item.builder()
.name(tuple3.getT1())
.xxxDetails(tuple3.getT2())
.yyyDetails(tuple3.getT3())
.build()
);
}
private Mono<String> getNameDetails(SingleItemSearchRequest sisr) {
return mockBackendApi.getItemCNameApi(sisr.getItemIdentifiers().getItemId());
}
private Mono<String> getYYYDetails(SingleItemSearchRequest sisr) {
return sisr.isAddYYYDetails()
? mockBackendApi.getItemCYYYApi(sisr.getItemIdentifiers().getItemId())
: Mono.empty();
}
private Mono<String> getXXXDetails(SingleItemSearchRequest sisr) {
return sisr.isAddXXXDetails()
? mockBackendApi.getItemCXXXApi(sisr.getItemIdentifiers().getItemId())
: Mono.empty();
}
Edit: Below answer does not solve the issue, but it contains useful information about Thread switching. It does not work because .block() is no problem for non-blocking Schedulers if it's used to switch to synchronous code.
This is because the block operator inherited the reactor-http-nio-3 Thread from backendApi.getItemANameApi (or one of the other calls in Mono.zip), which is non-blocking.
Most operators continue working on the Thread on which the previous operator executed, this is because the Thread is linked to the emitted item. There are two groups of operators where the Thread of the output item differs from the input:
flatMap, concatMap, zip, etc: Operators that emit items from other Publishers will keep the Thread link they received from this inner Publisher, not from the input.
Time based operators like delayElements, interval, buffer(Duration), etc. will schedule their tasks on the provided Scheduler, or Schedulers.parallel() if none provided. The emitted items will then be linked to the Thread the task was scheduled on.
In your case, Mono.zip emits items from backendApi.getItemANameApi linked to reactor-http-nio-3, which gets propagated downstream, goes outside both the flatMap in searchItems and in itemFlux, until it reaches your block operator.
You can solve this by placing a .publishOn(Schedulers.boundedElastic()), either in searchItem, searchItems or itemFlux. This will cause the item to switch to a Thread in the provided Scheduler.
Bonus: Since you requested to pinpoint errors: Your Mono.zip will not work if sisr.isAddXXXDetails() is false, as Mono.zip discards any element it could not zip. Since you return a Mono.empty() in that case, no items can be zipped and it will return an empty Mono.
If we have only spring-boot-starter-webflux defined as application dependency, then springbok spin up a `Netty server.
One is not expected to block() in a reactive application using a non-blocking server.
However, once we add spring-boot-starter-web dependency then even with the presence of spring-boot-starter-webflux, springboot spinup a tomcat server. Which is a thread-per-request model and is expected to have blocking calls
So to solve my problem, all I had to do above is, to add spring-boot-starter-web dependency in pom.xml. After that applications is started in Tomcat
with timcat .collectList().block() works in Controller class to return the List<Item>.
Whereas with the Netty server I could return only Flux<Item> not List<Item>, which is expected.

Neo4j 3.5's embeded database does not seem to persist data

I am trying to build a small command line tool that will store data in a neo4j graph. To do this I have started experimenting with Neo4j3.5's embedded databases. After putting together the following example I have found that either the nodes I am creating are not being saved to the database or the method of database creation is overwriting my previous run.
The Example:
fun main() {
//Spin up data base
val graphDBFactory = GraphDatabaseFactory()
val graphDB = graphDBFactory.newEmbeddedDatabase(File("src/main/resources/neo4j"))
registerShutdownHook(graphDB)
val tx = graphDB.beginTx()
graphDB.createNode(Label.label("firstNode"))
graphDB.createNode(Label.label("secondNode"))
val result = graphDB.execute("MATCH (a) RETURN COUNT(a)")
println(result.resultAsString())
tx.success()
}
private fun registerShutdownHook(graphDb: GraphDatabaseService) {
// Registers a shutdown hook for the Neo4j instance so that it
// shuts down nicely when the VM exits (even if you "Ctrl-C" the
// running application).
Runtime.getRuntime().addShutdownHook(object : Thread() {
override fun run() {
graphDb.shutdown()
}
})
}
I would expect that every time I run main the resulting query count will increase by 2.
That is currently not the case and I can find nothing in the docs that references a different method of opening an already created embedded database. Am I trying to use the embedded database incorrectly or am I missing something? Any help or info would be appreciated.
build Info:
Kotlin jvm 1.4.21
Neo4j-comunity-3.5.35
Transactions in neo4j 3.x have a 3 stage model
create
success / failure
close
you missed the third, which would then commit or rollback.
You can use Kotlin's use as Transaction is an AutoCloseable

How to use datastax driver 4 for performant concurrent cassandra database queries in kotlin?

I have a kotlin application which serves data via a RESTFul api. That data is stored in a cassandra database. To fulfill a request, the application needs to perform n queries to Cassandra. I want the API to respond quickly so I would like those n queries to execute in parallel. I also want to be able to handle multiple concurrent users without performance degrading.
Libraries:
implementation("com.datastax.oss:java-driver-core:4.13.0")
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-jdk8:1.4.3")
In Datastax 3, I have code which uses the synchronous method execute. I am wrapping this in a coroutine dispatcher and awaiting all requests.
Here is a sample code which queries the same row n times in a loop,
val numbers: List<Int> = (1..NUMBER_OF_QUERIES).toList()
val query = "SELECT JSON * FROM keyspace.table WHERE partition_key=X AND clustering_key=Y"
val (result, elapsed1) = measureTimedValue {
numbers.map { num: Int ->
CoroutineScope(Dispatchers.IO).async {
session.execute(q).all().map{it ->
toJson(it.getString(0).toString())
)
}
}
}.awaitAll()
}
Datastax 3 offers executeAsync using guava's ListenableFuture, but I couldn't get that to work within a coroutine even with https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-guava/index.html
For Datastax 4, I am trying to use the asynchronous API to achieve a similar result. My hope is the asynchronous API can perform better using fewer threads as it is non-blocking. However when I run a similar test case, I observe that the above code runs slower than the sync api from V3. In addition, the code does not perform well as more concurrent users are added.
val numbers: List<Int> = (1..NUMBER_OF_QUERIES).toList()
val query = "SELECT JSON * FROM keyspace.table WHERE partition_key=X AND clustering_key=Y"
val (result, elapsed1) = measureTimedValue {
numbers.map { num: Int ->
CoroutineScope(Dispatchers.IO).async {
session.executeAsync(q).asDeferred()
}
}.awaitAll().awaitAll().map{rs -> toJson(rs).await()}
}
Is there a better way to handle parallel executions of tasks returning CompletionStage<T> in kotlin?

Spring data - webflux - Chaining requests

i use reactive Mongo Drivers and Web Flux dependancies
I have a code like below.
public Mono<Employee> editEmployee(EmployeeEditRequest employeeEditRequest) {
return employeeRepository.findById(employeeEditRequest.getId())
.map(employee -> {
BeanUtils.copyProperties(employeeEditRequest, employee);
return employeeRepository.save(employee)
})
}
Employee Repository has the following code
Mono<Employee> findById(String employeeId)
Does the thread actually block when findById is called? I understand the portion within map actually blocks the thread.
if it blocks, how can I make this code completely reactive?
Also, in this reactive paradigm of writing code, how do I handle that given employee is not found?
Yes, map is a blocking and synchronous operation for which time taken is always going to be deterministic.
Map should be used when you want to do the transformation of an object /data in fixed time. The operations which are done synchronously. eg your BeanUtils copy properties operation.
FlatMap should be used for non-blocking operations, or in short anything which returns back Mono,Flux.
"how do I handle that given employee is not found?" -
findById returns empty mono when not found. So we can use switchIfEmpty here.
Now let's come to what changes you can make to your code:
public Mono<Employee> editEmployee(EmployeeEditRequest employeeEditRequest) {
return employeeRepository.findById(employeeEditRequest.getId())
.switchIfEmpty(Mono.defer(() -> {
//do something
}))
.map(employee -> {
BeanUtils.copyProperties(employeeEditRequest, employee);
return employee;
})
.flatMap(employee -> employeeRepository.save(employee));
}

Spark could not filter correctly?

I encounter an wired problem that the result is not correct.
I have a class called A, and it has a value called keyword.
I want to filter the RDD[A] if it has some keyword.
Spark environment:
version: 1.3.1
execution env: yarn-client
Here is the code:
class A ...
case class C(words:Set[String] ) extends Serializable {
def run(data:RDD[A])(implicit sc:SparkContext) ={
data.collect{ case x:A=> x }.filter(y => words.contains(y.keyword)).foreach(println)
}
}
// in main function
val data:RDD[A] = ....
val c = C(Set("abc"))
c.run(data)
The code above prints nothing. However if I collect RDD[A] to local, then it print something! E.g.
data.take(1000).collect{ case x:A=> x }.filter(y => words.contains(y.keyword)).foreach(println)}
How could this happen?
Let me ask another related question: Should I make case class C extends Serializable? I don't think it is necessary.
The reason is quite easy. If you run the println function when you collect data locally, what happens is that your data are trasferred over the network to the machine you are using (let's call it the client of the Spark environment) and then it is printed on your console. SO far, everything behaves as expected. Instead, if you run the println function on a distributed RDD, the println function is executed locally on the worker machine on which there are your data. So the function is actually executed but you won't see any result on the console of your client, unless it is also a worker machine: in fact, everything is printed on the console of the respective worker node.
No, it's not necessary you make it Serializable, the only thing is serialized is your words:Set[String].