ReactiveRedisOperations not saving object in Redis - redis

I am using ReactiveRedisOperations to save data objects in Redis and this call returns a Mono as per the api.
I notice that if I don't do anything with this Mono return than this code does not do anything.
Just trying to understand how this works.
I would like below code to save every Object to Redis in this loop, however it does not do so, please share what is missing here.
for (SomeObject obj : list) {
reactiveRedisOperations.opsForHash().put(key, hashKey, obj).map(b -> obj); }
On the other side if i return the Mono result from similar code via a rest service response than it seems to save in Redis correctly, not sure why this is this way. Thanks

This is a quirk of reactive streams, not Lettuce.
Unlike a completable future which begins execution when it's created, a stream won't begin executing (the command isn't sent) until a consumer has subscribed to it.
I believe this is to facilitate backpressure, so a slow consumer isn't flooded with data by the producer.
Some nice reading -> https://blog.knoldus.com/working-with-project-reactor-reactive-streams/

If you return a Mono, to the underlying web framework, which generally will handle subscribe(ing) to this Mono, the respective operation will trigger resulting in whatever side-effects such as data being created in your Redis datastore.
Shall you wish to have your operations executed, you should do the same, i.e. subscribe to the publisher (Mono, or Flux) or return these data wrappers to whatever calling function you would know will handle this for you as in the aforementioned example:
Flux.fromIterable(list)
.flatMap(obj -> reactiveRedisOperations.opsForHash().put(key, hashKey, obj))
.subscribe();

Related

Proper logging in reactive application - WebFlux

last time I am thinking about proper using logger in our applications.
For example, I have a controller which returns a stream of users but in the log, I see the "Fetch Users" log is being logged by another thread than the thread on the processing pipeline but is it a good approach?
#Slf4j
class AwesomeController {
#GetMapping(path = "/users")
public Flux<User> getUsers() {
log.info("Fetch users..");
return Flux.just(...)..subscribeOn(Schedulers.newParallel("my-custom"));
}
}
In this case, two threads are used and from my perspective, not a good option, but I can't find good practices with loggers in reactive applications. I think below approach is better because allocation memory is from processing thread but not from spring webflux thread which potential can be blocking but logger.
#GetMapping(path = "/users")
public Flux<User> getUsers() {
return Flux.defer(() -> {
return Mono.fromCallable(() -> {
log.info("Fetch users..");
.....
})
}).subscribeOn(Schedulers.newParallel("my-custom"))
}
The normal thing to do would be to configure the logger as asynchronous (this usually has to be explicit as per the comments, but all modern logging frameworks support it) and then just include it "normally" (either as a separate line as you have there, or in a side-effect method such as doOnNext() if you want it half way through the reactive chain.)
If you want to be sure that the logger's call isn't blocking, then use BlockHound to make sure (this is never a bad idea anyway.) But in any case, I can't see a use case for your second example there - that makes the code rather difficult to follow with no real advantage.
One final thing to watch out for - remember that if you include the logging statement separately as you have above, rather than as part of the reactive chain, then it'll execute when the method at calltime rather than subscription time. That may not matter in scenarios like this where the two happen near simultaneously, but would be rather confusing if (for example) you're returning a publisher which may be subscribed to multiple times - in that case, you'd only ever see the "Fetch users..." statement once, which isn't obvious when glancing through the code.

RX way to get local data and send to remote

I'm starting with RX and I have a offline app the needs sync data with remote api. What better way to get data from database and send to remote api one by one watching the response of each? Which operators should I use to sequence tasks?
The simplest thing you can do is something like this (kotlin)
getDataFromDb()
.map { doNetworkRequest(it) }
.doOnNext { saveToDb(it) }
.subscribe()
but it really depends on your needs / environment.
You will probably need more mapping in the middle (e.g. to transform the network response to the data that you need to save to the db) and error handling.
Here I assume you don't need the result of saveToDb so I put it as a side effect instead of inside the main observable flow.
Another aspect is how you want to handle the network requests: is it ok to perform them in parallel or not? And maybe use a flatmap

Mono Slice with data transformation in spring reactive cassandra

After reading the new spring data casandra documentation (here), it says that there is now Mono<Slice<T>> support for reactive cassandra.
Which is great, because in my team, we wanted to implement some sort of pagination on our reactive response. So we are happy to move from Flux<T> to Mono<Slice<T>> but there is this issue, we do transformations on the data of our flux, with flux.map, but that doesn't seem possible with Slice without blocking.
For example, we have this functionallity:
Flux<Location> resp = repository.searchLocations(searchFields).map(this::transformLocation);
Where transformLocation is a function that receives the database object and returns a more user friendly object with more user friendly data.
How will you achieve that with Mono<Slice<Location>>?
From what I have seen of Slice, you can get the data with getContent, but that returns a list, will that not be like blocking?
You can use the list, got from the getContent() method, to create a flux as you were doing before.
Mono<Slice<Location>> sliceMono; //this is just another mono on which you can operate
Flux<location> intermediateResp = sliceMono.flatMap(slice -> Flux.fromIterable(slice.getContent()));
//you can now transform this intermediateResp flux as you were doing before, can't you?
Won't this serve your purpose or do you require something else?
(The code has been written without the help of IDE, so use it to understand the approach)

Difference between FireAndForget and Async behavior for publishing

Currently, we are using StackExchange.Redis and, as it does not provides "blocking pops", we are doing as suggested on the documentation:
db.ListLeftPush(key, newWork, flags: CommandFlags.FireAndForget);
sub.Publish(channel, "");
What is the difference from this to the following?
db.ListLeftPushAsync(key, newWork);
sub.Publish(channel, "");
We know the purpose of the commands, what we would like to know is if it has any difference internally or any risk of behaving differently? (Execution order etc.)
There's a main difference comparing fire and forget vs calling an async operation and not awaiting it.
Fire and forget means that not only you're not waiting for the result but you don't care if it works or not, while an async operation may throw an exception once it has ended if something goes wrong.
In the other hand, when you issue a fire and forget command, StackExchange.Redis doesn't try to retrieve the command result internally, which is better if you just want the so-called fire and forget behavior when issuing commands.
You may check this difference if you open ConnectionMultiplexer source code and you see how ExecuteAsyncImpl / ExecuteSyncImpl methods are implemented:
// For example, ExecuteAsyncImpl...
if (message.IsFireAndForget)
{
TryPushMessageToBridge(message, processor, null, ref server);
return CompletedTask<T>.Default(null); // F+F explicitly does not get async-state
}
else
{
var tcs = TaskSource.CreateDenyExecSync<T>(state);
var source = ResultBox<T>.Get(tcs);
if (!TryPushMessageToBridge(message, processor, source, ref server))
{
ThrowFailed(tcs, ExceptionFactory.NoConnectionAvailable(IncludeDetailInExceptions, message.Command, message, server));
}
return tcs.Task;
}
Answer to some OP comment
Hi. Thanks for your answer. We know the purpose of the commands, what
we would like to know is if it has any differrence internally or any
risk of behaving differently (execution order etc.)
Since the async operation won't be finished when you publish the message on the Redis channel, it can happen that you publish the message and the operation gets executed never. You lose a lot of control.
When you send a fire and forget command, it mightn't be executed too, but you know that the try was done before you publish the channel's message. Therefore, you shouldn't use async operations to implement fire and forget pattern when using StackExchange.Redis.
You may check this other related Q&A: Stackexchange.redis does fire and forget guarantees delivery?

Apache JENA TDB files locked after creation with web application

I am using JENA to create a triple store (TDB functionality) with the following code:
public void createTDBFromOWL() {
Dataset dataset = TDBFactory.createDataset(newTripleStoreLocation);
dataset.begin(ReadWrite.WRITE);
try {
//getting the model inside the transaction
Model model = dataset.getDefaultModel();
FileManager fileManager=FileManager.get();
Model holder=fileManager.readModel(model, newOWLFileLocation);
//committing dataset
dataset.commit();
model.close();
holder.close();
} finally {
dataset.end();
dataset.close();
}
}
After I create the triple store, the files created are locked by my application server (Glassfish), and I can't delete them until I manually stop Glassfish and it releases its lock. As shown in the above code, I think I am closing everything, so I don't get why a lock is maintained on the files.
When you call Dataset#close(), the implementation delegates that call to an underlying
DatasetGraphBase#close(), which then ultimately delegates to DatasetGraphTDB#_close().
This results in calls to TripleTable#close() and QuadTable#close(). Both of these call (several) NodeTupleTable#close(). Continuing with the indirection, this calls NodeTable#close() and TupleTable#close(). The former is an interface, so we'd need to make a proper guess as to which class is run in your implementation. The latter iterates through a collection of TupleIndex objects and calls close() on each of them. TupleIndex is, also, an interface.
There is only one meaningful heirarchy of descendents from TupleIndex that results in something which can lock a file, which leads us to TupleIndexRecord#close(). We can then follow a particular implementation of RangeIndex called BPlusTree all the way down until we see actual ownership of the MappedByteBuffer
Ultimately, while reading the implementation of BlockAccessMapped#close(), it seems like the entire heirarchy is closing things properly, down to the final classes, but that this longstanding bug may be the culprit. From the documentation:
once a file has been mapped a number of operations on that file will
fail until the mapping has been released (e.g. delete, truncating to a
size less than the mapped area). However the programmer can't control
accurately the time at which the unmapping takes place --- typically
it depends on the processing of finalization or a PhantomReference
queue.
So there you have it. Despite Jena's best efforts, one cannot yet control when that file will be unmapped in Java. This ends up being the tradeoff for memory-mapped file IO in java.