Dynamic set of publishers all emiting through the same flux - spring-webflux

I am trying to build a kind of hub service that can emit through a hot Flux (output) but you can also register/unregister Flux producers/publishers (input)
I know I can do something like:
class Hub<T> {
/**
* #return unregister function
*/
Function<Void, Void> registerProducer(final Flux<T> flux) { ... }
Disposable subscribe(Consumer<? super T> consumer) {
if (out == null) {
// obviously this will not work!
out = Flux.merge(producer1, producer2, ...).share();
}
return out;
}
}
... but as these "producers" are registered and unregistered, how do I add a new flux source to the existing subscribed to flux? or remove a unregistered source from it?
TIA!

Flux is immutable by design, so as you've implied in the question, there's no way to just "update" an existing Flux in situ.
Usually I'd recommend avoiding using a Processor directly. However, this is one of the (rare-ish) cases where a Processor is probably the only sane option, since you essentially want to be publishing elements dynamically based on the producers that you're registering. Something similar to:
class Hub<T> {
private final FluxProcessor<T, T> processor;
private final FluxSink<T> sink;
public Hub() {
this.processor = DirectProcessor.<T>create().serialize();
this.sink = processor.sink();
}
public Disposable registerProducer(Flux<T> flux) {
return flux.subscribe(sink::next);
}
public Flux<T> read() {
return processor;
}
}
If you want to remove a producer, then you can keep track of the Disposable returned from registerProducer() and call dispose() on it when done.

Related

Migration guide for Schedulers.enableMetrics() in project Reactor

I noticed that Schedulers.enableMetrics() got deprecated but I don't know what I should I do to get all my schedulers metered in a typical use case (using Spring Boot application).
Javadoc suggests using timedScheduler but how should it be achieved for Spring Boot?
First off, here are my thoughts on why the Schedulers.enableMetrics() approach was deprecated:
The previous approach was flawed in several ways:
intrinsic dependency on the MeterRegistry#globalRegistry() without any way of using a different registry.
wrong level of abstraction and limited instrumentation:
it was not the schedulers themselves that were instrumented, but individual ExecutorService instances assumed to back the schedulers.
schedulers NOT backed by any ExecutorService couldn't be instrumented.
schedulers backed by MULTIPLE ExecutorService (eg. a pool of workers) would produce multiple levels of metrics difficult to aggregate.
instrumentation was all-or-nothing, potentially polluting metrics backend with metrics from global or irrelevant schedulers.
A deliberate constraint of the new approach is that each Scheduler must be explicitly wrapped, which ensures that the correct MeterRegistry is used and that metrics are recognizable and aggregated for that particular Scheduler (thanks to the mandatory metricsPrefix).
I'm not a Spring Boot expert, but if you really want to instrument all the schedulers including the global ones here is a naive approach that will aggregate data from all the schedulers of same category, demonstrated in a Spring Boot app:
#SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
#Configuration
static class SchedulersConfiguration {
#Bean
#Order(1)
public Scheduler originalScheduler() {
// For comparison, we can capture a new original Scheduler (which won't be disposed by setFactory, unlike the global ones)
return Schedulers.newBoundedElastic(4, 100, "compare");
}
#Bean
public SimpleMeterRegistry registry() {
return new SimpleMeterRegistry();
}
#Bean
public Schedulers.Factory instrumentedSchedulers(SimpleMeterRegistry registry) {
// Let's create a Factory that does the same as the default Schedulers factory in Reactor-Core, but with instrumentation
return new Schedulers.Factory() {
#Override
public Scheduler newBoundedElastic(int threadCap, int queuedTaskCap, ThreadFactory threadFactory, int ttlSeconds) {
// The default implementation maps to the vanilla Schedulers so we can delegate to that
Scheduler original = Schedulers.Factory.super.newBoundedElastic(threadCap, queuedTaskCap, threadFactory, ttlSeconds);
// IMPORTANT NOTE: in this example _all_ the schedulers of the same type will share the same prefix/name
// this would especially be problematic if gauges were involved as they replace old gauges of the same name.
// Fortunately, for now, TimedScheduler only uses counters, timers and longTaskTimers.
String prefix = "my.instrumented.boundedElastic"; // TimedScheduler will add `.scheduler.xxx` to that prefix
return Micrometer.timedScheduler(original, registry, prefix);
}
#Override
public Scheduler newParallel(int parallelism, ThreadFactory threadFactory) {
Scheduler original = Schedulers.Factory.super.newParallel(parallelism, threadFactory);
String prefix = "my.instrumented.parallel"; // TimedScheduler will add `.scheduler.xxx` to that prefix
return Micrometer.timedScheduler(original, registry, prefix);
}
#Override
public Scheduler newSingle(ThreadFactory threadFactory) {
Scheduler original = Schedulers.Factory.super.newSingle(threadFactory);
String prefix = "my.instrumented.single"; // TimedScheduler will add `.scheduler.xxx` to that prefix
return Micrometer.timedScheduler(original, registry, prefix);
}
};
}
#PreDestroy
void resetFactories() {
System.err.println("Resetting Schedulers Factory to default");
// Later on if we want to disable instrumentation we can reset the Factory to defaults (closing all instrumented schedulers)
Schedulers.resetFactory();
}
}
#Service
public static class Demo implements ApplicationRunner {
final Scheduler forComparison;
final SimpleMeterRegistry registry;
final Schedulers.Factory factory;
Demo(Scheduler forComparison, SimpleMeterRegistry registry, Schedulers.Factory factory) {
this.forComparison = forComparison;
this.registry = registry;
this.factory = factory;
Schedulers.setFactory(factory);
}
public void generateMetrics() {
Schedulers.boundedElastic().schedule(() -> {});
Schedulers.newBoundedElastic(4, 100, "bounded1").schedule(() -> {});
Schedulers.newBoundedElastic(4, 100, "bounded2").schedule(() -> {});
Micrometer.timedScheduler(
forComparison,
registry,
"my.custom.instrumented.bounded"
).schedule(() -> {});
Schedulers.newBoundedElastic(4, 100, "bounded3").schedule(() -> {});
}
public String getCompletedSummary() {
return Search.in(registry)
.name(n -> n.endsWith(".scheduler.tasks.completed"))
.timers()
.stream()
.map(c -> c.getId().getName() + "=" + c.count())
.collect(Collectors.joining("\n"));
}
#Override
public void run(ApplicationArguments args) throws Exception {
generateMetrics();
System.err.println(getCompletedSummary());
}
}
}
Which prints:
my.instrumented.boundedElastic.scheduler.tasks.completed=4
my.custom.instrumented.bounded.scheduler.tasks.completed=1
Notice how the metrics for the four instrumentedFactory-produced Scheduler are aggregated together.
There's a bit of a hacky workaround for this: by default Schedulers uses ReactorThreadFactory, an internal private class which happens to be a Supplier<String>, supplying the "simplified name" (ie toString but without the configuration options) of the Scheduler.
One could use the following method to tentatively extract that name:
static String inferSimpleSchedulerName(ThreadFactory threadFactory, String defaultName) {
if (!(threadFactory instanceof Supplier)) {
return defaultName;
}
Object supplied = ((Supplier<?>) threadFactory).get();
if (!(supplied instanceof String)) {
return defaultName;
}
return (String) supplied;
}
Which can be applied to eg. the newParallel method in the factory:
String simplifiedName = inferSimpleSchedulerName(threadFactory, "para???");
String prefix = "my.instrumented." + simplifiedName; // TimedScheduler will add `.scheduler.xxx` to that prefix
This can then be demonstrated by submitting a few tasks to different parallel schedulers in the Demo#generateMetrics() part:
Schedulers.parallel().schedule(() -> {});
Schedulers.newParallel("paraOne").schedule(() -> {});
Schedulers.newParallel("paraTwo").schedule(() -> {});
And now it prints (blank lines for emphasis):
my.instrumented.paraOne.scheduler.tasks.completed=1
my.instrumented.paraTwo.scheduler.tasks.completed=1
my.instrumented.parallel.scheduler.tasks.completed=1
my.custom.instrumented.bounded.scheduler.tasks.completed=1
my.instrumented.boundedElastic.scheduler.tasks.completed=4

Which design pattern to use for using different subclasses based on input [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 months ago.
Improve this question
There is an interface called Processor, which has two implementations SimpleProcessor and ComplexProcessor.
Now I have a process, which consumes an input, and then using that input decides whether it should use SimpleProcessor or ComplexProcessor.
Current solution : I was thinking to use Abstract Factory, which will generate the instance on the basis of the input.
But the issue is that I don't want new instances. I want to use already instantiated objects. That is, I want to re-use the instances.
That means, Abstract factory is absolutely the wrong pattern to use here, as it is for generating objects on the basis of type.
Another thing, that our team normally does is to create a map from input to the corresponding processor instance. And at runtime, we can use that map to get the correct instance on the basis of input.
This feels like a adhoc solution.
I want this to be extendable : new input types can be mapped to new processor types.
Is there some standard way to solve this?
You can use a variation of the Chain of Responsibility pattern.
It will scale far better than using a Map (or hash table in general).
This variation will support dependency injection and is very easy to extend (without breaking any code or violating the Open-Closed principle).
Opposed to the classic version, handlers do not need to be explicitly chained. The classic version scales very bad.
The pattern uses polymorphism to enable extensibility and is therefore targeting an object oriented language.
The pattern is as follows:
The client API is a container class, that manages a collection of input handlers (for example SimnpleProcessor and ComplexProcessor).
Each handler is only known to the container by a common interface and unknown to the client.
The collection of handlers is passed to the container via the constructor (to enable optional dependency injection).
The container accepts the predicate (input) and passes it on to the anonymous handlers by iterating over the handler collection.
Each handler now decides based on the input if it can handle it (return true) or not (return false).
If a handler returns true (to signal that the input was successfully handled), the container will break further input processing by other handlers (alternatively, use a different criteria e.g., to allow multiple handlers to handle the input).
In the following very basic example implementation, the order of handler execution is simply defined by their position in their container (collection).
If this isn't sufficient, you can simply implement a priority algorithm.
Implementation (C#)
Below is the container. It manages the individual handler implementation using polymorphism. Since handler implementation are only known by their common interface, the container scales extremely well: simply add/inject an additional handler implementation.
The container is actually used directly by the client (whereas the handlers are hidden from the client, while anonymous to the container).
interface IInputProcessor
{
void Process(object input);
}
class InputProcessor : IInputProcessor
{
private IEnumerable<IInputHandler> InputHandlers { get; }
// Constructor.
// Optionally use an IoC container to inject the dependency (a collection of input handlers).
public InputProcessor(IEnumerable<IInputHandler> inputHandlers)
{
this.InputHandlers = inputHandlers;
}
// Method to handle the input.
// The input is then delegated to the input handlers.
public void Process(object input)
{
foreach (IInputHandler inputHandler in this.InputHandlers)
{
if (inputHandler.TryHandle(input))
{
return;
}
}
}
}
Below are the input handlers.
To add new handlers i.e. to extend input handling, simply implement the IInputHandler interface and add it to a collection which is passed/injected to the container (IInputProcessor):
interface IInputHandler
{
bool TryHandle(object input);
}
class SimpleProcessor : IInputHandler
{
public bool TryHandle(object input)
{
if (input == 1)
{
//TODO::Handle input
return true;
}
return false;
}
}
class ComplexProcessor : IInputHandler
{
public bool TryHandle(object input)
{
if (input == 3)
{
//TODO::Handle input
return true;
}
return false;
}
}
Usage Example
public class Program
{
public static void Main()
{
/* Setup Chain of Responsibility.
/* Preferably configure an IoC container. */
var inputHandlers = new List<IInputHandlers>
{
new SimpleProcessor(),
new ComplexProcessor()
};
IInputProcessor inputProcessor = new InputProcessor(inputHandlers);
/* Use the handler chain */
int input = 3;
inputProcessor.Pocess(input); // Will execute the ComplexProcessor
input = 1;
inputProcessor.Pocess(input); // Will execute the SimpleProcessor
}
}
It is possible to use Strategy pattern with combination of Factory pattern. Factory objects can be cached to have reusable objects without recreating them when objects are necessary.
As an alternative to caching, it is possible to use singleton pattern. In ASP.NET Core it is pretty simple. And if you have DI container, just make sure that you've set settings of creation instance to singleton
Let's start with the first example. We need some enum of ProcessorType:
public enum ProcessorType
{
Simple, Complex
}
Then this is our abstraction of processors:
public interface IProcessor
{
DateTime DateCreated { get; }
}
And its concrete implemetations:
public class SimpleProcessor : IProcessor
{
public DateTime DateCreated { get; } = DateTime.Now;
}
public class ComplexProcessor : IProcessor
{
public DateTime DateCreated { get; } = DateTime.Now;
}
Then we need a factory with cached values:
public class ProcessorFactory
{
private static readonly IDictionary<ProcessorType, IProcessor> _cache
= new Dictionary<ProcessorType, IProcessor>()
{
{ ProcessorType.Simple, new SimpleProcessor() },
{ ProcessorType.Complex, new ComplexProcessor() }
};
public IProcessor GetInstance(ProcessorType processorType)
{
return _cache[processorType];
}
}
And code can be run like this:
ProcessorFactory processorFactory = new ProcessorFactory();
Thread.Sleep(3000);
var simpleProcessor = processorFactory.GetInstance(ProcessorType.Simple);
Console.WriteLine(simpleProcessor.DateCreated); // OUTPUT: 2022-07-07 8:00:01
ProcessorFactory processorFactory_1 = new ProcessorFactory();
Thread.Sleep(3000);
var complexProcessor = processorFactory_1.GetInstance(ProcessorType.Complex);
Console.WriteLine(complexProcessor.DateCreated); // OUTPUT: 2022-07-07 8:00:01
The second way
The second way is to use DI container. So we need to modify our factory to get instances from dependency injection container:
public class ProcessorFactoryByDI
{
private readonly IDictionary<ProcessorType, IProcessor> _cache;
public ProcessorFactoryByDI(
SimpleProcessor simpleProcessor,
ComplexProcessor complexProcessor)
{
_cache = new Dictionary<ProcessorType, IProcessor>()
{
{ ProcessorType.Simple, simpleProcessor },
{ ProcessorType.Complex, complexProcessor }
};
}
public IProcessor GetInstance(ProcessorType processorType)
{
return _cache[processorType];
}
}
And if you use ASP.NET Core, then you can declare your objects as singleton like this:
services.AddSingleton<SimpleProcessor>();
services.AddSingleton<ComplexProcessor>();
Read more about lifetime of an object

Spring Integration testing a Files.inboundAdapter flow

I have this flow that I am trying to test but nothing works as expected. The flow itself works well but testing seems a bit tricky.
This is my flow:
#Configuration
#RequiredArgsConstructor
public class FileInboundFlow {
private final ThreadPoolTaskExecutor threadPoolTaskExecutor;
private String filePath;
#Bean
public IntegrationFlow fileReaderFlow() {
return IntegrationFlows.from(Files.inboundAdapter(new File(this.filePath))
.filterFunction(...)
.preventDuplicates(false),
endpointConfigurer -> endpointConfigurer.poller(
Pollers.fixedDelay(500)
.taskExecutor(this.threadPoolTaskExecutor)
.maxMessagesPerPoll(15)))
.transform(new UnZipTransformer())
.enrichHeaders(this::headersEnricher)
.transform(Message.class, this::modifyMessagePayload)
.route(Map.class, this::channelsRouter)
.get();
}
private String channelsRouter(Map<String, File> payload) {
boolean isZip = payload.values()
.stream()
.anyMatch(file -> isZipFile(file));
return isZip ? ZIP_CHANNEL : XML_CHANNEL; // ZIP_CHANNEL and XML_CHANNEL are PublishSubscribeChannel
}
#Bean
public SubscribableChannel xmlChannel() {
var channel = new PublishSubscribeChannel(this.threadPoolTaskExecutor);
channel.setBeanName(XML_CHANNEL);
return channel;
}
#Bean
public SubscribableChannel zipChannel() {
var channel = new PublishSubscribeChannel(this.threadPoolTaskExecutor);
channel.setBeanName(ZIP_CHANNEL);
return channel;
}
//There is a #ServiceActivator on each channel
#ServiceActivator(inputChannel = XML_CHANNEL)
public void handleXml(Message<Map<String, File>> message) {
...
}
#ServiceActivator(inputChannel = ZIP_CHANNEL)
public void handleZip(Message<Map<String, File>> message) {
...
}
//Plus an #Transformer on the XML_CHANNEL
#Transformer(inputChannel = XML_CHANNEL, outputChannel = BUS_CHANNEL)
private List<BusData> xmlFileToIngestionMessagePayload(Map<String, File> xmlFilesByName) {
return xmlFilesByName.values()
.stream()
.map(...)
.collect(Collectors.toList());
}
}
I would like to test multiple cases, the first one is checking the message payload published on each channel after the end of fileReaderFlow.
So I defined this test classe:
#SpringBootTest
#SpringIntegrationTest
#ExtendWith(SpringExtension.class)
class FileInboundFlowTest {
#Autowired
private MockIntegrationContext mockIntegrationContext;
#TempDir
static Path localWorkDir;
#BeforeEach
void setUp() {
copyFileToTheFlowDir(); // here I copy a file to trigger the flow
}
#Test
void checkXmlChannelPayloadTest() throws InterruptedException {
Thread.sleep(1000); //waiting for the flow execution
PublishSubscribeChannel xmlChannel = this.getBean(XML_CHANNEL, PublishSubscribeChannel.class); // I extract the channel to listen to the message sent to it.
xmlChannel.subscribe(message -> {
assertThat(message.getPayload()).isInstanceOf(Map.class); // This is never executed
});
}
}
As expected that test does not work because the assertThat(message.getPayload()).isInstanceOf(Map.class); is never executed.
After reading the documentation I didn't find any hint to help me solved that issue. Any help would be appreciated! Thanks a lot
First of all that channel.setBeanName(XML_CHANNEL); does not effect the target bean. You do this on the bean creation phase and dependency injection container knows nothing about this setting: it just does not consult with it. If you really would like to dictate an XML_CHANNEL for bean name, you'd better look into the #Bean(name) attribute.
The problem in the test that you are missing the fact of async logic of the flow. That Files.inboundAdapter() works if fully different thread and emits messages outside of your test method. So, even if you could subscribe to the channel in time, before any message is emitted to it, that doesn't mean your test will work correctly: the assertThat() will be performed on a different thread. Therefore no real JUnit report for your test method context.
So, what I'd suggest to do is:
Have Files.inboundAdapter() stopped in the beginning of the test before any setup you'd like to do in the test. Or at least don't place files into that filePath, so the channel adapter doesn't emit messages.
Take the channel from the application context and if you wish subscribe or use a ChannelInterceptor.
Have an async barrier, e.g. CountDownLatch to pass to that subscriber.
Start the channel adapter or put file into the dir for scanning.
Wait for the async barrier before verifying some value or state.

How to iterate an object inside a Flux and do an operation on it?

I'm using project reactor and I'd like to perform the following:
#Override
public void run(ApplicationArguments args) {
Flux.from(KafkaReceiver.create(receiverOptions)
.receive()
.map(this::getObject)
.flatMap(this::iterateElasticWrites)
.flatMap(this::writeTheWholeObjectToS3)
).subscribe();
}
// What I'd like to do - but a non reactive code
private Publisher<MyObj> iterateElasticWrites(MyObj message) {
for (MyDoc file: message.getDocs()) {
writeElasticDoc(file.getText());
}
return Mono.just(message);
}
I'm struggling in finding out the equivalent of the iterateElasticWrites in Project Reactor. I'd like to perform an iteration of an object of mine (MyObj), and write each of its documents list's element into elasticsearch reactively.
In Reactor you always need to construct a reactive flow using different operators and all reactive/async code should return Mono or Flux.
Looking at your example it could look like
private Mono<MyObj> iterateElasticWrites(MyObj message) {
return Flux.fromIterable(message.getDocs())
.flatMap(doc -> writeElasticDoc(doc.getText()))
.then(Mono.just(message));
}
where writeElasticDoc could be defined as
private Mono<Void> writeElasticDoc(String text) {
...
}

Spring WebFlux (Flux): how to publish dynamically

I am new to Reactive programming and Spring WebFlux. I want to make my App 1 publish Server Sent event through Flux and my App 2 listen on it continuously.
I want Flux publish on-demand (e.g. when something happens). All the example I found is to use Flux.interval to periodically publish event, and there seems no way to append/modify the content in Flux once it is created.
How can I achieve my goal? Or I am totally wrong conceptually.
Publish "dynamically" using FluxProcessor and FluxSink
One of the techniques to supply data manually to the Flux is using FluxProcessor#sink method as in the following example
#SpringBootApplication
#RestController
public class DemoApplication {
final FluxProcessor processor;
final FluxSink sink;
final AtomicLong counter;
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
public DemoApplication() {
this.processor = DirectProcessor.create().serialize();
this.sink = processor.sink();
this.counter = new AtomicLong();
}
#GetMapping("/send")
public void test() {
sink.next("Hello World #" + counter.getAndIncrement());
}
#RequestMapping(produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent> sse() {
return processor.map(e -> ServerSentEvent.builder(e).build());
}
}
Here, I created DirectProcessor in order to support multiple subscribers, that will listen to the data stream. Also, I provided additional FluxProcessor#serialize which provide safe support for multiproducer (invocation from different threads without violation of Reactive Streams spec rules, especially rule 1.3). Finally, by calling "http://localhost:8080/send" we will see the message Hello World #1 (of course, only in case if you connected to the "http://localhost:8080" previously)
Update For Reactor 3.4
With Reactor 3.4 you have a new API called reactor.core.publisher.Sinks. Sinks API offers a fluent builder for manual data-sending which lets you specify things like the number of elements in the stream and backpressure behavior, number of supported subscribers, and replay capabilities:
#SpringBootApplication
#RestController
public class DemoApplication {
final Sinks.Many sink;
final AtomicLong counter;
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
public DemoApplication() {
this.sink = Sinks.many().multicast().onBackpressureBuffer();
this.counter = new AtomicLong();
}
#GetMapping("/send")
public void test() {
EmitResult result = sink.tryEmitNext("Hello World #" + counter.getAndIncrement());
if (result.isFailure()) {
// do something here, since emission failed
}
}
#RequestMapping(produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent> sse() {
return sink.asFlux().map(e -> ServerSentEvent.builder(e).build());
}
}
Note, message sending via Sinks API introduces a new concept of emission and its result. The reason for such API is the fact that the Reactor extends Reactive-Streams and has to follow the backpressure control. That said if you emit more signals than was requested, and the underlying implementation does not support buffering, your message will not be delivered. Therefore, the result of tryEmitNext returns the EmitResult which indicates if the message was sent or not.
Also, note, that by default Sinsk API gives a serialized version of Sink, which means you don't have to care about concurrency. However, if you know in advance that the emission of the message is serial, you may build a Sinks.unsafe() version which does not serialize given messages
Just another idea, using EmitterProcessor as a gateway to flux
import reactor.core.publisher.EmitterProcessor;
import reactor.core.publisher.Flux;
public class MyEmitterProcessor {
EmitterProcessor<String> emitterProcessor;
public static void main(String args[]) {
MyEmitterProcessor myEmitterProcessor = new MyEmitterProcessor();
Flux<String> publisher = myEmitterProcessor.getPublisher();
myEmitterProcessor.onNext("A");
myEmitterProcessor.onNext("B");
myEmitterProcessor.onNext("C");
myEmitterProcessor.complete();
publisher.subscribe(x -> System.out.println(x));
}
public Flux<String> getPublisher() {
emitterProcessor = EmitterProcessor.create();
return emitterProcessor.map(x -> "consume: " + x);
}
public void onNext(String nextString) {
emitterProcessor.onNext(nextString);
}
public void complete() {
emitterProcessor.onComplete();
}
}
More info, see here from Reactor doc. There is a recommendation from the document itself that "Most of the time, you should try to avoid using a Processor. They are harder to use correctly and prone to some corner cases." BUT I don't know which kind of corner case.