In Spring WebFlux I have a controller similar to this:
#RestController
#RequestMapping("/data")
public class DataController {
#GetMapping(produces = MediaType.APPLICATION_JSON_VALUE)
public Flux<Data> getData() {
return <data from database using reactive driver>
}
}
What exactly is subscribing to the publisher?
What (if anything) is providing backpressure?
For context I'm trying to evaluate if there are advantages to using Spring WebFlux in this specific situation over Spring MVC.
Note: I am not a developer of spring framework, so any comments are welcome.
What exactly is subscribing to the publisher?
It is a long living subscription to the port (the server initialisation itself). Therefore, the ReactorHttpServer.class has the method:
#Override
protected void startInternal() {
DisposableServer server = this.reactorServer.handle(this.reactorHandler).bind().block();
setPort(((InetSocketAddress) server.address()).getPort());
this.serverRef.set(server);
}
The Subscriber is the bind method, which (as far as I can see) does request(Long.MAX_VALUE), so no back pressure management here.
The important part for request handling is the method handle(this.reactorHandler). The reactorHandler is an instance of ReactorHttpHandlerAdapter. Further up the stack (within the apply method of ReactorHttpHandlerAdapter) is the DispatcherHandler.class. The java doc of this class starts with " Central dispatcher for HTTP request handlers/controllers. Dispatches to registered handlers for processing a request, providing convenient mapping facilities.". It has the central method:
#Override
public Mono<Void> handle(ServerWebExchange exchange) {
if (this.handlerMappings == null) {
return createNotFoundError();
}
return Flux.fromIterable(this.handlerMappings)
.concatMap(mapping -> mapping.getHandler(exchange))
.next()
.switchIfEmpty(createNotFoundError())
.flatMap(handler -> invokeHandler(exchange, handler))
.flatMap(result -> handleResult(exchange, result));
}
Here, the actual request processing happens. The response is written within handleResult. It now depends on the actual server implementation, how the result is written.
For the default server, i.e. Reactor Netty it will be a ReactorServerHttpResponse.class. Here you can see the method writeWithInternal. This one takes the publisher result of the handler method and writes it to the underlying HTTP connection:
#Override
protected Mono<Void> writeWithInternal(Publisher<? extends DataBuffer> publisher) {
return this.response.send(toByteBufs(publisher)).then();
}
One implementation of NettyOutbound.send( ... ) is reactor.netty.channel.ChannelOperations. For your specific case of a Flux return, this implementation manages the NIO within MonoSendMany.class. This class does subscribe( ... ) with a SendManyInner.class, which does back pressure management by implementing Subscriber which onSubscribe does request(128). I guess Netty internally uses TCP ACK to signal successful transmission.
So,
What (if anything) is providing backpressure?
... yes, backpressure is provided, e.g. by SendManyInner.class, however also other implementations exist.
For context I'm trying to evaluate if there are advantages to using Spring WebFlux in this specific situation over Spring MVC.
I think, it is definitely worth evaluating. For performance I however guess, the result will depend on the amount of concurrent requests and maybe also on the type of your Data class. Generally speaking, Webflux is usually the preferred choice for high throughput, low latency situations, and we generally see better hardware utilization in our environments. Assuming you take your data from a database you probably will have the best results with a database driver that too supports reactive. Besides performance, the back pressure management is always a good reason to have a look at Webflux. Since we adopted to Webflux, our data platform never had problems with stability anymore (not to claim, there are no other ways to have a stable system, but here many issues are solved out of the box).
As a side note: I recommend, having a closer look at Schedulers we just recently gained 30% cpu time by choosing the right one for slow DB accesses.
EDIT:
In https://docs.spring.io/spring/docs/current/spring-framework-reference/web-reactive.html#webflux-fn-handler-functions the reference documentation explicitly says:
ServerRequest and ServerResponse are immutable interfaces that offer JDK 8-friendly access to the HTTP request and response. Both request and response provide Reactive Streams back pressure against the body streams.
What exactly is subscribing to the publisher?
The framework (so Spring, in this case.)
In general, you shouldn't subscribe in your own application - the framework should be subscribing to your publisher when necessary. In the context of spring, that's whenever a relevant request hits that controller.
What (if anything) is providing backpressure?
In this case, it's only restricted by the speed of the connection (I believe Webflux will look at the underlying TCP layer) and then request data as required. Whether your upstream flux listens to that backpressure though is another story - it may do, or it may just flood the consumer with as much data as it can.
For context I'm trying to evaluate if there are advantages to using Spring WebFlux in this specific situation over Spring MVC.
The main advantage is being able to hold huge numbers of connections open with only a few threads - so no overhead of context switching. (That's not the sole advantage, but most of the advantages generally boil down to that point.) Usually, this is only an advantage worth considering if you need to hold in the region of thousands of connections open at once.
The main disadvantage is the fact reactive code looks very different from standard Java code, and is usually necessarily more complex as a result. Debugging is also harder - vanilla stack traces become all but useless for instance (though their are tools & techniques to make this easier.)
Related
Would like to ask a question about two technologies.
We first started with an application that has to call other third parties rest API, hence, we used the Webflux WebClient in our SpringBoot Webflux project. So far so good, we had a successful app for a while.
Then the third party app (not ours) started to become flaky, sometimes will fail on our requests. We had to implement some kind of retry logic. After the implementation of the retry logic, such as WebClient reties, the business flow is now working fine.
We mainly took logics from the framework directly. For instance, a talk from #simon-baslé, Cancel, Retry and Timeouts at the recent SpringOne gave many working examples.
.retryWhen(backoff(5, Duration.ofMillis(10).maxbackOff(Duration.ofSeconds(1)).jitter(0.4)).timeout(Duration.ofSeconds(5)
On the other hand, lately, there are more and more apps moving towards Circuit Breaker pattern. The Spring Cloud Circuit Breaker project, backed by Resilience4J is a popular implementation using Resilience4J for patterns such as Circuit Breaker, Bulkhead, and of course Retry.
Hence, I am having a question, is there a benefit of using/combining both in terms of retry?
Any gain in terms of having the two together? Any drawbacks?
Or only one of the two is enough, in which case, which one please? And why?
Thank you
we (Resilience4j Team) have implemented custom Spring Reactor operators for CircuitBreaker, Retry and Timeout. Internally Retry and Timeout use operators from Spring Reactor, but Resilience4j adds functionality on top of it:
External configuration of Retry, Timeout and CircuitBreaker via config files
Spring Cloud Config support to dynamically adjust the configuration
Metrics, metrics, metrics ;)
Please see https://resilience4j.readme.io/docs/examples-1 and https://resilience4j.readme.io/docs/getting-started-3
You can even use Annotations to make it more simple:
#CircuitBreaker(name = BACKEND)
#RateLimiter(name = BACKEND)
#Retry(name = BACKEND)
#TimeLimiter(name = BACKEND, fallbackMethod = "fallback")
public Mono<String> method(String param1) {
return ...
}
private Mono<String> fallback(String param1, TimeoutException ex) {
return ...;
}
Please be aware that we are providing our own Spring Boot starter. I'm NOT talking about the Spring Cloud CircuitBreaker project.
I have a service that collects data and has to survive the app's life-cycle changes while app is in the background. This service resides in the same process as my app, i.e. registered in the manifest as well.
The service posts LiveData to the app, and the main app retrieves this LiveData by binding to the service and doing something like:
private void onServiceConnected(TicketValidatorService service) {
...
service.getStatus().observe(this, new Observer<SomeStatus>() {
#Override
public void onChanged(SomeStatus status) {
handleStatusChanged(status);
}
})
...
}
Is this considered bad practice? Or should I rather communicate via Messenger/Handler or LocalBroadcastManager stuff over the service/app boundary? It would be difficult to put the service in another process, but I don't think I have to do that for the sake of my task.
Communication to a local service directly is not considered to be a bad practice and in fact an official recommendation. There is no reason to complicate your code to support cross-process communication when you are not going to use it. Moreover this kind of communication involves marshaling / unmarshaling which adds restrictions on data types you can pass through and has some performance hit.
Also please note, starting from android 8 there are limitations on background services. So if you are not running your service as a foreground service it's not going to be alive for long time after your app goes to background.
My understand is Mono<List<T>> is a synchronized Flux<T>
and Flux could not be a rest api response.
Am I right?
If not, what's the different between Mono<List<T>> and Flux<T>
or could a Flux could be a rest api response in some where ?
as a return type, Mono<List<T>> means that you'll get asynchronously a full list of T elements in one shot.
Flux<T> means that you'll get zero to many T elements, possibly one by one as they come.
If you're getting such return types from an HTTP client such as WebClient, Mono<List<T>> and Flux<T> might be more or less equivalent from a runtime perspective, if the returned Content-Type is for example "application/json". In this case, the decoder will deserialize the response in one shot. The only different is, Flux<T> provides more interesting operators and you can always collectList and fall back to a Mono<List>.
On the other hand, if the returned Content-Type is a streaming one, for example "application/stream+json" then this definitely has an impact as you'll get the elements one by one as they come. In fact, if the returned stream is infinite, choosing Flux<T> is very important as the other one will never complete.
Mono<List<T>> will emit zero or maximal one list of item of type T.
Flux<T>will emit zero or many items of type T
Momo wraps is bounded and Flux is not.
Mono<List<T>> is a synchronized Flux
Mono and Flux are both a Reactor implementation of a Publisher interface specified in a Reactive Stream Specification.
Reactor Mono class:
public abstract class Mono<T> implements Publisher<T> {...}
Reactor Mono class:
public abstract class Flux<T> implements Publisher<T> {...}
Flux could not be a rest api response.
Of course Fluxcan be used as response type of REST API. By using Flux as return type you can easily switch from asynchronous to synchronous processing . If you use Spring Boot you can even stream data to your consumer just by changing the Content-Type of you API endpoint to application/stream+json as mention by #Brian.
Note that Flux and Mono are non blocking which means that you working threads (computer resources) can be used more efficiently.
The example at https://spring.io/guides/gs/caching-gemfire/ shows that if there is a cache miss, we have to fetch the data from a server and store in the cache.
Is this an example of Gemfire running as the Gemfire server or is it a Gemfire client? I thought a client would automatically fetch the data from a Server if there is a cache miss. If that is the case, would there ever be a cache miss for the client?
Regards,
Yash
First, I think you are missing the point of the core Spring Framework's Cache Abstraction. I encourage you to read more about the Cache Abstraction's intended purpose here.
In a nutshell, if one of your application objects makes a call to some "external", "expensive" service to access a resource, then caching maybe applicable, especially if the inputs passed result in the exact same output every single time.
So, for a moment, lets imagine your application makes a call to the Geocoding API in the Google Maps API to translate a addresses and (the inverse,) latitude/longitude coordinates.
You might have a application Spring #Service component like so...
#Service("AddressService")
class MyApplicationAddressService {
#Autowired
private GoogleGeocodingApiDao googleGeocodingApiDao;
#Cacheable("Address")
public Address getAddressFor(Point location) {
return googleGeocodingApiDao.convert(location);
}
}
#Region("Address")
class Address {
private Point location;
private State state;
private String street;
private String city;
private String zipCode;
...
}
Clearly, given a latitude/longitude (input), it should produce the same Address (result) everytime. Also, since making a (network) call to an external API like Google's Geocoding service can be very expensive, to both access the resource and perform the conversion, then this type of service call is a perfect candidate for use to cache in our application.
Among many other caching providers (e.g. EhCache, Hazelcaset, Redis, etc), you can, of course, use Pivotal GemFire, or the open source alternative, Apache Geode to back Spring's Caching Abstraction.
In your Pivotal GemFire/Apache Geode setup, you can of course use either the peer-to-peer (P2P) or client/server topology, it doesn't really matter, and GemFire/Geode will do the right thing, once "called upon".
But, the Spring Cache Abstraction documentation states, when you make a call to one of your application components methods (e.g. getAddressFor(:Point)) that support caching (with #Cacheable) the interceptor will first "consult" the cache before making the method call. If the value is present in the cache, then that value is returned and the "expensive" method call (e.g. getAddressFor(:Point)) will not be invoked.
However, if there is a cache miss, then Spring will proceed in invoking the method, and upon successful return from the method invocation, cache the result of the call in the backing cache provider (such as GemFire/Geode) so that the next time the method call is invoked with the same input, the cached value will be returned.
Now, if your application is using the client/sever topology, then of course, the client cache will forward the request onto the server if...
The corresponding client Region is a PROXY, or...
The corresponding client Region is a CACHING_PROXY, and the client's local client-side Region does not contain the requested Point for the Address.
I encourage you to read more about different client Region data management policies here.
To see another working example of Spring's Caching Abstraction backed by Pivotal GemFire in Action, have a look at...
caching-example
I used this example in my SpringOne-2015 talk to explain caching with GemFire/Geode as the caching provider. This particular example makes a external request to a REST API to get the "Quote of the Day".
Hope this helps!
Cheers,
John
I'm working with an application right now that uses a third-party API for handling some batch email-related tasks, and in order for that to work, we need to store some information in this service. Unfortunately, this information (first/last name, email address) is also something we want to use from our application. My normal inclination is to pick one canonical data source and stick with it, but round-tripping to a web service every time I want to look up these fields isn't really a viable option (we use some of them quite a bit), and the service's API requires the records to be stored there, so the duplication is sadly necessary.
But I have no interest in peppering every method throughout our business classes with code to synchronize data to the web service any time they might be updated, and I also don't think my entity should be aware of the service to update itself in a property setter (or whatever else is updating the "truth").
We use NHibernate for all of our DAL needs, and to my mind, this data replication is really a persistence issue - so I've whipped up a PoC implementation using an EventListener (both PostInsert and PostUpdate) that checks, if the entity is of type X, and any of fields [Y..Z] have been changed, update the web service with the new state.
I feel like this is striking a good balance between ensuring that our data is the canonical source and making sure that it gets replicated transparently and minimizing the chances for changes to fall through the cracks and get us into a mismatch situation (not the end of the world if eg. the service is unreachable, we just do a manual batch update later, but for everybody's sanity in the general case, the goal is that we never have to think about it), but my colleagues and I still have a degree of uncomfortableness with this way forward.
Is this a horrid idea that will invite raptors into my database at inopportune times? Is it a totally reasonable thing to do with an EventListener? Is it a serviceable solution to a less-than-ideal situation that we can just make do with and move on forever tainted? If we soldier on down this road, are there any gotchas I should be wary of in the Events pipeline?
In case of unreliable data stores (web service in your case), I would introduce a concept of transactions (operations) and store them in local database, then periodically pull them from DB and execute against the Web Service (other data store).
Something like this:
public class OperationContainer
{
public Operation Operation; //what ever operations you need CRUD, or some specific
public object Data; //your entity, business object or whatever
}
public class MyMailService
{
public SendMail (MailBusinessObject data)
{
DataAcceessLair<MailBusinessObject>.Persist(data);
OperationContainer operation = new OperationContainer(){Operation=insert, Data=data};
DataAcceessLair<OperationContainer>.Persist(operation);
}
}
public class Updater
{
Timer EverySec;
public void OnEverySec()
{
var data = DataAcceessLair<OperationContainer>.GetFirstIn(); //FIFO
var webServiceData = WebServiceData.Converr(data); // do the logic to prepare data for WebService
try
{
new WebService().DoSomething(data);
DataAcceessLair<OperationContainer>.Remove(data);
}
}
}
This is actually pretty close to the concept of smart client - technically not logicaly. Take a look at book: .NET Domain-Driven Design with C#: Problem-Design-Solution, chapter 10. Or take a look at source code from the book, it's pretty close to your situation: http://dddpds.codeplex.com/