RSocket Client Side Load balancing for multilevel Microservices - load-balancing

In our microservices based application we have layer of Microservices i.e User/Rest client calling MS1 and MS1 calling MS2 .. and so on. For simplicity and to present the actual problem I will mention only client, MS1 and MS2 here. We are tyring to implement MS to MS calls using RSocket communication protocol
(Request Response Interaction Model).
We also need to implement client side load balancing in RSocket as we will be running mulitple PODS(instances) of MSs in kubernetes enviroment.
We are observing following problem in client side Load Balancing in local Unit testing also(mentioning this so as to rule out any issue with deployment/kubernetes env. etc)
1.) Client -> MS1(Instance1) -> MS2(Instance1) ( RSocket Load Balancing code is Working fine and each request is processed)
Client -> MS1(Instance1,Intanse2) -> MS2(Instance1) (Load Balancing is working fine)
Client -> MS1(Instance1)->S2(Instance1c,Instance2) (Only 1st request passes i.e Client -> MS1(Instance1 -> MS2(Instance1)
and then 2nd Request Client -> MS1(Intance1) (Stops here and call to MS2(Instance2) is not initiated
Clinet -> MS1(Instance1,Instance2) -> MS2(Instance1,Instance2)
Only 2 request gets successfully processed Client -> MS1(Instance1) -> MS2(Intstance1)
and Client -> MS1(Instance2) -> MS2(Instance2)
futher the RSocket call does not happen and as per KeepAliveInterval and KeepAliveMaxLifeTime
client RSocket connection id disposed with error
Caused by: ConnectionErrorException (0x101): No keep-alive acks for 30000 ms
at io.rsocket.core.RSocketRequester.lambda$tryTerminateOnKeepAlive$2(RSocketRequester.java:299)
Now let us see how I have implmented client side Load Balancing code.
I am relying on Flux<List and 3 important Beans are
private Mono<List<LoadbalanceTarget>> targets()
{
Mono<List<LoadbalanceTarget>> mono = Mono.fromSupplier(()->serviceRegistry.getServerInstances()
.stream()
.map(server -> LoadbalanceTarget.from(getLoadBalanceTargetKey(server),
TcpClientTransport.create(TcpClient
.create()
.option(ChannelOption.TCP_NODELAY, true)
.option(ChannelOption.ALLOW_HALF_CLOSURE,true)
.host(server.getHost())
.port(server.getPort()))))
.collect(Collectors.toList()));
return mono;
#Bean
public Flux<List<LoadbalanceTarget>> targetFluxForMathService2()
{
return Flux.from(targets());
}
Note: for testing I am faking serviceRegistry and returing list of hard coded RSocket server instances( Host and port)
#Bean
public RSocketRequester rSocketRequester2(Flux<List<LoadbalanceTarget>> targetFluxForMathService2) {
RSocketRequester rSocketRequester = this.builder.rsocketConnector(configurer->
configurer.keepAlive(Duration.ofSeconds(10), Duration.ofSeconds(30))
.reconnect(Retry.fixedDelay(3,
Duration.ofSeconds(1))
.doBeforeRetry(s-> System.out.println("Disconnected, retrying to connect"))))
.transports(targetFluxForMathService2, new RoundRobinLoadbalanceStrategy());
return rSocketRequester;
}
private String getLoadBalanceTargetKey(RSocketServerInstance server)
{
return server.getHost() + server.getPort();
}
Any help will be highly appreciated.

Related

Implementing a 202 ACCEPTED - Retry-After behaviour with RSocket and Project Reactor

I'm implementing a tipical use case in which a client asks for a resource that will be asynchronously generated. Thus, a resourceID is generated and returned right away:
1. CLIENT ---(POST /request-resource)---> SERVER
2. SERVER (Generates resID, launches async process) ---(202 Accepted - resID)---> CLIENT
At this point there is a background task in the SERVER, that will eventually produce a result and store it in a database associated to the resID. The CLIENT would be asking for the resource periodically, retrying until it is available:
3. CLIENT ---(/resource/resID)---> SERVER (checks in Postgres using reactive driver)
4. SERVER ---(404 - Retry-After 5)---> CLIENT
5. CLIENT ---(/resource/resID)---> SERVER (checks in Postgres using reactive driver)
6. SERVER ---(200 - JSON Payload)---> CLIENT
I though RSocket would be a perfect fit in order to avoid this endless CLIENT retry until the resource is available (steps 3. on).
Which interaction model would be more suitable for this problem and how could I implement it?
Consider a repository as follows: ResourceRepository.Mono<Result> getResult(String resID)
If I chose a request/response interaction model I'd be in the same case as before. Unless there was a way to have a Mono that retried until there is a result. Is this possible?
With request/stream I could return results like Flux<Response> with response.status=PROCESSING until the query to Postgre returned a result, then the Flux would have an element with response.status=OK and the Flux would complete. A maximum time would be needed to finish the Flux without a result in a configured period. In this case how could I orquestate this?
I would need to create a Flux, that emits periodically (with a max period timeout), having an element with no result when the repository returns an empty Mono, or the actual value when te repository has it, completing the Flux.
Solution to this problem using RSocket with a RequestResponse interaction model that waits until the resource is available in DB. The key was to use the repeatWhenEmpty operator:
#MessageMapping("request-resource")
fun getResourceWebSocket(resourceRequest: ResourceRequest): Mono<Resource> {
return resourceService.sendResourceRequestProcessing(resourceRequest)
}
override fun sendResourceRequestMessage(resourceRequest: ResourceRequest): Mono<Resource> {
val resourceId = randomUUID().toString()
return Mono.fromCallable {
sendKafkaResourceProcessingRequestMessage(resourceId, resourceRequest)
}.then(poolResourceResponse(resourceId))
}
private fun poolResourceResponse(resourceId: String): Mono<Resource> {
return resourceRepository.findByResourceId(resourceId)
.repeatWhenEmpty(30) { longFlux ->
longFlux.delayElements(Duration.ofSeconds(1))
.doOnNext { logger.info("Repeating {}", it) }
}
}

Can I allow multiple http clients to consume a Flowable stream of data with resteasy-rxjava2 / quarkus?

Currently I am able to see the streaming values exposed by the code below, but only one http client will receive the continuous stream of values, the others will not be able to.
The code, a modified version of the quarkus quickstart for kafka reactive streaming is:
#Path("/migrations")
public class StreamingResource {
private volatile Map<String, String> counterBySystemDate = new ConcurrentHashMap<>();
#Inject
#Channel("migrations")
Flowable<String> counters;
#GET
#Path("/stream")
#Produces(MediaType.SERVER_SENT_EVENTS) // denotes that server side events (SSE) will be produced
#SseElementType("text/plain") // denotes that the contained data, within this SSE, is just regular text/plain data
public Publisher<String> stream() {
Flowable<String> mainStream = counters.doOnNext(dateSystemToCount -> {
String key = dateSystemToCount.substring(0, dateSystemToCount.lastIndexOf("_"));
counterBySystemDate.put(key, dateSystemToCount);
});
return fromIterable(counterBySystemDate.values().stream().sorted().collect(Collectors.toList()))
.concatWith(mainStream)
.onBackpressureLatest();
}
}
Is it possible to make any modification that would allow multiple clients to consume the same data, in a broadcast fashion?
I guess this implies letting go of backpressure, because that would imply a state per consumer?
I saw that Observable is not accepted as a return type in the resteasy-rxjava2 for the Server Side Events media-tpe.
Please let me know any ideas,
Thank you
Please find the full code in Why in multiple connections to PricesResource Publisher, only one gets the stream?

Akka HTTP Source Streaming vs regular request handling

What is the advantage of using Source Streaming vs the regular way of handling requests? My understanding that in both cases
The TCP connection will be reused
Back-pressure will be applied between the client and the server
The only advantage of Source Streaming I can see is if there is a very large response and the client prefers to consume it in smaller chunks.
My use case is that I have a very long list of users (millions), and I need to call a service that performs some filtering on the users, and returns a subset.
Currently, on the server side I expose a batch API, and on the client, I just split the users into chunks of 1000, and make X batch calls in parallel using Akka HTTP Host API.
I am considering switching to HTTP streaming, but cannot quite figure out what would be the value
You are missing one other huge benefit: memory efficiency. By having a streamed pipeline, client/server/client, all parties safely process data without running the risk of blowing up the memory allocation. This is particularly useful on the server side, where you always have to assume the clients may do something malicious...
Client Request Creation
Suppose the ultimate source of your millions of users is a file. You can create a stream source from this file:
val userFilePath : java.nio.file.Path = ???
val userFileSource = akka.stream.scaladsl.FileIO(userFilePath)
This source can you be use to create your http request which will stream the users to the service:
import akka.http.scaladsl.model.HttpEntity.{Chunked, ChunkStreamPart}
import akka.http.scaladsl.model.{RequestEntity, ContentTypes, HttpRequest}
val httpRequest : HttpRequest =
HttpRequest(uri = "http://filterService.io",
entity = Chunked.fromData(ContentTypes.`text/plain(UTF-8)`, userFileSource))
This request will now stream the users to the service without consuming the entire file into memory. Only chunks of data will be buffered at a time, therefore, you can send a request with potentially an infinite number of users and your client will be fine.
Server Request Processing
Similarly, your server can be designed to accept a request with an entity that can potentially be of infinite length.
Your questions says the service will filter the users, assuming we have a filtering function:
val isValidUser : (String) => Boolean = ???
This can be used to filter the incoming request entity and create a response entity which will feed the response:
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.model.HttpResponse
import akka.http.scaladsl.model.HttpEntity.Chunked
val route = extractDataBytes { userSource =>
val responseSource : Source[ByteString, _] =
userSource
.map(_.utf8String)
.filter(isValidUser)
.map(ByteString.apply)
complete(HttpResponse(entity=Chunked.fromData(ContentTypes.`text/plain(UTF-8)`,
responseSource)))
}
Client Response Processing
The client can similarly process the filtered users without reading them all into memory. We can, for example, dispatch the request and send all of the valid users to the console:
import akka.http.scaladsl.Http
Http()
.singleRequest(httpRequest)
.map { response =>
response
.entity
.dataBytes
.map(_.utf8String)
.foreach(System.out.println)
}

Apache HTTP client only two connections are possible

I have below code to invoke a REST API method using Apache HTTP client. However only two parallel requests could be sent using above client.
Is there any parameter to set max-connections?
HttpPost post = new HttpPost(resourcePath);
addPayloadJsonString(payload, post);//set a String Entity
setAuthHeader(post);// set Authorization: Basic header
try {
return httpClient.execute(post);
} catch (IOException e) {
String errorMsg = "Error while executing POST statement";
log.error(errorMsg, e);
throw new RestClientException(errorMsg, e);
}
Jars I am using are below are,
org.apache.httpcomponents.httpclient_4.3.5.jar
org.apache.httpcomponents.httpcore_4.3.2.jar
You can configure the HttpClient with HttpClientConnectionManager
Take a look at Pooling connection manager.
ClientConnectionPoolManager maintains a pool of HttpClientConnections and is able to service connection requests from multiple execution threads. Connections are pooled on a per route basis. A request for a route which already the manager has persistent connections for available in the pool will be services by leasing a connection from the pool rather than creating a brand new connection.
PoolingHttpClientConnectionManager maintains a maximum limit of connections on a per route basis and in total. Per default this implementation will create no more than 2 concurrent connections per given route and no more 20 connections in total. For many real-world applications these limits may prove too constraining, especially if they use HTTP as a transport protocol for their services.
This example shows how the connection pool parameters can be adjusted:
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
// Increase max total connection to 200
cm.setMaxTotal(200);
// Increase default max connection per route to 20
cm.setDefaultMaxPerRoute(20);
// Increase max connections for localhost:80 to 50
HttpHost localhost = new HttpHost("locahost", 80);
cm.setMaxPerRoute(new HttpRoute(localhost), 50);
CloseableHttpClient httpClient = HttpClients.custom()
.setConnectionManager(cm)
.build();

How does in duplex system WCF differentiate between different channel instances?

Uhm, I’m utterly lost so any help would be much appreciated
The
OperationContext.Current.InstanceContext
is the context of the current service
instance that the incoming channel is
using.
In a Duplex system, the service can
callback to the client via a
CallbackContract. This
CallbackContract is much like a
service on the client side that is
listening for calls from the service
on the channel that the client has
opened. This “client callback
service” can only be accessed via the
same channel it used on the service
and therefore only that service has
access to it.
a) So in duplex systems the same channel instance with which client-side sends messages to the service, is also used by client to receive messages from the service?
b) If in request-reply system a client uses particular channel instance clientChannel to send a message to the service, then I assume this same instance ( thus clientChannel ) needs to stay opened until service sends back a reply to this instance, while in duplex system clientChannel needs to stay opened until the session is closed?
c) I’m assuming such behaviour since as far as I can tell each channel instance has a unique address ( or ID ) which helps to differentiate it from other channel instances ) running on the same client? And when service sends back a message, it also specifies an ID of this channel?
Thus when in Duplex system client calls a service, WCF creates ( on client side ) a channel instance clientChannel, which sends a message over the wire. On server’s side WCF creates channel instance serverChannel, which delivers the message to requested operation(method). When this method wants to callback to the client via CallbackContract, it uses InstanceContext.GetCallBackChannel<> to create a channel, which among other things contains the ID of the channel that called a service ( thus it contains an exact address or ID of clientChannel )?
d) Does in duplex systems client use the same channel instance to call any of endpoint’s operations?
Thank you
I am not sure but here is how I understand this for a Duplex mode communication.
I had a look at the InstanceContext class defined in the System.ServiceModel assembly using dotPeek decompiler.
Internally there is a call
this.channels = new ServiceChannelManager(this);
That means, it is creating channel using a ServiceChannelManager passing in the instance of the same InstanceContext.
This way it keeping a track of the Channel with the instance of InstanceContext.
Then is binds Incoming channel (Service to Client) requests in method that is implemented as :
internal void BindIncomingChannel(ServiceChannel channel)
{
this.ThrowIfDisposed();
channel.InstanceContext = this;
IChannel channel1 = (IChannel) channel.Proxy;
this.channels.AddIncomingChannel(channel1);
if (channel1 == null)
return;
switch (channel.State)
{
case CommunicationState.Closing:
case CommunicationState.Closed:
case CommunicationState.Faulted:
this.channels.RemoveChannel(channel1);
break;
}
}
So to answer your queries :
a. Yes, it internally maintains the Service and InstanceContext (which creates a channel) relations for
calls between Client and Service.
b. Yes, the channel needs to stay opened untill the Service replies back to the context, in which the InstanceContext
will take care of closing the channel.
c. Each client has a unique Session Id, but the InstanceContext type at the Service depends on the InstanceContextMode
used at the Service on the implementation of the Contract.
d. It uses the same channel. InstanceContext maintains a count of IncomingChannel and Outgoing channel.
Incoming channel are the one that are Service to Client directed and Outgoing are Client to Service directed.
You can see this count using debugger in VS.
For the sake of further clarification, as far as the other behavior for a Duplex service is concerned, here is how we can look at the behavior of InstanceContext and how channel instance is created :
I created a Duplex service demo :
[ServiceContract(SessionMode = SessionMode.Required, CallbackContract = typeof(IServiceDuplexCallback))]
public interface IServiceClass
{
[OperationContract(IsOneWay = true)]
void Add(int num1);
}
This contract is implemented as :
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall)]
public class ServiceClass : IServiceClass
{
int result = 0;
public void Add(int num1)
{
result += num1;
callBack.Calculate(result);
}
IServiceDuplexCallback callBack
{
get
{
return OperationContext.Current.GetCallbackChannel<IServiceDuplexCallback>();
}
}
}
In this implementation notice the first line where InstanceContextMode is set to PerCall. The default is PerSession.
This enumeration has three options:
PerCall - New instance of InstanceContext used for every call independent of Session
PerSession - New instance used for every session
Single - A single instance of InstanceContext used for all the clients.
I created a client which use NetTcpBinding to connect with Service :
InstanceContext ic = new InstanceContext(new TCPCallbackHandler(ref textBox1));
TCP.ServiceClassClient client = new TCP.ServiceClassClient(ic);
// First call to Add method of the Contract
client.Add(val1);
// Second call to Add method of the Contract
client.Add(val2);
TCPCallbackHandler is the class in the Client that implements the Callback contract as:
public class TCPCallbackHandler : TCP.IServiceClassCallback
{
TextBox t;
public TCPCallbackHandler(ref TextBox t1)
{
t = t1;
}
public void Calculate(int result)
{
t.Text += OperationContext.Current.SessionId + " " + result.ToString();
}
}
To see the behavior of the InstanceContext, I started the service and then started two clients with
each enumeration operation as discussed above. Here are the results :
1 - PerCall
Client 1 : urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 5 - urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 5
Client 2 : urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 5 - urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 5
Here - urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 is SessionId
Since for each client Add is called twice in the Client, and in PerCall -> new instance of InstanceContext is created every call, we create a new instance of ServiceClass for both the calls of every client. Point to note here is that new instance is created even for the same session
// First call to Add method of the Contract
client.Add(val1); -> New Instance of ServiceClass created and value will be incremented to 5
// Second call to Add method of the Contract
client.Add(val2); -> New Instance of ServiceClass created and value will be incremented to 5
2 - PerSession
Client 1 : urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 5 - urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 10
Client 2 : urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 5 - urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 10
Here the instance of ServiceClass is separate for both the client as they have different sessions running. So the increment in the calls is 0 -> 5 -> 10 (for both client separately)
3 - Single
Client 1 : urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 5 - urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 10
Client 2 : urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 15 - urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 20
Here the same instance of ServiceClass is shared by all clients so we have 0 -> 5 -> 10 in first client. The second client will increment in the same instance, so we get 10 -> 15 -> 20.
This will behave differently as per the call and may give result like when invoked at the same time from the clients.
Client 1 : urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 5 - urn:uuid:4c5f3d8b-9203-4f25-b09a-839089ecbe54 15
Client 2 : urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 10 - urn:uuid:e101d2a7-ae41-4929-9789-6d43abf97f01 20
Hope this helps!