Implementing a 202 ACCEPTED - Retry-After behaviour with RSocket and Project Reactor - spring-webflux

I'm implementing a tipical use case in which a client asks for a resource that will be asynchronously generated. Thus, a resourceID is generated and returned right away:
1. CLIENT ---(POST /request-resource)---> SERVER
2. SERVER (Generates resID, launches async process) ---(202 Accepted - resID)---> CLIENT
At this point there is a background task in the SERVER, that will eventually produce a result and store it in a database associated to the resID. The CLIENT would be asking for the resource periodically, retrying until it is available:
3. CLIENT ---(/resource/resID)---> SERVER (checks in Postgres using reactive driver)
4. SERVER ---(404 - Retry-After 5)---> CLIENT
5. CLIENT ---(/resource/resID)---> SERVER (checks in Postgres using reactive driver)
6. SERVER ---(200 - JSON Payload)---> CLIENT
I though RSocket would be a perfect fit in order to avoid this endless CLIENT retry until the resource is available (steps 3. on).
Which interaction model would be more suitable for this problem and how could I implement it?
Consider a repository as follows: ResourceRepository.Mono<Result> getResult(String resID)
If I chose a request/response interaction model I'd be in the same case as before. Unless there was a way to have a Mono that retried until there is a result. Is this possible?
With request/stream I could return results like Flux<Response> with response.status=PROCESSING until the query to Postgre returned a result, then the Flux would have an element with response.status=OK and the Flux would complete. A maximum time would be needed to finish the Flux without a result in a configured period. In this case how could I orquestate this?
I would need to create a Flux, that emits periodically (with a max period timeout), having an element with no result when the repository returns an empty Mono, or the actual value when te repository has it, completing the Flux.

Solution to this problem using RSocket with a RequestResponse interaction model that waits until the resource is available in DB. The key was to use the repeatWhenEmpty operator:
#MessageMapping("request-resource")
fun getResourceWebSocket(resourceRequest: ResourceRequest): Mono<Resource> {
return resourceService.sendResourceRequestProcessing(resourceRequest)
}
override fun sendResourceRequestMessage(resourceRequest: ResourceRequest): Mono<Resource> {
val resourceId = randomUUID().toString()
return Mono.fromCallable {
sendKafkaResourceProcessingRequestMessage(resourceId, resourceRequest)
}.then(poolResourceResponse(resourceId))
}
private fun poolResourceResponse(resourceId: String): Mono<Resource> {
return resourceRepository.findByResourceId(resourceId)
.repeatWhenEmpty(30) { longFlux ->
longFlux.delayElements(Duration.ofSeconds(1))
.doOnNext { logger.info("Repeating {}", it) }
}
}

Related

How do I control which upstream hot flow a downstream flow collects from?

Imagine there are have two upstream hot flows and one downstream flow. I want to be able to collect from the downstream flow once, and for the downstream flow to collect from either of the two upstream flows. I want to be able to control using some logic which of the two upstream flows the downstream flow is collecting from, and I want to be able to switch back and forth arbitrarily until I stop collecting on the downstream flow.
For example, I call this once:
downstreamFlow.collect {
// This is the endpoint of the flow
}
Now imagine a method called switch(). I would start collecting on upstream flow 1 like this:
downstreamFlow.switch(upstreamFlow1) // Now the downstreamFlow is emitting upstreamFlow1
I would switch to upstream flow 2 like this:
downstreamFlow.switch(upstreamFlow2) // Now the downstreamFlow is emitting upstreamFlow2
The top collect would collect from upstream 1, then upstream 2. I could switch back and forth arbitrarily. There could be an upstream 3, ...etc. What would switch() look like?
Looking for that or any other ideas that work.
The way I interpret this is that you want a way to encapsulate this switching behavior such that collectors of the downstream flow don't need to know about the switching behavior or anything about the upstream flows.
The following is an idea I have about how it might be implemented. You end up with a Flow that is sort of hot, in the sense that all its downstream collectors will be switched to get a new upstream when switch() is called. But if you use it with cold upstreams (not what you asked for, but it isn't restrictive the way I wrote it), it's sort of cold in the sense that each downstream collector triggers its own upstream collection. You could of course change the signature of switch to prohibit cold upstreams, or you could change the switcherFlow function to return a hot wrapper around the implementation.
interface SwitcherFlow<T> : Flow<T> {
fun switch(upstream: Flow<T>)
}
#OptIn(ExperimentalCoroutinesApi::class)
class SwitcherFlowImpl<T> private constructor(
private val flowOfSources: MutableStateFlow<Flow<T>>
) : Flow<T> by flowOfSources.flatMapLatest({ it }), SwitcherFlow<T> {
constructor(initialFlow: Flow<T>) : this(MutableStateFlow(initialFlow))
override fun switch(upstream: Flow<T>) {
flowOfSources.value = upstream
}
}
fun <T> switcherFlow(initialFlow: Flow<T> = emptyFlow()): SwitcherFlow<T> =
SwitcherFlowImpl(initialFlow)

Can I allow multiple http clients to consume a Flowable stream of data with resteasy-rxjava2 / quarkus?

Currently I am able to see the streaming values exposed by the code below, but only one http client will receive the continuous stream of values, the others will not be able to.
The code, a modified version of the quarkus quickstart for kafka reactive streaming is:
#Path("/migrations")
public class StreamingResource {
private volatile Map<String, String> counterBySystemDate = new ConcurrentHashMap<>();
#Inject
#Channel("migrations")
Flowable<String> counters;
#GET
#Path("/stream")
#Produces(MediaType.SERVER_SENT_EVENTS) // denotes that server side events (SSE) will be produced
#SseElementType("text/plain") // denotes that the contained data, within this SSE, is just regular text/plain data
public Publisher<String> stream() {
Flowable<String> mainStream = counters.doOnNext(dateSystemToCount -> {
String key = dateSystemToCount.substring(0, dateSystemToCount.lastIndexOf("_"));
counterBySystemDate.put(key, dateSystemToCount);
});
return fromIterable(counterBySystemDate.values().stream().sorted().collect(Collectors.toList()))
.concatWith(mainStream)
.onBackpressureLatest();
}
}
Is it possible to make any modification that would allow multiple clients to consume the same data, in a broadcast fashion?
I guess this implies letting go of backpressure, because that would imply a state per consumer?
I saw that Observable is not accepted as a return type in the resteasy-rxjava2 for the Server Side Events media-tpe.
Please let me know any ideas,
Thank you
Please find the full code in Why in multiple connections to PricesResource Publisher, only one gets the stream?

Best way to build a "task queue" with RxJava

Currently I'm working on a lot of network-related features. At the moment, I'm dealing with a network channel that allows me to send 1 single piece of information at a time, and I have to wait for it to be acknowledged before I can send the next piece of information. I'm representing the server with 1..n connected clients.
Some of these messages, I have to send in chunks, which is fairly easy to do with RxJava. Currently my "writing" method looks sort of like this:
fun write(bytes: ByteArray, ignoreMtu: Boolean) =
server.deviceList()
.first(emptyList())
.flatMapObservable { devices ->
Single.fromCallable {
if (ignoreMtu) {
bytes.size
} else {
devices.minBy { device -> device.mtu }?.mtu ?: DEFAULT_MTU
}
}
.flatMapObservable { minMtu ->
Observable.fromIterable(bytes.asIterable())
.buffer(minMtu)
}
.map { it.toByteArray() }
.doOnNext { server.currentData = bytes }
.map { devices }
// part i've left out: waiting for each device acknowledging the message, timeouts, etc.
}
What's missing in here is the part where I only allow one piece of information to be sent at the same time. Also, what I require is that if I'm adding a message into my queue, I have to be able to observe the status of only this message (completed, error).
I've thought about what's the most elegant way to achieve this. Solutions I've came up with include for example a PublishSubject<ByteArray> in which I push the messages (queue-like), add a subscriber and observe it - but this would throw for example onError if the previous message failed.
Another way that crossed my mind was to give each message a number upon creating / queueing it, and have a global "message-counter" Observable which I'd hook into the chain's beginning with a filter for the currently sent message == MY_MESSAGE_ID. But this feels kind of fragile. I could increment the counter whenever the subscription terminates, but I'm sure there must be a better way to achieve my goal.
Thanks for your help.
For future reference: The most straight-forward approach I've found is to add a scheduler that's working on a single thread, thus working each task sequential.

Akka HTTP Source Streaming vs regular request handling

What is the advantage of using Source Streaming vs the regular way of handling requests? My understanding that in both cases
The TCP connection will be reused
Back-pressure will be applied between the client and the server
The only advantage of Source Streaming I can see is if there is a very large response and the client prefers to consume it in smaller chunks.
My use case is that I have a very long list of users (millions), and I need to call a service that performs some filtering on the users, and returns a subset.
Currently, on the server side I expose a batch API, and on the client, I just split the users into chunks of 1000, and make X batch calls in parallel using Akka HTTP Host API.
I am considering switching to HTTP streaming, but cannot quite figure out what would be the value
You are missing one other huge benefit: memory efficiency. By having a streamed pipeline, client/server/client, all parties safely process data without running the risk of blowing up the memory allocation. This is particularly useful on the server side, where you always have to assume the clients may do something malicious...
Client Request Creation
Suppose the ultimate source of your millions of users is a file. You can create a stream source from this file:
val userFilePath : java.nio.file.Path = ???
val userFileSource = akka.stream.scaladsl.FileIO(userFilePath)
This source can you be use to create your http request which will stream the users to the service:
import akka.http.scaladsl.model.HttpEntity.{Chunked, ChunkStreamPart}
import akka.http.scaladsl.model.{RequestEntity, ContentTypes, HttpRequest}
val httpRequest : HttpRequest =
HttpRequest(uri = "http://filterService.io",
entity = Chunked.fromData(ContentTypes.`text/plain(UTF-8)`, userFileSource))
This request will now stream the users to the service without consuming the entire file into memory. Only chunks of data will be buffered at a time, therefore, you can send a request with potentially an infinite number of users and your client will be fine.
Server Request Processing
Similarly, your server can be designed to accept a request with an entity that can potentially be of infinite length.
Your questions says the service will filter the users, assuming we have a filtering function:
val isValidUser : (String) => Boolean = ???
This can be used to filter the incoming request entity and create a response entity which will feed the response:
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.model.HttpResponse
import akka.http.scaladsl.model.HttpEntity.Chunked
val route = extractDataBytes { userSource =>
val responseSource : Source[ByteString, _] =
userSource
.map(_.utf8String)
.filter(isValidUser)
.map(ByteString.apply)
complete(HttpResponse(entity=Chunked.fromData(ContentTypes.`text/plain(UTF-8)`,
responseSource)))
}
Client Response Processing
The client can similarly process the filtered users without reading them all into memory. We can, for example, dispatch the request and send all of the valid users to the console:
import akka.http.scaladsl.Http
Http()
.singleRequest(httpRequest)
.map { response =>
response
.entity
.dataBytes
.map(_.utf8String)
.foreach(System.out.println)
}

Designing an API for the client to a 3rd-party service

I am fairly new to Scala and I'm working on an application (library) which is a client to a 3rd-party service (I'm not able to modify the server side and it uses custom binary protocol). I use Netty for networking.
I want to design an API which should allow users to:
Send requests to the server
Send requests to the server and get the response asynchronously
Subscribe to events triggered by the server (having multiple asynchronous event handlers which should be able to send requests as well)
I am not sure how should I design it. Exploring Scala, I stumble upon a bunch of information about Actor model, but I am not sure if it can be applied there and if it can, how.
I'd like to get some recommendations on the way I should take.
In general, the Scala-ish way to expose asynchronous functionality to user code is to return a scala.concurrent.Future[T].
If you're going the actor route, you might consider encapsulating the binary communication within the context of a single actor class. You can scale the instances of this proxy actor using Akka's router support, and you could produce response futures easily using the ask pattern. There are a few nice libraries (Spray, Play Framework) that make wrapping e.g. a RESTful or even WebSocket layer over Akka almost trivial.
A nice model for the pub-sub functionality might be to define a Publisher trait that you can mix in to some actor subclasses. This could define some state to keep track of subscribers, handle Subscribe and Unsubscribe messages, and provide some sort of convenient method for broadcasting messages:
/**
* Sends a copy of the supplied event object to every subscriber of
* the event object class and superclasses.
*/
protected[this] def publish[T](event: T) {
for (subscriber <- subscribersFor(event)) subscriber ! event
}
These are just some ideas based on doing something similar in some recent projects. Feel free to elaborate on your use case if you need more specific direction. Also, the Akka user list is a great resource for general questions like this, if indeed you're interested in exploring actors in Scala.
Observables
This looks like a good example for the Obesrvable pattern. This pattern comes from the Reactive Extensions of .NET, but is also available for Java and Scala. The library is provided by Netflix and has a really good quality.
This pattern has a good theoretical foundation --- it is the dual to the iterator in the category theoretical sense. But more important, it has a lot of practical ideas in it. Especially it handles time very good, e.g. you can limit the event rate you want to get.
With an observable you can process events on avery high level. In .NET it looks a lot like an SQL query. You can register for certain events ("FROM"), filter them ("WHERE") and finally process them ("SELECT"). In Scala you can use standard monadic API (map, filter, flatMap) and of course "for expressions".
An example can look like
stackoverflowQuestions.filter(_.tag == "Scala").map(_.subject).throttleLast(1 second).subscribe(println _)
Obeservables take away a lot of problems you will have with event based systems
Handling subsrcriptions
Handling errors
Filtering and pre-processing events
Buffering events
Structuring the API
Your API should provide an obesrvable for each event source you have. For procedure calls you provide a function that will map the function call to an obesrvable. This function will call the remote procedure and provide the result through the obeservable.
Implementation details
Add the following dependency to your build.sbt:
libraryDependencies += "com.netflix.rxjava" % "rxjava-scala" % "0.15.0"
You can then use the following pattern to convert a callback to an obeservable (given your remote API has some way to register and unregister a callback):
private val callbackFunc : (rx.lang.scala.Observer[String]) => rx.lang.scala.Subscription = { o =>
val listener = {
case Value(s) => o.onNext(s)
case Error(e) => o.onError(o)
}
remote.subscribe(listener)
// Return an interface to cancel the subscription
new Subscription {
val unsubscribed = new AtomicBoolean(false)
def isUnsubscribed: Boolean = unsubscribed.get()
val asJavaSubscription: rx.Subscription = new rx.Subscription {
def unsubscribe() {
remote.unsubscribe(listener)
unsubscribed.set(true)
}
}
}
If you have some specific questions, just ask and I can refine the answer
Additional ressources
There is a very nice course from Martin Odersky et al. at coursera, covering Observables and other reactive techniques.
Take a look at the spray-client library. This provides HTTP request functionality (I'm assuming the server you want to talk to is a web service?). It gives you a pretty nice DSL for building requests and is all about being asynchronous. It does use the akka Actor model behind the scenes, but you do not have to build your own Actors to use it. Instead the you can just use scala's Future model for handling things asynchronously. A good introduction to the Future model is here.
The basic building block of spray-client is a "pipeline" which maps an HttpRequest to a Future containing an HttpResponse:
// this is from the spray-client docs
val pipeline: HttpRequest => Future[HttpResponse] = sendReceive
val response: Future[HttpResponse] = pipeline(Get("http://spray.io/"))
You can take this basic building block and build it up into a client API in a couple of steps. First, make a class that sets up a pipeline and defines some intermediate helpers demonstrating ResponseTransformation techniques:
import scala.concurrent._
import spray.can.client.HttpClient
import spray.client.HttpConduit
import spray.client.HttpConduit._
import spray.http.{HttpRequest, HttpResponse, FormData}
import spray.httpx.unmarshalling.Unmarshaller
import spray.io.IOExtension
type Pipeline = (HttpRequest) => Future[HttpResponse]
// this is basically spray-client boilerplate
def createPipeline(system: ActorSystem, host: String, port: Int): Pipeline = {
val httpClient = system.actorOf(Props(new HttpClient(IOExtension(system).ioBridge())))
val conduit = system.actorOf(props = Props(new HttpConduit(httpClient, host, port)))
sendReceive(conduit)
}
private var pipeline: Pipeline = _
// unmarshalls to a specific type, e.g. a case class representing a datamodel
private def unmarshallingPipeline[T](implicit ec:ExecutionContext, um:Unmarshaller[T]) = (pipeline ~> unmarshal[T])
// for requests that don't return any content. If you get a successful Future it worked; if there's an error you'll get a failed future from the errorFilter below.
private def unitPipeline(implicit ec:ExecutionContext) = (pipeline ~> { _:HttpResponse => () })
// similar to unitPipeline, but where you care about the specific response code.
private def statusPipeline(implicit ec:ExecutionContext) = (pipeline -> {r:HttpResponse => r.status})
// if you want standard error handling create a filter like this
// RemoteServerError and RemoteClientError are custom exception classes
// not shown here.
val errorFilter = { response:HttpResponse =>
if(response.status.isSuccess) response
else if(response.status.value >= 500) throw RemoteServerError(response)
else throw RemoteClientError(response)
}
pipeline = (createPipeline(system, "yourHost", 8080) ~> errorFilter)
Then you can use wrap these up in methods tied to specific requests/responses that becomes the public API. For example, suppose the service has a "ping" GET endpoint that returns a string ("pong") and a "form" POST endpoint where you post form data and receive a DataModel in return:
def ping()(implicit ec:ExecutionContext, um:Unmarshaller[String]): Future[String] =
unmarshallingPipeline(Get("/ping"))
def form(formData: Map[String, String])(implicit ec:ExecutionContext, um:Unmarshaller[DataModel]): Future[DataModel] =
unmarshallingPipeline(Post("/form"), FormData(formData))
And then someone could use the API like this:
import scala.util.{Failure, Success}
API.ping() foreach(println) // will print out "pong" when response comes back
API.form(Map("a" -> "b") onComplete {
case Success(dataModel) => println("Form accepted. Server returned DataModel: " + dataModel)
case Failure(e) => println("Oh noes, the form didn't go through! " + e)
}
I'm not sure if you will find direct support in spray-client for your third bullet point about subscribing to events. Are these events being generated by the server and somehow sent to your client outside the scope of a specific HTTP request? If so, then spray-client will probably not be able to help directly (though your event handlers could still use it to send requests). Are the events occurring on the client side, e.g. the completion of deferred processing initially triggered by a response from the server? If so, you could actually probably get pretty far just by using the functionality in Future, but depending on your use cases, using Actors might make sense.