Reactive pattern for combing operator where one is dependent on another - operators

I have an app that consumes a REST service. It makes requests every time someone e.g. clicks a button. However it needs to get a token first and then refresh every 20 minutes thereafter. Here is a marble diagram that represents the desired combining of the streams ...
TOKEN SOURCE ---------A--------------B--------------C----------------D-
REQUEST SOURCE ----1-----------2--3---------4-----------5-------6--------
RESULT SOURCE ---------A1-----A2-A3--------B4----------C5------C6-------
Typically the request source triggers the result which combines with the latest value of the token source, however there is an exception when no items have been emitted on the token stream - in this case the request is buffered until the first token arrives and then it is sent.
The combineLatest operator is almost there but it triggers when either stream emits. The marble diagram for sample operator also seems close but it throttles the output based on a time interval which is not what I want.
Which operator/chain of operators would work for this instance?
I'm prototyping with RxJS but I need to implement in RxSwift.

You were close to the solution. The important detail is that Source (2) projects all its elements, while Source (1) does not directly project, and having that extra bit of information can help us discard when only Source (1) changes.
s1.combineLatest(s2, (a, b) => [a, b])
.distinctUntilChanged(v => v[1])
Here's the test setup:
var s1 = Rx.Observable.interval(997).take(10).map(i => String.fromCharCode(65 + i))
var s2 = Rx.Observable.interval(251).take(10).delay(500).skip(1)
s1.combineLatest(s2, (a, b) => [a, b])
.distinctUntilChanged(v => v[1])
.subscribe(value => console.log(value))
Output:
["A",1]
["A",2]
["A",3]
["A",4]
["B",5]
["B",6]
["B",7]
["B",8]
["C",9]

Since you want your data-emission only to be triggered based on your REQUEST SOURCE, it'd probably be the best way to start your stream with that and then switch - or in this case flatMap - onto the TOKEN SOURCE and combine those two - no advanced operators required:
(Note: My assumption is, that your TOKEN SOURCE does replay the latest token:
var token$ = Rx.Observable.interval(1000).take(10).map(() => Math.random()).publishReplay(1); // simulating the token source: new token every second and replaying the latest token
var requests$ = Rx.Observable.interval(250).take(10); // simulating the request source: emitting a new request every 250ms
token$.connect();
requests$
.flatMap(request => token$
.take(1)
.map(token => ({request, token}))
)
.subscribe(function(requestWithToken) {
console.log("Matched: ", requestWithToken);
});
You can check out the live fiddle here: https://jsfiddle.net/rn8rzufg/

Related

Ingest data into warp10 - Performance tip

We're looking for the best way to ingest data in warp10. We are on a Microservices architecture that use Kafka mainly.
Two solutions:
Use Ingress endpoint as defined here: https://www.warp10.io/content/03_Documentation/03_Interacting_with_Warp_10/03_Ingesting_data/01_Ingress (This is the solution we use for now)
Use the warp10 Kafka plugin as defined here: https://blog.senx.io/introducing-the-warp-10-kafka-plugin/
As described here, we use Ingress solution as of now, based on an aggregation of data for x seconds, and call the Ingress API to send data per packet. (Instead of calling the API each time we need to insert something).
For few days, we are experimenting with the Kafka Plugin. We successfully set up the plugin and create an .mc2 responsible to consume data from a given topic and then insert them using UPDATE into warp10.
Questions:
Using the Kafka plugin, would it be better to apply the same buffer mechanism as the one applied when we use the Ingress endpoint? Or, is there any specific implementation in warp10 Kafka plugin that allows to consume message per message in the topic and call the UPDATE function for each ?
Today, as both solutions are working, we're trying to find differences to get the best performance results during ingestion of data. And if possible, without having to apply any buffer mechanism because we are trying to be in real-time as much as possible.
MC2 file:
{
'topics' [ 'our_topic_name' ] // List of Kafka topics to subscribe to
'parallelism' 1 // Number of threads to start for processing the incoming messages. Each thread will handle a certain number of partitions.
'config' { // Map of Kafka consumer parameters
'bootstrap.servers' 'kafka-headless:9092'
'group.id' 'senx-consumer'
'enable.auto.commit' 'true'
}
'macro' <%
// macro executed each time a kafka record is consumed
/*
// received record format :
{
'timestamp' 123 // The record timestamp
'timestampType' 'type' // The type of timestamp, can be one of 'NoTimestampType', 'CreateTime', 'LogAppendTime'
'topic' 'topic_name' // Name of the topic which received the message
'offset' 123 // Offset of the message in 'topic'
'partition' 123 // Id of the partition which received the message
'key' ... // Byte array of the message key
'value' ... // Byte array of the message value
'headers' { } // Map of message headers
}
*/
"recordArray" STORE
"preprod.write" "token" STORE
// macro can be called on timeout with an empty entry map
$recordArray SIZE 0 !=
<%
$recordArray 'value' GET // kafka record value is retrieved in bytes
'UTF-8' BYTES-> // convert bytes to string (WARP10 INGRESS format)
JSON->
"value" STORE
"Records received through Kafka" LOGMSG
$value LOGMSG
$value
<%
DROP
PARSE
// PARSE outputs a gtsList, including only one gts
0 GET
// GTS rename is required to use UPDATE function
"gts" STORE
$gts $gts NAME RENAME
%>
LMAP
// Store GTS in Warp10
$token
UPDATE
%>
IFT
%> // end macro
'timeout' 10000 // Polling timeout (in ms), if no message is received within this delay, the macro will be called with an empty map as input
}
If you want to cache something in Warp 10 to avoid lots of UPDATE per second, you can use SHM (SHared Memory). This is a built-in extension you need to activate.
Once activated, use it with SHMSTORE and SHMLOAD to keep objects in RAM between two WarpScript executions.
In you example, you can push all the incoming GTS in a list, or a list of list of GTS, using +! to append elements to an existing list.
The MERGE of all the GTS in the cache (by name + labels) and UPDATE in the database can then be done in a runner (don't forget to use a MUTEX)
Don't forget the total operation cost:
The ingress format can be optimized for ingestion speed, if you do not repeat classname and labels, and if you gather lines per gts. See here.
PARSE deserialize data from the Warp 10 ingress format.
UPDATE serialize data to the Warp 10 optimized ingress format (and push it to the update endpoint).
the update endpoint deserialize again.
It makes sense to do these deserialize/serialize/deserialize operation if your input data is far from the optimal ingress format. It also make sense if you want to RANGECOMPACT your data to save disk space, or do any preprocessing.

How to push Salesforce Order to an external REST API?

I have experience in Salesforce administration, but not in Salesforce development.
My task is to push a Order in Salesforce to an external REST API, if the order is in the custom status "Processing" and the Order Start Date (EffectiveDate) is in 10 days.
The order will be than processed in the down-stream system.
If the order was successfully pushed to the REST API the status should be changed to "Activated".
Can anybody give me some example code to get started?
There's very cool guide for picking right mechanism, I've been studying from this PDF for one of SF certifications: https://developer.salesforce.com/docs/atlas.en-us.integration_patterns_and_practices.meta/integration_patterns_and_practices/integ_pat_intro_overview.htm
A lot depends on whether the endpoint is accessible from Salesforce (if it isn't - you might have to pull data instead of pushing), what authentication it needs.
For push out of Salesforce you could use
Outbound Message - it'd be an XML document sent when (time-based in your case?) workflow fires, not REST but it's just clicks, no code. The downside is that it's just 1 object in message. So you can send Order header but no line items.
External Service would be code-free and you could build a flow with it.
You could always push data with Apex code (something like this). We'd split the solution into 2 bits.
The part that gets actual work done: At high level you'd write function that takes list of Order ids as parameter, queries them, calls req.setBody(JSON.serialize([SELECT Id, OrderNumber FROM Order WHERE Id IN :ids]));... If the API needs some special authentication - you'd look into "Named Credentials". Hard to say what you'll need without knowing more about your target.
And the part that would call this Apex when the time comes. Could be more code (a nightly scheduled job that makes these callouts 1 minute after midnight?) https://salesforce.stackexchange.com/questions/226403/how-to-schedule-an-apex-batch-with-callout
Could be a flow / process builder (again, you probably want time-based flows) that calls this piece of Apex. The "worker" code would have to "implement interface" (a fancy way of saying that the code promises there will be function "suchAndSuchName" that takes "suchAndSuch" parameters). Check Process.Plugin out.
For pulling data... well, target application could login to SF (SOAP, REST) and query the table of orders once a day. Lots of integration tools have Salesforce plugins, do you already use Azure Data Factory? Informatica? BizTalk? Mulesoft?
There's also something called "long polling" where client app subscribes to notifications and SF pushes info to them. You might have heard about CometD? In SF-speak read up about Platform Events, Streaming API, Change Data Capture (although that last one fires on change and sends only the changed fields, not great for pushing a complete order + line items). You can send platform events from flows too.
So... don't dive straight to coding the solution. Plan a bit, the maintenance will be easier. This is untested, written in Notepad, I don't have org with orders handy... But in theory you should be able to schedule it to run at 1 AM for example. Or from dev console you can trigger it with Database.executeBatch(new OrderSyncBatch(), 1);
public class OrderSyncBatch implements Database.Batchable, Database.AllowsCallouts {
public Database.QueryLocator start(Database.BatchableContext bc) {
Date cutoff = System.today().addDays(10);
return Database.getQueryLocator([SELECT Id, Name, Account.Name, GrandTotalAmount, OrderNumber, OrderReferenceNumber,
(SELECT Id, UnitPrice, Quantity, OrderId FROM OrderItems)
FROM Order
WHERE Status = 'Processing' AND EffectiveDate = :cutoff]);
}
public void execute(Database.BatchableContext bc, List<sObject> scope) {
Http h = new Http();
List<Order> toUpdate = new List<Order>();
// Assuming you want 1 order at a time, not a list of orders?
for (Order o : (List<Order>)scope) {
HttpRequest req = new HttpRequest();
HttpResponse res;
req.setEndpoint('https://example.com'); // your API endpoint here, or maybe something that starts with "callout:" if you'd be using Named Credentials
req.setMethod('POST');
req.setHeader('Content-Type', 'application/json');
req.setBody(JSON.serializePretty(o));
res = h.send(req);
if (res.getStatusCode() == 200) {
o.Status = 'Activated';
toUpdate.add(o);
}
else {
// Error handling? Maybe just debug it, maybe make a Task for the user or look into
// Database.RaisesPlatformEvents
System.debug(res);
}
}
update toUpdate;
}
public void finish(Database.BatchableContext bc) {}
public void execute(SchedulableContext sc){
Database.executeBatch(new OrderSyncBatch(), Limits.getLimitCallouts()); // there's limit of 10 callouts per single transaction
// and by default batches process 200 records at a time so we want smaller chunks
// https://developer.salesforce.com/docs/atlas.en-us.apexref.meta/apexref/apex_methods_system_limits.htm
// You might want to tweak the parameter even down to 1 order at a time if processing takes a while at the other end.
}
}

is there any plan for karate to support automate kafka message testing same way of API testing? [duplicate]

Our test automation needs to interact with kafka and we are looking at how we can achieve this with karate.
We have a java class which reads from kafka and puts records in an internal list. We then ask for these records from karate, filter out all messages from background traffic, and return the first message that matches our filter.
So our consumer looks like this (simplified):
// consume.js
function(bootstrapServers, topic, filter, timeout, interval) {
var KafkaLib = Java.type('kafka.KafkaLib')
var records = KafkaLib.getRecords(bootstrapServers, topic)
for (record_id in records) {
// TODO here we want to convert record to a json (and later xml for xml records) so that
// we can access them as 'native' karate data types and use notation like: cat.cat.scores.score[1]
var record = records[record_id]
if (filter(record)) {
karate.log("Record matched: " + record)
return record
}
}
throw "No records found matching the filter: " + filter
}
Records can be json, xml, or plain text, but looking in the json case now.
In this case given that in kafka there is a message like this:
{"correlationId":"b3e6bbc7-e5a6-4b2a-a8f9-a0ddf435de67","text":"Hello world"}
This is loaded as a string in the record variable above.
We want to convert this to json so that a filter like this would work:
* def uuid = java.util.UUID.randomUUID() + ''
# This is what we are publishing to kafka
* def payload = ({ correlationId: uuid, text: "Hello world" })
* def filter = function(m) { return m.correlationId == uuid }
Is there a way to convert a string to a native karate variable in javascript? Might have missed it looking at https://intuit.github.io/karate/#the-karate-object. By the way var jsonRecord = karate.toJson(record) did not work and jsonRecord.uuid was undefined.
Edit: I have made an example of what I am trying to achieve here:
https://github.com/KostasKgr/karate-issues/blob/java_json_interop/src/test/java/examples/consumption/consumption.feature
Many thanks
Sometime ago I had put together a something that could be used to test Kafka from within Karate. Pls see if https://github.com/Sdaas/karate-kafka helps. Happy to enhance / improve if it helps you.
Can you try,
* json payload = { correlationId: uuid, text: "Hello world" }
ref : Type Conversion
for type conversion within javascript ideally karate.toMap(object) or karate.toJson(object) should.
rather than wrapping up everything into one JS function, I would suggest keeping the record invoking part outside the JS and let karate cast it.
* json records = Java.type('kafka.KafkaLib').getRecords(bootstrapServers, topic)
* consume(records, filter, timeout, interval)
As mentioned in the comments of another answer, there is now an enhancement ticket on karate to achieve what was discussed in this thread, see https://github.com/intuit/karate/issues/1202
Until that is in place, I managed to get most of what I wanted concerning JSON by parsing string to json in Java and returning that to karate.
Map<String,Object> result = new ObjectMapper().readValue(record, HashMap.class);
Not sure if the same can be worked around for xml
You can see the workaround in action here:
https://github.com/KostasKgr/karate-issues/blob/java_json_interop_v2/src/test/java/examples/consumption/consumption.feature
Because of Karate's support for Java inter-op you can easily write some "glue" code to connect your existing Kafka systems to Karate test-suites, see the first link below.
Here are a few references:
how to use Java inter-op to listen and wait for events: https://twitter.com/KarateDSL/status/1417023536082812935
the Karate ActiveMQ example: https://github.com/intuit/karate/tree/master/karate-netty#consumer-provider-example
Walmart Labs blog post (Kafka specific): https://medium.com/walmartglobaltech/kafka-automation-using-karate-6a129cfdc210
Karate Kafka (3rd party project / example): https://github.com/Sdaas/karate-kafka

Designing an API for the client to a 3rd-party service

I am fairly new to Scala and I'm working on an application (library) which is a client to a 3rd-party service (I'm not able to modify the server side and it uses custom binary protocol). I use Netty for networking.
I want to design an API which should allow users to:
Send requests to the server
Send requests to the server and get the response asynchronously
Subscribe to events triggered by the server (having multiple asynchronous event handlers which should be able to send requests as well)
I am not sure how should I design it. Exploring Scala, I stumble upon a bunch of information about Actor model, but I am not sure if it can be applied there and if it can, how.
I'd like to get some recommendations on the way I should take.
In general, the Scala-ish way to expose asynchronous functionality to user code is to return a scala.concurrent.Future[T].
If you're going the actor route, you might consider encapsulating the binary communication within the context of a single actor class. You can scale the instances of this proxy actor using Akka's router support, and you could produce response futures easily using the ask pattern. There are a few nice libraries (Spray, Play Framework) that make wrapping e.g. a RESTful or even WebSocket layer over Akka almost trivial.
A nice model for the pub-sub functionality might be to define a Publisher trait that you can mix in to some actor subclasses. This could define some state to keep track of subscribers, handle Subscribe and Unsubscribe messages, and provide some sort of convenient method for broadcasting messages:
/**
* Sends a copy of the supplied event object to every subscriber of
* the event object class and superclasses.
*/
protected[this] def publish[T](event: T) {
for (subscriber <- subscribersFor(event)) subscriber ! event
}
These are just some ideas based on doing something similar in some recent projects. Feel free to elaborate on your use case if you need more specific direction. Also, the Akka user list is a great resource for general questions like this, if indeed you're interested in exploring actors in Scala.
Observables
This looks like a good example for the Obesrvable pattern. This pattern comes from the Reactive Extensions of .NET, but is also available for Java and Scala. The library is provided by Netflix and has a really good quality.
This pattern has a good theoretical foundation --- it is the dual to the iterator in the category theoretical sense. But more important, it has a lot of practical ideas in it. Especially it handles time very good, e.g. you can limit the event rate you want to get.
With an observable you can process events on avery high level. In .NET it looks a lot like an SQL query. You can register for certain events ("FROM"), filter them ("WHERE") and finally process them ("SELECT"). In Scala you can use standard monadic API (map, filter, flatMap) and of course "for expressions".
An example can look like
stackoverflowQuestions.filter(_.tag == "Scala").map(_.subject).throttleLast(1 second).subscribe(println _)
Obeservables take away a lot of problems you will have with event based systems
Handling subsrcriptions
Handling errors
Filtering and pre-processing events
Buffering events
Structuring the API
Your API should provide an obesrvable for each event source you have. For procedure calls you provide a function that will map the function call to an obesrvable. This function will call the remote procedure and provide the result through the obeservable.
Implementation details
Add the following dependency to your build.sbt:
libraryDependencies += "com.netflix.rxjava" % "rxjava-scala" % "0.15.0"
You can then use the following pattern to convert a callback to an obeservable (given your remote API has some way to register and unregister a callback):
private val callbackFunc : (rx.lang.scala.Observer[String]) => rx.lang.scala.Subscription = { o =>
val listener = {
case Value(s) => o.onNext(s)
case Error(e) => o.onError(o)
}
remote.subscribe(listener)
// Return an interface to cancel the subscription
new Subscription {
val unsubscribed = new AtomicBoolean(false)
def isUnsubscribed: Boolean = unsubscribed.get()
val asJavaSubscription: rx.Subscription = new rx.Subscription {
def unsubscribe() {
remote.unsubscribe(listener)
unsubscribed.set(true)
}
}
}
If you have some specific questions, just ask and I can refine the answer
Additional ressources
There is a very nice course from Martin Odersky et al. at coursera, covering Observables and other reactive techniques.
Take a look at the spray-client library. This provides HTTP request functionality (I'm assuming the server you want to talk to is a web service?). It gives you a pretty nice DSL for building requests and is all about being asynchronous. It does use the akka Actor model behind the scenes, but you do not have to build your own Actors to use it. Instead the you can just use scala's Future model for handling things asynchronously. A good introduction to the Future model is here.
The basic building block of spray-client is a "pipeline" which maps an HttpRequest to a Future containing an HttpResponse:
// this is from the spray-client docs
val pipeline: HttpRequest => Future[HttpResponse] = sendReceive
val response: Future[HttpResponse] = pipeline(Get("http://spray.io/"))
You can take this basic building block and build it up into a client API in a couple of steps. First, make a class that sets up a pipeline and defines some intermediate helpers demonstrating ResponseTransformation techniques:
import scala.concurrent._
import spray.can.client.HttpClient
import spray.client.HttpConduit
import spray.client.HttpConduit._
import spray.http.{HttpRequest, HttpResponse, FormData}
import spray.httpx.unmarshalling.Unmarshaller
import spray.io.IOExtension
type Pipeline = (HttpRequest) => Future[HttpResponse]
// this is basically spray-client boilerplate
def createPipeline(system: ActorSystem, host: String, port: Int): Pipeline = {
val httpClient = system.actorOf(Props(new HttpClient(IOExtension(system).ioBridge())))
val conduit = system.actorOf(props = Props(new HttpConduit(httpClient, host, port)))
sendReceive(conduit)
}
private var pipeline: Pipeline = _
// unmarshalls to a specific type, e.g. a case class representing a datamodel
private def unmarshallingPipeline[T](implicit ec:ExecutionContext, um:Unmarshaller[T]) = (pipeline ~> unmarshal[T])
// for requests that don't return any content. If you get a successful Future it worked; if there's an error you'll get a failed future from the errorFilter below.
private def unitPipeline(implicit ec:ExecutionContext) = (pipeline ~> { _:HttpResponse => () })
// similar to unitPipeline, but where you care about the specific response code.
private def statusPipeline(implicit ec:ExecutionContext) = (pipeline -> {r:HttpResponse => r.status})
// if you want standard error handling create a filter like this
// RemoteServerError and RemoteClientError are custom exception classes
// not shown here.
val errorFilter = { response:HttpResponse =>
if(response.status.isSuccess) response
else if(response.status.value >= 500) throw RemoteServerError(response)
else throw RemoteClientError(response)
}
pipeline = (createPipeline(system, "yourHost", 8080) ~> errorFilter)
Then you can use wrap these up in methods tied to specific requests/responses that becomes the public API. For example, suppose the service has a "ping" GET endpoint that returns a string ("pong") and a "form" POST endpoint where you post form data and receive a DataModel in return:
def ping()(implicit ec:ExecutionContext, um:Unmarshaller[String]): Future[String] =
unmarshallingPipeline(Get("/ping"))
def form(formData: Map[String, String])(implicit ec:ExecutionContext, um:Unmarshaller[DataModel]): Future[DataModel] =
unmarshallingPipeline(Post("/form"), FormData(formData))
And then someone could use the API like this:
import scala.util.{Failure, Success}
API.ping() foreach(println) // will print out "pong" when response comes back
API.form(Map("a" -> "b") onComplete {
case Success(dataModel) => println("Form accepted. Server returned DataModel: " + dataModel)
case Failure(e) => println("Oh noes, the form didn't go through! " + e)
}
I'm not sure if you will find direct support in spray-client for your third bullet point about subscribing to events. Are these events being generated by the server and somehow sent to your client outside the scope of a specific HTTP request? If so, then spray-client will probably not be able to help directly (though your event handlers could still use it to send requests). Are the events occurring on the client side, e.g. the completion of deferred processing initially triggered by a response from the server? If so, you could actually probably get pretty far just by using the functionality in Future, but depending on your use cases, using Actors might make sense.

ASP.NET Web API - Reading querystring/formdata before each request

For reasons outlined here I need to review a set values from they querystring or formdata before each request (so I can perform some authentication). The keys are the same each time and should be present in each request, however they will be located in the querystring for GET requests, and in the formdata for POST and others
As this is for authentication purposes, this needs to run before the request; At the moment I am using a MessageHandler.
I can work out whether I should be reading the querystring or formdata based on the method, and when it's a GET I can read the querystring OK using Request.GetQueryNameValuePairs(); however the problem is reading the formdata when it's a POST.
I can get the formdata using Request.Content.ReadAsFormDataAsync(), however formdata can only be read once, and when I read it here it is no longer available for the request (i.e. my controller actions get null models)
What is the most appropriate way to consistently and non-intrusively read querystring and/or formdata from a request before it gets to the request logic?
Regarding your question of which place would be better, in this case i believe the AuthorizationFilters to be better than a message handler, but either way i see that the problem is related to reading the body multiple times.
After doing "Request.Content.ReadAsFormDataAsync()" in your message handler, Can you try doing the following?
Stream requestBufferedStream = Request.Content.ReadAsStreamAsync().Result;
requestBufferedStream.Position = 0; //resetting to 0 as ReadAsFormDataAsync might have read the entire stream and position would be at the end of the stream causing no bytes to be read during parameter binding and you are seeing null values.
note: The ability of a request's content to be read single time only or multiple times depends on the host's buffer policy. By default, the host's buffer policy is set as always Buffered. In this case, you will be able to reset the position back to 0. However, if you explicitly make the policy to be Streamed, then you cannot reset back to 0.
What about using ActionFilterAtrributes?
this code worked well for me
public HttpResponseMessage AddEditCheck(Check check)
{
var request= ((System.Web.HttpContextWrapper)Request.Properties.ToList<KeyValuePair<string, object>>().First().Value).Request;
var i = request.Form["txtCheckDate"];
return Request.CreateResponse(HttpStatusCode.Ok);
}