AspNetCore SignalR Streaming Clarifications - asp.net-core

I have been going through the recent signalr documentation, and i stumbled across the new feature called Streaming. I also, and i managed to get it running with a JS client. However, i am still not clear on when to use it.
1- Does ChannelReader stream data to a single client?
2- If yes, what is the difference than calling this.Clients.Caller.Invoke()
3- Lets say i am listening to an external realtime feed e.g. stock exchange, is it recommended to use signalr stream?
4- According to this post, the writer lives within a Task.Run(). So how is this scalable if i need to push a real time feed using streams to lets say 1000 clients? Are there any scalability concers of using signalr streams generally?

1- Does ChannelReader stream data to a single client?
Yes.
2- If yes, what is the difference than doing this.Clients.Caller.Invoke()
You can only invoke a single method at a time (sequentially). As long as you are in an invocation, the rest will be queued for that connection until the previous one is finished. With streaming methods, you can start a stream and pump data to the client while still invoking other methods on the same hub.
3- Lets say i am listening to an external realtime feed e.g. stock exchange, is it recommended to use signalr stream?
Streams are for streaming data triggered from a client action. You can still do unsolicited (not from the client) streaming by just calling a method on the IHubContext.
4- According to this post, the writer lives within a Task.Run(). So how is this scalable if i need to push a real time feed using streams to lets say 1000 clients? Are there any scalability concers of using signalr streams generally?
It scales fine. The Task.Run kicks off the Stream but you're never holding a thread hostage.

Related

Flutter: Realtime API data fetch

I came across to this one solution where we need to use Timer() in order to fetch the changes over time and add to stream controller. Just a simple question, does this method will consume more internet usage for the user?
Because, even the API data does not have new value, the apps still will fetch the API when the period in Timer() reach, which I consider this is just waste of process and internet usage.
You mentioned the use of a stream in your post, hence if your API implements a stream too, you do not need to use a Timer() to make requests (also known as Polling).
You just need to subscribe to your API stream to receive updated data.

NServiceBus Sagas and REST API Integration best-practices

What is the most sensible approach to integrate/interact NServiceBus Sagas with REST APIs?
The scenario is as follows,
We have a load balanced REST API. Depending on the load we can add more nodes.
REST API is a wrapper around a DomainServices API. This means the API can be consumed directly.
We would like to use Sagas for workflow and implement NServiceBus Distributor to scale-out.
Question is, if we use the REST API from Sagas, the actual processing happens in the API farm. This in a way defeats the purpose of implementing distributor pattern.
On the other hand, using DomainServives API directly from Sagas, allows processing locally within worker nodes. With this approach we will have to maintain API assemblies in multiple locations but the throughput could be higher.
I am trying to understand the best approach. Personally, I’d prefer to consume the API (if readily available) but this could introduce chattiness to the system and could take longer to complete as compared to to in-process.
A typical sequence could be similar to publishing an online advertisement,
Advertiser submits a new advertisement request via a web application.
Web application invokes the relevant API endpoint and sends a command
message.
Command message initiates a new publish advertisement Saga
instance.
Saga sends a command to validate caller permissions (in
process/out of process API call)
Saga sends a command to validate the
advertisement data (in process/out of process API call)
Saga sends a
command to the fraud service (third party service)
Once the content and fraud verifications are successful,
Saga sends a command to the billing system.
Saga invokes an API call to save add details. (in
process/out of process API call)
And this goes on until the advertisement is expired, there are a number of retry and failure condition paths.
After a number of design iterations we came up with the following guidelines,
Treat REST API layer as the integration platform.
Assume API endpoints are capable of abstracting fairly complex micro work-flows. Micro work-flows are operations that executes in a single burst (not interruptible) and completes with-in a short time span (<1 second).
Assume API farm is capable of serving many concurrent requests and can be easily scaled-out.
Favor synchronous invocations over asynchronous message based invocations when the target operation is fairly straightforward.
When asynchronous processing is required use a single message handler and invoke API from the handlers. This will delegate work to the API farm. This will also eliminate the need for a distributor and extra hardware resources.
Avoid Saga’s unless if the business work-flow contains multiple transactions, compensation logic and resumes. Tests reveals Sagas do not perform well under load.
Avoid consuming DomainServices directly from a message handler. This till do the work locally and also introduces a deployment hassle by distributing business logic.
Happy to hear out thoughts.
You are right on with identifying that you will need Sagas to manage workflow. I'm willing to bet that your Domain hooks up to a common database. If that is true then it will be faster to use your Domain directly and remove the serialization/network overhead. You will also lose the ability to easily manage the transactions at the database level.
Assuming your are directly calling your Domain, the performance becomes a question of how the Domain performs. You may take steps to optimize the database, drive down distributed transaction costs, sharding the data, etc. You may end up using the Distributor to have multiple Saga processing nodes, but it sounds like you have some more testing to do once a design is chosen.
Generically speaking, we use REST APIs to model the commands as resources(via POST) to allow interaction with NSB from clients who don't have direct access to messaging. This is a potential solution to get things onto NSB from your web app.

Streaming API vs Rest API?

The canonical example here is Twitter's API. I understand conceptually how the REST API works, essentially its just a query to their server for your particular request in which you then receive a response (JSON, XML, etc), great.
However I'm not exactly sure how a streaming API works behind the scenes. I understand how to consume it. For example with Twitter listen for a response. From the response listen for data and in which the tweets come in chunks. Build up the chunks in a string buffer and wait for a line feed which signifies end of Tweet. But what are they doing to make this work?
Let's say I had a bunch of data and I wanted to setup a streaming API locally for other people on the net to consume (just like Twitter). How is this done, what technologies? Is this something Node JS could handle? I'm just trying to wrap my head around what they are doing to make this thing work.
Twitter's stream API is that it's essentially a long-running request that's left open, data is pushed into it as and when it becomes available.
The repercussion of that is that the server will have to be able to deal with lots of concurrent open HTTP connections (one per client). A lot of existing servers don't manage that well, for example Java servlet engines assign one Thread per request which can (a) get quite expensive and (b) quickly hits the normal max-threads setting and prevents subsequent connections.
As you guessed the Node.js model fits the idea of a streaming connection much better than say a servlet model does. Both requests and responses are exposed as streams in Node.js, but don't occupy an entire thread or process, which means that you could continue pushing data into the stream for as long as it remained open without tying up excessive resources (although this is subjective). In theory you could have a lot of concurrent open responses connected to a single process and only write to each one when necessary.
If you haven't looked at it already the HTTP docs for Node.js might be useful.
I'd also take a look at technoweenie's Twitter client to see what the consumer end of that API looks like with Node.js, the stream() function in particular.

Offline client and messages to azure

I'm playing around with windows azure and I would like to build a clouded server application that receives messages from many different clients, such as mobile and desktop.
I would like to build the client so that they work while in "offline-mode", i.e. I would like the client to build up a local queue of messages that are sent to the azure server as soon as they get online.
Can I accomplish this using wcf and/or azure queing mechanism, so that I don't have to worry about whether the client is online or offline when I write the code?
You won't need queuing in the cloud to accomplish this. For the client app to be "offline enabled" you need to do queuing on the client. For this there are many options, a local database, xml files, etc. Whenever the app senses network availability you can upload your queue to Azure. And yes, you can use WCF for that.
For the client queue/sync stuff you could take a look at the Sync Framework.
I haven't found a great need for the queue so far. Maybe it's just that I'm not seeing it in my app view. Could also be that the data you can store in the queue is minimal. You basically store short text strings (like record ids), and then you have to do something with the ID when you pull it from the queue, such as look it up, delete it, whatever.
In my app, I didn't use the queue at all, just as Peter suggests. I wrote directly to table storage (accessed via it's REST interface using StorageClient) from the client. If you want to look at a concrete example, take a look at http://www.netalerts.mobi/traffic. Like you, I wanted to learn Azure so I built a small web site.
There's a worker_role that wakes up every 60 seconds. Using one thread, it retrieves any new data from it's source (screen scraping a web page). New entries are stored directly in table storage (no need for a queue). Another thread deletes entries in table storage that are older than a specified threshold (there's no issue with running multiple threads against table storage). And then I'm working on the third thread which is designed to send notifications to handheld devices.
The app itself is a web_role, obviously.

REST, WCF and Queues

I created a RESTful service using WCF which calculates some value and then returns a response to the client.
I am expecting a lot of traffic so I am not sure whether I need to manually implement queues or it is not neccessary in order to process all client requests.
Actually I am receiving measurements from clients which have to be stored to the database - each client sends a measurement every 200 ms so if there are a multiple clients there could be a lot of requests.
And the other operation performed on received data. For example a client could send an instruction "give me the average of the last 200 measurements" so it could take some time to calculate this value and in the meantime the same request could come from another client.
I would be very thankful if anyone could give any advice on how to create a reliable service using WCF.
Thanks!
You could use the MsmqBinding and utilize the method implemented by eedsi9n. However, from what I'm gathering from this post is that you're looking for something along the lines of a pub/sub type of architecture.
This can be implemented with the WSDualHttpBinding which allows subscribers to subscribe to events. The publisher will then notify the user when the action is completed.
Therefore you could have Msmq running behind the scenes. The client subscribes to the certain events, then perhaps it publishes a message that needs to be processed. THe client sits there and does work (because its all async) and when the publisher is done working on th message it can publish an event (The event your client subscribed to) letting you know that its done. That way you don't have to implement a polling strategy.
There are pre-canned solutions for this as well. Such as NService Bus, Mass Transit, and Rhino Bus.
If you are using Web Service, Transmission Control Protocol (TCP/IP) will act as the queue to a certain degree.
TCP provides reliable, ordered
delivery of a stream of bytes from one
program on one computer to another
program on another computer.
This guarantees that if client sends packet A, B, then C, the server will received it in that order: A, B, then C. If you must reply back to the client in the same order as request, then you might need a queue.
By default maximum ASP.NET worker thread is set to 12 threads per CPU core. So on a dual core machine, you can run 24 connections at a time. Depending on how long the calculation takes and what you mean by "a lot of traffic" you could try different strategies.
The simplest one is to use serviceTimeouts and serviceThrottling and only handle what you can handle, and reject the ones you can't.
If that's not an option, increase hardware. That's the second option.
Finally you could make the service completely asynchronous. Implement two methods
string PostCalc(...) and double GetCalc(string id). PostCalc accepts the parameters, stuff them into a queue (or a database) and returns a GUID immediately (I like using string instead of Guid). The client can use the returned GUID as a claim ticket and call GetCalc(string id) every few seconds, if the calculation has not finished yet, you can return 404 for REST. Calculation must now be done by a separate process that monitors the queue.
The third option is the most complicated, but the outcome is similar to that of the first option of putting cap on incoming request.
It will depend on what you mean by "calculates some value" and "a lot of traffic". You could do some load testing and see how the #requests/second evolves with the traffic.
There's nothing WCF specific here if you are RESTful
the GET for an Average would give a URI where the answer would wait once the server finish calculating (if it is indeed a long operation)
Regarding getting the measurements - you didn't specify the freshness needed (i.e. when you get a request for an average - how fresh do you need the results to be) Also you did not specify the relative frequency of queries vs. new measurements
In any event you can (and IMHO should) use the queue (assuming measuring your performance proves it) behind the endpoint. If you change the WCF binding you might still be RESTful but will not benefit from the standard based approach of REST over HTTP