I came across to this one solution where we need to use Timer() in order to fetch the changes over time and add to stream controller. Just a simple question, does this method will consume more internet usage for the user?
Because, even the API data does not have new value, the apps still will fetch the API when the period in Timer() reach, which I consider this is just waste of process and internet usage.
You mentioned the use of a stream in your post, hence if your API implements a stream too, you do not need to use a Timer() to make requests (also known as Polling).
You just need to subscribe to your API stream to receive updated data.
Related
I have been going through the recent signalr documentation, and i stumbled across the new feature called Streaming. I also, and i managed to get it running with a JS client. However, i am still not clear on when to use it.
1- Does ChannelReader stream data to a single client?
2- If yes, what is the difference than calling this.Clients.Caller.Invoke()
3- Lets say i am listening to an external realtime feed e.g. stock exchange, is it recommended to use signalr stream?
4- According to this post, the writer lives within a Task.Run(). So how is this scalable if i need to push a real time feed using streams to lets say 1000 clients? Are there any scalability concers of using signalr streams generally?
1- Does ChannelReader stream data to a single client?
Yes.
2- If yes, what is the difference than doing this.Clients.Caller.Invoke()
You can only invoke a single method at a time (sequentially). As long as you are in an invocation, the rest will be queued for that connection until the previous one is finished. With streaming methods, you can start a stream and pump data to the client while still invoking other methods on the same hub.
3- Lets say i am listening to an external realtime feed e.g. stock exchange, is it recommended to use signalr stream?
Streams are for streaming data triggered from a client action. You can still do unsolicited (not from the client) streaming by just calling a method on the IHubContext.
4- According to this post, the writer lives within a Task.Run(). So how is this scalable if i need to push a real time feed using streams to lets say 1000 clients? Are there any scalability concers of using signalr streams generally?
It scales fine. The Task.Run kicks off the Stream but you're never holding a thread hostage.
what would be the best option for exposing 220k records to third party applications?
SF style 'bulk API' - independent of the standard API to maintain availability
server-side pagination
call back to a ftp generated file?
webhooks?
This bulk will have to happen once a day or so. ANY OTHER SUGGESTIONS WELCOME!
How are the 220k records being used?
Must serve it all at once
Not ideal for human consumers of this endpoint without special GUI considerations and communication.
A. I think that using a 'bulk API' would be marginally better than reading a file of the same data. (Not 100% sure on this.) Opening and interpreting a file might take a little bit more time than directly accessing data provided in an endpoint's response body.
Can send it in pieces
B. If only a small amount of data is needed at once, then server-side pagination should be used and allows the consumer to request new batches of data as desired. This reduces unnecessary server load by not sending data without it being specifically requested.
C. If all of it needs to be received during a user-session, then find a way to send the consumer partial information along the way. Often users can be temporarily satisfied with partial data while the rest loads, so update the client periodically with information as it arrives. Consider AJAX Long-Polling, HTML5 Server Sent Events (SSE), HTML5 Websockets as described here: What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?. Tech stack details and third party requirements will likely limit your options. Make sure to communicate to users that the application is still working on the request until it is finished.
Can send less data
D. If the third party applications only need to show updated records, could a different endpoint be created for exposing this more manageable (hopefully) subset of records?
E. If the end-result is displaying this data in a user-centric application, then maybe a manageable amount of summary data could be sent instead? Are there user-centric applications that show 220k records at once, instead of fetching individual ones (or small batches)?
I would use a streaming API. This is an API that does a "select * from table" and then streams the results to the consumer. You do this using a for loop to fetch and output the records. This way you never use much memory and as long as you frequently flush the output the webserver will not close the connection and you will support any size of result set.
I know this works as I (shameless plug) wrote the mysql-crud-api that actually does this.
I'm subscribing to live data with the Bloomberg API. Occasionally, it hangs on the call to session.Cancel(correlationID)
Anyone know why?
Where can I find documentation on the API?
I assume that you are talking about the .NET or Java API. In either case you should be able to find documentation (pdfs) by running WAPI on a Bloomberg terminal.
The Bloomberg API can be run in two modes - synchronous and asynchronous. So if you've taken some code example using WAPI and it happens to have been synchronous, you will face delays in your application.
The mode differs in the way data is accessed, for e.g.
the COM API in asynchronous mode would first send out the request using one procedure and another procedure is called back to execute when the data is fetched and ready, hence enabling the user to continue interacting with the GUI.
The synchronous mode would handle data requests and fetching in the same function with the same thread causing the app to hang. It won't make a big difference for the single value return types, but some large data sets could cause delays depending on your leased-line or internet bandwidth.
Is your question referring to the Bloomberg's Excel Add-In or its API Library releases to access live dataI? In either case, unless the data is not widely available to the public, and unless you have a special subscription arrangement from Bloomberg or other data feeds that can be sourced through the terminal, you are going to run into limits on the amount of live data that you are able to gather in any single interval.
To answer your second question, you can access Documentation for Bloomberg's Developers API here. And you can find Documentation and Resources for Bloomberg's API Libraries / Releases here.
The canonical example here is Twitter's API. I understand conceptually how the REST API works, essentially its just a query to their server for your particular request in which you then receive a response (JSON, XML, etc), great.
However I'm not exactly sure how a streaming API works behind the scenes. I understand how to consume it. For example with Twitter listen for a response. From the response listen for data and in which the tweets come in chunks. Build up the chunks in a string buffer and wait for a line feed which signifies end of Tweet. But what are they doing to make this work?
Let's say I had a bunch of data and I wanted to setup a streaming API locally for other people on the net to consume (just like Twitter). How is this done, what technologies? Is this something Node JS could handle? I'm just trying to wrap my head around what they are doing to make this thing work.
Twitter's stream API is that it's essentially a long-running request that's left open, data is pushed into it as and when it becomes available.
The repercussion of that is that the server will have to be able to deal with lots of concurrent open HTTP connections (one per client). A lot of existing servers don't manage that well, for example Java servlet engines assign one Thread per request which can (a) get quite expensive and (b) quickly hits the normal max-threads setting and prevents subsequent connections.
As you guessed the Node.js model fits the idea of a streaming connection much better than say a servlet model does. Both requests and responses are exposed as streams in Node.js, but don't occupy an entire thread or process, which means that you could continue pushing data into the stream for as long as it remained open without tying up excessive resources (although this is subjective). In theory you could have a lot of concurrent open responses connected to a single process and only write to each one when necessary.
If you haven't looked at it already the HTTP docs for Node.js might be useful.
I'd also take a look at technoweenie's Twitter client to see what the consumer end of that API looks like with Node.js, the stream() function in particular.
I'm writing an API which is used to receive some data from another application. Currently the function is designed to block until data is received. In my mind this limits developers using the API to use multithreading or some sort of multi-process design. So is it better for a function to block or to return a null and then sleep for a few milliseconds before trying again.
Note the other application may not have any data to send through the API for an unknown period of time.
The API is written in C++
Why not use a callback?
You could define the API to allow the user to pass an optional timeout value. If the timeout is not specified, then the API function waits indefinitely, much like how select() works.
Consider another option: use an async transaction -> issue a request & provide a callback address with ticket id. When the response is available, the service end-point callbacks your application with the ticket id and of your the result ;-)
You should avoid as must as possible blocking when you possibly can.
As you say:
Note the other application may not have any data to send through the API for an unknown period of time.
In this case, using a synchronous interface ties up resources unnecessarily.
You haven't said what language this is, but it sounds like your API is listening or checking for some event, and the users of the API are either blocking or polling your API to determine if the event happened?
Is it possible to use a callback? Users of the API would register for notifications of the event happening, and when your library detects the event it will use the callback to notify all listeners.
When your applications calls the O/S api function read(), do you expect it to block? Of course you do—at least by default. In some circumstances, ioctl's allow a programmer to change the behavior to be asynchronous, which is particularly common in network applications.
You've shed very little light on what your API is about, so consider:
Does it make sense that an API user would want to be blocked? That is, is there little to do until it returns.
If you were writing an application for the API, what would you expect it to do? You should definitely write a few sample applications for your own education, as well as to document the API.
Is there any reason why the API user would not multithread (or fork, etc.) requests to the API?
If you want a reusable solution you could apply the Asynchronous Design 'Pattern' which is common in .NET but can also be implemented in C++ as demonstrated in this CodeProject project.
There's nothing wrong with providing both synchronous and asynchronous calls to the same feature in the interface.
Personally I would only go these lengths if I need to service multiple requests (in which case you can queue 'BeginOperation' requests for example), or there are many potentially asynchronous operations in the interface (and I want a standardised, flexible pattern). If you can only handle one request at a time a time-out is usually sufficient.