I'm Having difficulties streaming data to a custom Spacy model. I have a real-time data stream from Twitter and would like the data processed. I have read through Spacy documentation and no article specifies how to process real-time data. I already have code to store processed data in JSON format but need help with setting up a way to process data as they come.
Related
I am trying to figure out how to record timestamped sensor data to an instance of VictoriaMetrics. I have an embedded controller with a sensor that is read once per second. I would like VictoriaMetrics to poll the controller once a minute, and log all 60 readings with their associated timestamps into the TSDB.
I have the server and client running, and measuring system metrics is easy, but I can't find an example of how to get a batch of sensor readings to be reported by the embedded client, nor have I been able to figure it out from the docs.
Any insights are welcome!
VictoriaMetrics supports data ingestion via various protocols. All these protocols support batching, i.e. multiple measurements can be sent in a single request. So you can choose the best suited protocol for inserting batches of collected measurements into VictoriaMetrics. For example, if Prometheus text exposition format is selected for data ingestion, then a batch of metrics could look like the following:
measurement_name{optional="labels"} value1 timestamp1
...
measurement_name{optional="labels"} valueN timestampN
VictoriaMetrics can poll (scrape) metrics from the configured address via HTTP. It expects application to return metrics value in the exposition text format. The exposition text format is compatible with Prometheus, so its libraries for different languages will be compatible with VictoriaMetrics as well.
There is also a how-to guide for instrumenting golang application to expose metrics and scrape via VictoriaMetrics here. It describes the monitoring basics for any service or application.
I have a client/server audio synthesizer where the server (java) dynamically generates an audio stream (Ogg/Vorbis) to be rendered by the client using an HTML5 audio element. Users can tweak various parameters and the server immediately alters the output accordingly. Unfortunately the audio element buffers (prefetches) very aggressively so changes made by the user won't be heard until minutes later, literally.
Trying to disable preload has no effect, and apparently this setting is only 'advisory' so there's no guarantee that it's behavior would be consistent across browsers.
I've been reading everything that I can find on WebRTC and the evolving WebAudio API and it seems like all of the pieces I need are there but I don't know if it's possible to connect them up the way I'd like to.
I looked at RTCPeerConnection, it does provide low latency but it brings in a lot of baggage that I don't want or need (STUN, ICE, offer/answer, etc) and currently it seems to only support a limited set of codecs, mostly geared towards voice. Also since the server side is in java I think I'd have to do a lot of work to teach it to 'speak' the various protocols and formats involved.
AudioContext.decodeAudioData works great for a static sample, but not for a stream since it doesn't process the incoming data until it's consumed the entire stream.
What I want is the exact functionality of the audio tag (i.e. HTMLAudioElement) without any buffering. If I could somehow create a MediaStream object that uses the server URL for its input then I could create a MediaStreamAudioSourceNode and send that output to context.destination. This is not very different than what AudioContext.decodeAudioData already does, except that method creates a static buffer, not a stream.
I would like to keep the Ogg/Vorbis compression and eventually use other codecs, but one thing that I may try next is to send raw PCM and build audio buffers on the fly, just as if they were being generated programatically by javascript code. But again, I think all of the parts already exist, and if there's any way to leverage that I would be most thrilled to know about it!
Thanks in advance,
Joe
How are you getting on ? Did you resolve this question ? I am solving a similar challenge. On the browser side I'm using web audio API which has nice ways to render streaming input audio data, and nodejs on the server side using web sockets as the middleware to send the browser streaming PCM buffers.
What's the best way to store custom messages into the queue? I mean if I have a queue that can store different types of messages should I store them in binary format or json?
What do you think?
Windows Azure Storage Client Library provides overloads for binary and string that handle encoding for you. As such you can make use of any serialization mechanism you like, given that the serialized form is less than 64 KB.
Hence, the answer to your question actually depends on your specific scenario. Handling JSON data would be much easier, but if you have a specific need to send the data in another format, please consider such alternatives. For larger scenarios some users augment queue messages to simply point to blob or table storage as a more flexible and verbose option while using the queue messages to provide for reliable message delivery.
basically I want to understand both high level and also technical point of view as what constitutes a streaming API, there are all sorts of data available but I could not find a satisfactory explanation of streaming API, also how does it differ from general APIs (REST if applicable)
PS:I am not asking about multimedia streaming.
Kind of a vague question. I guess streaming usually means one of the following (or a combination)
downloading data for immediate consumption, rather than a whole file for storage, potentially with support for delivering partial data (lower quality, only relevant pieces etc), sometimes even without any storage at all in between producer and consumer
a persistent connection that continues to deliver new data as it becomes available, rather than having the client poll
A good example (for the first pattern) are streaming XML parsers (such as SAX). They allow you to handle XML data that is too big to fit into memory (which a DOM parser likes to do).
I just find another good answer here:
https://www.quora.com/What-is-meant-by-streaming-API
A streaming API differs from the normal REST API in the way that it leaves the HTTP connection open for as long as possible(i.e. "persistent connection"). It pushes data to the client as and when it's available and there is no need for the client to poll the requests to the server for newer data. This approach of maintaining a persistent connection reduces the network latency significantly when a server produces continous stream of data like say, today's social media channels. These APIs are mostly used to read/subscribe to data.
I have limited knowledge in WCF as well as sending binary data via WCF, so this question may be somewhat rudimental.
I would like to know the difference between sending data with BinaryMessageEncodingBindingElement and MtomMessageEncodingBindingElement. It is still not clear to me when to use which approach after reading this page from MSDN on Large Data and Streaming.
Also, a small question: are a message with attachments and an MTOM message the same thing?
MTOM is a standard that uses multi-part mime-encoded messages to send portions of the message that are large and would be too expensive to base64 encode as pure binary. The SOAP message itself is sent as the initial part of the message and contains references to the binary parts which a web service software stack like WCF can then pull back together to create a single representation of the message.
Binary encoding is entirely proprietary to WCF and really doesn't just have to do with large messages. It presents a binary representation of the XML Infoset which is far more compact across the wire and faster to parse than text based formats. If you happen to be sending large binary chunks of data then it just fits right in with the other bytes that are being sent.
Streaming can be done used with any message format. That's more about when the data is written across the network vs. being buffered entirely in memoery before being presented to the network transport. Smaller messages make more sense to buffer up before sending and larger messages, especially those containing large binary chunks or streams, necessitate being streamed or will exhaust memory resources.