NAudio: Recording Audio-Card's Actual Output

NAudio: Recording Audio-Card's Actual Output - naudio

I successfully use WasapiLoopbackCapture() for recording audio played on system, but I'm looking for a way to record what the user would actually hear through the speakers.
I'll explain: If a certain application plays music, WASAPI Loopback shall intercept music samples, even if Windows main volume-control is set to 0, meaning: even if no sound is actually heard through audio-card's output-jack (speakers/headphone/etc).
I'd like to intercept the audio actually "reaching" the output-jack (after ALL mixers on the audio-path have "done their job").
Is this possible using NAudio (or other infrastructure)?
A code-sample or a link to a such could come in handy.
Thanks much.

No, this is not directly possible. The loopback capture provided by WASAPI is the stream of data being sent to the audio hardware. It is the hardware that controls the actual output sound, and this is where the volume level is applied to change the output signal strength. Apart from some hardware- and driver-specific options - or some interesting hardware solutions like loopback cables or external ADC - there is no direct method to get the true output data.
One option is to get the volume level from the mixer and apply it as a scaling factor on any data you receive from the loopback stream. This is not a perfect solution, but possibly the best you can do without specific hardware support.

Related

Reduce Freeswitch video conference latency

We're experimenting with a Freeswitch based multiparty video conferencing solution (Zoom like). The users are connecting via WebRTC (Verto clients) and the streams are all muxed and displayed on the canvas (mod_conference in mux mode). It works OK, but we notice high media latency for mixed output and this makes it very difficult to have a real-time dialogue. This is not load related, even with only 1 caller watching himself on the canvas (the mux conference output), it takes almost 1 second to see a local move being reflected on the screen (e.g. if I raise my hand I can see it happening on the screen after almost 1 second ). This is obviously the roundtrip delay, but after discarding the intrinsic network latency (measured to be about 100 ms roundtrip) there seem to be around 800-900 ms added latency. There's no TURN relaying involved. It seems this is being introduced along the buffering/ transcoding/ muxing pipeline. Any suggestions please what to try to reduce the latency? What sort of latency should we expect, what's your experience, has anyone deployed a Freeswitch video conferencing with acceptable latency for bidirectional, real time conversations? Ultimately I'm trying to understand if Freeswitch can be used for a multiparty real time video conversation or I should give up look for something else. Thanks!

Stream html5 camera output

does anyone know how to stream html5 camera output to other users.
If that's possible should I use sockets, images and stream them to the users or other technology.
Is there any video tutorial where I can take a look about it.
Many thanks.

The two most common approaches now are most likely:
stream from the source to a server, and allow users connect to the server to stream to their devices, typically using some form of Adaptive Bit Rate streaming protocol (ABR - basically creates multiple bit rate versions of your content and chunks them, so the client can choose the next chunk from the best bit rate for the device and current network conditions).
Stream peer to peer, or via a conferencing hub, using WebRTC
In general, the latter is more focused towards real time, e.g. any delay should be below the threshold which would interfere with audio and video conferences, usually less than 200ms for audio for example. To achieve this it may have to sacrifice quality sometimes, especially video quality.
There are some good WebRTC samples available online (here at the time of writing): https://webrtc.github.io/samples/

Play audio stream using WebAudio API

I have a client/server audio synthesizer where the server (java) dynamically generates an audio stream (Ogg/Vorbis) to be rendered by the client using an HTML5 audio element. Users can tweak various parameters and the server immediately alters the output accordingly. Unfortunately the audio element buffers (prefetches) very aggressively so changes made by the user won't be heard until minutes later, literally.
Trying to disable preload has no effect, and apparently this setting is only 'advisory' so there's no guarantee that it's behavior would be consistent across browsers.
I've been reading everything that I can find on WebRTC and the evolving WebAudio API and it seems like all of the pieces I need are there but I don't know if it's possible to connect them up the way I'd like to.
I looked at RTCPeerConnection, it does provide low latency but it brings in a lot of baggage that I don't want or need (STUN, ICE, offer/answer, etc) and currently it seems to only support a limited set of codecs, mostly geared towards voice. Also since the server side is in java I think I'd have to do a lot of work to teach it to 'speak' the various protocols and formats involved.
AudioContext.decodeAudioData works great for a static sample, but not for a stream since it doesn't process the incoming data until it's consumed the entire stream.
What I want is the exact functionality of the audio tag (i.e. HTMLAudioElement) without any buffering. If I could somehow create a MediaStream object that uses the server URL for its input then I could create a MediaStreamAudioSourceNode and send that output to context.destination. This is not very different than what AudioContext.decodeAudioData already does, except that method creates a static buffer, not a stream.
I would like to keep the Ogg/Vorbis compression and eventually use other codecs, but one thing that I may try next is to send raw PCM and build audio buffers on the fly, just as if they were being generated programatically by javascript code. But again, I think all of the parts already exist, and if there's any way to leverage that I would be most thrilled to know about it!
Thanks in advance,
Joe

How are you getting on ? Did you resolve this question ? I am solving a similar challenge. On the browser side I'm using web audio API which has nice ways to render streaming input audio data, and nodejs on the server side using web sockets as the middleware to send the browser streaming PCM buffers.

Is it possible to programmatically power on/off the 3V3?

I have a Netduino Plus with at transeiver attached via SPI. I would like to reset the transiever every time the Netduino restarts. Is it possible to programmatically power on/off the 3V3 pin?

I would recommend using a FET (controlled by one of the I/O) pins to enable/disable 3V3 power to your transceiver. When you say transceiver, I think "more than a few mA" :)
BTW, we took this feedback into account with the new Shield Base module for Netduino Go. It has an integrated FET on both 3V3 and 5V power headers, so you could enable/disable power to your shield in code. Once the new Ethernet go!bus module ships and the Shield Base comes out of beta (soon), your solution can be redeployed to Netduino Go + Shield Base with few/no code changes.
Chris
Secret Labs LLC

Looking at the circuit diagram ( http://www.netduino.com/netduinoplus/schematic.pdf ), I can see only the Micro SD Card Slot having its power controlled programmatically. You could rig up a relay to control it (via a transistor, of course) instead, or if the transceiver uses less than 130mA (the current limit of the device shown: http://www.datasheetarchive.com/BSS84W-7-F-datasheet.html) you could copy the circuit from the Netduino Plus. Buying a relay shield looks like overkill, but you might have other uses for it.
Have you looked into resetting the transceiver programmatically instead of the brute-force method of power-cycling it?

Just to provide another view. You could use a transistor powered off the netduino RESET line, this will reset the device every time the netduino reboots. Or you can just link the transistor to a spare digital pin and power it in code..

What specific SPI device are you using? You mention that it's a transceiver but we could probably provide better information if we know the exact part number. If your device requires less than 8mA the Netduino Plus specs seem to indicate that one option could be using a digital output pin as the power source.
Unfortunately Secret Labs don't use exactly the language I'd expect and call out the sink and source current maximums so I would contact them directly first to see if you risk blowing your chip. I'll see if I can get an answer from them and amend this post if/when I do.
Update: Sink and source current is the same on the Netduino. See my post on their forums about sink vs. source current for a more in depth explanation. So, if your device can run off of just a few milliamps you should be able to use a digital I/O pin to power it.
Also, a lot of devices have enable pins. You can usually reset them with that line instead of pulling the power if that helps. Sometimes with flaky hardware it is better to pull the power though.

Symbian/S60 audio playback rate

I would like to control the playback rate of a song while it is playing. Basically I want to make it play a little faster or slower, when I tell it to do so.
Also, is it possible to playback two different tracks at the same time. Imagine a recording with the instruments in one track and the vocal in a different track. One of these tracks should then be able to change the playback rate in "realtime".
Is this possible on Symbian/S60?

It's possible, but you would have to:
Convert the audio data into PCM, if it is not already in this format
Process this PCM stream in the application, in order to change its playback rate
Render the audio via CMdaAudioOutputStream or CMMFDevSound (or QAudioOutput, if you are using Qt)
In other words, the platform itself does not provide any APIs for changing the audio playback rate - your application would need to process the audio stream directly.
As for playing multiple tracks together, depending on the device, the audio subsystem may let you play two or more streams simultaneously using either of the above APIs. The problem you may have however is that they are unlikely to be synchronised. Your app would probably therefore have to mix all of the individual tracks into one stream before rendering.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas