Reading speaker output and convert to data - naudio

I am going to read the PC's music output and getting basic information (beat/tone/..) of the song played (then flash the lights accordingly etc). Can NAudio be used for the purpose and any samples? Sorry for this too general question at the moment.
TIA
-d

You can access your PC's output using WasapiLoopbackCapture. However, NAudio does not include a beat detection algorithm, so you'd need to find one yourself. There is a FFT class though which could be used to determine frequencies present.

Related

Reading HDMI data as streamed data

TLDR: I want some way to read HDMI data directly in code, rather than output it to a monitor. Doesn't really matter what language.
Background:
A friend of mine was curious about trying to write an AI for a video game we play, so we were discussing how to do it. At first he suggested to use a camera to read it and then feed that to the AI, but I was thinking there'd be less latency if you could just pull the visual data directly from the hdmi cable to use instead of adding another camera to the mix.
There seem to be some similar questions here, but all the one's I've found were unanswered.
After much google, I'm left with an unclear idea if it's even possible. There are some usb data analyzers, so if you could convert HDMI to USB, that's a possiblity. I'm just trying to figure out if
A) What I'm trying to do is possible.
B) What the best way to do it is.

Problems with NAudio

I am currently using NAudio to do the audio output. Everything is fine except there is an annoying echo in the background. What can I eliminate such noise?
Thanks,
Adam
If you are not using a headset for a network chat program, then the received audio can get recorded by the microphone again, resulting in an annoying echo. There are some fairly complex echo suppression algorithms that programs like Skype use to detect and eliminate these echoes. Unfortunately NAudio does not include such an algorithm, so you'd need to find a third party one, or write your own.

Core Audio CAPlayThrough example with interface and guitar quality

I'm playing around with the CoreAudio CAPlayThrough example provided by Apple. I'm not doing anything fancy, just attempting to get my guitar to pass through an audio interface (m-audio fast track pro) to my computer and then back out the interface into my headphones. I'm getting some audio to pass through, but the quality is terrible. I have the sample rate set to 48k on the input and output. Is there something I'm missing? I suspect it may be an issue with the bit rate, but I'm not sure how to change that. Any guesses as to what maybe causing this quality issue?

Objective-C play sound

I know how to play mp3 files and whatnot in Xcode iOS. But how do I play a certain frequency, like if I just wanted to emit a C# note for 25 seconds; how might I do that? (The synth isn't as important to me as just the pitch of the note.)
You need to generate the PCM audio waveform that corresponds to the note you want to play and store that into a sample buffer in memory. Then you send that buffer to the audio hardware.
Here is a tutorial on generating waveforms of several types. The article goes into some details on the many aspects to a note you need to consider, including the frequency, volume, waveform shape, sampling rate, etc. The article comes with Flash source code, I think you should have no problem taking the concepts and adapting them to iOS.
If you also need a library that you can use to play the generated buffers on iOS, then I recommend the open source Finch.
I hope this helps!
You can synthesize waveforms of your desired frequency and feed them to the callbacks of either the Audio Queue or the RemoteIO Audio Unit API.
Here is a short tutorial on some of the code needed to create sine wave tones for iOS in C.

ZyXEL ADPCM codec

I have a ZyXEL USB Omni56K Duo modem and want to send and receive voice streams on it, but to reach adequate quality I probably need to implement some "ZyXEL ADPCM" encoding because plain PCM provides too small sampling rate to transmit even medium quality voice, and it doesn't work through USB either (probably because even this bitrate is too high for USB-Serial converter in it).
This mysterious codec figures in all Microsoft WAV-related libraries as one of many codecs theoretically supported by it, but I found no implementations.
Can someone offer an implementation in any language or maybe some documentation? Writing a custom mu-law decoding algorithm won't be a problem for me.
Thanks.
I'm not sure how ZyXEL ADPCM varies from other flavors of ADPCM, but various ADPCM implementations can be found with some google searches.
However, the real reason for my post is why the choice of ADPCM. ADPCM is adaptive differential pulse-code modulation. This means that the data being passed is the difference in samples, not the current value (which is also why you see such great compression). In a clean environment with no bit loss (ie disk drive), this is fine. However, in a streaming environment, its generally assumed that bits may be periodically mangled. Any bit damage to the data and you'll be hearing static or other audio artifacts very quickly and usually, fairly badly.
ADPCM's reset mechanism isn't framed based, which means the audio problems can go on for an extended period of time depending on the encoder. The reset code is a usually a set of 0s (16 comes to mind, but its been years since I wrote my own ports).
ADPCM in the telephony environment usually converts a 12 bit PCM sample to a 4 bit ADPCM sample (not bad). As for audio quality...not bad for phone conversations and the spoken word, but most people, in a blind test, can easily detect the quality drop.
In your last sentence, you throw a curve ball into the question. You start mentioning muLaw. muLaw is a PCM implementation that takes a 12 bit sample and transforms it using a logarithmic scale to an 8 bit sample. This is the typical compression mechanism for TDM (phone) networkworks in North America (most of the rest of the world uses a similar algorithm called ALaw).
So, I'm confused what you are actually trying to find.
You also mentioned Microsft and WAV implementations. You probably know, but just in case, that WAV is just a wrapper around the audio data that provides format, sampling information, channel, size and other useful information. Without WAV, AU or other wrappers involved, muLaw and ADPCM are usually presented as raw data.
One other tip if you are implementing ADPCM. As I indicated, they use 4 bits to represent a 12 bit sample. They get away with this by both sides having a multiplier table. Your position in the table changes based on the 4 bit value (in other words, the value is both multiple against a step size and used to figure out the new step size). I've seen a variety of algorithms use slightly different tables (no idea why, but you typically see the sent and received signals slowly stray off the bias). One of the older, popular sound packages was different than what I typically saw from the telephony hardware vendors.
And, for more useless trivia, there are multiple flavors of ADPCM. The variances involve the table, source sample size and destination sample size, but I've never had a need to work with them. Just documented flavors that I've found when I did my internet search for specifications for the various audio formats used in telephony.
Piping your pcm through ffmpeg -f u16le -i - -f wav -acodec adpcm_ms - will likely work.
http://ffmpeg.org/