Unexpected behavior with AudioQueueServices callback while recording audio - objective-c

I'm recording a continuous stream of data using AudioQueueServices. It is my understanding that the callback will only be called when the buffer fills with data. In practice, the first callback has a full buffer, the 2nd callback is 3/4 full, the 3rd callback is full, the 4th is 3/4 full, and so on. These buffers are 8000 packets (recording 8khz audio) - so I should be getting back 1s of audio to the callback each time. I've confirmed that my audio queue buffer size is correct (and is somewhat confirmed by the behavior). What am I doing wrong? Should I be doing something in the AudioQueueNewInput with a different RunLoop? I tried but this didn't seem to make a difference...
By the way, if I run in the debugger, each callback is full with 8000 samples - making me think this is a threading / timing thing.

Apparently, from discussing with others — and the lack of responses, this behavior is as designed (or broken but not likely to be fixed), even if improperly documented. The work around is to buffer your samples appropriately in the callback if necessary and not expect the buffer to be full. This isn't an issue at all if you are just writing the data to a file, but if you expect to operate on consistently sized blocks of audio data in the callback, you will have to assure this consistency yourself.

Related

Low latency isochronous out on OSX and Windows 10

I'm trying to output isochronous data (generated programmatically) over High Speed USB 2 with very low latency. Ideally around 1-2 ms. On Windows I'm using WinUsb, and on OSX I'm using IOKit.
There are two approaches I have thought of. I'm wondering which is best.
1-frame transfers
WinUsb is quite restrictive in what it allows, and requires each isochronous transfer to be a whole number of frames (1 frame = 1 ms). Therefore to minimise latency use transfers of one frame each in a loop something like this:
for (;;)
{
// Submit a 1-frame transfer ASAP.
WinUsb_WriteIsochPipeAsap(..., &overlapped[i]);
// Wait for the transfer from 2 frames ago to complete, for timing purposes. This
// keeps the loop in sync with the USB frames.
WinUsb_GetOverlappedResult(..., &overlapped[i-2], block=true);
}
This works fairly well and gives a latency of 2 ms. On OSX I can do a similar thing, though it is quite a bit more complicated. This is the gist of the code - the full code is too long to post here:
uint64_t frame = ...->GetBusFrameNumber(...) + 1;
for (;;)
{
// Submit at the next available frame.
for (a few attempts)
{
kr = ...->LowLatencyWriteIsochPipeAsync(...
frame, // Start on this frame.
&transfer[i]); // Callback
if (kr == kIOReturnIsoTooOld)
frame++; // Try the next frame.
else if (kr == kIOReturnSuccess)
break;
else
abort();
}
// Above, I pass a callback with a reference to a condition_variable. When
// the transfer completes the condition_variable is triggered and wakes this up:
transfer[i-5].waitForResult();
// I have to wait for 5 frames ago on OSX, otherwise it skips frames.
}
Again this kind of works and gives a latency of around 3.5 ms. But it's not super-reliable.
Race the kernel
OSX's low latency isochronous functions allow you to submit long transfers (e.g. 64 frames), and then regularly (max once per millisecond) update the frame list which says where the kernel has got to in reading the write buffer.
I think the idea is that you somehow wake up every N milliseconds (or microseconds), read the frame list, work out where you need to write to and do that. I haven't written code for this yet but I'm not entirely sure how to proceed, and there are no examples I can find.
It doesn't seem to provide a callback when the frame list is updated so I suppose you have to use your own timer - CFRunLoopTimerCreate() and read the frame list from that callback?
Also I'm wondering if WinUsb allows a similar thing, because it also forces you to register a buffer so it can be simultaneously accessed by the kernel and user-space. I can't find any examples that explicitly say you can write to the buffer while the kernel is reading it though. Are you meant to use WinUsb_GetCurrentFrameNumber in a regular callback to work out where the kernel has got to in a transfer?
That would require getting a regular callback on Windows, which seems a bit tricky. The only way I've seen is to use multimedia timers which have a minimum period of 1 millisecond (unless you use the undocumented (NtSetTimerResolution?).
So my question is: Can I improve the "1-frame transfers" approach, or should I switch to a 1 kHz callback that tries to race the kernel. Example code very appreciated!
(Too long for a comment, so…)
I can only address the OS X side of things. This part of the question:
I think the idea is that you somehow wake up every N milliseconds (or
microseconds), read the frame list, work out where you need to write
to and do that. I haven't written code for this yet but I'm not
entirely sure how to proceed, and there are no examples I can find.
It doesn't seem to provide a callback when the frame list is updated
so I suppose you have to use your own timer - CFRunLoopTimerCreate()
and read the frame list from that callback?
Has me scratching my head over what you're trying to do. Where is your data coming from, where latency is critical but the data source does not already notify you when data is ready?
The idea is that your data is being streamed from some source, and as soon as any data becomes available, presumably when some completion for that data source gets called, you write all available data into the user/kernel shared data buffer at the appropriate location.
So maybe you could explain in a little more detail what you're trying to do and I might be able to help.

WasapiLoopbackCapture to WaveOut

I'm using WasapiLoopbackCapture to capture sound coming from my speakers and then using onDataAvailable to send it to another device and I'm attempting to play the data sent using the WaveOut class and a BufferedWaveProvider and just adding a sample everytime data is sent from my client using the onDataAvailable. I'm having problems sending sound. The most functioning I've managed to get it is:
Not syncing the Wave format of the client and the server, just sending data and adding it to the sample. Problem is this is stutters very much even though I checked the buffer stored size and it has 51 seconds. I even have to increase the buffer size which eventually overflows anyway.
I tried syncing the Wave format and I just get clicks but have no problem with buffer size. I also tried making sure that at least a second was stored in the buffer but that had zero effect.
If anyone could point me in the right direction that would be great.
Uncompressed audio takes up a lot of space on a network. On my machine the WasapiLoopbackCapture object produces 32-bit (IeeeFloat) stereo samples at 44100 samples per second, for around 2.7Mbit/sec total raw bandwidth. Once you factor in TCP packet overheads and so on, that's quite a lot of data you're transferring.
The first thing I would suggest though is that you plug in some profiling code at each step in the process to get an idea of where your bottlenecks are happening. How fast is data arriving from the capture device? How big are your packets? How long does it take to service each call to your OnDataAvailable event handler? How much data are you sending per second across the network? How fast is the data arriving at the client? Figure out where the bottlenecks are and you get a much better idea of what the bottlenecks are.
Try building a simulated server that reads data from a wave file in various WaveFormats (channels, bits per sample and sample rate) and simulates sending that data across the network to the client. You might find that the problem goes away at lower bandwidth. And if bandwidth is the issue, compression might be the solution.
If you're using a single-threaded model, and servicing each OnDataAvailable event takes longer than the recording frequency (ie: number of expected calls to OnDataAvailable per second) then there's going to be a data loss issue. Multiple threads can help with this - one to get the data from the audio system, another to process and send the data. But you can end up in the same position: losing data because you're not dealing with it quickly enough. When that happens it's handy to know about it, because it indicates a problem in the program. Find out when and where it happens - overflow in input, processing or output buffers all have different potential reasons and need different attention.

samples per second on the iphone audio input

After reading so much on the remoteIO for the iphone ,and the buffers, i wrote a code and i get the audio samples.
but , i cant understand something about the buffer size.
i know every time the buffer is full, the callback function is being called.
the buffer size is 2 byts, 16 bits.
what i dont know is, the frequency which the callback is called to get this 16bits.
somehow when i log the buffer out, i have got only 2500 samples per 7 second, which means about 400 samples a second. which is too BAD ! .
what am i doing wrong ? OR what i dont understand here ?
my code is here from another post of me :
error in audio Unit code -remoteIO for iphone
The problem is that NSLog is way too slow compared to the audio samplerate, and thus blocks yor audio callback from getting called often enough. So you are losing almost all of the samples. Take all of the NSLogs out of the audio callback, and just increment a counter to count your samples.
If you want to NSLog something, figure out how to do that outside the audio callback.
Your code sample seems to be requesting 44100 samples per second from the audio unit. Check the error return value to make sure.
Also, the number of samples in a buffer does not involve a strlen().
Maybe it's just the logging that is 400Hz, and the audio is fine?
If I understand correctly you have no problem hearing the audio, that means that the audio frequency is sufficient. At 400Hz, you'll have aliasing for all bands over 200Hz, which is very low (we can hear up to 20 kHz), meaning your audio will be strongly distorted. See nyquist theorem.
Maybe what you get is not a single sample but an array of samples, i.e. an audio buffer, containing maybe 128 samples (~400*128 = 44100), and maybe multiple channels (depending on your configuration).

Why is playing audio through AV Foundation blocking the UI on a slow connection?

I'm using AV Foundation to play an MP3 file loaded over the network, with code that is almost identical to the playback example here: Putting it all Together: Playing a Video File Using AVPlayerLayer, except without attaching a layer for video playback. I was trying to make my app respond to the playback buffer becoming empty on a slow network connection. To do this, I planned to use key-value observing on the AVPlayerItem's playbackBufferEmpty property, but the documentation did not say whether that was possible. I thought it might be possible because the status property can be observed (and is the example above) even though the documentation doesn't say that.
So, in an attempt to create conditions where the buffer would empty, I added code on the server to sleep for two seconds after serving up each 8k chunk of the MP3 file. Much to my surprise, this caused my app's UI (updated using NSTimer) to freeze completely for long periods, despite the fact that it shows almost no CPU usage in the profiler. I tried loading the tracks on another queue with dispatch_async, but that didn't help at all.
Even without the sleep on the server, I've noticed that loading streams using AVPlayerItem keeps the UI from updating for the short time that the stream is being downloaded. I can't see why a slow file download should ever block the responsiveness of the UI. Any idea why this is happening or what I can do about it?
Okay, problem solved. It looks like passing AVURLAssetPreferPreciseDurationAndTimingKey in the options to URLAssetWithURL:options: causes the slowdown. This also only happens when the AVURLAsset's duration property or some other property relating to the stream's timing is accessed from the selector fired by the NSTimer. So if you can avoid polling for timing information, this problem may not affect you, but that's not an option for me. If precise timing is not requested, there's still a delay of around 0.75 seconds to 1 second, but that's all.
Looking back through it, the documentation does warn that precise timing might cause slower performance, but I never imagined 10+ second delays. Why the delay should scale with the loading time of the media is beyond me; it seems like it should only scale with the size of the file. Maybe iOS is doing some kind of heavy polling for new data and/or processing the same bytes over and over.
So now, without "precise timing and duration," the duration of the asset is permanently at 0.0, even when it's fully loaded. I can also answer my original goal of doing KVO on AVPlayerItem.isPlaybackBufferEmpty. It seems KVO would be useless anyway, since the property starts out being NO, changes to YES as soon as I start playback, and continues to be YES even as the media is playing for minutes at a time. The documentation says this about the property:
Indicates whether playback has consumed all buffered media and that playback will stall or end.
So I guess that's not accurate, and, at least in this particular case, the property is not very useful.

Solving a producer-consumer problem with NSData (for audio streaming)

I am using AVAssetReader to copy PCM data from an iPod track to a buffer, which is then played with a RemoteIO audio unit. I am trying to create a separate thread for loading sound data, so that I can access and play data from the buffer while it is being loaded into.
I currently have a large NSMutableData object that eventually holds the entire song's data. Currently, I load audio data in a separate thread using NSOperation like so:
AVAssetReaderOutput copies, at most, 8192 bytes at a time to a CMBlockBuffer
Copy these bytes to a NSData object
Append this NSData object to a larger NSMutableData object (which eventually holds the entire song)
When finished, play the song by accessing each packet in the NSMutableData object
I'm trying to be able to play the song WHILE copying these bytes. I am unsure what a good way to write to and read from a file from the same time is.
A short idea I had:
Create and fill 3 NSData objects, each 8192 bytes in length, as buffers.
Start playing. When I have finished playing the first buffer, load new data into the first buffer.
When I have finished playing the second buffer, load new data into the second. Same for the third
Start playing from the first buffer again, fill the third. And so on.
Or, create one NSData object that holds 3 * 8192 PCM units, and somehow write to and read from it at the same time with two different threads.
I have my code working on two different threads right now. I append data to the array until I press play, at which point it stops (probably because the thread is blocked, but I don't know right now) and plays until it reaches the end of whatever I loaded and causes an EXC_BAD_ACCESS exception.
In short, I want to find the right way to play PCM data while it is being copied, say, 8192 bytes at a time. I will probably have to do so with another thread (I am using NSOperation right now), but am unclear on how to write to and read from a buffer at the same time, preferably using some higher level Objective-C methods.
I'm doing this exact thing. You will definitely need to play your audio on a different thread (I am doing this with RemoteIO). You will also need to use a circular buffer. You probably want to look up this data structure if you aren't familiar with it as you will be using it a lot for this type of operation. My general setup is as follows:
LoadTrackThread starts up and starts loading data from AVAssetReader and storing it in a file as PCM.
LoadPCMThread starts up once enough data is loaded into my PCM file and essentially loads that file into local memory for my RemoteIO thread on demand. It does this by feeding this data into a circular buffer whenever my RemoteIO thread gets even remotely close to running out of samples.
RemoteIO playback callback thread consumes the circular buffer frames and feeds them to the RemoteIO interface. It also informs LoadPCMThread to wake up when it needs to start loading more samples.
This should be about all you need as far as threads. You will need to have some sort of mutex or semaphore between the two threads to ensure you aren't trying to read your file while you are writing into it at the same time (this is bad form and will cause you to crash). I just have both my threads set a boolean and sleep for a while until it is unset. There is probably a more sophisticated way of doing this but it works for my purposes.
Hope that helps!