How do I read samples from an AudioBufferList? - objective-c

If I open an audio file with extended audio file services, using the following client data format...
AudioStreamBasicDescription audioFormat;
memset(&audioFormat, 0, sizeof(audioFormat));
audioFormat.mSampleRate = 44100.0;
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagIsBigEndian |
kAudioFormatFlagIsSignedInteger |
kAudioFormatFlagIsPacked;
audioFormat.mBytesPerPacket = 4;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 2;
audioFormat.mBytesPerFrame = 4;
audioFormat.mBitsPerChannel = 16;
And configure an AudioBufferList like so....
AudioBufferList bufferList;
bufferList.mNumberBuffers = 1;
bufferList.mBuffers[0].mDataByteSize = bufferSize;
bufferList.mBuffers[0].mNumberChannels = audioFormat.mChannelsPerFrame;
bufferList.mBuffers[0].mData = buffer; //malloc(sizeof(UInt8) * 1024 * audioFormat.mBytesPerPacket)
How, then, is the data arranged in mData? If I iterate through the data like so
for (int i = 0; i < frameCount; i++) {
UInt8 somePieceOfAudioData = buffer[i];
}
then what is somePieceOfAudioData.
Is it a sample or a frame (left and right channels together)? If it's a sample then what channel is it a sample for? If for example it's a sample from the right channel, will buffer[i + 1] be a sample for the left channel?
Any ideas, links? Thank you!

Audio data is expected to be interleaved unless the kAudioFormatFlagIsNonInterleaved is set. I've found that for Core Audio questions the best source of documentation is usually the headers. CoreAudioTypes.h contains the following comment:
Typically, when an ASBD is being used, the fields describe the
complete layout of the sample data in the buffers that are represented
by this description - where typically those buffers are represented by
an AudioBuffer that is contained in an AudioBufferList.
However, when an ASBD has the kAudioFormatFlagIsNonInterleaved flag,
the AudioBufferList has a different structure and semantic. In this
case, the ASBD fields will describe the format of ONE of the
AudioBuffers that are contained in the list, AND each AudioBuffer in
the list is determined to have a single (mono) channel of audio data.
Then, the ASBD's mChannelsPerFrame will indicate the total number of
AudioBuffers that are contained within the AudioBufferList - where
each buffer contains one channel. This is used primarily with the
AudioUnit (and AudioConverter) representation of this list - and won't
be found in the AudioHardware usage of this structure.
In your particular case, the buffer will consist of interleaved shorts starting with the left channel.

Yeah, you're reading a frame and it's two 16-bit samples, Left and Right. (Actually, I'm not certain which is Left and which is Right. Hmmm.)
In addition to the header files, the class references built into Xcode are helpful. I find I'm using "option-click" and "command-click" in my code a lot when I'm sorting out these kinds of details. (for those new to Xcode.. these clicks get you the info and docs, and jump-to-source location, respectively.)
The upcoming book "Learning Core Audio: A Hands-on Guide to Audio Programming for Mac and iOS" by Kevin Avila and Chris Adamson does a nice job of explaining how all this works. It's available in "Rough Cut" form now at Safari Books Online:
http://my.safaribooksonline.com/book/audio/9780321636973

Related

How to write the avc1 atom with libavcodec for MP4 file using H264 codec

I am trying to create an MP4 file using libavcodec. I am using a raspberry pi which has a built in hardware H264 encoder. It outputs Annex B H264 frames and I am trying to see the proper way to save these frames into an MP4 container.
My first attempt simply wrote the MP4 header without building the extradata. The raspberry pi transmits as first frame the SPS and PPS info. This is followed by IDR and then the remaining H264 frames. I started with avformat_write_header and then repackaged the succeeding frames in AVPacket and used
av_write_frame(outputFormatCtx, &pkt);
This works fine but mplayer tries to decode the first frame ( the one containing SPS and PPS info ) and fails with decoding that frame. However, succeeding frames are decodable and the video plays fine from that point on.
I wanted to construct a proper MP4 file so I wanted the SPS and PPS information to go the MP4 header. I read that it should be in the avc1 atom and that I needed to build the extradata and somehow link it to the outputformatctx.
This is my effort so far, after parsing sps and pps from the returned encoder buffers. (I removed the leading 0x0000001 nal delimiters prior to memcpying to sps and pps).
if ((sps) && (pps)) {
//length of extradata is 6 bytes + 2 bytes for spslen + sps + 1 byte number of pps + 2 bytes for ppslen + pps
uint32_t extradata_len = 8 + spslen + 1 + 2 + ppslen;
outputStream->codecpar->extradata = (uint8_t*)av_mallocz(extradata_len);
outputStream->codecpar->extradata_size = extradata_len;
//start writing avcc extradata
outputStream->codecpar->extradata[0] = 0x01; //version
outputStream->codecpar->extradata[1] = sps[1]; //profile
outputStream->codecpar->extradata[2] = sps[2]; //comatibility
outputStream->codecpar->extradata[3] = sps[3]; //level
outputStream->codecpar->extradata[4] = 0xFC | 3; // reserved (6 bits), NALU length size - 1 (2 bits) which is 3
outputStream->codecpar->extradata[5] = 0xE0 | 1; // reserved (3 bits), num of SPS (5 bits) which is 1 sps
//write sps length
memcpy(&outputStream->codecpar->extradata[6],&spslen,2);
//Check to see if written correctly
uint16_t *cspslen=(uint16_t *)&outputStream->codecpar->extradata[6];
fprintf(stderr,"SPS length Wrote %d and read %d \n",spslen,*cspslen);
//Write the actual sps
int i = 0;
for (i=0; i<spslen; i++) {
outputStream->codecpar->extradata[8 + i] = sps[i];
}
for (size_t i = 0; i != outputStream->codecpar->extradata_size; ++i)
fprintf(stderr, "\\%02x", (unsigned char)outputStream->codecpar->extradata[i]);
fprintf(stderr,"\n");
//Number of pps
outputStream->codecpar->extradata[8 + spslen] = 0x01;
//Size of pps
memcpy(&outputStream->codecpar->extradata[8+spslen+1],&ppslen,2);
for (size_t i = 0; i != outputStream->codecpar->extradata_size; ++i)
fprintf(stderr, "\\%02x", (unsigned char)outputStream->codecpar->extradata[i]);
fprintf(stderr,"\n");
//Check to see if written correctly
uint16_t *cppslen=(uint16_t *)&outputStream->codecpar->extradata[+8+spslen+1];
fprintf(stderr,"PPS length Wrote %d and read %d \n",ppslen,*cppslen);
//Write actual PPS
for (i=0; i<ppslen; i++) {
outputStream->codecpar->extradata[8 + spslen + 1 + 2 + i] = pps[i];
}
//Output the extradata to check
for (size_t i = 0; i != outputStream->codecpar->extradata_size; ++i)
fprintf(stderr, "\\%02x", (unsigned char)outputStream->codecpar->extradata[i]);
fprintf(stderr,"\n");
//Access the outputFormatCtx internal AVCodecContext and copy the codecpar to it
AVCodecContext *avctx= outputFormatCtx->streams[0]->codec;
fprintf(stderr,"Extradata size output stream sps pps %d\n",outputStream->codecpar->extradata_size);
if(avcodec_parameters_to_context(avctx, outputStream->codecpar) < 0 ){
fprintf(stderr,"Error avcodec_parameters_to_context");
}
//Check to see if extradata was actually transferred to OutputformatCtx internal AVCodecContext
fprintf(stderr,"Extradata size after sps pps %d\n",avctx->extradata_size);
//Write the MP4 header
if(avformat_write_header(outputFormatCtx , NULL) < 0){
fprintf(stderr,"Error avformat_write_header");
ret = 1;
} else {
extradata_written=true;
fprintf(stderr,"EXTRADATA written\n");
}
}
The resulting video file does not play. The extradata is actually stored in the tail section of the MP4 file instead of the location in the MP4 header for avc1. So it is being written by libavcodec but written likely by avformat_write_trailer.
I will post the PPS and SPS info here and the final extradata byte string just in case the error was in forming the extradata.
Here is the buffer from the hardware encoder with sps and pps preceded by the nal delimiter
\00\00\00\01\27\64\00\28\ac\2b\40\a0\cd\00\f1\22\6a\00\00\00\01\28\ee\04\f2\c0
Here is the 13 byte sps:
27640028ac2b40a0cd00f1226a
Here is the 5 byte pps:
28ee04f2c0
Here is the final extradata byte string which is 29 bytes long. I hope I wrote the PPS and SPS size correctly.
\01\64\00\28\ff\e1\0d\00\27\64\00\28\ac\2b\40\a0\cd\00\f1\22\6a\01\05\00\28\ee\04\f2\c0
I did the same conversion from NAL delimiter 0x0000001 to 4 byte NAL size for the succeeding frames from the encoder and saved them to the file sequentially and then wrote the trailer.
Any idea where the mistake is? How can I write the extradata to its proper location in the MP4 header?
Thanks,
Chris
Well, I found the problem. The raspberry pi is little endian so I assumed that I must write the sps length and pps length and each NALU size in little endian. They need to be written in big endian. After I made the change, the avcc atom showed in mp4info and mplayer can now playback the video.
It's not necessary to access the outputformatctx internal avcodeccontext and modify it.
This post was very helpful:
Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream
Thanks,
Chris

encoding wav file for sample

I saw one sample in encoding wav file,here is the sample
sample for encoding
in this part of code have doubt:
/* encode a single tone sound */
float t, tincr;
t = 0;
tincr = 2 * M_PI * 440.0 / c->sample_rate;
for(i=0; i<2000; i++) {
for(j=0;j<frame_size;j++) {
samples[2*j] = (int)(sin(t) * 10000);
samples[2*j+1] = samples[2*j];
t += tincr;
}
/* encode the samples */
what is 2000 here,in basis of what we have to give this value,because of this i thing my encoding is not correct,any suggestion will be helpfull
It seems to be an arbitrary number of repeated 'frames' that make up the sample. In a different code path it constructs another type of wave form in a similar way and mentions 2000=>52sec.

Objective C variable value not being preserved

I'm doing some audio programming for a client and I've come across an issue which I just don't understand.
I have a render callback which is called repeatedly by CoreAudio. Inside this callback I have the following:
// Get the audio sample data
AudioSampleType *outA = (AudioSampleType *)ioData->mBuffers[0].mData;
Float32 data;
// Loop over the samples
for (UInt32 frame = 0; frame < inNumberFrames; frame++) {
// Convert from SInt16 to Float32 just to prove it's possible
data = (Float32) outA[frame] / (Float32) 32768;
// Convert back to SInt16 to show that everything works as expected
outA[frame] = (SInt16) round(next * 32768);
}
This works as expected which shows there aren't any unexpected rounding errors.
The next thing I want to do is add a small delay. I add a global variable to the class:
i.e. below the #implementation line
Float32 last = 0;
Then I use this variable to get a one frame delay:
// Get the audio sample data
AudioSampleType *outA = (AudioSampleType *)ioData->mBuffers[0].mData;
Float32 data;
Float32 next;
// Loop over the samples
for (UInt32 frame = 0; frame < inNumberFrames; frame++) {
// Convert from SInt16 to Float32 just to prove it's possible
data = (Float32) outA[frame] / (Float32) 32768;
next = last;
last = data;
// Convert back to SInt16 to show that everything works as expected
outA[frame] = (SInt16) round(next * 32768);
}
This time round there's a strange audio distortion on the signal.
I just can't see what I'm doing wrong! Any advice would be greatly appreciated.
It seems that what you've done is introduced an unintentional phaser effect on your audio.
This is because you're only delaying one channel of your audio, so the result is that you have the left channel being delayed one frame behind the right channel. This would result in some odd frequency cancellations / amplifications that would suit your description of "a strange audio distortion".
Try applying the effect to both channels:
AudioSampleType *outA = (AudioSampleType *)ioData->mBuffers[0].mData;
AudioSampleType *outB = (AudioSampleType *)ioData->mBuffers[1].mData;
// apply the same effect to outB as you did to outA
This assumes that you are working with stereo audio (i.e ioData->mNumberBuffers == 2)
As a matter of style, it's (IMO) a bad idea to use a global like your last variable in a render callback. Use the inRefCon to pass in proper context (either as a single variable or as a struct if necessary). This likely isn't related to the problem you're having, though.

taking AudioBuffer samples

i am trying to take the audio buffer samples in real time( resolution of ms)
i am using this function, but it gives me error.
AudioBufferList *bufferList = NULL;
AudioBuffer audioBuffer = bufferList->mBuffers[0];
int bufferSize = audioBuffer.mDataByteSize / sizeof(SInt32);
SInt32 *frame = audioBuffer.mData;
SInt32 signalInput[22050];
for( int i=0; i<bufferSize; i++ )
{
SInt32 currentSample = frame[i];
*(signalInput +i) = currentSample;
NSLog(#"Total power was: %ld ",currentSample);
}
what am i doning wrong here ?
i only need to get the audio samples .i dont want 2 pages code(such as in the app doc)
thanks .
What you want is inconsistent with what you are trying to do. A NULL bufferlist can produce no samples.
You need the two+ pages of code to properly configure the Audio Session and the RemoteIO Audio Unit (etc.) in order to get what you are trying to get. Otherwise there are no samples. The phone won't even turn on audio recording or know how to set up the recording (there a bunches of options) before turning it on. Study the docs and deal with it.

Getting raw sample data of m4a file to draw waveform

I'm using AudioToolbox to access m4a audio files with following code:
UInt32 packetsToRead = 1; //Does it makes difference?
void *buffer = malloc(maxPacketSize * packetsToRead);
for (UInt64 packetIndex = 0; packetIndex < packetCount; packetIndex++)
{
ioNumberOfPackets = packetsToRead;
ioNumberOfBytes = maxPacketSize * ioNumberOfPackets;
AudioFileReadPacketData(audioFile, NO, &ioNumbersOfBytes, NULL, packetIndex, &ioNumberOFPackets, buffer);
for (UInt32 batchPacketIndex = 0; batchPacketIndex < ioNumberOfPackets; batchPacketIndex++)
{
//What to do here to get amplitude value? How to get sample value?
}
packetIndex+=ioNumberOfPackets;
}
My audio format is:
AppleM4A, 8000 Hz, 16 Bit, 4096 frames per packet
The solution was to use extended audio file services. You just have to set up transition between client format and PCM. Got the right way overthere Audio Processing: Playing with volume level.
To get waveform data, you may first need to convert your compressed audio file into raw PCM samples, such as found inside a WAV file, or other non-compressed audio format. Try AVAssetReader, et.al.