How to use VideoToolbox to decompress H.264 video stream - objective-c

I had a lot of trouble figuring out how to use Apple's Hardware accelerated video framework to decompress an H.264 video stream. After a few weeks I figured it out and wanted to share an extensive example since I couldn't find one.
My goal is to give a thorough, instructive example of Video Toolbox introduced in WWDC '14 session 513. My code will not compile or run since it needs to be integrated with an elementary H.264 stream (like a video read from a file or streamed from online etc) and needs to be tweaked depending on the specific case.
I should mention that I have very little experience with video en/decoding except what I learned while googling the subject. I don't know all the details about video formats, parameter structure etc. so I've only included what I think you need to know.
I am using XCode 6.2 and have deployed to iOS devices that are running iOS 8.1 and 8.2.

Concepts:
NALUs: NALUs are simply a chunk of data of varying length that has a NALU start code header 0x00 00 00 01 YY where the first 5 bits of YY tells you what type of NALU this is and therefore what type of data follows the header. (Since you only need the first 5 bits, I use YY & 0x1F to just get the relevant bits.) I list what all these types are in the method NSString * const naluTypesStrings[], but you don't need to know what they all are.
Parameters: Your decoder needs parameters so it knows how the H.264 video data is stored. The 2 you need to set are Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) and they each have their own NALU type number. You don't need to know what the parameters mean, the decoder knows what to do with them.
H.264 Stream Format: In most H.264 streams, you will receive with an initial set of PPS and SPS parameters followed by an i frame (aka IDR frame or flush frame) NALU. Then you will receive several P frame NALUs (maybe a few dozen or so), then another set of parameters (which may be the same as the initial parameters) and an i frame, more P frames, etc. i frames are much bigger than P frames. Conceptually you can think of the i frame as an entire image of the video, and the P frames are just the changes that have been made to that i frame, until you receive the next i frame.
Procedure:
Generate individual NALUs from your H.264 stream. I cannot show code for this step since it depends a lot on what video source you're using. I made this graphic to show what I was working with ("data" in the graphic is "frame" in my following code), but your case may and probably will differ. My method receivedRawVideoFrame: is called every time I receive a frame (uint8_t *frame) which was one of 2 types. In the diagram, those 2 frame types are the 2 big purple boxes.
Create a CMVideoFormatDescriptionRef from your SPS and PPS NALUs with CMVideoFormatDescriptionCreateFromH264ParameterSets( ). You cannot display any frames without doing this first. The SPS and PPS may look like a jumble of numbers, but VTD knows what to do with them. All you need to know is that CMVideoFormatDescriptionRef is a description of video data., like width/height, format type (kCMPixelFormat_32BGRA, kCMVideoCodecType_H264 etc.), aspect ratio, color space etc. Your decoder will hold onto the parameters until a new set arrives (sometimes parameters are resent regularly even when they haven't changed).
Re-package your IDR and non-IDR frame NALUs according to the "AVCC" format. This means removing the NALU start codes and replacing them with a 4-byte header that states the length of the NALU. You don't need to do this for the SPS and PPS NALUs. (Note that the 4-byte NALU length header is in big-endian, so if you have a UInt32 value it must be byte-swapped before copying to the CMBlockBuffer using CFSwapInt32. I do this in my code with the htonl function call.)
Package the IDR and non-IDR NALU frames into CMBlockBuffer. Do not do this with the SPS PPS parameter NALUs. All you need to know about CMBlockBuffers is that they are a method to wrap arbitrary blocks of data in core media. (Any compressed video data in a video pipeline is wrapped in this.)
Package the CMBlockBuffer into CMSampleBuffer. All you need to know about CMSampleBuffers is that they wrap up our CMBlockBuffers with other information (here it would be the CMVideoFormatDescription and CMTime, if CMTime is used).
Create a VTDecompressionSessionRef and feed the sample buffers into VTDecompressionSessionDecodeFrame( ). Alternatively, you can use AVSampleBufferDisplayLayer and its enqueueSampleBuffer: method and you won't need to use VTDecompSession. It's simpler to set up, but will not throw errors if something goes wrong like VTD will.
In the VTDecompSession callback, use the resultant CVImageBufferRef to display the video frame. If you need to convert your CVImageBuffer to a UIImage, see my StackOverflow answer here.
Other notes:
H.264 streams can vary a lot. From what I learned, NALU start code headers are sometimes 3 bytes (0x00 00 01) and sometimes 4 (0x00 00 00 01). My code works for 4 bytes; you will need to change a few things around if you're working with 3.
If you want to know more about NALUs, I found this answer to be very helpful. In my case, I found that I didn't need to ignore the "emulation prevention" bytes as described, so I personally skipped that step but you may need to know about that.
If your VTDecompressionSession outputs an error number (like -12909) look up the error code in your XCode project. Find the VideoToolbox framework in your project navigator, open it and find the header VTErrors.h. If you can't find it, I've also included all the error codes below in another answer.
Code Example:
So let's start by declaring some global variables and including the VT framework (VT = Video Toolbox).
#import <VideoToolbox/VideoToolbox.h>
#property (nonatomic, assign) CMVideoFormatDescriptionRef formatDesc;
#property (nonatomic, assign) VTDecompressionSessionRef decompressionSession;
#property (nonatomic, retain) AVSampleBufferDisplayLayer *videoLayer;
#property (nonatomic, assign) int spsSize;
#property (nonatomic, assign) int ppsSize;
The following array is only used so that you can print out what type of NALU frame you are receiving. If you know what all these types mean, good for you, you know more about H.264 than me :) My code only handles types 1, 5, 7 and 8.
NSString * const naluTypesStrings[] =
{
#"0: Unspecified (non-VCL)",
#"1: Coded slice of a non-IDR picture (VCL)", // P frame
#"2: Coded slice data partition A (VCL)",
#"3: Coded slice data partition B (VCL)",
#"4: Coded slice data partition C (VCL)",
#"5: Coded slice of an IDR picture (VCL)", // I frame
#"6: Supplemental enhancement information (SEI) (non-VCL)",
#"7: Sequence parameter set (non-VCL)", // SPS parameter
#"8: Picture parameter set (non-VCL)", // PPS parameter
#"9: Access unit delimiter (non-VCL)",
#"10: End of sequence (non-VCL)",
#"11: End of stream (non-VCL)",
#"12: Filler data (non-VCL)",
#"13: Sequence parameter set extension (non-VCL)",
#"14: Prefix NAL unit (non-VCL)",
#"15: Subset sequence parameter set (non-VCL)",
#"16: Reserved (non-VCL)",
#"17: Reserved (non-VCL)",
#"18: Reserved (non-VCL)",
#"19: Coded slice of an auxiliary coded picture without partitioning (non-VCL)",
#"20: Coded slice extension (non-VCL)",
#"21: Coded slice extension for depth view components (non-VCL)",
#"22: Reserved (non-VCL)",
#"23: Reserved (non-VCL)",
#"24: STAP-A Single-time aggregation packet (non-VCL)",
#"25: STAP-B Single-time aggregation packet (non-VCL)",
#"26: MTAP16 Multi-time aggregation packet (non-VCL)",
#"27: MTAP24 Multi-time aggregation packet (non-VCL)",
#"28: FU-A Fragmentation unit (non-VCL)",
#"29: FU-B Fragmentation unit (non-VCL)",
#"30: Unspecified (non-VCL)",
#"31: Unspecified (non-VCL)",
};
Now this is where all the magic happens.
-(void) receivedRawVideoFrame:(uint8_t *)frame withSize:(uint32_t)frameSize isIFrame:(int)isIFrame
{
OSStatus status;
uint8_t *data = NULL;
uint8_t *pps = NULL;
uint8_t *sps = NULL;
// I know what my H.264 data source's NALUs look like so I know start code index is always 0.
// if you don't know where it starts, you can use a for loop similar to how i find the 2nd and 3rd start codes
int startCodeIndex = 0;
int secondStartCodeIndex = 0;
int thirdStartCodeIndex = 0;
long blockLength = 0;
CMSampleBufferRef sampleBuffer = NULL;
CMBlockBufferRef blockBuffer = NULL;
int nalu_type = (frame[startCodeIndex + 4] & 0x1F);
NSLog(#"~~~~~~~ Received NALU Type \"%#\" ~~~~~~~~", naluTypesStrings[nalu_type]);
// if we havent already set up our format description with our SPS PPS parameters, we
// can't process any frames except type 7 that has our parameters
if (nalu_type != 7 && _formatDesc == NULL)
{
NSLog(#"Video error: Frame is not an I Frame and format description is null");
return;
}
// NALU type 7 is the SPS parameter NALU
if (nalu_type == 7)
{
// find where the second PPS start code begins, (the 0x00 00 00 01 code)
// from which we also get the length of the first SPS code
for (int i = startCodeIndex + 4; i < startCodeIndex + 40; i++)
{
if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
{
secondStartCodeIndex = i;
_spsSize = secondStartCodeIndex; // includes the header in the size
break;
}
}
// find what the second NALU type is
nalu_type = (frame[secondStartCodeIndex + 4] & 0x1F);
NSLog(#"~~~~~~~ Received NALU Type \"%#\" ~~~~~~~~", naluTypesStrings[nalu_type]);
}
// type 8 is the PPS parameter NALU
if(nalu_type == 8)
{
// find where the NALU after this one starts so we know how long the PPS parameter is
for (int i = _spsSize + 4; i < _spsSize + 30; i++)
{
if (frame[i] == 0x00 && frame[i+1] == 0x00 && frame[i+2] == 0x00 && frame[i+3] == 0x01)
{
thirdStartCodeIndex = i;
_ppsSize = thirdStartCodeIndex - _spsSize;
break;
}
}
// allocate enough data to fit the SPS and PPS parameters into our data objects.
// VTD doesn't want you to include the start code header (4 bytes long) so we add the - 4 here
sps = malloc(_spsSize - 4);
pps = malloc(_ppsSize - 4);
// copy in the actual sps and pps values, again ignoring the 4 byte header
memcpy (sps, &frame[4], _spsSize-4);
memcpy (pps, &frame[_spsSize+4], _ppsSize-4);
// now we set our H264 parameters
uint8_t* parameterSetPointers[2] = {sps, pps};
size_t parameterSetSizes[2] = {_spsSize-4, _ppsSize-4};
// suggestion from #Kris Dude's answer below
if (_formatDesc)
{
CFRelease(_formatDesc);
_formatDesc = NULL;
}
status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2,
(const uint8_t *const*)parameterSetPointers,
parameterSetSizes, 4,
&_formatDesc);
NSLog(#"\t\t Creation of CMVideoFormatDescription: %#", (status == noErr) ? #"successful!" : #"failed...");
if(status != noErr) NSLog(#"\t\t Format Description ERROR type: %d", (int)status);
// See if decomp session can convert from previous format description
// to the new one, if not we need to remake the decomp session.
// This snippet was not necessary for my applications but it could be for yours
/*BOOL needNewDecompSession = (VTDecompressionSessionCanAcceptFormatDescription(_decompressionSession, _formatDesc) == NO);
if(needNewDecompSession)
{
[self createDecompSession];
}*/
// now lets handle the IDR frame that (should) come after the parameter sets
// I say "should" because that's how I expect my H264 stream to work, YMMV
nalu_type = (frame[thirdStartCodeIndex + 4] & 0x1F);
NSLog(#"~~~~~~~ Received NALU Type \"%#\" ~~~~~~~~", naluTypesStrings[nalu_type]);
}
// create our VTDecompressionSession. This isnt neccessary if you choose to use AVSampleBufferDisplayLayer
if((status == noErr) && (_decompressionSession == NULL))
{
[self createDecompSession];
}
// type 5 is an IDR frame NALU. The SPS and PPS NALUs should always be followed by an IDR (or IFrame) NALU, as far as I know
if(nalu_type == 5)
{
// find the offset, or where the SPS and PPS NALUs end and the IDR frame NALU begins
int offset = _spsSize + _ppsSize;
blockLength = frameSize - offset;
data = malloc(blockLength);
data = memcpy(data, &frame[offset], blockLength);
// replace the start code header on this NALU with its size.
// AVCC format requires that you do this.
// htonl converts the unsigned int from host to network byte order
uint32_t dataLength32 = htonl (blockLength - 4);
memcpy (data, &dataLength32, sizeof (uint32_t));
// create a block buffer from the IDR NALU
status = CMBlockBufferCreateWithMemoryBlock(NULL, data, // memoryBlock to hold buffered data
blockLength, // block length of the mem block in bytes.
kCFAllocatorNull, NULL,
0, // offsetToData
blockLength, // dataLength of relevant bytes, starting at offsetToData
0, &blockBuffer);
NSLog(#"\t\t BlockBufferCreation: \t %#", (status == kCMBlockBufferNoErr) ? #"successful!" : #"failed...");
}
// NALU type 1 is non-IDR (or PFrame) picture
if (nalu_type == 1)
{
// non-IDR frames do not have an offset due to SPS and PSS, so the approach
// is similar to the IDR frames just without the offset
blockLength = frameSize;
data = malloc(blockLength);
data = memcpy(data, &frame[0], blockLength);
// again, replace the start header with the size of the NALU
uint32_t dataLength32 = htonl (blockLength - 4);
memcpy (data, &dataLength32, sizeof (uint32_t));
status = CMBlockBufferCreateWithMemoryBlock(NULL, data, // memoryBlock to hold data. If NULL, block will be alloc when needed
blockLength, // overall length of the mem block in bytes
kCFAllocatorNull, NULL,
0, // offsetToData
blockLength, // dataLength of relevant data bytes, starting at offsetToData
0, &blockBuffer);
NSLog(#"\t\t BlockBufferCreation: \t %#", (status == kCMBlockBufferNoErr) ? #"successful!" : #"failed...");
}
// now create our sample buffer from the block buffer,
if(status == noErr)
{
// here I'm not bothering with any timing specifics since in my case we displayed all frames immediately
const size_t sampleSize = blockLength;
status = CMSampleBufferCreate(kCFAllocatorDefault,
blockBuffer, true, NULL, NULL,
_formatDesc, 1, 0, NULL, 1,
&sampleSize, &sampleBuffer);
NSLog(#"\t\t SampleBufferCreate: \t %#", (status == noErr) ? #"successful!" : #"failed...");
}
if(status == noErr)
{
// set some values of the sample buffer's attachments
CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);
// either send the samplebuffer to a VTDecompressionSession or to an AVSampleBufferDisplayLayer
[self render:sampleBuffer];
}
// free memory to avoid a memory leak, do the same for sps, pps and blockbuffer
if (NULL != data)
{
free (data);
data = NULL;
}
}
The following method creates your VTD session. Recreate it whenever you receive new parameters. (You don't have to recreate it every time you receive parameters, pretty sure.)
If you want to set attributes for the destination CVPixelBuffer, read up on CoreVideo PixelBufferAttributes values and put them in NSDictionary *destinationImageBufferAttributes.
-(void) createDecompSession
{
// make sure to destroy the old VTD session
_decompressionSession = NULL;
VTDecompressionOutputCallbackRecord callBackRecord;
callBackRecord.decompressionOutputCallback = decompressionSessionDecodeFrameCallback;
// this is necessary if you need to make calls to Objective C "self" from within in the callback method.
callBackRecord.decompressionOutputRefCon = (__bridge void *)self;
// you can set some desired attributes for the destination pixel buffer. I didn't use this but you may
// if you need to set some attributes, be sure to uncomment the dictionary in VTDecompressionSessionCreate
NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithBool:YES],
(id)kCVPixelBufferOpenGLESCompatibilityKey,
nil];
OSStatus status = VTDecompressionSessionCreate(NULL, _formatDesc, NULL,
NULL, // (__bridge CFDictionaryRef)(destinationImageBufferAttributes)
&callBackRecord, &_decompressionSession);
NSLog(#"Video Decompression Session Create: \t %#", (status == noErr) ? #"successful!" : #"failed...");
if(status != noErr) NSLog(#"\t\t VTD ERROR type: %d", (int)status);
}
Now this method gets called every time VTD is done decompressing any frame you sent to it. This method gets called even if there's an error or if the frame is dropped.
void decompressionSessionDecodeFrameCallback(void *decompressionOutputRefCon,
void *sourceFrameRefCon,
OSStatus status,
VTDecodeInfoFlags infoFlags,
CVImageBufferRef imageBuffer,
CMTime presentationTimeStamp,
CMTime presentationDuration)
{
THISCLASSNAME *streamManager = (__bridge THISCLASSNAME *)decompressionOutputRefCon;
if (status != noErr)
{
NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil];
NSLog(#"Decompressed error: %#", error);
}
else
{
NSLog(#"Decompressed sucessfully");
// do something with your resulting CVImageBufferRef that is your decompressed frame
[streamManager displayDecodedFrame:imageBuffer];
}
}
This is where we actually send the sampleBuffer off to the VTD to be decoded.
- (void) render:(CMSampleBufferRef)sampleBuffer
{
VTDecodeFrameFlags flags = kVTDecodeFrame_EnableAsynchronousDecompression;
VTDecodeInfoFlags flagOut;
NSDate* currentTime = [NSDate date];
VTDecompressionSessionDecodeFrame(_decompressionSession, sampleBuffer, flags,
(void*)CFBridgingRetain(currentTime), &flagOut);
CFRelease(sampleBuffer);
// if you're using AVSampleBufferDisplayLayer, you only need to use this line of code
// [videoLayer enqueueSampleBuffer:sampleBuffer];
}
If you're using AVSampleBufferDisplayLayer, be sure to init the layer like this, in viewDidLoad or inside some other init method.
-(void) viewDidLoad
{
// create our AVSampleBufferDisplayLayer and add it to the view
videoLayer = [[AVSampleBufferDisplayLayer alloc] init];
videoLayer.frame = self.view.frame;
videoLayer.bounds = self.view.bounds;
videoLayer.videoGravity = AVLayerVideoGravityResizeAspect;
// set Timebase, you may need this if you need to display frames at specific times
// I didn't need it so I haven't verified that the timebase is working
CMTimebaseRef controlTimebase;
CMTimebaseCreateWithMasterClock(CFAllocatorGetDefault(), CMClockGetHostTimeClock(), &controlTimebase);
//videoLayer.controlTimebase = controlTimebase;
CMTimebaseSetTime(self.videoLayer.controlTimebase, kCMTimeZero);
CMTimebaseSetRate(self.videoLayer.controlTimebase, 1.0);
[[self.view layer] addSublayer:videoLayer];
}

If you can't find the VTD error codes in the framework, I decided to just include them here. (Again, all these errors and more can be found inside the VideoToolbox.framework itself in the project navigator, in the file VTErrors.h.)
You will get one of these error codes either in the the VTD decode frame callback or when you create your VTD session if you did something incorrectly.
kVTPropertyNotSupportedErr = -12900,
kVTPropertyReadOnlyErr = -12901,
kVTParameterErr = -12902,
kVTInvalidSessionErr = -12903,
kVTAllocationFailedErr = -12904,
kVTPixelTransferNotSupportedErr = -12905, // c.f. -8961
kVTCouldNotFindVideoDecoderErr = -12906,
kVTCouldNotCreateInstanceErr = -12907,
kVTCouldNotFindVideoEncoderErr = -12908,
kVTVideoDecoderBadDataErr = -12909, // c.f. -8969
kVTVideoDecoderUnsupportedDataFormatErr = -12910, // c.f. -8970
kVTVideoDecoderMalfunctionErr = -12911, // c.f. -8960
kVTVideoEncoderMalfunctionErr = -12912,
kVTVideoDecoderNotAvailableNowErr = -12913,
kVTImageRotationNotSupportedErr = -12914,
kVTVideoEncoderNotAvailableNowErr = -12915,
kVTFormatDescriptionChangeNotSupportedErr = -12916,
kVTInsufficientSourceColorDataErr = -12917,
kVTCouldNotCreateColorCorrectionDataErr = -12918,
kVTColorSyncTransformConvertFailedErr = -12919,
kVTVideoDecoderAuthorizationErr = -12210,
kVTVideoEncoderAuthorizationErr = -12211,
kVTColorCorrectionPixelTransferFailedErr = -12212,
kVTMultiPassStorageIdentifierMismatchErr = -12213,
kVTMultiPassStorageInvalidErr = -12214,
kVTFrameSiloInvalidTimeStampErr = -12215,
kVTFrameSiloInvalidTimeRangeErr = -12216,
kVTCouldNotFindTemporalFilterErr = -12217,
kVTPixelTransferNotPermittedErr = -12218,

A good Swift example of much of this can be found in Josh Baker's Avios library: https://github.com/tidwall/Avios
Note that Avios currently expects the user to handle chunking data at NAL start codes, but does handle decoding the data from that point forward.
Also worth a look is the Swift based RTMP library HaishinKit (formerly "LF"), which has its own decoding implementation, including more robust NALU parsing: https://github.com/shogo4405/lf.swift

In addition to VTErrors above, I thought it's worth adding CMFormatDescription, CMBlockBuffer, CMSampleBuffer errors that you may encounter while trying Livy's example.
kCMFormatDescriptionError_InvalidParameter = -12710,
kCMFormatDescriptionError_AllocationFailed = -12711,
kCMFormatDescriptionError_ValueNotAvailable = -12718,
kCMBlockBufferNoErr = 0,
kCMBlockBufferStructureAllocationFailedErr = -12700,
kCMBlockBufferBlockAllocationFailedErr = -12701,
kCMBlockBufferBadCustomBlockSourceErr = -12702,
kCMBlockBufferBadOffsetParameterErr = -12703,
kCMBlockBufferBadLengthParameterErr = -12704,
kCMBlockBufferBadPointerParameterErr = -12705,
kCMBlockBufferEmptyBBufErr = -12706,
kCMBlockBufferUnallocatedBlockErr = -12707,
kCMBlockBufferInsufficientSpaceErr = -12708,
kCMSampleBufferError_AllocationFailed = -12730,
kCMSampleBufferError_RequiredParameterMissing = -12731,
kCMSampleBufferError_AlreadyHasDataBuffer = -12732,
kCMSampleBufferError_BufferNotReady = -12733,
kCMSampleBufferError_SampleIndexOutOfRange = -12734,
kCMSampleBufferError_BufferHasNoSampleSizes = -12735,
kCMSampleBufferError_BufferHasNoSampleTimingInfo = -12736,
kCMSampleBufferError_ArrayTooSmall = -12737,
kCMSampleBufferError_InvalidEntryCount = -12738,
kCMSampleBufferError_CannotSubdivide = -12739,
kCMSampleBufferError_SampleTimingInfoInvalid = -12740,
kCMSampleBufferError_InvalidMediaTypeForOperation = -12741,
kCMSampleBufferError_InvalidSampleData = -12742,
kCMSampleBufferError_InvalidMediaFormat = -12743,
kCMSampleBufferError_Invalidated = -12744,
kCMSampleBufferError_DataFailed = -16750,
kCMSampleBufferError_DataCanceled = -16751,

Thanks to Olivia for this great and detailed post!
I recently started to program a streaming app on iPad Pro with Xamarin forms and this article helped a lot and I found many references to it throughout the web.
I suppose many people re-wrote Olivia's example in Xamarin already and I don't claim to be the best programmer in the world. But as nobody posted a C#/Xamarin version here yet and I would like to give something back to the community for the great post above, here is my C# / Xamarin version. Maybe it helps someone to to speed up progress in her or his project.
I kept close to Olivia's example, I even kept most of her comments.
First, for I prefer dealing with enums rather than numbers, I declared this NALU enum.
For the sake of completeness I also added some "exotic" NALU types I found on the internet:
public enum NALUnitType : byte
{
NALU_TYPE_UNKNOWN = 0,
NALU_TYPE_SLICE = 1,
NALU_TYPE_DPA = 2,
NALU_TYPE_DPB = 3,
NALU_TYPE_DPC = 4,
NALU_TYPE_IDR = 5,
NALU_TYPE_SEI = 6,
NALU_TYPE_SPS = 7,
NALU_TYPE_PPS = 8,
NALU_TYPE_AUD = 9,
NALU_TYPE_EOSEQ = 10,
NALU_TYPE_EOSTREAM = 11,
NALU_TYPE_FILL = 12,
NALU_TYPE_13 = 13,
NALU_TYPE_14 = 14,
NALU_TYPE_15 = 15,
NALU_TYPE_16 = 16,
NALU_TYPE_17 = 17,
NALU_TYPE_18 = 18,
NALU_TYPE_19 = 19,
NALU_TYPE_20 = 20,
NALU_TYPE_21 = 21,
NALU_TYPE_22 = 22,
NALU_TYPE_23 = 23,
NALU_TYPE_STAP_A = 24,
NALU_TYPE_STAP_B = 25,
NALU_TYPE_MTAP16 = 26,
NALU_TYPE_MTAP24 = 27,
NALU_TYPE_FU_A = 28,
NALU_TYPE_FU_B = 29,
}
More or less for convenience reasons I also defined an additional dictionary for the NALU descriptions:
public static Dictionary<NALUnitType, string> GetDescription { get; } =
new Dictionary<NALUnitType, string>()
{
{ NALUnitType.NALU_TYPE_UNKNOWN, "Unspecified (non-VCL)" },
{ NALUnitType.NALU_TYPE_SLICE, "Coded slice of a non-IDR picture (VCL) [P-frame]" },
{ NALUnitType.NALU_TYPE_DPA, "Coded slice data partition A (VCL)" },
{ NALUnitType.NALU_TYPE_DPB, "Coded slice data partition B (VCL)" },
{ NALUnitType.NALU_TYPE_DPC, "Coded slice data partition C (VCL)" },
{ NALUnitType.NALU_TYPE_IDR, "Coded slice of an IDR picture (VCL) [I-frame]" },
{ NALUnitType.NALU_TYPE_SEI, "Supplemental Enhancement Information [SEI] (non-VCL)" },
{ NALUnitType.NALU_TYPE_SPS, "Sequence Parameter Set [SPS] (non-VCL)" },
{ NALUnitType.NALU_TYPE_PPS, "Picture Parameter Set [PPS] (non-VCL)" },
{ NALUnitType.NALU_TYPE_AUD, "Access Unit Delimiter [AUD] (non-VCL)" },
{ NALUnitType.NALU_TYPE_EOSEQ, "End of Sequence (non-VCL)" },
{ NALUnitType.NALU_TYPE_EOSTREAM, "End of Stream (non-VCL)" },
{ NALUnitType.NALU_TYPE_FILL, "Filler data (non-VCL)" },
{ NALUnitType.NALU_TYPE_13, "Sequence Parameter Set Extension (non-VCL)" },
{ NALUnitType.NALU_TYPE_14, "Prefix NAL Unit (non-VCL)" },
{ NALUnitType.NALU_TYPE_15, "Subset Sequence Parameter Set (non-VCL)" },
{ NALUnitType.NALU_TYPE_16, "Reserved (non-VCL)" },
{ NALUnitType.NALU_TYPE_17, "Reserved (non-VCL)" },
{ NALUnitType.NALU_TYPE_18, "Reserved (non-VCL)" },
{ NALUnitType.NALU_TYPE_19, "Coded slice of an auxiliary coded picture without partitioning (non-VCL)" },
{ NALUnitType.NALU_TYPE_20, "Coded Slice Extension (non-VCL)" },
{ NALUnitType.NALU_TYPE_21, "Coded Slice Extension for Depth View Components (non-VCL)" },
{ NALUnitType.NALU_TYPE_22, "Reserved (non-VCL)" },
{ NALUnitType.NALU_TYPE_23, "Reserved (non-VCL)" },
{ NALUnitType.NALU_TYPE_STAP_A, "STAP-A Single-time Aggregation Packet (non-VCL)" },
{ NALUnitType.NALU_TYPE_STAP_B, "STAP-B Single-time Aggregation Packet (non-VCL)" },
{ NALUnitType.NALU_TYPE_MTAP16, "MTAP16 Multi-time Aggregation Packet (non-VCL)" },
{ NALUnitType.NALU_TYPE_MTAP24, "MTAP24 Multi-time Aggregation Packet (non-VCL)" },
{ NALUnitType.NALU_TYPE_FU_A, "FU-A Fragmentation Unit (non-VCL)" },
{ NALUnitType.NALU_TYPE_FU_B, "FU-B Fragmentation Unit (non-VCL)" }
};
Here comes my main decoding procedure. I assume the received frame as raw byte array:
public void Decode(byte[] frame)
{
uint frameSize = (uint)frame.Length;
SendDebugMessage($"Received frame of {frameSize} bytes.");
// I know how my H.264 data source's NALUs looks like so I know start code index is always 0.
// if you don't know where it starts, you can use a for loop similar to how I find the 2nd and 3rd start codes
uint firstStartCodeIndex = 0;
uint secondStartCodeIndex = 0;
uint thirdStartCodeIndex = 0;
// length of NALU start code in bytes.
// for h.264 the start code is 4 bytes and looks like this: 0 x 00 00 00 01
const uint naluHeaderLength = 4;
// check the first 8bits after the NALU start code, mask out bits 0-2, the NALU type ID is in bits 3-7
uint startNaluIndex = firstStartCodeIndex + naluHeaderLength;
byte startByte = frame[startNaluIndex];
int naluTypeId = startByte & 0x1F; // 0001 1111
NALUnitType naluType = (NALUnitType)naluTypeId;
SendDebugMessage($"1st Start Code Index: {firstStartCodeIndex}");
SendDebugMessage($"1st NALU Type: '{NALUnit.GetDescription[naluType]}' ({(int)naluType})");
// bits 1 and 2 are the NRI
int nalRefIdc = startByte & 0x60; // 0110 0000
SendDebugMessage($"1st NRI (NAL Ref Idc): {nalRefIdc}");
// IF the very first NALU type is an IDR -> handle it like a slice frame (-> re-cast it to type 1 [Slice])
if (naluType == NALUnitType.NALU_TYPE_IDR)
{
naluType = NALUnitType.NALU_TYPE_SLICE;
}
// if we haven't already set up our format description with our SPS PPS parameters,
// we can't process any frames except type 7 that has our parameters
if (naluType != NALUnitType.NALU_TYPE_SPS && this.FormatDescription == null)
{
SendDebugMessage("Video Error: Frame is not an I-Frame and format description is null.");
return;
}
// NALU type 7 is the SPS parameter NALU
if (naluType == NALUnitType.NALU_TYPE_SPS)
{
// find where the second PPS 4byte start code begins (0x00 00 00 01)
// from which we also get the length of the first SPS code
for (uint i = firstStartCodeIndex + naluHeaderLength; i < firstStartCodeIndex + 40; i++)
{
if (frame[i] == 0x00 && frame[i + 1] == 0x00 && frame[i + 2] == 0x00 && frame[i + 3] == 0x01)
{
secondStartCodeIndex = i;
this.SpsSize = secondStartCodeIndex; // includes the header in the size
SendDebugMessage($"2nd Start Code Index: {secondStartCodeIndex} -> SPS Size: {this.SpsSize}");
break;
}
}
// find what the second NALU type is
startByte = frame[secondStartCodeIndex + naluHeaderLength];
naluType = (NALUnitType)(startByte & 0x1F);
SendDebugMessage($"2nd NALU Type: '{NALUnit.GetDescription[naluType]}' ({(int)naluType})");
// bits 1 and 2 are the NRI
nalRefIdc = startByte & 0x60; // 0110 0000
SendDebugMessage($"2nd NRI (NAL Ref Idc): {nalRefIdc}");
}
// type 8 is the PPS parameter NALU
if (naluType == NALUnitType.NALU_TYPE_PPS)
{
// find where the NALU after this one starts so we know how long the PPS parameter is
for (uint i = this.SpsSize + naluHeaderLength; i < this.SpsSize + 30; i++)
{
if (frame[i] == 0x00 && frame[i + 1] == 0x00 && frame[i + 2] == 0x00 && frame[i + 3] == 0x01)
{
thirdStartCodeIndex = i;
this.PpsSize = thirdStartCodeIndex - this.SpsSize;
SendDebugMessage($"3rd Start Code Index: {thirdStartCodeIndex} -> PPS Size: {this.PpsSize}");
break;
}
}
// allocate enough data to fit the SPS and PPS parameters into our data objects.
// VTD doesn't want you to include the start code header (4 bytes long) so we subtract 4 here
byte[] sps = new byte[this.SpsSize - naluHeaderLength];
byte[] pps = new byte[this.PpsSize - naluHeaderLength];
// copy in the actual sps and pps values, again ignoring the 4 byte header
Array.Copy(frame, naluHeaderLength, sps, 0, sps.Length);
Array.Copy(frame, this.SpsSize + naluHeaderLength, pps,0, pps.Length);
// create video format description
List<byte[]> parameterSets = new List<byte[]> { sps, pps };
this.FormatDescription = CMVideoFormatDescription.FromH264ParameterSets(parameterSets, (int)naluHeaderLength, out CMFormatDescriptionError formatDescriptionError);
SendDebugMessage($"Creation of CMVideoFormatDescription: {((formatDescriptionError == CMFormatDescriptionError.None)? $"Successful! (Video Codec = {this.FormatDescription.VideoCodecType}, Dimension = {this.FormatDescription.Dimensions.Height} x {this.FormatDescription.Dimensions.Width}px, Type = {this.FormatDescription.MediaType})" : $"Failed ({formatDescriptionError})")}");
// re-create the decompression session whenever new PPS data was received
this.DecompressionSession = this.CreateDecompressionSession(this.FormatDescription);
// now lets handle the IDR frame that (should) come after the parameter sets
// I say "should" because that's how I expect my H264 stream to work, YMMV
startByte = frame[thirdStartCodeIndex + naluHeaderLength];
naluType = (NALUnitType)(startByte & 0x1F);
SendDebugMessage($"3rd NALU Type: '{NALUnit.GetDescription[naluType]}' ({(int)naluType})");
// bits 1 and 2 are the NRI
nalRefIdc = startByte & 0x60; // 0110 0000
SendDebugMessage($"3rd NRI (NAL Ref Idc): {nalRefIdc}");
}
// type 5 is an IDR frame NALU.
// The SPS and PPS NALUs should always be followed by an IDR (or IFrame) NALU, as far as I know.
if (naluType == NALUnitType.NALU_TYPE_IDR || naluType == NALUnitType.NALU_TYPE_SLICE)
{
// find the offset or where IDR frame NALU begins (after the SPS and PPS NALUs end)
uint offset = (naluType == NALUnitType.NALU_TYPE_SLICE)? 0 : this.SpsSize + this.PpsSize;
uint blockLength = frameSize - offset;
SendDebugMessage($"Block Length (NALU type '{naluType}'): {blockLength}");
var blockData = new byte[blockLength];
Array.Copy(frame, offset, blockData, 0, blockLength);
// write the size of the block length (IDR picture data) at the beginning of the IDR block.
// this means we replace the start code header (0 x 00 00 00 01) of the IDR NALU with the block size.
// AVCC format requires that you do this.
// This next block is very specific to my application and wasn't in Olivia's example:
// For my stream is encoded by NVIDEA NVEC I had to deal with additional 3-byte start codes within my IDR/SLICE frame.
// These start codes must be replaced by 4 byte start codes adding the block length as big endian.
// ======================================================================================================================================================
// find all 3 byte start code indices (0x00 00 01) within the block data (including the first 4 bytes of NALU header)
uint startCodeLength = 3;
List<uint> foundStartCodeIndices = new List<uint>();
for (uint i = 0; i < blockData.Length; i++)
{
if (blockData[i] == 0x00 && blockData[i + 1] == 0x00 && blockData[i + 2] == 0x01)
{
foundStartCodeIndices.Add(i);
byte naluByte = blockData[i + startCodeLength];
var tmpNaluType = (NALUnitType)(naluByte & 0x1F);
SendDebugMessage($"3-Byte Start Code (0x000001) found at index: {i} (NALU type {(int)tmpNaluType} '{NALUnit.GetDescription[tmpNaluType]}'");
}
}
// determine the byte length of each slice
uint totalLength = 0;
List<uint> sliceLengths = new List<uint>();
for (int i = 0; i < foundStartCodeIndices.Count; i++)
{
// for convenience only
bool isLastValue = (i == foundStartCodeIndices.Count-1);
// start-index to bit right after the start code
uint startIndex = foundStartCodeIndices[i] + startCodeLength;
// set end-index to bit right before beginning of next start code or end of frame
uint endIndex = isLastValue ? (uint) blockData.Length : foundStartCodeIndices[i + 1];
// now determine slice length including NALU header
uint sliceLength = (endIndex - startIndex) + naluHeaderLength;
// add length to list
sliceLengths.Add(sliceLength);
// sum up total length of all slices (including NALU header)
totalLength += sliceLength;
}
// Arrange slices like this:
// [4byte slice1 size][slice1 data][4byte slice2 size][slice2 data]...[4byte slice4 size][slice4 data]
// Replace 3-Byte Start Code with 4-Byte start code, then replace the 4-Byte start codes with the length of the following data block (big endian).
// https://stackoverflow.com/questions/65576349/nvidia-nvenc-media-foundation-encoded-h-264-frames-not-decoded-properly-using
byte[] finalBuffer = new byte[totalLength];
uint destinationIndex = 0;
// create a buffer for each slice and append it to the final block buffer
for (int i = 0; i < sliceLengths.Count; i++)
{
// create byte vector of size of current slice, add additional bytes for NALU start code length
byte[] sliceData = new byte[sliceLengths[i]];
// now copy the data of current slice into the byte vector,
// start reading data after the 3-byte start code
// start writing data after NALU start code,
uint sourceIndex = foundStartCodeIndices[i] + startCodeLength;
long dataLength = sliceLengths[i] - naluHeaderLength;
Array.Copy(blockData, sourceIndex, sliceData, naluHeaderLength, dataLength);
// replace the NALU start code with data length as big endian
byte[] sliceLengthInBytes = BitConverter.GetBytes(sliceLengths[i] - naluHeaderLength);
Array.Reverse(sliceLengthInBytes);
Array.Copy(sliceLengthInBytes, 0, sliceData, 0, naluHeaderLength);
// add the slice data to final buffer
Array.Copy(sliceData, 0, finalBuffer, destinationIndex, sliceData.Length);
destinationIndex += sliceLengths[i];
}
// ======================================================================================================================================================
// from here we are back on track with Olivia's code:
// now create block buffer from final byte[] buffer
CMBlockBufferFlags flags = CMBlockBufferFlags.AssureMemoryNow | CMBlockBufferFlags.AlwaysCopyData;
var finalBlockBuffer = CMBlockBuffer.FromMemoryBlock(finalBuffer, 0, flags, out CMBlockBufferError blockBufferError);
SendDebugMessage($"Creation of Final Block Buffer: {(blockBufferError == CMBlockBufferError.None ? "Successful!" : $"Failed ({blockBufferError})")}");
if (blockBufferError != CMBlockBufferError.None) return;
// now create the sample buffer
nuint[] sampleSizeArray = new nuint[] { totalLength };
CMSampleBuffer sampleBuffer = CMSampleBuffer.CreateReady(finalBlockBuffer, this.FormatDescription, 1, null, sampleSizeArray, out CMSampleBufferError sampleBufferError);
SendDebugMessage($"Creation of Final Sample Buffer: {(sampleBufferError == CMSampleBufferError.None ? "Successful!" : $"Failed ({sampleBufferError})")}");
if (sampleBufferError != CMSampleBufferError.None) return;
// if sample buffer was successfully created -> pass sample to decoder
// set sample attachments
CMSampleBufferAttachmentSettings[] attachments = sampleBuffer.GetSampleAttachments(true);
var attachmentSetting = attachments[0];
attachmentSetting.DisplayImmediately = true;
// enable async decoding
VTDecodeFrameFlags decodeFrameFlags = VTDecodeFrameFlags.EnableAsynchronousDecompression;
// add time stamp
var currentTime = DateTime.Now;
var currentTimePtr = new IntPtr(currentTime.Ticks);
// send the sample buffer to a VTDecompressionSession
var result = DecompressionSession.DecodeFrame(sampleBuffer, decodeFrameFlags, currentTimePtr, out VTDecodeInfoFlags decodeInfoFlags);
if (result == VTStatus.Ok)
{
SendDebugMessage($"Executing DecodeFrame(..): Successful! (Info: {decodeInfoFlags})");
}
else
{
NSError error = new NSError(CFErrorDomain.OSStatus, (int)result);
SendDebugMessage($"Executing DecodeFrame(..): Failed ({(VtStatusEx)result} [0x{(int)result:X8}] - {error}) - Info: {decodeInfoFlags}");
}
}
}
My function to create the decompression session looks like this:
private VTDecompressionSession CreateDecompressionSession(CMVideoFormatDescription formatDescription)
{
VTDecompressionSession.VTDecompressionOutputCallback callBackRecord = this.DecompressionSessionDecodeFrameCallback;
VTVideoDecoderSpecification decoderSpecification = new VTVideoDecoderSpecification
{
EnableHardwareAcceleratedVideoDecoder = true
};
CVPixelBufferAttributes destinationImageBufferAttributes = new CVPixelBufferAttributes();
try
{
var decompressionSession = VTDecompressionSession.Create(callBackRecord, formatDescription, decoderSpecification, destinationImageBufferAttributes);
SendDebugMessage("Video Decompression Session Creation: Successful!");
return decompressionSession;
}
catch (Exception e)
{
SendDebugMessage($"Video Decompression Session Creation: Failed ({e.Message})");
return null;
}
}
The decompression session callback routine:
private void DecompressionSessionDecodeFrameCallback(
IntPtr sourceFrame,
VTStatus status,
VTDecodeInfoFlags infoFlags,
CVImageBuffer imageBuffer,
CMTime presentationTimeStamp,
CMTime presentationDuration)
{
if (status != VTStatus.Ok)
{
NSError error = new NSError(CFErrorDomain.OSStatus, (int)status);
SendDebugMessage($"Decompression: Failed ({(VtStatusEx)status} [0x{(int)status:X8}] - {error})");
}
else
{
SendDebugMessage("Decompression: Successful!");
try
{
var image = GetImageFromImageBuffer(imageBuffer);
// In my application I do not use a display layer but send the decoded image directly by an event:
ImageSource imgSource = ImageSource.FromStream(() => image.AsPNG().AsStream());
OnImageFrameReady?.Invoke(imgSource);
}
catch (Exception e)
{
SendDebugMessage(e.ToString());
}
}
}
I use this function to convert the CVImageBuffer to an UIImage. It also refers to one of Olivia's posts mentioned above (how to convert a CVImageBufferRef to UIImage):
private UIImage GetImageFromImageBuffer(CVImageBuffer imageBuffer)
{
if (!(imageBuffer is CVPixelBuffer pixelBuffer)) return null;
var ciImage = CIImage.FromImageBuffer(pixelBuffer);
var temporaryContext = new CIContext();
var rect = CGRect.FromLTRB(0, 0, pixelBuffer.Width, pixelBuffer.Height);
CGImage cgImage = temporaryContext.CreateCGImage(ciImage, rect);
if (cgImage == null) return null;
var uiImage = UIImage.FromImage(cgImage);
cgImage.Dispose();
return uiImage;
}
Last but not least my tiny little function for debug output, feel free to pimp it as needed for your purpose ;-)
private void SendDebugMessage(string msg)
{
Debug.WriteLine($"VideoDecoder (iOS) - {msg}");
}
Finally, let's have a look at the namespaces used for the code above:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Net;
using AvcLibrary;
using CoreFoundation;
using CoreGraphics;
using CoreImage;
using CoreMedia;
using CoreVideo;
using Foundation;
using UIKit;
using VideoToolbox;
using Xamarin.Forms;

#Livy to remove memory leaks before CMVideoFormatDescriptionCreateFromH264ParameterSets you should add the following:
if (_formatDesc) {
CFRelease(_formatDesc);
_formatDesc = NULL;
}

This post helped me a lot with sending H264 video from one device to another, but switching between devices caused the function receivedRawVideoFrame to not work correctly due to some changes in the frame data.
Here is my final function that decodes NAL units from the data directly, but doesn't rely on the order in the data frame
- (void)receivedRawVideoFrame:(NSData*)frameData {
NSUInteger frameSize = [frameData length];
const uint8_t * frame = [frameData bytes];
NSMutableDictionary* nalUnitsStart = [NSMutableDictionary dictionary];
NSMutableDictionary* nalUnitsEnd = [NSMutableDictionary dictionary];
uint8_t previousNalUnitType = 0;
for ( NSUInteger offset = 0; offset < frameSize - 4; offset++ ) {
// Find the start on NAL unit
if (frame[offset] == 0x00 && frame[offset+1] == 0x00 && frame[offset+2] == 0x00 && frame[offset+3] == 0x01) {
uint8_t nalType = frame[offset + 4] & 0x1F;
// Record the end of previous NAL unit
nalUnitsEnd[#(previousNalUnitType)] = #(offset);
previousNalUnitType = nalType;
nalUnitsStart[#(nalType)] = #(offset + 4);
}
}
// Record the end of the last NAL unit
nalUnitsEnd[#(previousNalUnitType)] = #(frameSize);
// Let's check if our data contains SPS && PPS NAL Units
NSNumber* spsOffset = nalUnitsStart[#(NAL_TYPE_SPS)];
NSNumber* ppsOffset = nalUnitsStart[#(NAL_TYPE_PPS)];
if ( spsOffset && ppsOffset ) {
NSNumber* spsEnd = nalUnitsEnd[#(NAL_TYPE_SPS)];
NSNumber* ppsEnd = nalUnitsEnd[#(NAL_TYPE_PPS)];
NSAssert(spsEnd && ppsEnd, #" [DECODE]: Missing the end of NAL unit(s)");
uint8_t *pps = NULL;
uint8_t *sps = NULL;
int spsSize = (int)(spsEnd.unsignedIntegerValue - spsOffset.unsignedIntegerValue);
int ppsSize = (int)(ppsEnd.unsignedIntegerValue - ppsOffset.unsignedIntegerValue);
// allocate enough data to fit the SPS and PPS parameters into our data objects.
// VTD doesn't want you to include the start code header (4 bytes long) so we add the - 4 here
sps = malloc(spsSize);
pps = malloc(ppsSize);
// copy in the actual sps and pps values, again ignoring the 4 byte header
memcpy(sps, &frame[spsOffset.unsignedIntegerValue], spsSize);
memcpy(pps, &frame[ppsOffset.unsignedIntegerValue], ppsSize);
// now we set our H264 parameters
uint8_t* parameterSetPointers[2] = {sps, pps};
size_t parameterSetSizes[2] = {spsSize, ppsSize};
OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault,
2,
(const uint8_t *const*)parameterSetPointers,
parameterSetSizes,
4,
&_formatDesc);
if (sps != NULL) free(sps);
if (pps != NULL) free(pps);
DebugAssert(status == noErr, #" [DECODE]: Failed to create CMVideoFormatDescription for H264");
if ( status != noErr ) {
NSLog(#" [DECODE]: Failed to create CMVideoFormatDescription for H264");
} else {
// Good place to re-create our decompression session
[self destroySession];
}
}
// Loop over all NAL units we have while ignoring everything with type < 5
for ( NSNumber* nalType in nalUnitsStart.allKeys ) {
if ( nalType.intValue > 5 ) {
continue;
}
// Get the header too (0x00000001), that will be replaced with the NAL unit size
NSNumber* nalStart = nalUnitsStart[nalType];
NSNumber* nalEnd = nalUnitsEnd[nalType];
size_t blockLength = nalEnd.unsignedIntegerValue - (nalStart.unsignedIntegerValue - sizeof(uint32_t));
uint8_t *data = malloc(blockLength);
memcpy(data, &frame[nalStart.unsignedIntegerValue - sizeof(uint32_t)], blockLength);
// replace the start code header on this NALU with its size.
// AVCC format requires that you do this.
// htonl converts the unsigned int from host to network byte order
uint32_t dataLength32 = htonl(blockLength - 4);
memcpy(data, &dataLength32, sizeof(uint32_t));
CMBlockBufferRef blockBuffer;
OSStatus status = CMBlockBufferCreateWithMemoryBlock(NULL,
data,
blockLength,
kCFAllocatorNull,
NULL,
0,
blockLength,
0,
&blockBuffer);
DebugAssert(status == noErr, #" [DECODE]: Failed to create CMBlockBufferRef for %#", nalType);
if ( status != noErr ) {
NSLog(#" [DECODE]: Failed to create CMBlockBufferRef for H264 for %#", nalType);
} else {
const size_t sampleSize = blockLength;
/* NOTE:
We are not responsible for releasing sample buffer,
it will be released by the decompress frame function
after it has been decoded!
*/
CMSampleBufferRef sampleBuffer;
status = CMSampleBufferCreate(kCFAllocatorDefault,
blockBuffer,
true,
NULL,
NULL,
_formatDesc,
1,
0,
NULL,
1,
&sampleSize,
&sampleBuffer);
DebugAssert(status == noErr, #" [DECODE]: Failed to create CMSampleBufferRef for %#", nalType);
if ( status != noErr ) {
NSLog(#" [DECODE]: Failed to create CMSampleBufferRef for H264 for %#", nalType);
if ( sampleBuffer ) {
CFRelease(sampleBuffer);
sampleBuffer = NULL;
}
} else {
// set some values of the sample buffer's attachments
CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);
[self decompressFrame:sampleBuffer];
}
}
if ( blockBuffer ) {
CFRelease(blockBuffer);
blockBuffer = NULL;
}
if ( data != NULL ) {
free(data);
data = NULL;
}
}
}
decompressFrame function is responsible for creating a new decompression session when it needs to based on the latest CMVideoFormatDescriptionRef data we got from our stream.

Related

STM32 Crash on Flash Sector Erase

I'm trying to write 4 uint32's of data into the flash memory of my STM32F767ZI so I've looked at some examples and in the reference manual but still I cannot do it. My goal is to write 4 uint32's into the flash and read them back and compare with the original data, and light different leds depending on the success of the comparison.
My code is as follows:
void flash_write(uint32_t offset, uint32_t *data, uint32_t size) {
FLASH_EraseInitTypeDef EraseInitStruct = {0};
uint32_t SectorError = 0;
HAL_FLASH_Unlock();
EraseInitStruct.TypeErase = FLASH_TYPEERASE_SECTORS;
EraseInitStruct.VoltageRange = FLASH_VOLTAGE_RANGE_3;
EraseInitStruct.Sector = FLASH_SECTOR_11;
EraseInitStruct.NbSectors = 1;
//EraseInitStruct.Banks = FLASH_BANK_1; // or FLASH_BANK_2 or FLASH_BANK_BOTH
st = HAL_FLASHEx_Erase(&EraseInitStruct, &SectorError);
if (st == HAL_OK) {
for (int i = 0; i < size; i += 4) {
st = HAL_FLASH_Program(FLASH_TYPEPROGRAM_WORD, FLASH_USER_START_ADDR + offset + i, *(data + i)); //This is what's giving me trouble
if (st != HAL_OK) {
// handle the error
break;
}
}
}else {
// handle the error
}
HAL_FLASH_Lock();
}
void flash_read(uint32_t offset, uint32_t *data, uint32_t size) {
for (int i = 0; i < size; i += 4) {
*(data + i) = *(__IO uint32_t*)(FLASH_USER_START_ADDR + offset + i);
}
}
int main(void) {
uint32_t data[] = {'a', 'b', 'c', 'd'};
uint32_t read_data[] = {0, 0, 0, 0};
HAL_Init();
SystemClock_Config();
MX_GPIO_Init();
flash_write(0, data, sizeof(data));
flash_read(0, read_data, sizeof(read_data));
if (compareArrays(data,read_data,4))
{
HAL_GPIO_WritePin(GPIOB, GPIO_PIN_7,SET);
}
else
{
HAL_GPIO_WritePin(GPIOB, GPIO_PIN_14,SET);
}
return 0;
}
The problem is that before writing data I must erase a sector, and when I do it with the HAL_FLASHEx_Erase(&EraseInitStruct, &SectorError), function, the program always crashes, and sometimes even corrupts my codespace forcing me to update firmware.
I've selected the sector farthest from the code space but still it crashes when i try to erase it.
I've read in the reference manual that
Any attempt to read the Flash memory while it is being written or erased, causes the bus to
stall. Read operations are processed correctly once the program operation has completed.
This means that code or data fetches cannot be performed while a write/erase operation is
ongoing.
which I believe means the code should ideally be run from RAM while we operate on the flash, but I've seen other people online not have this issue so I'm wondering if that's the only problem I have. With that in mind I wanted to confirm if this is my only issue, or if I'm doing something wrong?
In your loop, you are adding multiples of 4 to i, but then you are adding i to data. When you add to a pointer it is automatically multiplied by the size of the pointed type, so you are adding multiples of 16 bytes and reading past the end of your input buffer.
Also, make sure you initialize all members of EraseInitStruct. Uncomment that line and set the correct value!

unable to copy from buffer to image

I have an image of dimensions 4096*4096 (so 67108864 bytes, since there are 4 channels) that I want to copy from a staging buffer to a device local image. The buffer already has the data stored and I have set up the image barriers properly, so now I want to perform the copy operation... Except it doesn't work. The validation layers give me this error message when I call vkCmdCopyBufferToImage() -
IMAGE(ERROR): object: 0x0 type: 6 location: 3903 msgCode: 417333590: vkCmdCopyBufferToImage(): pRegion[0] exceeds buffer size of 67108864 bytes. The spec valid usage text states 'The buffer region specified by each element of pRegions mustbe a region that is contained within srcBuffer' (https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspec.html#VUID-vkCmdCopyBufferToImage-pRegions-00171).
I can't find anything wrong with the values that I gave it though. The VkBufferImageCopy struct I passed to it looks like this-
VkBufferImageCopy bufImgCopy;
bufImgCopy.bufferOffset = 0;
bufImgCopy.bufferImageHeight = 0;
bufImgCopy.bufferRowLength = 0;
bufImgCopy.imageExtent = modelTexture.imgExtents; // 4096 * 4096 * 1
bufImgCopy.imageOffset = {0, 0, 0};
bufImgCopy.imageSubresource.aspectMask = modelTexture.subResource.aspectMask; // Colour attachment
bufImgCopy.imageSubresource.baseArrayLayer = modelTexture.subResource.baseArrayLayer; // 0
bufImgCopy.imageSubresource.layerCount = VK_REMAINING_ARRAY_LAYERS;
bufImgCopy.imageSubresource.mipLevel = 0;
I can't figure out why the api thinks the struct is specifying a size greater than the buffer size. The format of the image is VK_FORMAT_B8G8R8A8_UNORM.
EDIT
Here's the code that sets up the staging buffer-
stageBuf.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;
stageBuf.shareMode = VK_SHARING_MODE_EXCLUSIVE;
stageBuf.bufSize = static_cast<VkDeviceSize>(verts.size() * sizeof(vert) + indices.size() * sizeof(u32)) > modelImage.size ? static_cast<VkDeviceSize>(verts.size() * sizeof(vert) + indices.size() * sizeof(u32)) : modelImage.size;
// filled from the previous struct.
VkBufferCreateInfo info;
info.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
info.pNext = nullptr;
info.flags = 0;
info.queueFamilyIndexCount = bufInfo.qFCount;
info.pQueueFamilyIndices = bufInfo.qFIndices;
info.usage = bufInfo.usage;
info.sharingMode = bufInfo.shareMode;
info.size = bufInfo.bufSize;
if (vkCreateBuffer(device, &info, nullptr, &(bufInfo.buf)) != VK_SUCCESS)
{ //...
VkMemoryRequirements memReqs;
vkGetBufferMemoryRequirements(device, buf, &memReqs);
for (u32 type = 0; type < memProps.memoryTypeCount; ++type)
if ((memReqs.memoryTypeBits & (1 << type)) &&
((memProps.memoryTypes[type].propertyFlags & memFlags) == memFlags)) // The usual things to set buffers up.
{
VkMemoryAllocateInfo info;
info.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
info.pNext = nullptr;
info.allocationSize = memReqs.size;
info.memoryTypeIndex = type;
if (vkAllocateMemory(device, &info, nullptr, &mem.memory) == VK_SUCCESS)
{ //....
// All this works perfectly except for the texture copy.
if (vkBindBufferMemory(device, buf, mem.memory, mem.offset) != VK_SUCCESS)
{ //...
I'm using this staging buffer for both the vertex and index buffers (which I have taken as a single buffer with offsets) as well as the image which I'm trying to copy to. The memory allocated is according to the size of the largest data structure.
As noted in the comments. Using VK_REMAINING_ARRAY_LAYERS is invalid for the layerCount of VkImageSubresourceRange, so you have to explicitly set the layerCount to the actual number of layers to copy.

Vulkan depth image binding error

Hi I am trying to bind depth memory buffer but I get an error saying as below. I have no idea why this error is popping up.
The depth format is VK_FORMAT_D16_UNORM and the usage is VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT. I have read online that the TILING shouldnt be linear but then I get a different error. Thanks!!!
The code for creating and binding the image is as below.
VkImageCreateInfo imageInfo = {};
// If the depth format is undefined, use fallback as 16-byte value
if (Depth.format == VK_FORMAT_UNDEFINED) {
Depth.format = VK_FORMAT_D16_UNORM;
}
const VkFormat depthFormat = Depth.format;
VkFormatProperties props;
vkGetPhysicalDeviceFormatProperties(*deviceObj->gpu, depthFormat, &props);
if (props.linearTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT) {
imageInfo.tiling = VK_IMAGE_TILING_LINEAR;
}
else if (props.optimalTilingFeatures & VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT) {
imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
}
else {
std::cout << "Unsupported Depth Format, try other Depth formats.\n";
exit(-1);
}
imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
imageInfo.pNext = NULL;
imageInfo.imageType = VK_IMAGE_TYPE_2D;
imageInfo.format = depthFormat;
imageInfo.extent.width = width;
imageInfo.extent.height = height;
imageInfo.extent.depth = 1;
imageInfo.mipLevels = 1;
imageInfo.arrayLayers = 1;
imageInfo.samples = NUM_SAMPLES;
imageInfo.queueFamilyIndexCount = 0;
imageInfo.pQueueFamilyIndices = NULL;
imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
imageInfo.usage = VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT;
imageInfo.flags = 0;
// User create image info and create the image objects
result = vkCreateImage(deviceObj->device, &imageInfo, NULL, &Depth.image);
assert(result == VK_SUCCESS);
// Get the image memory requirements
VkMemoryRequirements memRqrmnt;
vkGetImageMemoryRequirements(deviceObj->device, Depth.image, &memRqrmnt);
VkMemoryAllocateInfo memAlloc = {};
memAlloc.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
memAlloc.pNext = NULL;
memAlloc.allocationSize = 0;
memAlloc.memoryTypeIndex = 0;
memAlloc.allocationSize = memRqrmnt.size;
// Determine the type of memory required with the help of memory properties
pass = deviceObj->memoryTypeFromProperties(memRqrmnt.memoryTypeBits, 0, /* No requirements */ &memAlloc.memoryTypeIndex);
assert(pass);
// Allocate the memory for image objects
result = vkAllocateMemory(deviceObj->device, &memAlloc, NULL, &Depth.mem);
assert(result == VK_SUCCESS);
// Bind the allocated memeory
result = vkBindImageMemory(deviceObj->device, Depth.image, Depth.mem, 0);
assert(result == VK_SUCCESS);
Yes, linear tiling may not be supported for depth usage Images.
Consult the specification and Valid Usage section of VkImageCreateInfo. The capability is queried by vkGetPhysicalDeviceFormatProperties and vkGetPhysicalDeviceImageFormatProperties commands. Though depth formats are "opaque", so there is not much reason to use linear tiling.
This you seem to be doing in your code.
But the error informs you that you are trying to use a memory type that is not allowed for the given Image. Use vkGetImageMemoryRequirements command to query which memory types are allowed.
Possibly you have some error there (you are using 0x1 which is obviously not part of 0x84 per the message). You may want to reuse the example code in the Device Memory chapter of the specification. Provide your memoryTypeFromProperties implementation for more specific answer.
I accidentally set the typeIndex to 1 instead of i and it works now. In my defense I have been vulkan coding the whole day and my eyes are bleeding :). Thanks for the help.
bool VulkanDevice::memoryTypeFromProperties(uint32_t typeBits, VkFlags
requirementsMask, uint32_t *typeIndex)
{
// Search memtypes to find first index with those properties
for (uint32_t i = 0; i < 32; i++) {
if ((typeBits & 1) == 1) {
// Type is available, does it match user properties?
if ((memoryProperties.memoryTypes[i].propertyFlags & requirementsMask) == requirementsMask) {
*typeIndex = i;// was set to 1 :(
return true;
}
}
typeBits >>= 1;
}
// No memory types matched, return failure
return false;
}

CoreAudio AudioQueue callback function never called, no errors reported

I am trying to do a simple playback from a file functionality and it appears that my callback function is never called. It doesn't really make sense because all of the OSStatuses come back 0 and other numbers all appear correct as well (like the output packets read pointer from AudioFileReadPackets).
Here is the setup:
OSStatus stat;
stat = AudioFileOpenURL(
(CFURLRef)urlpath, kAudioFileReadPermission, 0, &aStreamData->aFile
);
UInt32 dsze = 0;
stat = AudioFileGetPropertyInfo(
aStreamData->aFile, kAudioFilePropertyDataFormat, &dsze, 0
);
stat = AudioFileGetProperty(
aStreamData->aFile, kAudioFilePropertyDataFormat, &dsze, &aStreamData->aDescription
);
stat = AudioQueueNewOutput(
&aStreamData->aDescription, bufferCallback, aStreamData, NULL, NULL, 0, &aStreamData->aQueue
);
aStreamData->pOffset = 0;
for(int i = 0; i < NUM_BUFFERS; i++) {
stat = AudioQueueAllocateBuffer(
aStreamData->aQueue, aStreamData->aDescription.mBytesPerPacket, &aStreamData->aBuffer[i]
);
bufferCallback(aStreamData, aStreamData->aQueue, aStreamData->aBuffer[i]);
}
stat = AudioQueuePrime(aStreamData->aQueue, 0, NULL);
stat = AudioQueueStart(aStreamData->aQueue, NULL);
(Not shown is where I'm checking the value of stat in between the functions, it just comes back normal.)
And the callback function:
void bufferCallback(void *uData, AudioQueueRef queue, AudioQueueBufferRef buffer) {
UInt32 bread = 0;
UInt32 pread = buffer->mAudioDataBytesCapacity / player->aStreamData->aDescription.mBytesPerPacket;
OSStatus stat;
stat = AudioFileReadPackets(
player->aStreamData->aFile, false, &bread, NULL, player->aStreamData->pOffset, &pread, buffer->mAudioData
);
buffer->mAudioDataByteSize = bread;
stat = AudioQueueEnqueueBuffer(queue, buffer, 0, NULL);
player->aStreamData->pOffset += pread;
}
Where aStreamData is my user data struct (typedefed so I can use it as a class property) and player is a static instance of the controlling Objective-C class. If any other code is wanted please let me know. I am a bit at my wit's end. Printing any of the numbers involved here yields the correct result, including functions in bufferCallback when I call it myself in the allocate loop. It just never gets called thereafter. The start up method returns and nothing happens.
Also anecdotally, I am using a peripheral device (an MBox Pro 3) to play the sound which CoreAudio only boots up when it is about to output. IE if I start iTunes or something, the speakers pop faintly and there is an LED that goes from blinking to solid. The device boots up like it does so CA is definitely doing something. (Also I've of course tried it with the onboard Macbook sound sans the device.)
I've read other solutions to problems that sound similiar and they don't work. Stuff like using multiple buffers which I am doing now and doesn't appear to make any difference.
I basically assume I am doing something obviously wrong somehow but not sure what it could be. I've read the relevant documentation, looked at the available code examples and scoured the net a bit for answers and it appears that this is all I need to do and it should just go.
At the very least, is there anything else I can do to investigate?
My first answer was not good enough, so I compiled a minimal example that will play a 2 channel, 16 bit wave file.
The main difference from your code is that I made a property listener listening for play start and stop events.
As for your code, it seems legit at first glance. Two things I will point out, though:
1. Is seems you are allocating buffers with TOO SMALL a buffer size. I have noticed that AudioQueues won't play if the buffers are too small, which seems to fit your problem.
2. Have you verified the properties returned?
Back to my code example:
Everything is hard coded, so it is not exactly good coding practice, but it shows how you can do it.
AudioStreamTest.h
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
uint32_t bufferSizeInSamples;
AudioFileID file;
UInt32 currentPacket;
AudioQueueRef audioQueue;
AudioQueueBufferRef buffer[3];
AudioStreamBasicDescription audioStreamBasicDescription;
#interface AudioStreamTest : NSObject
- (void)start;
- (void)stop;
#end
AudioStreamTest.m
#import "AudioStreamTest.h"
#implementation AudioStreamTest
- (id)init
{
self = [super init];
if (self) {
bufferSizeInSamples = 441;
file = NULL;
currentPacket = 0;
audioStreamBasicDescription.mBitsPerChannel = 16;
audioStreamBasicDescription.mBytesPerFrame = 4;
audioStreamBasicDescription.mBytesPerPacket = 4;
audioStreamBasicDescription.mChannelsPerFrame = 2;
audioStreamBasicDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
audioStreamBasicDescription.mFormatID = kAudioFormatLinearPCM;
audioStreamBasicDescription.mFramesPerPacket = 1;
audioStreamBasicDescription.mReserved = 0;
audioStreamBasicDescription.mSampleRate = 44100;
}
return self;
}
- (void)start {
AudioQueueNewOutput(&audioStreamBasicDescription, AudioEngineOutputBufferCallback, (__bridge void *)(self), NULL, NULL, 0, &audioQueue);
AudioQueueAddPropertyListener(audioQueue, kAudioQueueProperty_IsRunning, AudioEnginePropertyListenerProc, NULL);
AudioQueueStart(audioQueue, NULL);
}
- (void)stop {
AudioQueueStop(audioQueue, YES);
AudioQueueRemovePropertyListener(audioQueue, kAudioQueueProperty_IsRunning, AudioEnginePropertyListenerProc, NULL);
}
void AudioEngineOutputBufferCallback(void *inUserData, AudioQueueRef inAQ, AudioQueueBufferRef inBuffer) {
if (file == NULL) return;
UInt32 bytesRead = bufferSizeInSamples * 4;
UInt32 packetsRead = bufferSizeInSamples;
AudioFileReadPacketData(file, false, &bytesRead, NULL, currentPacket, &packetsRead, inBuffer->mAudioData);
inBuffer->mAudioDataByteSize = bytesRead;
currentPacket += packetsRead;
if (bytesRead == 0) {
AudioQueueStop(inAQ, false);
}
else {
AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL);
}
}
void AudioEnginePropertyListenerProc (void *inUserData, AudioQueueRef inAQ, AudioQueuePropertyID inID) {
//We are only interested in the property kAudioQueueProperty_IsRunning
if (inID != kAudioQueueProperty_IsRunning) return;
//Get the status of the property
UInt32 isRunning = false;
UInt32 size = sizeof(isRunning);
AudioQueueGetProperty(inAQ, kAudioQueueProperty_IsRunning, &isRunning, &size);
if (isRunning) {
currentPacket = 0;
NSString *fileName = #"/Users/roy/Documents/XCodeProjectsData/FUZZ/03.wav";
NSURL *fileURL = [[NSURL alloc] initFileURLWithPath: fileName];
AudioFileOpenURL((__bridge CFURLRef) fileURL, kAudioFileReadPermission, 0, &file);
for (int i = 0; i < 3; i++){
AudioQueueAllocateBuffer(audioQueue, bufferSizeInSamples * 4, &buffer[i]);
UInt32 bytesRead = bufferSizeInSamples * 4;
UInt32 packetsRead = bufferSizeInSamples;
AudioFileReadPacketData(file, false, &bytesRead, NULL, currentPacket, &packetsRead, buffer[i]->mAudioData);
buffer[i]->mAudioDataByteSize = bytesRead;
currentPacket += packetsRead;
AudioQueueEnqueueBuffer(audioQueue, buffer[i], 0, NULL);
}
}
else {
if (file != NULL) {
AudioFileClose(file);
file = NULL;
for (int i = 0; i < 3; i++) {
AudioQueueFreeBuffer(audioQueue, buffer[i]);
buffer[i] = NULL;
}
}
}
}
-(void)dealloc {
[super dealloc];
AudioQueueDispose(audioQueue, true);
audioQueue = NULL;
}
#end
Lastly, I want to include some research I have done today to test the robustness of AudioQueues.
I have noticed that if you make too small AudioQueue buffers, it won't play at all. That made me play around a bit to see why it is not playing.
If I try buffer size that can hold only 150 samples, I get no sound at all.
If I try buffer size that can hold 175 samples, it plays the whole song through, but with A lot of distortion. 175 amounts to a tad less than 4 ms of audio.
AudioQueue keeps asking for new buffers as long as you keep supplying buffers. That is regardless of AudioQueue actually playing your buffers or not.
If you supply a buffer with size 0, the buffer will be lost and an error kAudioQueueErr_BufferEmpty is returned for that queue enqueue request. You will never see AudioQueue ask you to fill that buffer again. If this happened for the last queue you have posted, AudioQueue will stop asking you to fill any more buffers. In that case you will not hear any more audio for that session.
To see why AudioQueues is not playing anything with smaller buffer sizes, I made a test to see if my callback is called at all even when there is no sound. The answer is that the buffers gets called all the time as long as AudioQueues is playing and needs data.
So if you keep feeding buffers to the queue, no buffer is ever lost. It doesn't happen. Unless there is an error, of course.
So why is no sound playing?
I tested to see if 'AudioQueueEnqueueBuffer()' returned any errors. It did not. No other errors within my play routine either. The data returned from reading from file is also good.
Everything is normal, buffers are good, data re-enqueued is good, there is just no sound.
So my last test was to slowly increase buffer size till I could hear anything. I finally heard faint and sporadic distortion.
Then it came to me...
It seems that the problem lies with that the system tries to keep the stream in sync with time so if you enqueue audio, and the time for the audio you wanted to play has passed, it will just skip that part of the buffer. If the buffer size becomes too small, more and more data is dropped or skipped until the audio system is in sync again. Which is never if the buffer size is too small. (You can hear this as distortion if you chose a buffer size that is barely large enough to support continuous play.)
If you think about it, it is the only way the audio queue can work, but it is a good realisation when you are clueless like me and "discover" how it really works.
I decided to take a look at this again and was able to solve it by making the buffers larger. I've accepted the answer by #RoyGal since it was their suggestion but I wanted to provide the actual code that works since I guess others are having the same problem (question has a few favorites that aren't me at the moment).
One thing I tried was making the packet size larger:
aData->aDescription.mFramesPerPacket = 512; // or some other number
aData->aDescription.mBytesPerPacket = (
aData->aDescription.mFramesPerPacket * aData->aDescription.mBytesPerFrame
);
This does NOT work: it causes AudioQueuePrime to fail with an AudioConverterNew returned -50 message. I guess it wants mFramesPerPacket to be 1 for PCM.
(I also tried setting the kAudioQueueProperty_DecodeBufferSizeFrames property which didn't seem to do anything. Not sure what it's for.)
The solution seems to be to only allocate the buffer(s) with the specified size:
AudioQueueAllocateBuffer(
aData->aQueue,
aData->aDescription.mBytesPerPacket * N_BUFFER_PACKETS / N_BUFFERS,
&aData->aBuffer[i]
);
And the size has to be sufficiently large. I found the magic number is:
mBytesPerPacket * 1024 / N_BUFFERS
(Where N_BUFFERS is the number of buffers and should be > 1 or playback is choppy.)
Here is an MCVE demonstrating the issue and solution:
#import <Foundation/Foundation.h>
#import <AudioToolbox/AudioToolbox.h>
#import <AudioToolbox/AudioQueue.h>
#import <AudioToolbox/AudioFile.h>
#define N_BUFFERS 2
#define N_BUFFER_PACKETS 1024
typedef struct AStreamData {
AudioFileID aFile;
AudioQueueRef aQueue;
AudioQueueBufferRef aBuffer[N_BUFFERS];
AudioStreamBasicDescription aDescription;
SInt64 pOffset;
volatile BOOL isRunning;
} AStreamData;
void printASBD(AudioStreamBasicDescription* desc) {
printf("mSampleRate = %d\n", (int)desc->mSampleRate);
printf("mBytesPerPacket = %d\n", desc->mBytesPerPacket);
printf("mFramesPerPacket = %d\n", desc->mFramesPerPacket);
printf("mBytesPerFrame = %d\n", desc->mBytesPerFrame);
printf("mChannelsPerFrame = %d\n", desc->mChannelsPerFrame);
printf("mBitsPerChannel = %d\n", desc->mBitsPerChannel);
}
void bufferCallback(
void *vData, AudioQueueRef aQueue, AudioQueueBufferRef aBuffer
) {
AStreamData* aData = (AStreamData*)vData;
UInt32 bRead = 0;
UInt32 pRead = (
aBuffer->mAudioDataBytesCapacity / aData->aDescription.mBytesPerPacket
);
OSStatus stat;
stat = AudioFileReadPackets(
aData->aFile, false, &bRead, NULL, aData->pOffset, &pRead, aBuffer->mAudioData
);
if(stat != 0) {
printf("AudioFileReadPackets returned %d\n", stat);
}
if(pRead == 0) {
aData->isRunning = NO;
return;
}
aBuffer->mAudioDataByteSize = bRead;
stat = AudioQueueEnqueueBuffer(aQueue, aBuffer, 0, NULL);
if(stat != 0) {
printf("AudioQueueEnqueueBuffer returned %d\n", stat);
}
aData->pOffset += pRead;
}
AStreamData* beginPlayback(NSURL* path) {
static AStreamData* aData;
aData = malloc(sizeof(AStreamData));
OSStatus stat;
stat = AudioFileOpenURL(
(CFURLRef)path, kAudioFileReadPermission, 0, &aData->aFile
);
printf("AudioFileOpenURL returned %d\n", stat);
UInt32 dSize = 0;
stat = AudioFileGetPropertyInfo(
aData->aFile, kAudioFilePropertyDataFormat, &dSize, 0
);
printf("AudioFileGetPropertyInfo returned %d\n", stat);
stat = AudioFileGetProperty(
aData->aFile, kAudioFilePropertyDataFormat, &dSize, &aData->aDescription
);
printf("AudioFileGetProperty returned %d\n", stat);
printASBD(&aData->aDescription);
stat = AudioQueueNewOutput(
&aData->aDescription, bufferCallback, aData, NULL, NULL, 0, &aData->aQueue
);
printf("AudioQueueNewOutput returned %d\n", stat);
aData->pOffset = 0;
for(int i = 0; i < N_BUFFERS; i++) {
// change YES to NO for stale playback
if(YES) {
stat = AudioQueueAllocateBuffer(
aData->aQueue,
aData->aDescription.mBytesPerPacket * N_BUFFER_PACKETS / N_BUFFERS,
&aData->aBuffer[i]
);
} else {
stat = AudioQueueAllocateBuffer(
aData->aQueue,
aData->aDescription.mBytesPerPacket,
&aData->aBuffer[i]
);
}
printf(
"AudioQueueAllocateBuffer returned %d for aBuffer[%d] with capacity %d\n",
stat, i, aData->aBuffer[i]->mAudioDataBytesCapacity
);
bufferCallback(aData, aData->aQueue, aData->aBuffer[i]);
}
UInt32 numFramesPrepared = 0;
stat = AudioQueuePrime(aData->aQueue, 0, &numFramesPrepared);
printf("AudioQueuePrime returned %d with %d frames prepared\n", stat, numFramesPrepared);
stat = AudioQueueStart(aData->aQueue, NULL);
printf("AudioQueueStart returned %d\n", stat);
UInt32 pSize = sizeof(UInt32);
UInt32 isRunning;
stat = AudioQueueGetProperty(
aData->aQueue, kAudioQueueProperty_IsRunning, &isRunning, &pSize
);
printf("AudioQueueGetProperty returned %d\n", stat);
aData->isRunning = !!isRunning;
return aData;
}
void endPlayback(AStreamData* aData) {
OSStatus stat = AudioQueueStop(aData->aQueue, NO);
printf("AudioQueueStop returned %d\n", stat);
}
NSString* getPath() {
// change NO to YES and enter path to hard code
if(NO) {
return #"";
}
char input[512];
printf("Enter file path: ");
scanf("%[^\n]", input);
return [[NSString alloc] initWithCString:input encoding:NSASCIIStringEncoding];
}
int main(int argc, const char* argv[]) {
NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
NSURL* path = [NSURL fileURLWithPath:getPath()];
AStreamData* aData = beginPlayback(path);
if(aData->isRunning) {
do {
printf("Queue is running...\n");
[NSThread sleepForTimeInterval:1.0];
} while(aData->isRunning);
endPlayback(aData);
} else {
printf("Playback did not start\n");
}
[pool drain];
return 0;
}

how to extract byte data from bluetooth heart rate monitor in objective c

Im having trouble understanding bytes and uint8_t values.
I am using the sample project created by apple that reads data from a Bluetooth 4.0 heart rate monitor via the heart rate service protocol. THe sample project gives out heart rate data as below:
- (void) updateWithHRMData:(NSData *)data
{
const uint8_t *reportData = [data bytes];
uint16_t bpm = 0;
if ((reportData[0] & 0x01) == 0)
{
/* uint8 bpm */
bpm = reportData[1];
}
else
{
/* uint16 bpm */
bpm = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[1]));
}
I am assuming that (reportData[0] & 0x01) returns the first data bit in the data array reportData but I dont know how to access the second, (reportData[0] & 0x02) doesn't work like I thought it would.
Ideally I would like to check all the data in reportData[0] and then based on that grab the rr interval data in reportData[4] or [5] dependant on where it is stored and iterate through it to get each value as I believe there can be multiple values stored there.
a newbie question I know but Im having trouble finding the answer, or indeed the search terms to establish the answer.
When you do reportData[0] you are getting the first byte (at index 0). When you combine that value with reportData[0] & 0x02, you are masking out all but the 2nd bit. This result will either be 0 (if bit 2 is not set) or it will be 2 (if the 2nd bit is set).
if ((reportData[0] & 0x02) == 0) {
// bit 2 of first byte is not set
} else {
// bit 2 of first byte is set
}
If you want to check all 8 bits then you could do:
uint8_t byte = reportData[0];
for (int i = 0; i < 8; i++) {
int mask = 1 << i;
if ((byte & mask) == 0) {
bit i is not set
} else {
bit i is set
}
}
Update: To extract a value that spans two bits you do something like this:
uint8_t mask = 0x01 | 0x02; // Good for value stored in the first two bits
uint8_t value = byte & mask; // value now has just value from first two bits
If the value to extract is in higher bits then there is an extra step:
uint8_t mask = 0x02 | 0x04; // Good for value in 2nd and 3rd bits
uint8_t value = (byte & mask) >> 1; // need to shift value to convert to regular integer
Check this post for a discussion of the sample code. The post also links to the Bluetooth spec which should help you understand why the endianess check is being performed (basically it's Apple ensuring maximum portability). Basically, the first byte is a bit field describing the format of the HRV data and the presence/absence of EE and RR interval data. So:
reportData[0] & 0x03
tells you if EE data are present (1 = yes, 0 = no), and
reportData[0] & 0x04
tells you if RR interval data are present (1 = yes, 0 = no)
You can then get RR interval data with
uint16_t rrinterval;
rrinterval = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[idx]));
where idx is determined by the presence/absence tests you performed. I am assuming that the offsets are not fixed BTW as that's what you are indicating (i.e., dynamic offsets based on presence/absence) -- I'm not familiar with the BT spec. If the format is fixed, in this case the RR data would be at offset 7.
Part of the original question has not been answered yet. Rory also wants to know how to parse all the RR-interval data, as there can be multiple values within in one message (I have seen up to three). The RR-interval data is not always located within the same bytes. It depends on several things:
is the BPM written into a single byte or two?
is there EE data present?
calculate the number of RR-interval values
Here is the actual spec of the Heart_rate_measurement characteristic
// Instance method to get the heart rate BPM information
- (void) getHeartBPMData:(CBCharacteristic *)characteristic error:(NSError *)error
{
// Get the BPM //
// https://developer.bluetooth.org/gatt/characteristics/Pages/CharacteristicViewer.aspx?u=org.bluetooth.characteristic.heart_rate_measurement.xml //
// Convert the contents of the characteristic value to a data-object //
NSData *data = [characteristic value];
// Get the byte sequence of the data-object //
const uint8_t *reportData = [data bytes];
// Initialise the offset variable //
NSUInteger offset = 1;
// Initialise the bpm variable //
uint16_t bpm = 0;
// Next, obtain the first byte at index 0 in the array as defined by reportData[0] and mask out all but the 1st bit //
// The result returned will either be 0, which means that the 2nd bit is not set, or 1 if it is set //
// If the 2nd bit is not set, retrieve the BPM value at the second byte location at index 1 in the array //
if ((reportData[0] & 0x01) == 0) {
// Retrieve the BPM value for the Heart Rate Monitor
bpm = reportData[1];
offset = offset + 1; // Plus 1 byte //
}
else {
// If the second bit is set, retrieve the BPM value at second byte location at index 1 in the array and //
// convert this to a 16-bit value based on the host’s native byte order //
bpm = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[1]));
offset = offset + 2; // Plus 2 bytes //
}
NSLog(#"bpm: %i", bpm);
// Determine if EE data is present //
// If the 3rd bit of the first byte is 1 this means there is EE data //
// If so, increase offset with 2 bytes //
if ((reportData[0] & 0x03) == 1) {
offset = offset + 2; // Plus 2 bytes //
}
// Determine if RR-interval data is present //
// If the 4th bit of the first byte is 1 this means there is RR data //
if ((reportData[0] & 0x04) == 0)
{
NSLog(#"%#", #"Data are not present");
}
else
{
// The number of RR-interval values is total bytes left / 2 (size of uint16) //
NSUInteger length = [data length];
NSUInteger count = (length - offset)/2;
NSLog(#"RR count: %lu", (unsigned long)count);
for (int i = 0; i < count; i++) {
// The unit for RR interval is 1/1024 seconds //
uint16_t value = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[offset]));
value = ((double)value / 1024.0 ) * 1000.0;
offset = offset + 2; // Plus 2 bytes //
NSLog(#"RR value %lu: %u", (unsigned long)i, value);
}
}
}