Example for length-prefix framing for protocol buffers - serialization

I'm currently working on an protocol buffer system for transporting large messages up to 6 Mb. My concern is that I misinterpreted the following post (https://eli.thegreenplace.net/2011/08/02/length-prefix-framing-for-protocol-buffers) . My idea of that post is:
message GeometryInTime
{
uint32 vecLength = 1;
message Vector3d
{
optional double x = 1;
optional double y = 2;
optional double z = 3;
}
uint32 timeStampLength = 1;
message Timestamp
{
optional int64 seconds = 1;
optional uint32 nanos = 2;
}
}
Is that a valid implementation for the length prefixed system valid? Does it work for repeated fields? Does the length get the serialized length or unserialized ( I'm confusing myself with that )? Does this work for partial message deserialization?
Edit:
message Vector3d
{
optional double x = 1;
optional double y = 2;
optional double z = 3;
}
message Timestamp
{
optional int64 seconds = 1;
optional uint32 nanos = 2;
}
message GeometryInTime
{
uint32 vecLength = 1;
optional Vector3d vector = 2;
uint32 timeStampLength = 3;
optional Timestamp timestamp = 4;
}

An embedded message is just a definition, not a usage. Right now GeometryInTime contains only the lengths.
In terms of embedding sub-messages: there are two formats: length-prefixed and grouped (start/end token, this option is basically deprecated now). When using length-prefixed, the library deals with everything - the length will always be "varint" encoded.
The only time custom length prefix approaches is relevant is for the root message - as part of a framing protocol. In that scenario, the library has nothing to do with it, so no amount of changes to the message will make any difference: you need to handle the frame data (length prefix etc in whatever format) outside of the serializer.

Related

Lexing/tokenization delimited strings

I'm writing a hand-written lexer for a small language but have one weird requirement that I'm not sure how to handle.
I need to be able to support the notion of delimited strings where the delimiter could be any char. eg. strings are most likely to be delimited using double quotes (eg. "hello") but it could just as easily be /hello/ or ,hello,
eg. some sample input lines might be:
x = /abc/
y = "abc" + ,def,
z = zabcz
The last case is a bit pathological, but technically possible.
I'm trying work out if there's any way I can do this in the tokenization phase in the general case? Any thoughts or suggestions would be grand.
Here are solutions in c++ and js.
c++
#include "vector"
#include "string"
#include "iostream"
using namespace std;
// Lexically Analyze method
auto lex_argument(string code){
// Define variables
size_t equal_location;
int counter = 0;
auto variable;
string variable_name;
auto variable_info[2]
string code_for_inspection;
/* In the case of a variable , these two characters will hold the beginning and end of the string */
char string_variable_characters[2];
equal_location = code.find("=",0,code.length());
variable_value = code.substr(equal_location + 2,code.length());
variable_name = code.substr(code.begin(),equal_location - 2);
variable_info[0] = variable_name;
string_variable_characters[0] = (char) variable_value.substr(0,1);
string_variable_characters[1] = (char)
variable_value.substr(variable_value.length() - 1,variable_value.length());
if(string_variable_charecters[0] = string_variable_charecters[1]){
variable_name.erase(0,1);
variable_value.erase(variable_value.length() - 1,variable_value.length());
variable_info[1] = variable_value;
}
return variable_info;
}
and in js:
function lex_argument(code){
var equalLocation = code.search("=");
var variableInfo = [null,null];
variableInfo[1] = code.substr(1,equalLocation - 2);
variableInfo[0] = code.substr(equalLocation,code.length);
string_delimeters = [variableInfo[0].substr(1,2),variableInfo[0].substr(variableInfo[0].length - 1,variableInfo[0].length];
return variableInfo;
}

How can I read individual pixels from a CVPixelBuffer

AVDepthData gives me a CVPixelBuffer of depth data. But I can't find a way to easily access the depth information in this CVPixelBuffer. Is there a simple recipe in Objective-C to do so?
You have to use the CVPixelBuffer APIs to get the right format to access the data via unsafe pointer manipulations. Here is the basic way:
CVPixelBufferRef pixelBuffer = _lastDepthData.depthDataMap;
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
size_t cols = CVPixelBufferGetWidth(pixelBuffer);
size_t rows = CVPixelBufferGetHeight(pixelBuffer);
Float32 *baseAddress = CVPixelBufferGetBaseAddress( pixelBuffer );
// This next step is not necessary, but I include it here for illustration,
// you can get the type of pixel format, and it is associated with a kCVPixelFormatType
// this can tell you what type of data it is e.g. in this case Float32
OSType type = CVPixelBufferGetPixelFormatType( pixelBuffer);
if (type != kCVPixelFormatType_DepthFloat32) {
NSLog(#"Wrong type");
}
// Arbitrary values of x and y to sample
int x = 20; // must be lower that cols
int y = 30; // must be lower than rows
// Get the pixel. You could iterate here of course to get multiple pixels!
int baseAddressIndex = y * (int)cols + x;
const Float32 pixel = baseAddress[baseAddressIndex];
CVPixelBufferUnlockBaseAddress( pixelBuffer, 0 );
Note that the first thing you need to determine is what type of data is in the CVPixelBuffer - if you don't know this then you can use CVPixelBufferGetPixelFormatType() to find out. In this case I am getting depth data at Float32, if you were using another type e.g. Float16, then you would need to replace all occurrences of Float32 with that type.
Note that it's important to lock and unlock the base address using CVPixelBufferLockBaseAddress and CVPixelBufferUnlockBaseAddress.

Return same double only if the double is an int? (no decimals) Obj-C

I'm using a for-loop to determine whether the long double is an int. I have it set up that the for loop loops another long double that is between 2 and final^1/2. Final is a loop I have set up that is basically 2 to the power of 2-10 minus 1. I am then checking if final is an integer. My question is how can I get only the final values that are integers?
My explanation may have been a bit confusing so here is my entire loop code. BTW I am using long doubles because I plan on increasing these numbers very largely.
for (long double ld = 1; ld<10; ld++) {
long double final = powl(2, ld) - 1;
//Would return e.g. 1, 3, 7, 15, 31, 63...etc.
for (long double pD = 2; pD <= powl(final, 0.5); pD++) {
//Create new long double
long double newFinal = final / pD;
//Check if new long double is int
long int intPart = (long int)newFinal;
long double newLong = newFinal - intPart;
if (newLong == 0) {
NSLog(#"Integer");
//Return only the final ints?
}
}
}
Just cast it to an int and subtract it from itself?
long double d;
//assign a value to d
int i = (int)d;
if((double)(d - i) == 0) {
//d has no fractional part
}
As a note... because of the way floating point math works in programming, this == check isn't necessarily the best thing to do. Better would be to decide on a certain level of tolerance, and check whether d was within that tolerance.
For example:
if(fabs((double)(d - i)) < 0.000001) {
//d's fractional part is close enough to 0 for your purposes
}
You can also use long long int and long double to accomplish the same thing. Just be sure you're using the right absolute value function for whatever type you're using:
fabsf(float)
fabs(double)
fabsl(long double)
EDIT... Based on clarification of the actual problem... it seems you're just trying to figure out how to return a collection from a method.
-(NSMutableArray*)yourMethodName {
NSMutableArray *retnArr = [NSMutableArray array];
for(/*some loop logic*/) {
// logic to determine if the number is an int
if(/*number is an int*/) {
[retnArr addObject:[NSNumber numberWithInt:/*current number*/]];
}
}
return retnArr;
}
Stick your logic into this method. Once you've found a number you want to return, stick it into the array using the [retnArr addObject:[NSNumber numberWithInt:]]; method I put up there.
Once you've returned the array, access the numbers like this:
[[arrReturnedFromMethod objectAtIndex:someIndex] intValue];
Optionally, you might want to throw them into the NSNumber object as different types.
You can also use:
[NSNumber numberWithDouble:]
[NSNumber numberWithLongLong:]
And there are matching getters (doubleValue,longLongValue) to extract the number. There are lots of other methods for NSNumber, but these seem the most likely you'd want to be using.

how to extract byte data from bluetooth heart rate monitor in objective c

Im having trouble understanding bytes and uint8_t values.
I am using the sample project created by apple that reads data from a Bluetooth 4.0 heart rate monitor via the heart rate service protocol. THe sample project gives out heart rate data as below:
- (void) updateWithHRMData:(NSData *)data
{
const uint8_t *reportData = [data bytes];
uint16_t bpm = 0;
if ((reportData[0] & 0x01) == 0)
{
/* uint8 bpm */
bpm = reportData[1];
}
else
{
/* uint16 bpm */
bpm = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[1]));
}
I am assuming that (reportData[0] & 0x01) returns the first data bit in the data array reportData but I dont know how to access the second, (reportData[0] & 0x02) doesn't work like I thought it would.
Ideally I would like to check all the data in reportData[0] and then based on that grab the rr interval data in reportData[4] or [5] dependant on where it is stored and iterate through it to get each value as I believe there can be multiple values stored there.
a newbie question I know but Im having trouble finding the answer, or indeed the search terms to establish the answer.
When you do reportData[0] you are getting the first byte (at index 0). When you combine that value with reportData[0] & 0x02, you are masking out all but the 2nd bit. This result will either be 0 (if bit 2 is not set) or it will be 2 (if the 2nd bit is set).
if ((reportData[0] & 0x02) == 0) {
// bit 2 of first byte is not set
} else {
// bit 2 of first byte is set
}
If you want to check all 8 bits then you could do:
uint8_t byte = reportData[0];
for (int i = 0; i < 8; i++) {
int mask = 1 << i;
if ((byte & mask) == 0) {
bit i is not set
} else {
bit i is set
}
}
Update: To extract a value that spans two bits you do something like this:
uint8_t mask = 0x01 | 0x02; // Good for value stored in the first two bits
uint8_t value = byte & mask; // value now has just value from first two bits
If the value to extract is in higher bits then there is an extra step:
uint8_t mask = 0x02 | 0x04; // Good for value in 2nd and 3rd bits
uint8_t value = (byte & mask) >> 1; // need to shift value to convert to regular integer
Check this post for a discussion of the sample code. The post also links to the Bluetooth spec which should help you understand why the endianess check is being performed (basically it's Apple ensuring maximum portability). Basically, the first byte is a bit field describing the format of the HRV data and the presence/absence of EE and RR interval data. So:
reportData[0] & 0x03
tells you if EE data are present (1 = yes, 0 = no), and
reportData[0] & 0x04
tells you if RR interval data are present (1 = yes, 0 = no)
You can then get RR interval data with
uint16_t rrinterval;
rrinterval = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[idx]));
where idx is determined by the presence/absence tests you performed. I am assuming that the offsets are not fixed BTW as that's what you are indicating (i.e., dynamic offsets based on presence/absence) -- I'm not familiar with the BT spec. If the format is fixed, in this case the RR data would be at offset 7.
Part of the original question has not been answered yet. Rory also wants to know how to parse all the RR-interval data, as there can be multiple values within in one message (I have seen up to three). The RR-interval data is not always located within the same bytes. It depends on several things:
is the BPM written into a single byte or two?
is there EE data present?
calculate the number of RR-interval values
Here is the actual spec of the Heart_rate_measurement characteristic
// Instance method to get the heart rate BPM information
- (void) getHeartBPMData:(CBCharacteristic *)characteristic error:(NSError *)error
{
// Get the BPM //
// https://developer.bluetooth.org/gatt/characteristics/Pages/CharacteristicViewer.aspx?u=org.bluetooth.characteristic.heart_rate_measurement.xml //
// Convert the contents of the characteristic value to a data-object //
NSData *data = [characteristic value];
// Get the byte sequence of the data-object //
const uint8_t *reportData = [data bytes];
// Initialise the offset variable //
NSUInteger offset = 1;
// Initialise the bpm variable //
uint16_t bpm = 0;
// Next, obtain the first byte at index 0 in the array as defined by reportData[0] and mask out all but the 1st bit //
// The result returned will either be 0, which means that the 2nd bit is not set, or 1 if it is set //
// If the 2nd bit is not set, retrieve the BPM value at the second byte location at index 1 in the array //
if ((reportData[0] & 0x01) == 0) {
// Retrieve the BPM value for the Heart Rate Monitor
bpm = reportData[1];
offset = offset + 1; // Plus 1 byte //
}
else {
// If the second bit is set, retrieve the BPM value at second byte location at index 1 in the array and //
// convert this to a 16-bit value based on the host’s native byte order //
bpm = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[1]));
offset = offset + 2; // Plus 2 bytes //
}
NSLog(#"bpm: %i", bpm);
// Determine if EE data is present //
// If the 3rd bit of the first byte is 1 this means there is EE data //
// If so, increase offset with 2 bytes //
if ((reportData[0] & 0x03) == 1) {
offset = offset + 2; // Plus 2 bytes //
}
// Determine if RR-interval data is present //
// If the 4th bit of the first byte is 1 this means there is RR data //
if ((reportData[0] & 0x04) == 0)
{
NSLog(#"%#", #"Data are not present");
}
else
{
// The number of RR-interval values is total bytes left / 2 (size of uint16) //
NSUInteger length = [data length];
NSUInteger count = (length - offset)/2;
NSLog(#"RR count: %lu", (unsigned long)count);
for (int i = 0; i < count; i++) {
// The unit for RR interval is 1/1024 seconds //
uint16_t value = CFSwapInt16LittleToHost(*(uint16_t *)(&reportData[offset]));
value = ((double)value / 1024.0 ) * 1000.0;
offset = offset + 2; // Plus 2 bytes //
NSLog(#"RR value %lu: %u", (unsigned long)i, value);
}
}
}

Functions to compress and uncompress array of integers

I was recently asked to complete a task for a c++ role, however as the application was decided not to be progressed any further I thought that I would post here for some feedback / advice / improvements / reminder of concepts I've forgotten.
The task was:
The following data is a time series of integer values
int timeseries[32] = {67497, 67376, 67173, 67235, 67057, 67031, 66951,
66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044,
67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620,
66579, 66596, 66713, 66852, 66715};
The series might be, for example, the closing price of a stock each day
over a 32 day period.
As stored above, the data will occupy 32 x sizeof(int) bytes = 128 bytes
assuming 4 byte ints.
Using delta encoding , write a function to compress, and a function to
uncompress data like the above.
Ok, so before this point I had never looked at compression so my solution is far from perfect. The manner in which I approached the problem is by compressing the array of integers into a array of bytes. When representing the integer as a byte I keep the calculate most
significant byte (msb) and keep everything up to this point, whilst throwing the rest away. This is then added to the byte array. For negative values I increment the msb by 1 so that we can
differentiate between positive and negative bytes when decoding by keeping the leading
1 bit values.
When decoding I parse this jagged byte array and simply reverse my
previous actions performed when compressing. As mentioned I have never looked at compression prior to this task so I did come up with my own method to compress the data. I was looking at C++/Cli recently, had not really used it previously so just decided to write it in this language, no particular reason. Below is the class, and a unit test at the very bottom. Any advice / improvements / enhancements will be much appreciated.
Thanks.
array<array<Byte>^>^ CDeltaEncoding::CompressArray(array<int>^ data)
{
int temp = 0;
int original;
int size = 0;
array<int>^ tempData = gcnew array<int>(data->Length);
data->CopyTo(tempData, 0);
array<array<Byte>^>^ byteArray = gcnew array<array<Byte>^>(tempData->Length);
for (int i = 0; i < tempData->Length; ++i)
{
original = tempData[i];
tempData[i] -= temp;
temp = original;
int msb = GetMostSignificantByte(tempData[i]);
byteArray[i] = gcnew array<Byte>(msb);
System::Buffer::BlockCopy(BitConverter::GetBytes(tempData[i]), 0, byteArray[i], 0, msb );
size += byteArray[i]->Length;
}
return byteArray;
}
array<int>^ CDeltaEncoding::DecompressArray(array<array<Byte>^>^ buffer)
{
System::Collections::Generic::List<int>^ decodedArray = gcnew System::Collections::Generic::List<int>();
int temp = 0;
for (int i = 0; i < buffer->Length; ++i)
{
int retrievedVal = GetValueAsInteger(buffer[i]);
decodedArray->Add(retrievedVal);
decodedArray[i] += temp;
temp = decodedArray[i];
}
return decodedArray->ToArray();
}
int CDeltaEncoding::GetMostSignificantByte(int value)
{
array<Byte>^ tempBuf = BitConverter::GetBytes(Math::Abs(value));
int msb = tempBuf->Length;
for (int i = tempBuf->Length -1; i >= 0; --i)
{
if (tempBuf[i] != 0)
{
msb = i + 1;
break;
}
}
if (!IsPositiveInteger(value))
{
//We need an extra byte to differentiate the negative integers
msb++;
}
return msb;
}
bool CDeltaEncoding::IsPositiveInteger(int value)
{
return value / Math::Abs(value) == 1;
}
int CDeltaEncoding::GetValueAsInteger(array<Byte>^ buffer)
{
array<Byte>^ tempBuf;
if(buffer->Length % 2 == 0)
{
//With even integers there is no need to allocate a new byte array
tempBuf = buffer;
}
else
{
tempBuf = gcnew array<Byte>(4);
System::Buffer::BlockCopy(buffer, 0, tempBuf, 0, buffer->Length );
unsigned int val = buffer[buffer->Length-1] &= 0xFF;
if ( val == 0xFF )
{
//We have negative integer compressed into 3 bytes
//Copy over the this last byte as well so we keep the negative pattern
System::Buffer::BlockCopy(buffer, buffer->Length-1, tempBuf, buffer->Length, 1 );
}
}
switch(tempBuf->Length)
{
case sizeof(short):
return BitConverter::ToInt16(tempBuf,0);
case sizeof(int):
default:
return BitConverter::ToInt32(tempBuf,0);
}
}
And then in a test class I had:
void CTestDeltaEncoding::TestCompression()
{
array<array<Byte>^>^ byteArray = CDeltaEncoding::CompressArray(m_testdata);
array<int>^ decompressedArray = CDeltaEncoding::DecompressArray(byteArray);
int totalBytes = 0;
for (int i = 0; i<byteArray->Length; i++)
{
totalBytes += byteArray[i]->Length;
}
Assert::IsTrue(m_testdata->Length * sizeof(m_testdata) > totalBytes, "Expected the total bytes to be less than the original array!!");
//Expected totalBytes = 53
}
This smells a lot like homework to me. The crucial phrase is: "Using delta encoding."
Delta encoding means you encode the delta (difference) between each number and the next:
67497, 67376, 67173, 67235, 67057, 67031, 66951, 66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044, 67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620, 66579, 66596, 66713, 66852, 66715
would turn into:
[Base: 67497]: -121, -203, +62
and so on. Assuming 8-bit bytes, the original numbers require 3 bytes apiece (and given the number of compilers with 3-byte integer types, you're normally going to end up with 4 bytes apiece). From the looks of things, the differences will fit quite easily in 2 bytes apiece, and if you can ignore one (or possibly two) of the least significant bits, you can fit them in one byte apiece.
Delta encoding is most often used for things like sound encoding where you can "fudge" the accuracy at times without major problems. For example, if you have a change from one sample to the next that's larger than you've left space to encode, you can encode a maximum change in the current difference, and add the difference to the next delta (and if you don't mind some back-tracking, you can distribute some to the previous delta as well). This will act as a low-pass filter, limiting the gradient between samples.
For example, in the series you gave, a simple delta encoding requires ten bits to represent all the differences. By dropping the LSB, however, nearly all the samples (all but one, in fact) can be encoded in 8 bits. That one has a difference (right shifted one bit) of -173, so if we represent it as -128, we have 45 left. We can distribute that error evenly between the preceding and following sample. In that case, the output won't be an exact match for the input, but if we're talking about something like sound, the difference probably won't be particularly obvious.
I did mention that it was an exercise that I had to complete and the solution that I received was deemed not good enough, so I wanted some constructive feedback seeing as actual companies never decide to tell you what you did wrong.
When the array is compressed I store the differences and not the original values except the first as this was my understanding. If you had looked at my code I have provided a full solution but my question was how bad was it?