QR Code encode mode for short URLs - optimization

Usual URL shortening techniques use few characters of the usual URL-charset, because not need more. Typical short URL is http://domain/code, where code is a integer number. Suppose that I can use any base (base10, base16, base36, base62, etc.) to represent the number.
QR Code have many encoding modes, and we can optimize the QR Code (minimal version to obtain lowest density), so we can test pairs of baseX-modeY...
What is the best base-mode pair?
NOTES
A guess...
Two modes fit with the "URL shortening profile",
0010 - Alphanumeric encoding (11 bits per 2 characters)
0100- Byte encoding (8 bits per character)
My choice was "upper case base36" and Alphanumeric (that also encodes "/", ":", etc.), but not see any demonstration that it is always (for any URL-length) the best. There are some good Guide or Mathematical demonstration about this kind of optimization?
The ideal (perhaps impracticable)
There are another variation, "encoding modes can be mixed as needed within a QR symbol" (Wikipedia)... So, we can use also
HTTP://DOMAIN/ with Alphanumeric + change_mode + Numeric encoding (10 bits per 3 digits)
For long URLs (long integers), of course, this is the best solution (!), because use all charset, no loose... Is it?
The problem is that this kind of optimization (mixed mode) is not accessible in usual QRCode-image generators... it is practicable? There are one generator using correctally?
An alternative answer format
The (practicable) question is about best combination of base and mode, so we can express it as a (eg. Javascript) function,
function bestBaseMode(domain,number_range) {
var dom_len = domain.length;
var urlBase_len = dom_len+8; // 8 = "http://".length + "/".length;
var num_min = number_range[0];
var num_max = number_range[1];
// ... check optimal base and mode
return [base,mode];
}
Example-1: the domain is "bit.ly" and the code is a ISO3166-1-numeric country-code,
ranging from 4 to 894. So urlBase_len=14, num_min=4 and num_max=894.
Example-2: the domain is "postcode-resolver.org" and number_range parameter is the range of most frequent postal codes integer representations, for instance a statistically inferred range from ~999 to ~999999. So urlBase_len=27, num_min=999 and num_max=9999999.
Example-3: the domain is "my-example3.net" and number_range a double SHA-1 code, so a fixed length code with 40 bytes (2 concatenated hexadecimal 40 digits long numbers). So num_max=num_min=Math.pow(8,40).

Nobody want my bounty... I lost it, and now also need to do the work by myself ;-)
about the ideal
The goQR.me support reply the particular question about mixed encoding remembering that, unfortunately, it can't be used,
sorry, our api does not support mixed qr code encoding.
Even the standard may defined it. Real world QR code scanner apps
on mobile phone have tons of bugs, we would not recommend to rely
on this feature.
functional answer
This function show the answers in the console... It is a simplification and "brute force" solution.
/**
* Find the best base-mode pair for a short URL template as QR-Code.
* #param Msg for debug or report.
* #param domain the string of the internet domain
* #param digits10 the max. number of digits in a decimal representation
* #return array of objects with equivalent valid answers.
*/
function bestBaseMode(msg, domain,digits10) {
var commomBases= [2,8,10,16,36,60,62,64,124,248]; // your config
var dom_len = domain.length;
var urlBase_len = dom_len+8; // 8 = "http://".length + "/".length
var numb = parseFloat( "9".repeat(digits10) );
var scores = [];
var best = 99999;
for(i in commomBases) {
var b = commomBases[i];
// formula at http://math.stackexchange.com/a/335063
var digits = Math.floor(Math.log(numb) / Math.log(b)) + 1;
var mode = 'alpha';
var len = dom_len + digits;
var lost = 0;
if (b>36) {
mode = 'byte';
lost = parseInt( urlBase_len*0.25); // only 6 of 8 bits used at URL
}
var score = len+lost; // penalty
scores.push({BASE:b,MODE:mode,digits:digits,score:score});
if (score<best) best = score;
}
var r = [];
for(i in scores) {
if (scores[i].score==best) r.push(scores[i]);
}
return r;
}
Running the question examples:
var x = bestBaseMode("Example-1", "bit.ly",3);
console.log(JSON.stringify(x)) // "BASE":36,"MODE":"alpha","digits":2,"score":8
var x = bestBaseMode("Example-2", "postcode-resolver.org",7);
console.log(JSON.stringify(x)) // "BASE":36,"MODE":"alpha","digits":5,"score":26
var x = bestBaseMode("Example-3", "my-example3.net",97);
console.log(JSON.stringify(x)) // "BASE":248,"MODE":"byte","digits":41,"score":61

Related

How to represent ObjC enum AVAudioSessionPortOverride which has declaration of int and string using Dart ffi?

I'm working on a cross platform sound API for Flutter.
We're trying to stop using Objective C/Swift for the iOS portion of the API and we're using Dart ffi as a replacement.
ffi(foreign function interface) allows dart to call into an Obj C API.
This means we need to create a dart library which wraps the Obj C audio library.
Whilst doing this we encountered the AVAudioSessionPortOverride enum which has two declarations; AVAudioSessionPortOverrideSpeaker = 'spkr' and AVAudioSessionPortOverrideNone = 0.
I'm confused as to what's going on here as one of these declarations is an int whilst the other is a string.
I note that AVAudioSessionPortOverride extends an NSUInteger so how is the string being handled. Is it somehow being converted to an int? if so any ideas on how I would do this in dart?
Here's what we have so far:
class AVAudioSessionPortOverride extends NSUInteger {
const AVAudioSessionPortOverride(int value) : super(value);
static AVAudioSessionPortOverride None = AVAudioSessionPortOverride(0);
static const AVAudioSessionPortOverride Speaker =
AVAudioSessionPortOverride('spkr');
}
'spkr' is in fact an int. See e.g. How to convert multi-character constant to integer in C? for an explanation of how this obscure feature in C works.
That said, if you look at the Swift representation of the PortOverride enum, you'll see this:
/// For use with overrideOutputAudioPort:error:
public enum PortOverride : UInt {
/// No override. Return audio routing to the default state for the current audio category.
case none = 0
/// Route audio output to speaker. Use this override with AVAudioSessionCategoryPlayAndRecord,
/// which by default routes the output to the receiver.
case speaker = 1936747378
}
Also, see https://developer.apple.com/documentation/avfoundation/avaudiosession/portoverride/speaker
Accordingly, 0 and 1936747378 are the values you should use.
Look at this
NSLog(#"spkr = %x s = %x p = %x k = %x r = %x", 'spkr', 's', 'p', 'k', 'r' );
Apple is doing everything your lecturer warned you against. You can get away with this since the string is 4 chars (bytes) long. If you make it longer you'll get a warning. The string gets converted to an int as illustrated in the code snippet above. You could reverse it by accessing the four bytes one by one and printing them as a character.
Spoiler - it will print
spkr = 73706b72 s = 73 p = 70 k = 6b r = 72

Add a VSA (Vendor Specific Attribute) to Access-Accept reply programmatically in FreeRADIUS C module

I have a FreeRADIUS C language module that implements MOD_AUTHENTICATE and MOD_AUTHORIZE methods for custom auth purpose. I need the ability to programmatically add VSAs to the Access-Accept reply.
I have toyed a bit with radius_pair_create() and fr_pair_add() methods (see snippet below) but that didn’t yield any change to the reply content, possibly because I specified ad-hoc values that don’t exist in a vendor-specific dictionary. Or because I didn’t use them correctly.
My FreeRADIUS version is 3_0_19
Any information, pointers and, especially, syntax samples will be highly appreciated.
void test_vsa(REQUEST *request)
{
VALUE_PAIR *vp = NULL;
vp = radius_pair_create(request->reply, NULL, 18, 0);
if (vp)
{
log("Created VALUE_PAIR");
vp->vp_integer = 96;
fr_pair_add(&request->reply->vps, vp);
}
else
{
log("Failed to create VALUE_PAIR");
}
}
So first off you're writing an integer value to a string attribute, which is wrong. The only reason why the server isn't SEGVing is because the length of the VP has been left at zero, so the RADIUS encoder doesn't bother dereferencing the char * inside the pair that's meant to contain the pair's value.
fr_pair_make is the easier function to use here, as it takes both the attribute name and value as strings, so you don't need to worry about the C types.
The code snippet below should do what you want.
void test_avp(REQUEST *request)
{
VALUE_PAIR *vp = NULL;
vp = fr_pair_make(request->reply, &request->reply->vps, "Reply-Message", "Hello from FreeRADIUS", T_OP_SET);
if (vp)
{
log("Created VALUE_PAIR");
}
else
{
log("Failed to create VALUE_PAIR");
}
}
For a bit more of an explanation, lets look at the doxygen header:
/** Create a VALUE_PAIR from ASCII strings
*
* Converts an attribute string identifier (with an optional tag qualifier)
* and value string into a VALUE_PAIR.
*
* The string value is parsed according to the type of VALUE_PAIR being created.
*
* #param[in] ctx for talloc
* #param[in] vps list where the attribute will be added (optional)
* #param[in] attribute name.
* #param[in] value attribute value (may be NULL if value will be set later).
* #param[in] op to assign to new VALUE_PAIR.
* #return a new VALUE_PAIR.
*/
VALUE_PAIR *fr_pair_make(TALLOC_CTX *ctx, VALUE_PAIR **vps,
char const *attribute, char const *value, FR_TOKEN op)
ctx - This is the packet or request that the vps will belong to. If you're adding attributes to the request it should be request->packet, reply would be request->reply, control would be request.
vps - If specified, this will be which list to insert the new VP into. If this is NULL fr_pair_make will just return the pair and let you insert it into a list.
attribute - The name of the attribute as a string.
value - The value of the attribute as a string. For non-string types, fr_pair_make will attempt to perform a conversion. So, for example, passing "12345" for an integer type, will result in the integer value 12345 being written to an int field in the attribute.
op - You'll usually want to us T_OP_SET which means overwrite existing instances of the same attribute. See the T_OP_* values of FR_TOKEN and the code that uses them, if you want to understand the different operators and what they do.

Ambiguous process calcChecksum

CONTEXT
I'm using a code written to work with a GPS module that connects to the Arduino through serial communication. The module starts each packet with a header (0xb5, 0x62), continues with the information you requested and ends with to bytes of checksum, CK_A, and CK_B. I don't understand the code that calculates that checksum. More info about the algorithm of checksum (8-Bit Fletcher Algorithm) in the module protocol (https://www.u-blox.com/sites/default/files/products/documents/u-blox7-V14_ReceiverDescriptionProtocolSpec_%28GPS.G7-SW-12001%29_Public.pdf), page 74 (87 with index).
MORE INFO
Just wanted to understand the code, it works fine. In the UBX protocol, I mentioned there is also a piece of code that explains how it works (isn't write in c++)
struct NAV_POSLLH {
//Here goes the struct
};
NAV_POSLLH posllh;
void calcChecksum(unsigned char* CK) {
memset(CK, 0, 2);
for (int i = 0; i < (int)sizeof(NAV_POSLLH); i++) {
CK[0] += ((unsigned char*)(&posllh))[i];
CK[1] += CK[0];
}
}
In the link you provide, you can find a link to RFC 1145, containing that Fletcher 8 bit algorithm as well and explaining
It can be shown that at the end of the loop A will contain the 8-bit
1's complement sum of all octets in the datagram, and that B will
contain (n)*D[0] + (n-1)*D[1] + ... + D[n-1].
n = sizeof byte D[];
Quote adjusted to C syntax
Try it with a couple of bytes, pen and paper, and you'll see :)

Unwanted click when using SoXR Library to do variable rate resampling

I am using the SoXR library's variable rate feature to dynamically change the sampling rate of an audio stream in real time. Unfortunately I have have noticed that an unwanted clicking noise is present when changing the rate from 1.0 to a larger value (ex: 1.01) when testing with a sine wave. I have not noticed any unwanted artifacts when changing from a value larger than 1.0 to 1.0. I looked at the wave form it was producing and it appeared as if a few samples right at rate change are transposed incorrectly.
Here's a picture of an example of a stereo 440Hz sinewave stored using signed 16bit interleaved samples:
I also was unable to find any documentation covering the variable rate feature beyond the fifth code example. Here's is my initialization code:
bool DynamicRateAudioFrameQueue::intialize(uint32_t sampleRate, uint32_t numChannels)
{
mSampleRate = sampleRate;
mNumChannels = numChannels;
mRate = 1.0;
mGlideTimeInMs = 0;
// Intialize buffer
size_t intialBufferSize = 100 * sampleRate * numChannels / 1000; // 100 ms
pFifoSampleBuffer = new FiFoBuffer<int16_t>(intialBufferSize);
soxr_error_t error;
// Use signed int16 with interleaved channels
soxr_io_spec_t ioSpec = soxr_io_spec(SOXR_INT16_I, SOXR_INT16_I);
// "When creating a var-rate resampler, q_spec must be set as follows:" - example code
// Using SOXR_VR makes sense, but I'm not sure if the quality can be altered when using var-rate
soxr_quality_spec_t qualitySpec = soxr_quality_spec(SOXR_HQ, SOXR_VR);
// Using the var-rate io-spec is undocumented beyond a single code example which states
// "The ratio of the given input rate and ouput rates must equate to the
// maximum I/O ratio that will be used: "
// My tests show this is not true
double inRate = 1.0;
double outRate = 1.0;
mSoxrHandle = soxr_create(inRate, outRate, mNumChannels, &error, &ioSpec, &qualitySpec, NULL);
if (error == 0) // soxr_error_t == 0; no error
{
mIntialized = true;
return true;
}
else
{
return false;
}
}
Any idea what may be causing this to happen? Or have a suggestion for an alternative library that is capable of variable rate audio resampling in real time?
After speaking with the developer of the SoXR library I was able to resolve this issue by adjusting the maximum ratio parameters in the soxr_create method call. The developer's response can be found here.

Functions to compress and uncompress array of integers

I was recently asked to complete a task for a c++ role, however as the application was decided not to be progressed any further I thought that I would post here for some feedback / advice / improvements / reminder of concepts I've forgotten.
The task was:
The following data is a time series of integer values
int timeseries[32] = {67497, 67376, 67173, 67235, 67057, 67031, 66951,
66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044,
67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620,
66579, 66596, 66713, 66852, 66715};
The series might be, for example, the closing price of a stock each day
over a 32 day period.
As stored above, the data will occupy 32 x sizeof(int) bytes = 128 bytes
assuming 4 byte ints.
Using delta encoding , write a function to compress, and a function to
uncompress data like the above.
Ok, so before this point I had never looked at compression so my solution is far from perfect. The manner in which I approached the problem is by compressing the array of integers into a array of bytes. When representing the integer as a byte I keep the calculate most
significant byte (msb) and keep everything up to this point, whilst throwing the rest away. This is then added to the byte array. For negative values I increment the msb by 1 so that we can
differentiate between positive and negative bytes when decoding by keeping the leading
1 bit values.
When decoding I parse this jagged byte array and simply reverse my
previous actions performed when compressing. As mentioned I have never looked at compression prior to this task so I did come up with my own method to compress the data. I was looking at C++/Cli recently, had not really used it previously so just decided to write it in this language, no particular reason. Below is the class, and a unit test at the very bottom. Any advice / improvements / enhancements will be much appreciated.
Thanks.
array<array<Byte>^>^ CDeltaEncoding::CompressArray(array<int>^ data)
{
int temp = 0;
int original;
int size = 0;
array<int>^ tempData = gcnew array<int>(data->Length);
data->CopyTo(tempData, 0);
array<array<Byte>^>^ byteArray = gcnew array<array<Byte>^>(tempData->Length);
for (int i = 0; i < tempData->Length; ++i)
{
original = tempData[i];
tempData[i] -= temp;
temp = original;
int msb = GetMostSignificantByte(tempData[i]);
byteArray[i] = gcnew array<Byte>(msb);
System::Buffer::BlockCopy(BitConverter::GetBytes(tempData[i]), 0, byteArray[i], 0, msb );
size += byteArray[i]->Length;
}
return byteArray;
}
array<int>^ CDeltaEncoding::DecompressArray(array<array<Byte>^>^ buffer)
{
System::Collections::Generic::List<int>^ decodedArray = gcnew System::Collections::Generic::List<int>();
int temp = 0;
for (int i = 0; i < buffer->Length; ++i)
{
int retrievedVal = GetValueAsInteger(buffer[i]);
decodedArray->Add(retrievedVal);
decodedArray[i] += temp;
temp = decodedArray[i];
}
return decodedArray->ToArray();
}
int CDeltaEncoding::GetMostSignificantByte(int value)
{
array<Byte>^ tempBuf = BitConverter::GetBytes(Math::Abs(value));
int msb = tempBuf->Length;
for (int i = tempBuf->Length -1; i >= 0; --i)
{
if (tempBuf[i] != 0)
{
msb = i + 1;
break;
}
}
if (!IsPositiveInteger(value))
{
//We need an extra byte to differentiate the negative integers
msb++;
}
return msb;
}
bool CDeltaEncoding::IsPositiveInteger(int value)
{
return value / Math::Abs(value) == 1;
}
int CDeltaEncoding::GetValueAsInteger(array<Byte>^ buffer)
{
array<Byte>^ tempBuf;
if(buffer->Length % 2 == 0)
{
//With even integers there is no need to allocate a new byte array
tempBuf = buffer;
}
else
{
tempBuf = gcnew array<Byte>(4);
System::Buffer::BlockCopy(buffer, 0, tempBuf, 0, buffer->Length );
unsigned int val = buffer[buffer->Length-1] &= 0xFF;
if ( val == 0xFF )
{
//We have negative integer compressed into 3 bytes
//Copy over the this last byte as well so we keep the negative pattern
System::Buffer::BlockCopy(buffer, buffer->Length-1, tempBuf, buffer->Length, 1 );
}
}
switch(tempBuf->Length)
{
case sizeof(short):
return BitConverter::ToInt16(tempBuf,0);
case sizeof(int):
default:
return BitConverter::ToInt32(tempBuf,0);
}
}
And then in a test class I had:
void CTestDeltaEncoding::TestCompression()
{
array<array<Byte>^>^ byteArray = CDeltaEncoding::CompressArray(m_testdata);
array<int>^ decompressedArray = CDeltaEncoding::DecompressArray(byteArray);
int totalBytes = 0;
for (int i = 0; i<byteArray->Length; i++)
{
totalBytes += byteArray[i]->Length;
}
Assert::IsTrue(m_testdata->Length * sizeof(m_testdata) > totalBytes, "Expected the total bytes to be less than the original array!!");
//Expected totalBytes = 53
}
This smells a lot like homework to me. The crucial phrase is: "Using delta encoding."
Delta encoding means you encode the delta (difference) between each number and the next:
67497, 67376, 67173, 67235, 67057, 67031, 66951, 66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044, 67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620, 66579, 66596, 66713, 66852, 66715
would turn into:
[Base: 67497]: -121, -203, +62
and so on. Assuming 8-bit bytes, the original numbers require 3 bytes apiece (and given the number of compilers with 3-byte integer types, you're normally going to end up with 4 bytes apiece). From the looks of things, the differences will fit quite easily in 2 bytes apiece, and if you can ignore one (or possibly two) of the least significant bits, you can fit them in one byte apiece.
Delta encoding is most often used for things like sound encoding where you can "fudge" the accuracy at times without major problems. For example, if you have a change from one sample to the next that's larger than you've left space to encode, you can encode a maximum change in the current difference, and add the difference to the next delta (and if you don't mind some back-tracking, you can distribute some to the previous delta as well). This will act as a low-pass filter, limiting the gradient between samples.
For example, in the series you gave, a simple delta encoding requires ten bits to represent all the differences. By dropping the LSB, however, nearly all the samples (all but one, in fact) can be encoded in 8 bits. That one has a difference (right shifted one bit) of -173, so if we represent it as -128, we have 45 left. We can distribute that error evenly between the preceding and following sample. In that case, the output won't be an exact match for the input, but if we're talking about something like sound, the difference probably won't be particularly obvious.
I did mention that it was an exercise that I had to complete and the solution that I received was deemed not good enough, so I wanted some constructive feedback seeing as actual companies never decide to tell you what you did wrong.
When the array is compressed I store the differences and not the original values except the first as this was my understanding. If you had looked at my code I have provided a full solution but my question was how bad was it?