Arc4random modulo biased - objective-c

According to this documentation,
arc4random_uniform() is recommended over constructions like arc4random() % upper_bound as it avoids "modulo bias" when the upper bound is not a power of two.
How bad is the bias? For example if I generate random numbers with an upper bound of 6, what's the difference between using arc4random with % and arc4random_uniform()?

arc4random() returns an unsigned 32-bit integer, meaning the values are between
0 and 2^32-1 = 4 294 967 295.
Now, the bias results from the fact that the multiple subintervals created with
modulo are not fitting exactly into the random output range.
Lets imagine for clarity a random generator that creates numbers from 0 to 198
inclusive. You want numbers from 0 to 99, therefore you calculate random() % 100,
yielding 0 to 99:
0 % 100 = 0
99 % 100 = 99
100 % 100 = 0
198 % 100 = 98
You see that 99 is the only number which can occur only once while all
others can occur twice in a run. That means that the probability for 99
is exactly halved which is also the worst case in a bias where at least
2 subintervals are involved.
As all powers of two smaller than the range interval fits nicely into the
2^32 interval, the bias disappears in this case.
The implications are that the smaller the result set with modulo and the higher
the random output range, the smaller the bias. In your example, 6 is your upper
bound (I assume 0 is the lower bound), so you use % 7, resulting that 0-3
occurs 613 566 757 times while 4-6 occurs 613 566 756 times.
So 0-3 is 613 566 757 / 613 566 756 = 1.0000000016298 times more probable
than 4-6.
While it seems easy to dismiss, some experiments (especially Monte-Carlo
experiments) were flawed exactly because these seemingly incredible small
differences were pretty important.
Even worse is the bias if the desired output range is bigger than
the random target range. Please read the Fisher-Yates shuffle entry
because many poker sites learned the hard way that normal linear
congruential random generators and bad shuffling algorithms resulted in
impossible or very probable decks or worse, predictable decks.

Related

Integer part bit growth for fixed point numbers of the 0.xyz kind

First of all we should agree on the definition of the QM.N format. I will follow this resource and its conventions.
For the purposes of this paper the notion of a Q-point for a fixed-point number is introduced.
This labeling convention is as follows:
Q[QI].[QF]
Where QI = # of integer bits & QF = # of fractional bits
For signed integer variable types we will include the sign bit in QI as it does have integer
weight albeit negative in sign.
Based on this convention, if I had to represent the number -0.123 in the format Q1.7 I would write it as: 1.1110001
The theory says that:
When performing an integer multiplication the product is 2xWL if both the multiplier and
multiplicand are WL long. If the integer multiplication is on fixed-point variables, the number of
integer and fractional bits in the product is the sum of the corresponding multiplier and
multiplicand Q-points as described by the following equations
Knowing this is useful because after multiplication we have double precision, and we need to rescale the output to our input precision. Knowing where the integer part is allows us to prevent overflow and to pick the relevant bits, as in the example where the long string is the result of the multiplication:
However, when performing the multiplication between two Q1.7 numbers of the format 0.xyz I have noticed that the integer part never grows, allowing me to pick only one bit from the integer part. I have written a piece of code that picks only the fractional part after multiplication, and here are the results.
Test 0
Testing +0.5158*+0.0596
A:real_val:+0.5156 fixed: 66 int: 0 frac: 1000010
B:real_val:+0.0547 fixed: 7 int: 0 frac: 0000111
C: real_val:+0.0282 fixed: 462 int: 00 frac: 00000111001110
Floating multiplication: +0.0307
Test 1
Testing +0.4842*-0.9558
A:real_val:+0.4766 fixed: 61 int: 0 frac: 0111101
B:real_val:-0.9531 fixed: -122 int: 1 frac: 0000110
C: real_val:-0.4542 fixed: -7442 int: 11 frac: 10001011101110
Floating multiplication: -0.4628
Test 2
Testing +0.2812*+0.2433
A:real_val:+0.2734 fixed: 35 int: 0 frac: 0100011
B:real_val:+0.2422 fixed: 31 int: 0 frac: 0011111
C: real_val:+0.0662 fixed: 1085 int: 00 frac: 00010000111101
Floating multiplication: +0.0684
Test 3
Testing -0.7235*-0.9037
A:real_val:-0.7188 fixed: -92 int: 1 frac: 0100100
B:real_val:-0.8984 fixed: -115 int: 1 frac: 0001101
C: real_val:+0.6458 fixed: 10580 int: 00 frac: 10100101010100
Floating multiplication: +0.6538
My question to you is if I am overlooking anything here or if this is normal and expected behaviour from fixed points. If so, I will be happy with my numbers never overflowing during multiplication.
Basically what I mean is that after multiplication of two Q1.X numbers in the form 0.xyz the integer part will always be 0 (if the result is positive) or 1111.. if the result is negative.
So my accumulator register will be filled with only 2*X of meaningful bits and I can take only them, plus the sign.
No, the number of bits in the result is still the sum of the bits in the inputs.
Summary:
Signed Q1.31 times signed Q1.31 equals signed Q2.62.
Unsigned Q1.31 times unsigned Q1.31 equals unsigned Q2.62.
Explanation:
Unsigned Q1.n numbers can represent from zero (inclusive) to two (exclusive). If you multiply two such numbers together the range of results is from zero (inclusive) to 4 (exclusive). Just less than four is three point something, and three fits in the two bits above the point.
Signed Q1.n numbers can represent from negative one (inclusive) to one (exclusive). If you multiply two such numbers together the range of results is negative one (exclusive) to one (inclusive). Signed Q1.31 times signed Q1.31 would fit in Q1.62 except for the single case -1.0 times -1.0 equals +1.0, which requires the extra bit above the point.
The equations in your question apply equally in both these cases.

How does numpy manage to divide float32 by 2**63?

Here Daniel mentions
... you pick any integer in [0, 2²⁴), and divide it by 2²⁴, then you can recover your original integer by multiplying the result again by 2²⁴. This works with 2²⁴ but not with 2²⁵ or any other larger number.
But when I tried
>>> b = np.divide(1, 2**63, dtype=np.float32)
>>> b*2**63
1.0
Although it isn't working for 2⁶⁴, but I'm left wondering why it's working for all the exponents from 24 to 63. And moreover if it's unique to numpy only.
In the context that passage is in, it is not saying that an integer value cannot be divided by 225 or 263 and then multiplied to restore the original value. It is saying that this will not work to create an unbiased distribution of numbers.
The text leaves some things not explicitly stated, but I suspect it is discussing taking a value of integer type, converting it to IEEE-754 single-precision, and then dividing it. This will not work for factors larger than 224 because the conversion from integer type to IEEE-754 single-precision will have to round the number.
For example, for 232, all numbers from 0 to 16,777,215 will convert to themselves with no error, and then dividing by 232 will produce a unique floating-point number for each. But both 16,777,216 and 16,777,217 will convert to 16,777,216, and then dividing by 232 will produce the same number for them (1/256). All numbers from 2,147,483,520 to 2,147,483,776 will map to 2,147,483,648, which then produces ½, so that is 257 numbers mapping to one floating-point number. But all the numbers from 2,147,483,777 to 2,147,484,031 map to 2,147,483,904. So this one has 255 numbers mapping to it. (The difference is due to the round-to-nearest-ties-to-even rule.) At the high end, the 129 numbers from 4,294,967,168 to 4,294,967,296 map to 4,294,967,296, for which dividing produces 1, which is out of the desired half-open interval, [0, 1).
On the other hand, if we use integers from 0 to 16,777,215 (224−1), there is no rounding, and each result maps from exactly one starting number and stays within the interval.
Note that “significand“ is the preferred term for the fraction portion of a floating-point representation. “Mantissa” is an old word for the fraction portion of a logarithm. Significands are linear. Mantissas are logarithmic. And the significand of the IEEE-754 single-precision format has 24 bits, not 23. The primary field used to encode the significand has 23 bits, but the exponent field provides another bit.

Big O notation and measuring time according to it

Suppose we have an algorithm that is of order O(2^n). Furthermore, suppose we multiplied the input size n by 2 so now we have an input of size 2n. How is the time affected? Do we look at the problem as if the original time was 2^n and now it became 2^(2n) so the answer would be that the new time is the power of 2 of the previous time?
Big 0 is not for telling you the actual running time, just how the running time is affected by the size of input. If you double the size of input the complexity is still O(2^n), n is just bigger.
number of elements(n) units of work
1 1
2 4
3 8
4 16
5 32
... ...
10 1024
20 1048576
There's a misunderstanding here about how Big-O relates to execution time.
Consider the following formulas which define execution time:
f1(n) = 2^n + 5000n^2 + 12300
f2(n) = (500 * 2^n) + 6
f3(n) = 500n^2 + 25000n + 456000
f4(n) = 400000000
Each of these functions are O(2^n); that is, they can each be shown to be less than M * 2^n for an arbitrary M and starting n0 value. But obviously, the change in execution time you notice for doubling the size from n1 to 2 * n1 will vary wildly between them (not at all in the case of f4(n)). You cannot use Big-O analysis to determine effects on execution time. It only defines an upper boundary on the execution time (which is not even guaranteed to be the minimum form of the upper bound).
Some related academia below:
There are three notable bounding functions in this category:
O(f(n)): Big-O - This defines a upper-bound.
Ω(f(n)): Big-Omega - This defines a lower-bound.
Θ(f(n)): Big-Theta - This defines a tight-bound.
A given time function f(n) is Θ(g(n)) only if it is also Ω(g(n)) and O(g(n)) (that is, both upper and lower bounded).
You are dealing with Big-O, which is the usual "entry point" to the discussion; we will neglect the other two entirely.
Consider the definition from Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes:
f(x)=O(g(x)) as x tends to infinity
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M|g(x)| for all x > x0
Going from here, assume we have f1(n) = 2^n. If we were to compare that to f2(n) = 2^(2n) = 4^n, how would f1(n) and f2(n) relate to each other in Big-O terms?
Is 2^n <= M * 4^n for some arbitrary M and n0 value? Of course! Using M = 1 and n0 = 1, it is true. Thus, 2^n is upper-bounded by O(4^n).
Is 4^n <= M * 2^n for some arbitrary M and n0 value? This is where you run into problems... for no constant value of M can you make 2^n grow faster than 4^n as n gets arbitrarily large. Thus, 4^n is not upper-bounded by O(2^n).
See comments for further explanations, but indeed, this is just an example I came up with to help you grasp Big-O concept. That is not the actual algorithmic meaning.
Suppose you have an array, arr = [1, 2, 3, 4, 5].
An example of a O(1) operation would be directly access an index, such as arr[0] or arr[2].
An example of a O(n) operation would be a loop that could iterate through all your array, such as for elem in arr:.
n would be the size of your array. If your array is twice as big as the original array, n would also be twice as big. That's how variables work.
See Big-O Cheat Sheet for complementary informations.

Which one is the correct way of using "arc4rand()"

I am new to objective C and trying to understand arc4random().
There are so many conflicting explanations on the web. Please clear my confusion, which of the following is correct:
// 1.
arc4random() % (toNumber - fromNumber) + fromNumber;
OR
//2.
arc4random() % ((toNumber - fromNumber) + 1) + fromNumber;
//toNumber-fromNumbers are any range of numbers like random # between 7-90.
This code will get you a random number between 7 and 90.
NSUInteger random = 7 + arc4random_uniform(90 - 7);
Use arc4random_uniform to avoid modulo bias.
Adam's answer is correct. However, just to clarify the difference between the two, the second one raises the possible range by one to make the range inclusive. The important thing to remember is that modulo is remainder division, so while there are toNumber possible outcomes, one of them is zero (if the result of arc4random() is a multiple of toNumber) and toNumber itself can not be the remainder.
// 1.
arc4random() % (10 - 5) + 5;
This results in a range of 0 + 5 to 4 + 5, which is 5 to 9.
//2.
arc4random() % ((10 - 5) + 1) + 5;
This results in a range of 0 + 5 to (4 + 1) + 5, which is 5 to 10.
Neither is correct or incorrect if you wish to use modulo. One is exclusive of the upper range while the other is inclusive of the upper range. However, if you think about how remainder division works and think of the pool of numbers returned by any PRNG in terms of cycles the length of your total range, then you'll realize that if the range does not divide evenly into the maximum range of the pool you'll get biased results. For instance, if arc4random() returned a result from 1 to 5 (it doesn't, obviously) and you wanted a number from 0 to 2, and you used arc4random() % 3, these are the possible results.
1 % 3 = 1
2 % 3 = 2
3 % 3 = 0
4 % 3 = 1
5 % 3 = 2
Note that there are two ones and two twos, but only one zero. This is because our range of 3 does not evenly divide into the PRNG's range of 5. The result is that (humorously enough) PRNG range % desired range numbers at the end of the cycle need to be culled because they are "biased"–the numbers themselves aren't really biased, but it's easier to cull from the end. Failing to do this results in the lower numbers of the range becoming more likely to appear.
We can cull the numbers by calculating the upper range of the numbers we can generate, modulo it with the desired range and then pull those numbers off of the end. By "pull those numbers off of the end" I really mean "loop infinitely until we get a number that isn't one of the end numbers".
Some would say that's bad practice; you could theoretically loop forever. In practice, however, the expected number of retries is always less than one since the modulo bias is never more than half the pool (usually much less than that) of the PRNG's numbers. I once wrote a wrapper for rand using this technique.
You can see an example of this in the source for OpenBSD, where arc4random_uniform calls arc4random in a loop until a number is determined to be clean.

32-bit fractional multiplication with cross-multiplication method (no 64-bit intermediate result)

I am programming a fixed-point speech enhancement algorithm on a 16-bit processor. At some point I need to do 32-bit fractional multiplication. I have read other posts about doing 32-bit multiplication byte by byte and I see why this works for Q0.31 formats. But I use different Q formats with varying number of fractional bits.
So I have found out that for fractional bits less than 16, this works:
(low*low >> N) + low*high + high*low + (high*high << N)
where N is the number of fractional bits. I have read that the low*low result should be unsigned as well as the low bytes themselves. In general this gives exactly the result I want in any Q format with less than 16 fractional bits.
Now it gets tricky when the fractional bits are more than 16. I have tried out several numbers of shifts, different shifts for low*low and high*high I have tried to put it on paper, but I can't figure it out.
I know it may be very simple but the whole idea eludes me and I would be grateful for some comments or guidelines!
It's the same formula. For N > 16, the shifts just mean you throw out a whole 16-bit word which would have over- or underflowed. low*low >> N means just shift N-16 bit in the high word of the 32-bit result of the multiply and add to the low word of the result. high * high << N means just use the low word of the multiply result shifted left N-16 and add to the high word of the result.
There are a few ideas at play.
First, multiplication of 2 shorter integers to produce a longer product. Consider unsigned multiplication of 2 32-bit integers via multiplications of their 16-bit "halves", each of which produces a 32-bit product and the total product is 64-bit:
a * b = (a_hi * 216 + a_lo) * (b_hi * 216 + b_lo) =
a_hi * b_hi * 232 + (a_hi * b_lo + a_lo * b_hi) * 216 + a_lo * b_lo.
Now, if you need a signed multiplication, you can construct it from unsigned multiplication (e.g. from the above).
Supposing a < 0 and b >= 0, a *signed b must be equal
264 - ((-a) *unsigned b), where
-a = 232 - a (because this is 2's complement)
IOW,
a *signed b =
264 - ((232 - a) *unsigned b) =
264 + (a *unsigned b) - (b * 232), where 264 can be discarded since we're using 64 bits only.
In exactly the same way you can calculate a *signed b for a >= 0 and b < 0 and must get a symmetric result:
(a *unsigned b) - (a * 232)
You can similarly show that for a < 0 and b < 0 the signed multiplication can be built on top of the unsigned multiplication this way:
(a *unsigned b) - ((a + b) * 232)
So, you multiply a and b as unsigned first, then if a < 0, you subtract b from the top 32 bits of the product and if b < 0, you subtract a from the top 32 bits of the product, done.
Now that we can multiply 32-bit signed integers and get 64-bit signed products, we can finally turn to the fractional stuff.
Suppose now that out of those 32 bits in a and b N bits are used for the fractional part. That means that if you look at a and b as at plain integers, they are going to be 2N times greater than what they really represent, e.g. 1.0 is going to look like 2N (or 1 << N).
So, if you multiply two such integers the product is going to be 2N*2N = 22*N times greater than what it should represent, e.g. 1.0 * 1.0 is going to look like 22*N (or 1 << (2*N)). IOW, plain integer multiplication is going to double the number of fractional bits. If you want the product to
have the same number of fractional bits as in the multiplicands, what do you do? You divide the product by 2N (or shift it arithmetically N positions right). Simple.
A few words of caution, just in case...
In C (and C++) you cannot legally shift a variable left or right by the same or greater number of bits contained in the variable. The code will compile, but not work as you may expect it to. So, if you want to shift a 32-bit variable, you can shift it by 0 through 31 positions left or right (31 is the max, not 32).
If you shift signed integers left, you cannot overflow the result legally. All signed overflows result in undefined behavior. So, you may want to stick to unsigned.
Right shifts of negative signed integers are implementation-specific. They can either do an arithmetic shift or a logical shift. Which one, it depends on the compiler. So, if you need one of the two you need to either ensure that your compiler just supports it directly
or implement it in some other ways.