How to get random numbers with the wrong generator - objective-c

Question: Suppose you have a random number generator randn() that returns a uniformly distributed random number between 0 and n-1. Given any number m, write a random number generator that returns a uniformly distributed random number between 0 and m-1.
My answer:
-(int)randm() {
int k=1;
while (k*n < m) {
++k;
}
int x = 0;
for (int i=0; i<k; ++i) {
x += randn();
}
if (x < m) {
return x;
} else {
return randm();
}
}
Is this correct?

You're close, but the problem with your answer is that there is more than one way to write a number as a sum of two other numbers.
If m<n, then this works because the numbers 0,1,...,m-1 appear each with equal probability, and the algorithm terminates almost surely.
This answer does not work in general because there is more than one way to write a number as a sum of two other numbers. For instance, there is only one way to get 0 but there are many many ways to get m/2, so the probabilities will not be equal.
Example: n = 2 and m=3
0 = 0+0
1 = 1+0 or 0+1
2 = 1+1
so the probability distribution from your method is
P(0)=1/4
P(1)=1/2
P(2)=1/4
which is not uniform.
To fix this, you can use unique factorization. Write m in base n, keeping track of the largest needed exponent, say e. Then, find the biggest multiple of m that is smaller than n^e, call it k. Finally, generate e numbers with randn(), take them as the base n expansion of some number x, if x < k*m, return x, otherwise try again.
Assuming that m < n^2, then
int randm() {
// find largest power of n needed to write m in base n
int e=0;
while (m > n^e) {
++e;
}
// find largest multiple of m less than n^e
int k=1;
while (k*m < n^2) {
++k
}
--k; // we went one too far
while (1) {
// generate a random number in base n
int x = 0;
for (int i=0; i<e; ++i) {
x = x*n + randn();
}
// if x isn't too large, return it x modulo m
if (x < m*k)
return (x % m);
}
}

It is not correct.
You are adding uniform random numbers, which does not produce a uniformly random result. Say n=2 and m = 3, then the possible values for x are 0+0, 0+1, 1+0, 1+1. So you're twice as likely to get 1 than you are to get 0 or 2.
What you need to do is write m in base n, and then generate 'digits' of the base-n representation of the random number. When you have the complete number, you have to check if it is less than m. If it is, then you're done. If it is not, then you need to start over.

The sum of two uniform random number generators is not uniformly generated. For instance, the sum of two dice is more likely to be 7 than 12, because to get 12 you need to throw two sixes, whereas you can get 7 as 1 + 6 or 6 + 1 or 2 + 5 or 5 + 2 or ...
Assuming that randn() returns an integer between 0 and n - 1, n * randn() + randn() is uniformly distributed between 0 and n * n - 1, so you can increase its range. If randn() returns an integer between 0 and k * m + j - 1, then call it repeatedly until you get a number <= k * m - 1, and then divide the result by k to get a number uniformly distributed between 0 and m -1.

Assuming both n and m are positive integers, wouldn't the standard algorithm of scaling work?
return (int)((float)randn() * m / n);

Related

How to Understand Time Complexity of Happy Number Problem Solution from Leetcode

I have some difficulties in understanding the time complexity analysis for one solution for the Happy Number Question from Leet code, for my doubts on complexity analysis, I marked them in bold and really appreciate your advice
Here is the question:
Link: https://leetcode.com/problems/happy-number/
Question:
Write an algorithm to determine if a number is "happy".
A happy number is a number defined by the following process: Starting with any positive integer, replace the number by the sum of the squares of its digits, and repeat the process until the number equals 1 (where it will stay), or it loops endlessly in a cycle which does not include 1. Those numbers for which this process ends in 1 are happy numbers.
Example:
Input: 19
Output: true
Explanation:
1^2(square of 1) + 9^2 = 82
8^2 + 2^2 = 68
6^2 + 8^2 = 100
1^2 + 0^2 + 0^2 = 1
Here is the code:
class Solution(object):
def isHappy(self, n):
#getnext function will compute the sum of square of each digit of n
def getnext(n):
totalsum = 0
while n>0:
n,v = divmod(n,10)
totalsum+=v**2
return totalsum
#we declare seen as a set to track the number we already visited
seen = set()
#we stop checking if: either the number reaches one or the number was visited #already(ex.a cycle)
while n!=1 and (n not in seen):
seen.add(n)
n = getnext(n)
return n==1
Note: feel free to let me know if I need to explain how the code works
Time Complexity Analysis:
Time complexity : O(243 * 3 + logN + loglogN + log loglog N)...=O(logN).
Finding the next value for a given number has a cost of O(log n)because we are processing each digit in the number, and the number of digits in a number is given by logN.
My doubt: why the number of digits in a number is given by logN? what is N here? the value of a specific number or something else?
To work out the total time complexity, we'll need to think carefully about how many numbers are in the chain, and how big they are.
We determined above that once a number is below 243, it is impossible for it to go back up above 243.Therefore, based on our very shallow analysis we know for sure that once a number is below 243, it is impossible for it to take more than another 243 steps to terminate.
Each of these numbers has at most 3 digits. With a little more analysis, we could replace the 243 with the length of the longest number chain below 243, however because the constant doesn't matter anyway, we won't worry about it.
My doubt: I think the above paragraph is related to the time complexity component of 243*3, but I cannot understand why we multiply 243 by 3
For an n above 243, we need to consider the cost of each number in the chain that is above 243. With a little math, we can show that in the worst case, these costs will be O(log n) + O(log log n) + O(log log log N)... Luckily for us, the O(logN) is the dominating part, and the others are all tiny in comparison (collectively, they add up to less than logN), so we can ignore them. My doubt: what is the reasoning behind O(log log n) O(log log log N) for an n above 243?
Well, my guess for the first doubt is that the number of digits of a base 10 number is given by it's value (N) taken to the logarithm at base 10, rounded down. So for example, 1023 would have floor(log10(1023)) digits, which is 3. So yes, the N is the value of the number. the log in time complexity indicates a logarithm, not specifically that of base 2 or base e.
As for the second doubt, it probably has to do with the work required to reduce a number to below 243, but I am not sure. I'll edit this answer once I work that bit out.
Let's say N has M digits. Than getnext(N) <= 81*M. The equality happens when N only has 9's.
When N < 1000, i.e. at most 3 digits, getnext(N) <= 3*81 = 243. Now, you will have to call getnext(.) at most O(243) times to figure out if N is indeed happy.
If M > 3, number of digits of getnext(N) must be less than M. Try getnext(9999), getnext(99999), and so on [1].
Notes:
[1] Adding a digit to N can make it at most 10*N + 9, i.e. adding a 9 at the end. But the number of digits increases to M+1 only. It's a logarithmic relationship between N and M. Hence, the same relationship holds between N and 81*M.
Using the Leetcode solution
class Solution {
private int getNext(int n) {
int totalSum = 0;
while (n > 0) {
int d = n % 10;
n = n / 10;
totalSum += d * d;
}
return totalSum;
}
public boolean isHappy(int n) {
Set<Integer> seen = new HashSet<>();
while (n != 1 && !seen.contains(n)) {
seen.add(n);
n = getNext(n);
}
return n == 1;
}
}
}
O(243*3) for n < 243
3 is the max number of digits in n
e.g. For n = 243
getNext() will take a maximum of 3 iterations because there are 3 digits for us to loop over.
isHappy() can take a maximum of 243 iterations to find a cycle or terminate, because we can store a max of 243 numbers in our hash set.
O(log n) + O(log log n) + O(log log log N)... for n > 243
1st iteration + 2nd iteration + 3rd iteration ...
getNext() will be called a maximum of O(log n) times. Because log10 n is the number of digits.
isHappy() will be called a maximum of 9^2 per digit. This is the max we can store in the hash set before we find a cycle or terminate.
First Iteration
9^2 * number of digits
O(81*(log n)) drop the constant
O(log n)
+
Second Iteration
O(log (81*(log n))) drop the constant
O(log log n)
+
Third Iteration
O(log log log N)
+
ect ...

How is the complexity of below O(log n) and not O(n)

Could someone explain how this algorithm is O(log(n)) and not O(n)?
the loop runs for all the digits in a given number. So isn't the complexity O(n)?
while (x != 0) {
int pop = x % 10;
x /= 10;
if (rev > Integer.MAX_VALUE/10 || (rev == Integer.MAX_VALUE / 10 && pop > 7))
return 0;
if (rev < Integer.MIN_VALUE/10 || (rev == Integer.MIN_VALUE / 10 && pop < -8))
return 0;
rev = rev * 10 + pop;
}
It depends on what n is. If n is x itself, a numeric value, then the complexity is O(log(n)). If you multiply x by 10, the while loop will only be one iteration longer, not ten times as long. Likewise, multiplying x by 100 will only add two iterations.
On the other hand, if there was a variable s which was the string representation of x, and n was the length of string s, then the complexity would be O(n). Note that in this case, the length of s is proportional to log(x), so the logarithm is implicit from the viewpoint of the numeric value.
Here's a thought experiment:
The algorithm depends on the number of digits in n, not the value of n. n = 10 takes as many iterations as n = 99 because they both have 2 digits.
The number digits in n grows at log(n) rate, since adding a single digit requires n to be at least 10 times bigger.
Hence the algorithm has a complexity of O(log(n))

Calculating the time complexity

Can somebody help with the time complexity of the following code:
for(i = 0; i <= n; i++)
{
for(j = 0; j <= i; j++)
{
for(k = 2; k <= n; k = k^2)
print("")
}
a/c to me the first loop will run n times,2nd will run for(1+2+3...n) times and third for loglogn times..
but i m not sure about the answer.
We start from the inside and work out. Consider the innermost loop:
for(k = 2; k <= n; k = k^2)
print("")
How many iterations of print("") are executed? First note that n is constant. What sequence of values does k assume?
iter | k
--------
1 | 2
2 | 4
3 | 16
4 | 256
We might find a formula for this in several ways. I used guess and prove to get iter = log(log(k)) + 1. Since the loop won't execute the next iteration if the value is already bigger than n, the total number of iterations executed for n is floor(log(log(n)) + 1). We can check this with a couple of values to make sure we got this right. For n = 2, we get one iteration which is correct. For n = 5, we get two. And so on.
The next level does i + 1 iterations, where i varies from 0 to n. We must therefore compute the sum 1, 2, ..., n + 1 and that will give us the total number of iterations of the outermost and middle loop: this sum is (n + 1)(n + 2) / 2 We must multiply this by the cost of the inner loop to get the answer, (n + 1)(n + 2)(log(log(n)) + 1) / 2 to get the total cost of the snippet. The fastest-growing term in the expansion is n^2 log(log(n)) and so that is what would typically be given as asymptotic complexity.

How to calculate the complexity of the Algorithm?

I have to calculate the complexity of this algorithm,I tried to solve it and found the answer to be O(nlogn). Is it correct ? If not please explain.
for (i=5; i<n/2; i+=5)
{
for (j=1; j<n; j*=4)
op;
x = 3*n;
while(x > 6)
{op; x--;}
}
Katrina, In this example we've got an O (n*log(n))
`
for (int i= 0; i < N; i++) {
c= i;
while (c > 0) {
c= c/2
}
}
How ever you have another for that includes this both bucles.
Im not quite sure to understand the way it works the algorithm but an standard way considering an another for bucle, should be O(nnlog n
for (int j= 0; i < N); j++) { --> O(n)
for (int i= 0; i < N; i++) { O(n)
c= i;
while (c > 0) { O(logn)
c= c/2 O(1)
}
}
} ?
Aftermath in this standard algorithm would be O(n) * O(n) * O(logn) * O(1)
So, I think you forgot to include another O(n)
)
Hope it helps
Let's count the number of iterations in each loop.
The outermost loop for (i=5; i<n/2; i+=5) steps through all values between 5 and n / 2 in steps of 5. It will, thus, require approximately n / 10 - 1 iterations.
There are two inner loops. Let's consider the first one: for (j=1; j<n; j*=4). This steps through all values of the form 4^x between 1 and n for integers of x. The lowest value of x for this to be true is 0, and the highest value is the x that fulfills 4^x < n -- i.e., the integer closest to log_4(n). Thus, labelling the iterations by x, we have iterations 0, 1, ..., log_4(n). In other words, we've approximately log_4(n) + 1 iterations for this loop.
Now consider the second inner loop. It steps through all values from 3 * n down to 7. Thus the number of iterations are approximately 3 n - 6.
All other operations have constant run time and can therefore be ignored.
How do we put this together? The two inner loops are run sequentially (i.e., they are not nested) so the run time for both of them together is simply the sum:
(log_4(n) + 1) + (3 n - 6) = 3 n + log_4(n) - 5.
The outer loop and the two inner loops are, however, nested. For every iteration of the outer loop, both the inner ones are run. Therefore we multiply the number of iterations of the outer with the total of the inner:
(n / 10 - 1) * (3 n + log_4(n) - 5) =
= 3 n^2 / 10 + n log_4(n) / 10 - 7 n / 2 - log_4(n) + 5.
Finally, complexity is often expressed in Big-O notation -- that is, we're only interested in the order of the run time. This means two things. First, we can ignore all constant factors in all terms. For example, O(3 n^2 / 10) becomes just O(n^2). Thereby we have:
O(3 n^2 / 10 + n log_4(n) / 10 - 7 n / 2 - log_4(n) + 5) =
= O(n^2 + n log_4(n) - n - log_4(n) + 1).
Second, we can ignore all terms that have a lower order than the term with the highest order. For example, n is of an higher order than 1 so we have O(n + 1) = O(n). Thereby we have:
O(n^2 + n log_4(n) - n - log_4(n) + 1) = O(n^2).
Finally, we have the answer. The complexity of the algorithm your code describes is O(n^2).
(In practice, one would never calculate the (approximate) number of iterations as we did here. The simplification we did in the last step can be done earlier which makes the calculations much easier.)
As Fredrik mentioned, time complexity of first and third loops are O(n). an time for second loop is O(log(n)).
So complexity of following algorithm is O(n^2).
for (i=5; i<n/2; i+=5)
{
for (j=1; j<n; j*=4)
op;
x = 3*n;
while(x > 6)
{op; x--;}
}
Note that complexity of following algorithm is O(n^2*log(n)) which is not same with above algorithm.
for (i=5; i<n/2; i+=5)
{
for (j=1; j<n; j*=4)
{
op;
x = 3*n;
while(x > 6)
{op; x--;}
}
}

Recognizing when to use the modulus operator

I know the modulus (%) operator calculates the remainder of a division. How can I identify a situation where I would need to use the modulus operator?
I know I can use the modulus operator to see whether a number is even or odd and prime or composite, but that's about it. I don't often think in terms of remainders. I'm sure the modulus operator is useful, and I would like to learn to take advantage of it.
I just have problems identifying where the modulus operator is applicable. In various programming situations, it is difficult for me to see a problem and realize "Hey! The remainder of division would work here!".
Imagine that you have an elapsed time in seconds and you want to convert this to hours, minutes, and seconds:
h = s / 3600;
m = (s / 60) % 60;
s = s % 60;
0 % 3 = 0;
1 % 3 = 1;
2 % 3 = 2;
3 % 3 = 0;
Did you see what it did? At the last step it went back to zero. This could be used in situations like:
To check if N is divisible by M (for example, odd or even)
or
N is a multiple of M.
To put a cap of a particular value. In this case 3.
To get the last M digits of a number -> N % (10^M).
I use it for progress bars and the like that mark progress through a big loop. The progress is only reported every nth time through the loop, or when count%n == 0.
I've used it when restricting a number to a certain multiple:
temp = x - (x % 10); //Restrict x to being a multiple of 10
Wrapping values (like a clock).
Provide finite fields to symmetric key algorithms.
Bitwise operations.
And so on.
One use case I saw recently was when you need to reverse a number. So that 123456 becomes 654321 for example.
int number = 123456;
int reversed = 0;
while ( number > 0 ) {
# The modulus here retrieves the last digit in the specified number
# In the first iteration of this loop it's going to be 6, then 5, ...
# We are multiplying reversed by 10 first, to move the number one decimal place to the left.
# For example, if we are at the second iteration of this loop,
# reversed gonna be 6, so 6 * 10 + 12345 % 10 => 60 + 5
reversed = reversed * 10 + number % 10;
number = number / 10;
}
Example. You have message of X bytes, but in your protocol maximum size is Y and Y < X. Try to write small app that splits message into packets and you will run into mod :)
There are many instances where it is useful.
If you need to restrict a number to be within a certain range you can use mod. For example, to generate a random number between 0 and 99 you might say:
num = MyRandFunction() % 100;
Any time you have division and want to express the remainder other than in decimal, the mod operator is appropriate. Things that come to mind are generally when you want to do something human-readable with the remainder. Listing how many items you could put into buckets and saying "5 left over" is good.
Also, if you're ever in a situation where you may be accruing rounding errors, modulo division is good. If you're dividing by 3 quite often, for example, you don't want to be passing .33333 around as the remainder. Passing the remainder and divisor (i.e. the fraction) is appropriate.
As #jweyrich says, wrapping values. I've found mod very handy when I have a finite list and I want to iterate over it in a loop - like a fixed list of colors for some UI elements, like chart series, where I want all the series to be different, to the extent possible, but when I've run out of colors, just to start over at the beginning. This can also be used with, say, patterns, so that the second time red comes around, it's dashed; the third time, dotted, etc. - but mod is just used to get red, green, blue, red, green, blue, forever.
Calculation of prime numbers
The modulo can be useful to convert and split total minutes to "hours and minutes":
hours = minutes / 60
minutes_left = minutes % 60
In the hours bit we need to strip the decimal portion and that will depend on the language you are using.
We can then rearrange the output accordingly.
Converting linear data structure to matrix structure:
where a is index of linear data, and b is number of items per row:
row = a/b
column = a mod b
Note above is simplified logic: a must be offset -1 before dividing & the result must be normalized +1.
Example: (3 rows of 4)
1 2 3 4
5 6 7 8
9 10 11 12
(7 - 1)/4 + 1 = 2
7 is in row 2
(7 - 1) mod 4 + 1 = 3
7 is in column 3
Another common use of modulus: hashing a number by place. Suppose you wanted to store year & month in a six digit number 195810. month = 195810 mod 100 all digits 3rd from right are divisible by 100 so the remainder is the 2 rightmost digits in this case the month is 10. To extract the year 195810 / 100 yields 1958.
Modulus is also very useful if for some crazy reason you need to do integer division and get a decimal out, and you can't convert the integer into a number that supports decimal division, or if you need to return a fraction instead of a decimal.
I'll be using % as the modulus operator
For example
2/4 = 0
where doing this
2/4 = 0 and 2 % 4 = 2
So you can be really crazy and let's say that you want to allow the user to input a numerator and a divisor, and then show them the result as a whole number, and then a fractional number.
whole Number = numerator/divisor
fractionNumerator = numerator % divisor
fractionDenominator = divisor
Another case where modulus division is useful is if you are increasing or decreasing a number and you want to contain the number to a certain range of number, but when you get to the top or bottom you don't want to just stop. You want to loop up to the bottom or top of the list respectively.
Imagine a function where you are looping through an array.
Function increase Or Decrease(variable As Integer) As Void
n = (n + variable) % (listString.maxIndex + 1)
Print listString[n]
End Function
The reason that it is n = (n + variable) % (listString.maxIndex + 1) is to allow for the max index to be accounted.
Those are just a few of the things that I have had to use modulus for in my programming of not just desktop applications, but in robotics and simulation environments.
Computing the greatest common divisor
Determining if a number is a palindrome
Determining if a number consists of only ...
Determining how many ... a number consists of...
My favorite use is for iteration.
Say you have a counter you are incrementing and want to then grab from a known list a corresponding items, but you only have n items to choose from and you want to repeat a cycle.
var indexFromB = (counter-1)%n+1;
Results (counter=indexFromB) given n=3:
`1=1`
`2=2`
`3=3`
`4=1`
`5=2`
`6=3`
...
Best use of modulus operator I have seen so for is to check if the Array we have is a rotated version of original array.
A = [1,2,3,4,5,6]
B = [5,6,1,2,3,4]
Now how to check if B is rotated version of A ?
Step 1: If A's length is not same as B's length then for sure its not a rotated version.
Step 2: Check the index of first element of A in B. Here first element of A is 1. And its index in B is 2(assuming your programming language has zero based index).
lets store that index in variable "Key"
Step 3: Now how to check that if B is rotated version of A how ??
This is where modulus function rocks :
for (int i = 0; i< A.length; i++)
{
// here modulus function would check the proper order. Key here is 2 which we recieved from Step 2
int j = [Key+i]%A.length;
if (A[i] != B[j])
{
return false;
}
}
return true;
It's an easy way to tell if a number is even or odd. Just do # mod 2, if it is 0 it is even, 1 it is odd.
Often, in a loop, you want to do something every k'th iteration, where k is 0 < k < n, assuming 0 is the start index and n is the length of the loop.
So, you'd do something like:
int k = 5;
int n = 50;
for(int i = 0;i < n;++i)
{
if(i % k == 0) // true at 0, 5, 10, 15..
{
// do something
}
}
Or, you want to keep something whitin a certain bound. Remember, when you take an arbitrary number mod something, it must produce a value between 0 and that number - 1.