Time Complexity of Euclid Algorithm by Subtraction - time-complexity

int gcd(int a, int b)
{
while(a!=b)
{
if(a > b)
a = a - b;
else
b = b - a;
}
return a;
}
What is the time complexity of this algorithm? Can someone provide a detailed explanation?

For Euclid Algorithm by Subtraction, a and b are positive integers.
The worst case scenario is if a = n and b = 1. Then, it will take n - 1 steps to calculate the GCD. Hence, the time complexity is O(max(a,b)) or O(n) (if it's calculated in regards to the number of iterations).
By the way, you should also fix your function so that it validates whether a and b are really positive integer numbers. Or even better, you could change your return and parameter types to unsigned long or unsigned int or similar and, as suggested by #templatetypedef, handle cases for a = 0 or b = 0 separately.

Related

What is the time complexity (Big-O) of this while loop (Pseudocode)?

This is written in pseudocode.
We have an Array A of length n(n>=2)
int i = 1;
while (i < n) {
if (A[i] == 0) {
terminates the while-loop;
}
doubles i
}
I am new to this whole subject and coding, so I am having a hard time grasping it and need an "Explain like im 5".
I know the code doesnt make a lot of sense but it is just an exercise, I have to determine best case and worst case.
So in the Best case Big O would be O(1) if the value in [1] is 0.
For the worst-case scenario I thought the time complexity of this loop would be O(log(n)) as i doubles.
Is that correct?
Thanks in advance!
For Big O notation you take the worse case scenario. For the case where A[i] never evaluates to zero then your loop is like this:
int i = 1;
while(i < n) {
i *= 2;
}
i is doubled on each iteration, ie exponential growth.
Given an example of n=16
the values of i would be:
1
2
4
8
wouldn't get to 16
4 iterations
and 2^4 = 16
to work out the power, you would take log to base 2 of n, ie log(16) = 4
So the worst case would be log(n)
So the complexity would be stated as O(log(n))

Evaluating time complexity for the binomial coefficient

I'm new to Theoretical Computer Science, and I would like to calculate the time complexity of the following algorithm that evaluates the binomial coefficient defined as
nf = 1;
for i = 2 to n do nf = nf * i;
kf = 1;
for i = 2 to k do kf = kf * i;
nkf = 1;
for i = 2 to n-k do nkf = nkf * i;
c = nf / (kf * nkf);
My textbook suggests to use Stirling's approximation
However, I can get the same result by considering that for i = 2 to n do nf = nf * i; have complexity O(n-2)=O(n), that is predominant.
Stirling's approximation seems a little bit overkill. Is my approach wrong?
In your first approach you calculate n!, k! and (n-k)! separately and then calculate the binomial coefficient. Therefore since all of those terms can be calculated with at most operations you have O(n) time complexity.
However, you are wrong about the time complexity of calculating the Stirling's formula. You only need log(n) in base 2 operations to calculate it. This is because when trying to calculate p'th power of some real number, instead of multiplicating it p times, you can instead keep squaring the number to calculate it quickly. For example:
If you want to calculate 2^17, instead of doing 17 operations like this:
return 2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2*2
you can do this:
a = 2*2
b = a*a
c = b*b
d = c*c
return d * 2
which is only 5 operations.
Note: However keep in mind that the Stirling's formula is not equal to the factorial. It is only an approximation but a good one.
Edit: Also you can consider a^n as e^(log(a)*n) and then calculate it by the quickly converging series expansion
1 + (log(a)n) + ((log(a)n)^2)/2! + ((log(a)n)^3)/3! + ...
Since the series converges very quickly you can get really close approximations in no time.

Determine time-complexity of recursive function

Can anybody help me find time conplexity of this recursive function?
int test(int m, int n) {
if(n == 0)
return m;
else
return (3 + test(m + n, n - 1));
The test(m+n, n-1) is called n-1 times before base case which is if (n==0), so complexity is O(n)
Also, this is a duplicate of Determining complexity for recursive functions (Big O notation)
It is really important to understand recursion and the time-complexity of recursive functions.
The first step to understand easy recursive functions like that is to be able to write the same function in an iterative way. This is not always easy and not always reasonable but in case of an easy function like yours this shouldn't be a problem.
So what happens in your function in every recursive call?
is n > 0?
If yes:
m = m + n + 3
n = n - 1
If no:
return m
Now it should be pretty easy to come up with the following (iterative) alternative:
int testIterative(int m, int n) {
while(n != 0) {
m = m + n + 3;
n = n - 1;
}
return m;
}
Please note: You should pay attention to negative n. Do you understand what's the problem here?
Time complexity
After looking at the iterative version, it is easy to see that the time-complexity is depending on n: The time-complexity therefore is O(n).

Calculate function time complexity

I am trying to calculate the time complexity of this function
Code
int Almacen::poner_items(id_sala s, id_producto p, int cantidad){
it_prod r = productos.find(p);
if(r != productos.end()) {
int n = salas[s - 1].size();
int m = salas[s - 1][0].size();
for(int i = n - 1; i >= 0 && cantidad > 0; --i) {
for(int j = 0; j < m && cantidad > 0; ++j) {
if(salas[s - 1][i][j] == "NULL") {
salas[s - 1][i][j] = p;
r->second += 1;
--cantidad;
}
}
}
}
else {
displayError();
return -1;
}
return cantidad;
}
the variable productos is a std::map and its find method has a time complexity of Olog(n) and other variable salas is a std::vector.
I calculated the time and I found that it was log(n) + nm but am not sure if it is the correct expression or I should leave it as nm because it is the worst or if I whould use n² only.
Thanks
The overall function is O(nm). Big-O notation is all about "in the limit of large values" (and ignores constant factors). "Small" overheads (like an O(log n) lookup, or even an O(n log n) sort) are ignored.
Actually, the O(n log n) sort case is a bit more complex. If you expect m to be typically the same sort of size as n, then O(nm + nlogn) == O(nm), if you expect n ≫ m, then O(nm + nlogn) == O(nlogn).
Incidentally, this is not a question about C++.
In general when using big O notation, you only leave the most dominant term when taking all variables to infinity.
n by itself is much larger than log n at infinity, so even without m you can (and generally should) drop the log n term, so O(nm) looks fine to me.
In non-theoretical use cases, it is sometimes important to understand the actual complexity (for non-infinite inputs), since sometimes algorithms that are slow at infinity can produce better results for shorter inputs (there are some examples where O(1) algorithms have such a terrible constant that an exponential algorithm does better in real life). quick sort is considered a practical example of an O(n^2) algorithm that often does better than it's O(n log n) counterparts.
Read about "Big O Notation" for more info.
let
k = productos.size()
n = salas[s - 1].size()
m = salas[s - 1][0].size()
your algorithm is O(log(k) + nm). You need to use a distinct name for each independent variable
Now it might be the case that there is a relation between k, n, m and you can re-label with a reduced set of variables, but that is not discernible from your code, you need to know about the data.
It may also be the case that some of these terms won't grow large, in which case they are actually constants, i.e. O(1).
E.g. you may know k << n, k << m and n ~= m , which allows you describe it as O(n^2)

Get the most occuring number amongst several integers without using arrays

DISCLAIMER: Rather theoretical question here, not looking for a correct answere, just asking for some inspiration!
Consider this:
A function is called repetitively and returns integers based on seeds (the same seed returns the same integer). Your task is to find out which integer is returned most often. Easy enough, right?
But: You are not allowed to use arrays or fields to store return values of said function!
Example:
int mostFrequentNumber = 0;
int occurencesOfMostFrequentNumber = 0;
int iterations = 10000000;
for(int i = 0; i < iterations; i++)
{
int result = getNumberFromSeed(i);
int occurencesOfResult = magic();
if(occurencesOfResult > occurencesOfMostFrequentNumber)
{
mostFrequentNumber = result;
occurencesOfMostFrequentNumber = occurencesOfResult;
}
}
If getNumberFromSeed() returns 2,1,5,18,5,6 and 5 then mostFrequentNumber should be 5 and occurencesOfMostFrequentNumber should be 3 because 5 is returned 3 times.
I know this could easily be solved using a two-dimentional list to store results and occurences. But imagine for a minute that you can not use any kind of arrays, lists, dictionaries etc. (Maybe because the system that is running the code has such a limited memory, that you cannot store enough integers at once or because your prehistoric programming language has no concept of collections).
How would you find mostFrequentNumber and occurencesOfMostFrequentNumber? What does magic() do?? (Of cause you do not have to stick to the example code. Any ideas are welcome!)
EDIT: I should add that the integers returned by getNumber() should be calculated using a seed, so the same seed returns the same integer (i.e. int result = getNumber(5); this would always assign the same value to result)
Make an hypothesis: Assume that the distribution of integers is, e.g., Normal.
Start simple. Have two variables
. N the number of elements read so far
. M1 the average of said elements.
Initialize both variables to 0.
Every time you read a new value x update N to be N + 1 and M1 to be M1 + (x - M1)/N.
At the end M1 will equal the average of all values. If the distribution was Normal this value will have a high frequency.
Now improve the above. Add a third variable:
M2 the average of all (x - M1)^2 for all values of xread so far.
Initialize M2 to 0. Now get a small memory of say 10 elements or so. For every new value x that you read update N and M1 as above and M2 as:
M2 := M2 + (x - M1)^2 * (N - 1) / N
At every step M2 is the variance of the distribution and sqrt(M2) its standard deviation.
As you proceed remember the frequencies of only the values read so far whose distances to M1 are less than sqrt(M2). This requires the use of some additional array, however, the array will be very short compared to the high number of iterations you will run. This modification will allow you to guess better the most frequent value instead of simply answering the mean (or average) as above.
UPDATE
Given that this is about insights for inspiration there is plenty of room for considering and adapting the approach I've proposed to any particular situation. Here are some thoughts
When I say assume that the distribution is Normal you should think of it as: Given that the problem has no solution, let's see if there is some qualitative information I can use to decide what kind of distribution would the data have. Given that the algorithm is intended to find the most frequent number, it should be fine to assume that the distribution is not uniform. Let's try with Normal, LogNormal, etc. to see what can be found out (more on this below.)
If the game completely disallows the use of any array, then fine, keep track of only, say 10 numbers. This would allow you to count the occurrences of the 10 best candidates, which will give more confidence to your answer. In doing this choose your candidates around the theoretical most likely value according to the distribution of your hypothesis.
You cannot use arrays but perhaps you can read the sequence of numbers two or three times, not just once. In that case you can read it once to check whether you hypothesis about its distribution is good nor bad. For instance, if you compute not just the variance but the skewness and the kurtosis you will have more elements to check your hypothesis. For instance, if the first reading indicates that there is some bias, you could use a LogNormal distribution instead, etc.
Finally, in addition to providing the approximate answer you would be able to use the information collected during the reading to estimate an interval of confidence around your answer.
Alright, I found a decent solution myself:
int mostFrequentNumber = 0;
int occurencesOfMostFrequentNumber = 0;
int iterations = 10000000;
int maxNumber = -2147483647;
int minNumber = 2147483647;
//Step 1: Find the largest and smallest number that _can_ occur
for(int i = 0; i < iterations; i++)
{
int result = getNumberFromSeed(i);
if(result > maxNumber)
{
maxNumber = result;
}
if(result < minNumber)
{
minNumber = result;
}
}
//Step 2: for each possible number between minNumber and maxNumber, count occurences
for(int thisNumber = minNumber; thisNumber <= maxNumber; thisNumber++)
{
int occurenceOfThisNumber = 0;
for(int i = 0; i < iterations; i++)
{
int result = getNumberFromSeed(i);
if(result == thisNumber)
{
occurenceOfThisNumber++;
}
}
if(occurenceOfThisNumber > occurencesOfMostFrequentNumber)
{
occurencesOfMostFrequentNumber = occurenceOfThisNumber;
mostFrequentNumber = thisNumber;
}
}
I must admit, this may take a long time, depending on the smallest and largest possible. But it will work without using arrays.