Is the growth of the binomial coefficient function factorial or polynomial - time-complexity

I have written an algorithm that given a list of words, must check each unique combination of four words in that list of words (regardless of order).
The number of combinations to be checked, x, can be calculated using the binomial coefficient i.e. x = n!/(r!(n-r)!) where n is the total number of words in the list and r is the number of words in each combination, which in my case is always 4, therefore the function is x = n!/(4!(n-4)!) = n!/(24(n-4)!). Therefore as the number of total words, n, increases the number of combinations to be checked, x, therefore increases factorially right?
What has thrown me is that WolframAlpha was able to rewrite this function as x = (n^4)/24 − (n^3)/4 + (11.n^2)/24 − n/4, so now it would appear to grow polynomially as n grows? So which is it?!
Here is a graph to visualise the growth of the function (the letter x is switched to an l)

For a fixed value of r, this function is O(n^r). In your case, r = 4, it is O(n^4). This is because most of the terms in the numerator are canceled out by the denominator:
n!/(4!(n-4)!)
= n(n-1)(n-2)(n-3)(n-4)(n-5)(n-6)...(3)(2)(1)
-------------------------------------------
4!(n-4)(n-5)(n-6)...(3)(2)(1)
= n(n-1)(n-2)(n-3)
----------------
4!
This is a 4th degree polynomial in n.

Related

Unnormalizing in Knuth's Algorithm D

I'm trying to implement Algorithm D from Knuth's "The Art of Computer Programming, Vol 2" in Rust although I'm having trouble understating how to implement the very last step of unnormalizing. My natural numbers are a class where each number is a vector of u64, in base u64::MAX. Addition, subtraction, and multiplication have been implemented.
Knuth's Algorithm D is a euclidean division algorithm which takes two natural numbers x and y and returns (q,r) where q = x / y (integer division) and r = x % y, the remainder. The algorithm depends on an approximation method which only works if the first digit of y is greater than b/2, where b is the base you're representing the numbers in. Since not all numbers are of this form, it uses a "normalizing trick", for example (if we were in base 10) instead of doing 200 / 23, we calculate a normalizer d and do (200 * d) / (23 * d) so that 23 * d has a first digit greater than b/2.
So when we use the approximation method, we end up with the desired q but the remainder is multiplied by a factor of d. So the last step is to divide r by d so that we can get the q and r we want. My problem is, I'm a bit confused at how we're suppose to do this last step as it requires division and the method it's part of is trying to implement division.
(Maybe helpful?):
The way that d is calculated is just by taking the integer floor of b-1 divided by the first digit of y. However, Knuth suggests that it's possible to make d a power of 2, as long as d * the first digit of y is greater than b / 2. I think he makes this suggestion so that instead of dividing, we can just do a binary shift for this last step. Although I don't think I can do that given that my numbers are represented as vectors of u64 values, instead of binary.
Any suggestions?

why maximum number of corner points is m+nCn?

In Linear programming we have:
maximum number of corner points for a problem with m constrains and n variable is . n+mCn . (taking a combination of the number of equations plus variables with number of variables )
why this is the case? I have no idea why this is true.
Define:
m = number of rows = number of logical variables (slacks)
n = number of columns = number of structural variables
so the total number of variables is n+m
Further, we have:
number of basic variables = m (solved by linear algebra)
number of non-basic variables = n (temporarily fixed, usually at 0)
The total number of corner points is equal to the number of ways we can choose m basic variables out of n+m total variables.
But we have:
n+m choose m = n+m choose n
Note that in general many of these bases are infeasible.

Is this O(N) algorithm actually O(logN)?

I have an integer, N.
I denote f[i] = number of appearances of the digit i in N.
Now, I have the following algorithm.
FOR i = 0 TO 9
FOR j = 1 TO f[i]
k = k*10 + i;
My teacher said this is O(N). It seems to me more like a O(logN) algorithm.
Am I missing something?
I think that you and your teacher are saying the same thing but it gets confused because the integer you are using is named N but it is also common to refer to an algorithm that is linear in the size of its input as O(N). N is getting overloaded as the specific name and the generic figure of speech.
Suppose we say instead that your number is Z and its digits are counted in the array d and then their frequencies are in f. For example, we could have:
Z = 12321
d = [1,2,3,2,1]
f = [0,2,2,1,0,0,0,0,0,0]
Then the cost of going through all the digits in d and computing the count for each will be O( size(d) ) = O( log (Z) ). This is basically what your second loop is doing in reverse, it's executing one time for each occurence of each digits. So you are right that there is something logarithmic going on here -- the number of digits of Z is logarithmic in the size of Z. But your teacher is also right that there is something linear going on here -- counting those digits is linear in the number of digits.
The time complexity of an algorithm is generally measured as a function of the input size. Your algorithm doesn't take N as an input; the input seems to be the array f. There is another variable named k which your code doesn't declare, but I assume that's an oversight and you meant to initialise e.g. k = 0 before the first loop, so that k is not an input to the algorithm.
The outer loop runs 10 times, and the inner loop runs f[i] times for each i. Therefore the total number of iterations of the inner loop equals the sum of the numbers in the array f. So the complexity could be written as O(sum(f)) or O(Σf) where Σ is the mathematical symbol for summation.
Since you defined that N is an integer which f counts the digits of, it is in fact possible to prove that O(Σf) is the same thing as O(log N), so long as N must be a positive integer. This is because Σf equals how many digits the number N has, which is approximately (log N) / (log 10). So by your definition of N, you are correct.
My guess is that your teacher disagrees with you because they think N means something else. If your teacher defines N = Σf then the complexity would be O(N). Or perhaps your teacher made a genuine mistake; that is not impossible. But the first thing to do is make sure you agree on the meaning of N.
I find your explanation a bit confusing, but lets assume N = 9075936782959 is an integer. Then O(N) doesn't really make sense. O(length of N) makes more sense. I'll use n for the length of N.
Then f(i) = iterate over each number in N and sum to find how many times i is in N, that makes O(f(i)) = n (it's linear). I'm assuming f(i) is a function, not an array.
Your algorithm loops at most:
10 times (first loop)
0 to n times, but the total is n (the sum of f(i) for all digits must be n)
It's tempting to say that algorithm is then O(algo) = 10 + n*f(i) = n^2 (removing the constant), but f(i) is only calculated 10 times, each time the second loops is entered, so O(algo) = 10 + n + 10*f(i) = 10 + 11n = n. If f(i) is an array, it's constant time.
I'm sure I didn't see the problem the same way as you. I'm still a little confused about the definition in your question. How did you come up with log(n)?

Big O notation and measuring time according to it

Suppose we have an algorithm that is of order O(2^n). Furthermore, suppose we multiplied the input size n by 2 so now we have an input of size 2n. How is the time affected? Do we look at the problem as if the original time was 2^n and now it became 2^(2n) so the answer would be that the new time is the power of 2 of the previous time?
Big 0 is not for telling you the actual running time, just how the running time is affected by the size of input. If you double the size of input the complexity is still O(2^n), n is just bigger.
number of elements(n) units of work
1 1
2 4
3 8
4 16
5 32
... ...
10 1024
20 1048576
There's a misunderstanding here about how Big-O relates to execution time.
Consider the following formulas which define execution time:
f1(n) = 2^n + 5000n^2 + 12300
f2(n) = (500 * 2^n) + 6
f3(n) = 500n^2 + 25000n + 456000
f4(n) = 400000000
Each of these functions are O(2^n); that is, they can each be shown to be less than M * 2^n for an arbitrary M and starting n0 value. But obviously, the change in execution time you notice for doubling the size from n1 to 2 * n1 will vary wildly between them (not at all in the case of f4(n)). You cannot use Big-O analysis to determine effects on execution time. It only defines an upper boundary on the execution time (which is not even guaranteed to be the minimum form of the upper bound).
Some related academia below:
There are three notable bounding functions in this category:
O(f(n)): Big-O - This defines a upper-bound.
Ω(f(n)): Big-Omega - This defines a lower-bound.
Θ(f(n)): Big-Theta - This defines a tight-bound.
A given time function f(n) is Θ(g(n)) only if it is also Ω(g(n)) and O(g(n)) (that is, both upper and lower bounded).
You are dealing with Big-O, which is the usual "entry point" to the discussion; we will neglect the other two entirely.
Consider the definition from Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes:
f(x)=O(g(x)) as x tends to infinity
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M|g(x)| for all x > x0
Going from here, assume we have f1(n) = 2^n. If we were to compare that to f2(n) = 2^(2n) = 4^n, how would f1(n) and f2(n) relate to each other in Big-O terms?
Is 2^n <= M * 4^n for some arbitrary M and n0 value? Of course! Using M = 1 and n0 = 1, it is true. Thus, 2^n is upper-bounded by O(4^n).
Is 4^n <= M * 2^n for some arbitrary M and n0 value? This is where you run into problems... for no constant value of M can you make 2^n grow faster than 4^n as n gets arbitrarily large. Thus, 4^n is not upper-bounded by O(2^n).
See comments for further explanations, but indeed, this is just an example I came up with to help you grasp Big-O concept. That is not the actual algorithmic meaning.
Suppose you have an array, arr = [1, 2, 3, 4, 5].
An example of a O(1) operation would be directly access an index, such as arr[0] or arr[2].
An example of a O(n) operation would be a loop that could iterate through all your array, such as for elem in arr:.
n would be the size of your array. If your array is twice as big as the original array, n would also be twice as big. That's how variables work.
See Big-O Cheat Sheet for complementary informations.

Karatsuba and Toom-3 algorithms for 3-digit number multiplications

I was wondering about this problem concerning Katatsuba's algorithm.
When you apply Karatsuba you basically have to do 3 multiplications per one run of the loop
Those are (let's say ab and cd are 2-digit numbers with digits respectively a, b, c and d):
X = bd
Y = ac
Z = (a+c)(c+d)
and then the sums we were looking for are:
bd = X
ac = Y
(bc + ad) = Z - X - Y
My question is: let's say we have two 3-digit numbers: abc, def. I found out that we will have to perfom only 5 multiplications to do so. I also found this Toom-3 algorithm, but it uses polynomials I can;t quite get. Could someone write down those multiplications and how to calculate the interesting sums bd + ae, ce+ bf, cd + be + af
The basic idea is this: The number 237 is the polynomial p(x)=2x2+3x+7 evaluated at the point x=10. So, we can think of each integer corresponding to a polynomial whose coefficients are the digits of the number. When we evaluate the polynomial at x=10, we get our number back.
What is interesting is that to fully specify a polynomial of degree 2, we need its value at just 3 distinct points. We need 5 values to fully specify a polynomial of degree 4.
So, if we want to multiply two 3 digit numbers, we can do so by:
Evaluating the corresponding polynomials at 5 distinct points.
Multiplying the 5 values. We now have 5 function values of the polynomial of the product.
Finding the coefficients of this polynomial from the five values we computed in step 2.
Karatsuba multiplication works the same way, except that we only need 3 distinct points. Instead of at 10, we evaluate the polynomial at 0, 1, and "infinity", which gives us b,a+b,a and d,d+c,c which multiplied together give you your X,Z,Y.
Now, to write this all out in terms of abc and def is quite involved. In the Wikipedia article, it's actually done quite nicely:
In the Evaluation section, the polynomials are evaluated to give, for example, c,a+b+c,a-b+c,4a+2b+c,a for the first number.
In Pointwise products, the corresponding values for each number are multiplied, which gives:
X = cf
Y = (a+b+c)(d+e+f)
Z = (a+b-c)(d-e+f)
U = (4a+2b+c)(4d+2e+f)
V = ad
In the Interpolation section, these values are combined to give you the digits in the product. This involves solving a 5x5 system of linear equations, so again it's a bit more complicated than the Karatsuba case.