Karatsuba and Toom-3 algorithms for 3-digit number multiplications - multiplication

I was wondering about this problem concerning Katatsuba's algorithm.
When you apply Karatsuba you basically have to do 3 multiplications per one run of the loop
Those are (let's say ab and cd are 2-digit numbers with digits respectively a, b, c and d):
X = bd
Y = ac
Z = (a+c)(c+d)
and then the sums we were looking for are:
bd = X
ac = Y
(bc + ad) = Z - X - Y
My question is: let's say we have two 3-digit numbers: abc, def. I found out that we will have to perfom only 5 multiplications to do so. I also found this Toom-3 algorithm, but it uses polynomials I can;t quite get. Could someone write down those multiplications and how to calculate the interesting sums bd + ae, ce+ bf, cd + be + af

The basic idea is this: The number 237 is the polynomial p(x)=2x2+3x+7 evaluated at the point x=10. So, we can think of each integer corresponding to a polynomial whose coefficients are the digits of the number. When we evaluate the polynomial at x=10, we get our number back.
What is interesting is that to fully specify a polynomial of degree 2, we need its value at just 3 distinct points. We need 5 values to fully specify a polynomial of degree 4.
So, if we want to multiply two 3 digit numbers, we can do so by:
Evaluating the corresponding polynomials at 5 distinct points.
Multiplying the 5 values. We now have 5 function values of the polynomial of the product.
Finding the coefficients of this polynomial from the five values we computed in step 2.
Karatsuba multiplication works the same way, except that we only need 3 distinct points. Instead of at 10, we evaluate the polynomial at 0, 1, and "infinity", which gives us b,a+b,a and d,d+c,c which multiplied together give you your X,Z,Y.
Now, to write this all out in terms of abc and def is quite involved. In the Wikipedia article, it's actually done quite nicely:
In the Evaluation section, the polynomials are evaluated to give, for example, c,a+b+c,a-b+c,4a+2b+c,a for the first number.
In Pointwise products, the corresponding values for each number are multiplied, which gives:
X = cf
Y = (a+b+c)(d+e+f)
Z = (a+b-c)(d-e+f)
U = (4a+2b+c)(4d+2e+f)
V = ad
In the Interpolation section, these values are combined to give you the digits in the product. This involves solving a 5x5 system of linear equations, so again it's a bit more complicated than the Karatsuba case.

Related

Unnormalizing in Knuth's Algorithm D

I'm trying to implement Algorithm D from Knuth's "The Art of Computer Programming, Vol 2" in Rust although I'm having trouble understating how to implement the very last step of unnormalizing. My natural numbers are a class where each number is a vector of u64, in base u64::MAX. Addition, subtraction, and multiplication have been implemented.
Knuth's Algorithm D is a euclidean division algorithm which takes two natural numbers x and y and returns (q,r) where q = x / y (integer division) and r = x % y, the remainder. The algorithm depends on an approximation method which only works if the first digit of y is greater than b/2, where b is the base you're representing the numbers in. Since not all numbers are of this form, it uses a "normalizing trick", for example (if we were in base 10) instead of doing 200 / 23, we calculate a normalizer d and do (200 * d) / (23 * d) so that 23 * d has a first digit greater than b/2.
So when we use the approximation method, we end up with the desired q but the remainder is multiplied by a factor of d. So the last step is to divide r by d so that we can get the q and r we want. My problem is, I'm a bit confused at how we're suppose to do this last step as it requires division and the method it's part of is trying to implement division.
(Maybe helpful?):
The way that d is calculated is just by taking the integer floor of b-1 divided by the first digit of y. However, Knuth suggests that it's possible to make d a power of 2, as long as d * the first digit of y is greater than b / 2. I think he makes this suggestion so that instead of dividing, we can just do a binary shift for this last step. Although I don't think I can do that given that my numbers are represented as vectors of u64 values, instead of binary.
Any suggestions?

How to find most similar numerical arrays to one array, using Numpy/Scipy?

Let's say I have a list of 5 words:
[this, is, a, short, list]
Furthermore, I can classify some text by counting the occurrences of the words from the list above and representing these counts as a vector:
N = [1,0,2,5,10] # 1x this, 0x is, 2x a, 5x short, 10x list found in the given text
In the same way, I classify many other texts (count the 5 words per text, and represent them as counts - each row represents a different text which we will be comparing to N):
M = [[1,0,2,0,5],
[0,0,0,0,0],
[2,0,0,0,20],
[4,0,8,20,40],
...]
Now, I want to find the top 1 (2, 3 etc) rows from M that are most similar to N. Or on simple words, the most similar texts to my initial text.
The challenge is, just checking the distances between N and each row from M is not enough, since for example row M4 [4,0,8,20,40] is very different by distance from N, but still proportional (by a factor of 4) and therefore very similar. For example, the text in row M4 can be just 4x as long as the text represented by N, so naturally all counts will be 4x as high.
What is the best approach to solve this problem (of finding the most 1,2,3 etc similar texts from M to the text in N)?
Generally speaking, the most widely standard technique of bag of words (i.e. you arrays) for similarity is to check cosine similarity measure. This maps your bag of n (here 5) words to a n-dimensional space and each array is a point (which is essentially also a point vector) in that space. The most similar vectors(/points) would be ones that have the least angle to your text N in that space (this automatically takes care of proportional ones as they would be close in angle). Therefore, here is a code for it (assuming M and N are numpy arrays of the similar shape introduced in the question):
import numpy as np
cos_sim = M[np.argmax(np.dot(N, M.T)/(np.linalg.norm(M)*np.linalg.norm(N)))]
which gives output [ 4 0 8 20 40] for your inputs.
You can normalise your row counts to remove the length effect as you discussed. Row normalisation of M can be done as M / M.sum(axis=1)[:, np.newaxis]. The residual values can then be calculated as the sum of the square difference between N and M per row. The minimum difference (ignoring NaN or inf values obtained if the row sum is 0) is then the most similar.
Here is an example:
import numpy as np
N = np.array([1,0,2,5,10])
M = np.array([[1,0,2,0,5],
[0,0,0,0,0],
[2,0,0,0,20],
[4,0,8,20,40]])
# sqrt of sum of normalised square differences
similarity = np.sqrt(np.sum((M / M.sum(axis=1)[:, np.newaxis] - N / np.sum(N))**2, axis=1))
# remove any Nan values obtained by dividing by 0 by making them larger than one element
similarity[np.isnan(similarity)] = similarity[0]+1
result = M[similarity.argmin()]
result
>>> array([ 4, 0, 8, 20, 40])
You could then use np.argsort(similarity)[:n] to get the n most similar rows.

Flop count for variable initialization

Consider the following pseudo code:
a <- [0,0,0] (initializing a 3d vector to zeros)
b <- [0,0,0] (initializing a 3d vector to zeros)
c <- a . b (Dot product of two vectors)
In the above pseudo code, what is the flop count (i.e. number floating point operations)?
More generally, what I want to know is whether initialization of variables counts towards the total floating point operations or not, when looking at an algorithm's complexity.
In your case, both a and b vectors are zeros and I don't think that it is a good idea to use zeros to describe or explain the flops operation.
I would say that given vector a with entries a1,a2 and a3, and also given vector b with entries b1, b2, b3. The dot product of the two vectors is equal to aTb that gives
aTb = a1*b1+a2*b2+a3*b3
Here we have 3 multiplication operations
(i.e: a1*b1, a2*b2, a3*b3) and 2 addition operations. In total we have 5 operations or 5 flops.
If we want to generalize this example for n dimensional vectors a_n and b_n, we would have n times multiplication operations and n-1 times addition operations. In total we would end up with n+n-1 = 2n-1 operations or flops.
I hope the example I used above gives you the intuition.

Is the growth of the binomial coefficient function factorial or polynomial

I have written an algorithm that given a list of words, must check each unique combination of four words in that list of words (regardless of order).
The number of combinations to be checked, x, can be calculated using the binomial coefficient i.e. x = n!/(r!(n-r)!) where n is the total number of words in the list and r is the number of words in each combination, which in my case is always 4, therefore the function is x = n!/(4!(n-4)!) = n!/(24(n-4)!). Therefore as the number of total words, n, increases the number of combinations to be checked, x, therefore increases factorially right?
What has thrown me is that WolframAlpha was able to rewrite this function as x = (n^4)/24 − (n^3)/4 + (11.n^2)/24 − n/4, so now it would appear to grow polynomially as n grows? So which is it?!
Here is a graph to visualise the growth of the function (the letter x is switched to an l)
For a fixed value of r, this function is O(n^r). In your case, r = 4, it is O(n^4). This is because most of the terms in the numerator are canceled out by the denominator:
n!/(4!(n-4)!)
= n(n-1)(n-2)(n-3)(n-4)(n-5)(n-6)...(3)(2)(1)
-------------------------------------------
4!(n-4)(n-5)(n-6)...(3)(2)(1)
= n(n-1)(n-2)(n-3)
----------------
4!
This is a 4th degree polynomial in n.

Division in double precision

I have two double variables:
a > 0
b >= 0
which could be tiny numbers. 'a' represents singular values of a matrix and 'b' represents the Tikhonov regularization constant. As part of the Tikhonov least squares solution, it is necessary to compute the quantity:
c = a*a / (a*a + b)
However if a is really small (ie small singular values of the matrix), a*a may not be representable in double precision. How can I compute this quotient c in a numerically stable way for the given ranges of a,b?
The best I can come up with is:
c = 1 / (1 + b / a / a)
To derive this equivalency, note that 1/c is (a^2 + b)/c and then decompose the fraction. This form might be more numerically stable since it doesn't require a^2 to be calculated at any point. It'll still lose precision if both b and a are very small. If that case must be handled too, you might look at a Taylor series expansion (may or may not work for this case).