SAGE implementation of discrete logarithm in subgroup of group of units - cryptography

This is a question related to this one. Briefly, in ElGammal cryptosystem with underlying group the group of units modulo a prime number p I'm told to find a subgroup of index 2 to solve discrete logarithm problem in order to break the system.
Clearly, as the group of units modulo a prime number is cyclic, if x is a generator then x^2 generates a subgroup of index 2. Now, what is a good way of solving discrete logarithm problem on sage? How would I use the result of solving discrete logarithm problem in this subgroup to solve it in the whole group?

Sage knows how to compute discrete logarithms in finite fields:
sage: K = GF(19)
sage: z = K.primitive_element()
sage: a = K.random_element()
sage: b = a.log(z)
sage: z^b == a
True
you can use this functionality to solve the discrete logarithm in the subgroup of index 2
sage: x = z^2
sage: a = K.random_element()^2
sage: a.log(x)
6
This is only a toy example, but note that this is not more efficient than solving the discrete logarithm in the full group 𝔽₁₉*.
It is true that the efficiency of generic algorithms (e.g., Baby step-Giant step, Pollard rho, ...) is directly related to the size of the subgroup; however algorithms used to solve discrete logarithms in finite fields (number field sieve, function field sieve) are mostly insensitive to the size of the multiplicative subgroup, and are in general much more efficient than generic algorithms.

Related

Given a set of points or vectors, find the set of N points that are closest to each other

I have for example 100 vectors each has a dimension of 12. I would like to find for example 8 vectors that are closest to each other. In other words, the top 8 matching vectors. I may use Euclidean or Manhattan distance as a measure metric to quantify the similarity between the vectors. An initial thinking reveals that I could formulate this problem as a 0-1 nonlinear programming which is NP hard to solve as the number of vectors increases. I also went through the k-means clustering algorithm but it does not use the Euclidean distance as a measure. Any idea which algorithm can target this problem. The reason I am asking is because I am sure this problem was addressed in the literature but I could not find such algorithm.
This can actually be formulated as a quadratic or linear integer program:
The quadratic model can look like:
min sum((i,j), x(i)*x(j)*dist(i,j))
sum(i, x(i)) = 8
x(i) ∈ {0,1}
The linear MIP model is a variant of the quadratic model:
min sum((i,j), y(i,j)*dist(i,j))
sum(i, x(i)) = 8
y(i,j) >= x(i)+x(j)-1
x(i) ∈ {0,1}
y(i,j) ∈ [0,1]
We can refine things by only considering distances with i < j (essentially no double counting).
Instead of summing over all distances, we can also minimize the maximum distance in our selected points:
min z
z >= y(i,j)*dist(i,j) for all i<j
sum(i, x(i)) = 8
y(i,j) >= x(i)+x(j)-1
x(i) ∈ {0,1}
y(i,j) ∈ [0,1]
These models are independent of what metric or dimensionality you use. Whether using Euclidean or Manhattan distances or whether you normalize or use weights, the models stay the same. The same thing for whether you have low- or high-dimensional data. These models just need a distance matrix.
The MIP models solve quite fast with Gurobi. With random data using your sizes (select 8 points from 100 using 12-dimensional coordinates), these models take 50 and 9 seconds for the linear sum and max model to find proven optimal solutions. Some more details are here.
For a 2d data set we can plot the results:
I agree with Jérôme Richard that the distance measure needs to be clarified. Is it the sum of the Euclidean lengths of the 56 line segments connecting pairs of the eight selected points, or the maximum distance between any pair, or what?
That said, if "similarity" is the goal, we might ask which eight points are contained in the smallest possible ball or box. Finding the smallest ball is a convex quadratic problem and finding the smallest hyperrectangle has a nonconvex objective (I think), but finding the vectors that fit in the smallest hypercube is easy to formulate as a mixed integer linear program.
with OPL CPLEX you can solve this both with Mathematical Programming and Constraint Programming.
Math prog:
int scale=1000000;
int nbdim=2;
int m=8;
int n=50;
range dim=1..nbdim;
range points=1..n;
float x[p in points][d in dim]=rand(scale)/scale;
float distance[p1 in points][p2 in points]=sqrt(sum(d in dim)((x[p1][d]-x[p2][d])^2));
// Should we keep that point ?
dvar boolean which[points];
dexpr float maxDist=max(ordered i,j in points) distance[i][j]*((which[i]==1) && (which[j]==1));
minimize maxDist;
subject to
{
sum (p in points) which[p]==m;
}
{int} chosenPoints={k | k in points:which[k]==1};
Constraint Programming:
using CP;
int scale=1000000;
int nbdim=2;
int m=8;
int n=50;
range dim=1..nbdim;
range points=1..n;
float x[p in points][d in dim]=rand(scale)/scale;
float distance[p1 in points][p2 in points]=sqrt(sum(d in dim)((x[p1][d]-x[p2][d])^2));
// which point for as the m th point ?
dvar int which[1..m] in points;
dexpr float maxDist=max(ordered i,j in 1..m) distance[which[i]][which[j]];
minimize maxDist;
subject to
{
allDifferent(which);
}
{int} chosenPoints={which[k] | k in 1..m};

Fast algorithm for computing cofactor matrix

I wonder if there is a fast algorithm, say (O(n^3)) for computing the cofactor matrix (or conjugate matrix) of a N*N square matrix. And yes one could first compute its determinant and inverse separately and then multiply them together. But how about this square matrix is non-invertible?
I am curious about the accepted answer here:Speed up python code for computing matrix cofactors
What would it mean by "This probably means that also for non-invertible matrixes, there is some clever way to calculate the cofactor (i.e., not use the mathematical formula that you use above, but some other equivalent definition)."?
Factorize M = L x D x U, whereL is lower triangular with ones on the main diagonal,U is upper triangular on the main diagonal, andD is diagonal.
You can use back-substitution as with Cholesky factorization, which is similar. Then,
M^{ -1 } = U^{ -1 } x D^{ -1 } x L^{ -1 }
and then transpose the cofactor matrix as :
Cof( M )^T = Det( U ) x Det( D ) x Det( L ) x M^{ -1 }.
If M is singular or nearly so, one element (or more) of D will be zero or nearly zero. Replace those elements with zero in the matrix product and 1 in the determinant, and use the above equation for the transpose cofactor matrix.

Big O notation and measuring time according to it

Suppose we have an algorithm that is of order O(2^n). Furthermore, suppose we multiplied the input size n by 2 so now we have an input of size 2n. How is the time affected? Do we look at the problem as if the original time was 2^n and now it became 2^(2n) so the answer would be that the new time is the power of 2 of the previous time?
Big 0 is not for telling you the actual running time, just how the running time is affected by the size of input. If you double the size of input the complexity is still O(2^n), n is just bigger.
number of elements(n) units of work
1 1
2 4
3 8
4 16
5 32
... ...
10 1024
20 1048576
There's a misunderstanding here about how Big-O relates to execution time.
Consider the following formulas which define execution time:
f1(n) = 2^n + 5000n^2 + 12300
f2(n) = (500 * 2^n) + 6
f3(n) = 500n^2 + 25000n + 456000
f4(n) = 400000000
Each of these functions are O(2^n); that is, they can each be shown to be less than M * 2^n for an arbitrary M and starting n0 value. But obviously, the change in execution time you notice for doubling the size from n1 to 2 * n1 will vary wildly between them (not at all in the case of f4(n)). You cannot use Big-O analysis to determine effects on execution time. It only defines an upper boundary on the execution time (which is not even guaranteed to be the minimum form of the upper bound).
Some related academia below:
There are three notable bounding functions in this category:
O(f(n)): Big-O - This defines a upper-bound.
Ω(f(n)): Big-Omega - This defines a lower-bound.
Θ(f(n)): Big-Theta - This defines a tight-bound.
A given time function f(n) is Θ(g(n)) only if it is also Ω(g(n)) and O(g(n)) (that is, both upper and lower bounded).
You are dealing with Big-O, which is the usual "entry point" to the discussion; we will neglect the other two entirely.
Consider the definition from Wikipedia:
Let f and g be two functions defined on some subset of the real numbers. One writes:
f(x)=O(g(x)) as x tends to infinity
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M|g(x)| for all x > x0
Going from here, assume we have f1(n) = 2^n. If we were to compare that to f2(n) = 2^(2n) = 4^n, how would f1(n) and f2(n) relate to each other in Big-O terms?
Is 2^n <= M * 4^n for some arbitrary M and n0 value? Of course! Using M = 1 and n0 = 1, it is true. Thus, 2^n is upper-bounded by O(4^n).
Is 4^n <= M * 2^n for some arbitrary M and n0 value? This is where you run into problems... for no constant value of M can you make 2^n grow faster than 4^n as n gets arbitrarily large. Thus, 4^n is not upper-bounded by O(2^n).
See comments for further explanations, but indeed, this is just an example I came up with to help you grasp Big-O concept. That is not the actual algorithmic meaning.
Suppose you have an array, arr = [1, 2, 3, 4, 5].
An example of a O(1) operation would be directly access an index, such as arr[0] or arr[2].
An example of a O(n) operation would be a loop that could iterate through all your array, such as for elem in arr:.
n would be the size of your array. If your array is twice as big as the original array, n would also be twice as big. That's how variables work.
See Big-O Cheat Sheet for complementary informations.

Flop count for variable initialization

Consider the following pseudo code:
a <- [0,0,0] (initializing a 3d vector to zeros)
b <- [0,0,0] (initializing a 3d vector to zeros)
c <- a . b (Dot product of two vectors)
In the above pseudo code, what is the flop count (i.e. number floating point operations)?
More generally, what I want to know is whether initialization of variables counts towards the total floating point operations or not, when looking at an algorithm's complexity.
In your case, both a and b vectors are zeros and I don't think that it is a good idea to use zeros to describe or explain the flops operation.
I would say that given vector a with entries a1,a2 and a3, and also given vector b with entries b1, b2, b3. The dot product of the two vectors is equal to aTb that gives
aTb = a1*b1+a2*b2+a3*b3
Here we have 3 multiplication operations
(i.e: a1*b1, a2*b2, a3*b3) and 2 addition operations. In total we have 5 operations or 5 flops.
If we want to generalize this example for n dimensional vectors a_n and b_n, we would have n times multiplication operations and n-1 times addition operations. In total we would end up with n+n-1 = 2n-1 operations or flops.
I hope the example I used above gives you the intuition.

Karatsuba and Toom-3 algorithms for 3-digit number multiplications

I was wondering about this problem concerning Katatsuba's algorithm.
When you apply Karatsuba you basically have to do 3 multiplications per one run of the loop
Those are (let's say ab and cd are 2-digit numbers with digits respectively a, b, c and d):
X = bd
Y = ac
Z = (a+c)(c+d)
and then the sums we were looking for are:
bd = X
ac = Y
(bc + ad) = Z - X - Y
My question is: let's say we have two 3-digit numbers: abc, def. I found out that we will have to perfom only 5 multiplications to do so. I also found this Toom-3 algorithm, but it uses polynomials I can;t quite get. Could someone write down those multiplications and how to calculate the interesting sums bd + ae, ce+ bf, cd + be + af
The basic idea is this: The number 237 is the polynomial p(x)=2x2+3x+7 evaluated at the point x=10. So, we can think of each integer corresponding to a polynomial whose coefficients are the digits of the number. When we evaluate the polynomial at x=10, we get our number back.
What is interesting is that to fully specify a polynomial of degree 2, we need its value at just 3 distinct points. We need 5 values to fully specify a polynomial of degree 4.
So, if we want to multiply two 3 digit numbers, we can do so by:
Evaluating the corresponding polynomials at 5 distinct points.
Multiplying the 5 values. We now have 5 function values of the polynomial of the product.
Finding the coefficients of this polynomial from the five values we computed in step 2.
Karatsuba multiplication works the same way, except that we only need 3 distinct points. Instead of at 10, we evaluate the polynomial at 0, 1, and "infinity", which gives us b,a+b,a and d,d+c,c which multiplied together give you your X,Z,Y.
Now, to write this all out in terms of abc and def is quite involved. In the Wikipedia article, it's actually done quite nicely:
In the Evaluation section, the polynomials are evaluated to give, for example, c,a+b+c,a-b+c,4a+2b+c,a for the first number.
In Pointwise products, the corresponding values for each number are multiplied, which gives:
X = cf
Y = (a+b+c)(d+e+f)
Z = (a+b-c)(d-e+f)
U = (4a+2b+c)(4d+2e+f)
V = ad
In the Interpolation section, these values are combined to give you the digits in the product. This involves solving a 5x5 system of linear equations, so again it's a bit more complicated than the Karatsuba case.