Why does the security of RSA depend on the non-factorability of the modulus n? - cryptography

Just wondering why does the security of RSA depend on the non-factorability of the modulus n?
Cheers!

well ... the non-factorability of the modulus n is not the whole story ...
as vlad already pointed out, you can easily calculate the private exponent if you know the factors of n ...
(p-1)(q-1) ... or more in general... if you know the prime factors P[i] of a number n, then you can calculate the product of all (P[i] - 1)... that is eulers PHI function ... to know the number of invertible multiplicative elements mod n
if you can factorize n, that calculation becomes trivial ... if n consists of only 2 large primes, and that factorization is hard, that isn't really trivial ...
however ... if you come up with another idea of calculating PHI(n) ... the number of elements mod n that have a multiplicative inverse ... factorization would probably no longer be your problem ...
currently there is no other public known way of calculating phi, than eulers way ... prod(P[i] - 1)
so either finding a way to factorize, or calculating PHI(n) a different way, would probably lead to breaking RSA

The public data in RSA is n - the public modulus, and e - the public exponent. The secret is d - the private exponent.
When creating the parameters you first generate two random primes p and q and then compute the public modulus n = p*q. So p and q are the factorization of n. Actually you could use more primes, but most use just two.
Then you choose the public exponent e, which is usually a small prime such as 65537 or 17 or even 3.
Your secret exponent d would then be d = 1/e mod (p-1)(q-1).
So clearly anyone could compute d if they knew p and q, which is the factorization.

Related

Ciphertext is not the same as the original one after verification step

I'm learning how to decrypt a message using RSA algorithms by doing this exercises to calculate the message.
The value of C is 2826893841, and public key (n= 5399937593 and e=3203).
I've computed d (equal to 2311305263), two prime numbers p (equal to 63419) and q (equal to 85147), The result of M could be C^(d) mod n = 2104674266.
The problem is that when I tried to verify the value of C = M^(e) mod n = (2104674266)^(3203) mod 5399937593, it's equal to 91392319 instead of 2826893841 as given above.
I use this website to calculate the mod https://www.mtholyoke.edu/courses/quenell/s2003/ma139/js/powermod.html
Maybe I did something wrong to solve this problem, please tell me how to fix it.

Problem determining the bit length of a key from the modulus in the RSA algorithm

Here are two 64-bit (signed) integers
p = 13776308150928489016
q = 16488138731131959619
and their product
n = 112488352363349635896748360565917156710
The bit-length of the product is floor ((log2 n) + 1) or 127.
Now here are another two 64-bit integers
p = 13275629912622491628
q = 16290498985329101221
and their product
n = 179030914337714357408535416678431567970
but this time the bit length is floor ((log2 n) + 1) or 128.
The reason is that there's a leading zero in the first integer, which makes the space needed to represent the integer in memory one bit smaller.
The problem this causes is that I can't determine the bit length of the keys accurately. For example, here are is a very short RSA key pair:
Public key : 7, 8371846783263706079
Private key : 2989945277626202443, 8371846783263706079
The modulus (8371846783263706079) is 63 bits, which the number I'm after is 64. The overcome this issue I have considered the following solutions:
Round up to the nearest 2^n
Store the key size in bits along with the key
Add some kind of padding to ensure all integers take up the same space (not sure how this would work in practice)
Which one is the correct solution?
As #r3mainer notes, the math needed here -- inequalities -- is not exotic. As to what tutorials say, well, they're just tutorials, they're trying to simplify as much as possible so they leave out some details.
What you are observing is the following:
you want two primes, p and q, to have the same bit length k and their product N to have a bit length of 2k.
By the definition of what it means to have a bit length of k, we have the following inequality:
1) 2(k-1) <= p, q < 2k.
However, when we multiply p and q we discover a problem:
2) 2(2k - 2) <= N < 22k
This means that N=p*q may end up having bit length of 2k-1 or 2k, but we don't want 2k-1.
In your example k=64.
To fix it, we need to tighten up the lower bound on p and q to the following:
3) sqrt(2(2k-1)) <= p, q < 2k.
Bearing in mind that all results are integers, we apply the ceiling function and get finally
4) ceiling(sqrt(2(2k-1))) <= p, q < 2k.
For k=64 this works out to:
13043817825332782213 <= p, q < 264
An even simpler formulation is make the bounds dynamic, as in the following:
first find p, of any size. Then we want
2(2k - 1) <= p*q < 22k, so
5) (2(2k - 1))/ p <= q < (22k)/p will do the trick.
For RSA, we actually do want both primes to be sufficiently large and entropic, and yet not be too close to each other. We can do that by choosing p to have length k-1 or k-2 and applying 5).

Is this O(N) algorithm actually O(logN)?

I have an integer, N.
I denote f[i] = number of appearances of the digit i in N.
Now, I have the following algorithm.
FOR i = 0 TO 9
FOR j = 1 TO f[i]
k = k*10 + i;
My teacher said this is O(N). It seems to me more like a O(logN) algorithm.
Am I missing something?
I think that you and your teacher are saying the same thing but it gets confused because the integer you are using is named N but it is also common to refer to an algorithm that is linear in the size of its input as O(N). N is getting overloaded as the specific name and the generic figure of speech.
Suppose we say instead that your number is Z and its digits are counted in the array d and then their frequencies are in f. For example, we could have:
Z = 12321
d = [1,2,3,2,1]
f = [0,2,2,1,0,0,0,0,0,0]
Then the cost of going through all the digits in d and computing the count for each will be O( size(d) ) = O( log (Z) ). This is basically what your second loop is doing in reverse, it's executing one time for each occurence of each digits. So you are right that there is something logarithmic going on here -- the number of digits of Z is logarithmic in the size of Z. But your teacher is also right that there is something linear going on here -- counting those digits is linear in the number of digits.
The time complexity of an algorithm is generally measured as a function of the input size. Your algorithm doesn't take N as an input; the input seems to be the array f. There is another variable named k which your code doesn't declare, but I assume that's an oversight and you meant to initialise e.g. k = 0 before the first loop, so that k is not an input to the algorithm.
The outer loop runs 10 times, and the inner loop runs f[i] times for each i. Therefore the total number of iterations of the inner loop equals the sum of the numbers in the array f. So the complexity could be written as O(sum(f)) or O(Σf) where Σ is the mathematical symbol for summation.
Since you defined that N is an integer which f counts the digits of, it is in fact possible to prove that O(Σf) is the same thing as O(log N), so long as N must be a positive integer. This is because Σf equals how many digits the number N has, which is approximately (log N) / (log 10). So by your definition of N, you are correct.
My guess is that your teacher disagrees with you because they think N means something else. If your teacher defines N = Σf then the complexity would be O(N). Or perhaps your teacher made a genuine mistake; that is not impossible. But the first thing to do is make sure you agree on the meaning of N.
I find your explanation a bit confusing, but lets assume N = 9075936782959 is an integer. Then O(N) doesn't really make sense. O(length of N) makes more sense. I'll use n for the length of N.
Then f(i) = iterate over each number in N and sum to find how many times i is in N, that makes O(f(i)) = n (it's linear). I'm assuming f(i) is a function, not an array.
Your algorithm loops at most:
10 times (first loop)
0 to n times, but the total is n (the sum of f(i) for all digits must be n)
It's tempting to say that algorithm is then O(algo) = 10 + n*f(i) = n^2 (removing the constant), but f(i) is only calculated 10 times, each time the second loops is entered, so O(algo) = 10 + n + 10*f(i) = 10 + 11n = n. If f(i) is an array, it's constant time.
I'm sure I didn't see the problem the same way as you. I'm still a little confused about the definition in your question. How did you come up with log(n)?

Diffie Hellman generating a result in a range

I'm using the diffie-hellman key exchange method to securely generate a key for use with the AES cipher (the result will be hashed to make in the ideal length). Assuming the exponent is a prime of length 2^2048 bits, how can i calculate the size of the base and the modulus if i want the decimal result to be of a length in between (2^6)^32 and (2^6)^40 (i.e. a base64 string of length equal to or greater than 32 and less than or equal to 40 characters). The base i want to use is within the range 3
I'm new to Diffie-Hellman exchanges, are there any restrictions on the modulus, the base or the exponents that i should be aware of?
Is there an equation i can use to derive the ideal pair lengths, or do i have to pre calculate it and store it in an array.
Thanks,
I'm not sure what you are asking about.
For Diffie-Hellman you choose a safe or strong prime p between 2^2047 and 2^2048-1 in your case, then choose an element 0 < g < p-1 such that g^(p-1) mod *p*=1 but g^x mod p≠1 for all 0 < x < p-1 . p and g are constant parameters for your implementation. The size of g does not matter for the scheme. Now for a key exchange you sample 0 < a,b < p-1 uniformly and random, exchange g^a mod p and g^b mod p, and calculate g^ab mod p. Because of the random choice of a and b the result g^ab mod p is also random with 0 < (g^ab mod p) < p-1.
As you have already noticed you can then hash g^ab mod p to generate a short key (256 bit with sha256 for example).

Modular arithmetic

I'm new to cryptography and modular arithmetic. So, I'm sure it's a silly question, but I can't help it.
How do I calculate a from
pow(a,q) = 1 (mod p),
where p and q are known? I don't get the "1 (mod p)" part, it equals to 1, doesn't it? If so, than what is "mod p" about?
Is this the same as
pow(a,-q) (mod p) = 1?
The (mod p) part refers not to the right hand side, but to the equality sign: it says that modulo p, pow(a,q) and 1 are equal. For instance, "modulo 10, 246126 and 7868726 are equal" (and they are also both equal to 6 modulo 10): two numbers x and y are equal modulo p if they have the same remainder on dividing by p, or equivalently, if p divides x-y.
Since you seem to be coming from a programming perspective, another way of saying it is that pow(a,q)%p=1, where "%" is the "remainder" operator as implemented in several languages (assuming that p>1).
You should read the Wikipedia article on Modular arithmetic, or any elementary number theory book (or even a cryptography book, since it is likely to introduce modular arithmetic).
To answer your other question: there is no general formula for finding such an a (to the best of my knowledge) in general. Assuming that p is prime, and using Fermat's little theorem to reduce q modulo p-1, and assuming that q divides p-1 (or else no such a exists), you can produce such an a by taking a primitive root of p and raising it to the power (p-1)/q. [And more generally, when p is not prime, you can reduce q modulo φ(p), then assuming it divides φ(p) and you know a primitive root (say r) mod p, you can take r to the power of φ(p)/q, where φ is the totient function -- this comes from Euler's theorem.]
Not silly at all, as this is the basis for public-key encryption. You can find an excellent discussion on this at http://home.scarlet.be/~ping1339/congr.htm#The-equation-a%3Csup%3Ex.
PKI works by choosing p and q that are large and relatively prime. One (say p) becomes your private key and the other (q) is your public key. The encryption is "broken" if an attacker guesses p, given aq (the encrypted message) and q (your public key).
So, to answer your question:
aq = 1 mod p
This means aq is a number that leaves a remainder of 1 when divided by p. We don't care about the integer portion of the quotient, so we can write:
aq / p = n + 1/p
for any integer value of n. If we multiply both sides of the equation by p, we have:
aq = np + 1
Solving for a we have:
a = (np+1)1/q
The final step is to find a value of n that generates the original value of a. I don't know of any way to do this other than trial and error -- which equates to a "brute force" attempt to break the encryption.