Problem determining the bit length of a key from the modulus in the RSA algorithm - cryptography

Here are two 64-bit (signed) integers
p = 13776308150928489016
q = 16488138731131959619
and their product
n = 112488352363349635896748360565917156710
The bit-length of the product is floor ((log2 n) + 1) or 127.
Now here are another two 64-bit integers
p = 13275629912622491628
q = 16290498985329101221
and their product
n = 179030914337714357408535416678431567970
but this time the bit length is floor ((log2 n) + 1) or 128.
The reason is that there's a leading zero in the first integer, which makes the space needed to represent the integer in memory one bit smaller.
The problem this causes is that I can't determine the bit length of the keys accurately. For example, here are is a very short RSA key pair:
Public key : 7, 8371846783263706079
Private key : 2989945277626202443, 8371846783263706079
The modulus (8371846783263706079) is 63 bits, which the number I'm after is 64. The overcome this issue I have considered the following solutions:
Round up to the nearest 2^n
Store the key size in bits along with the key
Add some kind of padding to ensure all integers take up the same space (not sure how this would work in practice)
Which one is the correct solution?

As #r3mainer notes, the math needed here -- inequalities -- is not exotic. As to what tutorials say, well, they're just tutorials, they're trying to simplify as much as possible so they leave out some details.
What you are observing is the following:
you want two primes, p and q, to have the same bit length k and their product N to have a bit length of 2k.
By the definition of what it means to have a bit length of k, we have the following inequality:
1) 2(k-1) <= p, q < 2k.
However, when we multiply p and q we discover a problem:
2) 2(2k - 2) <= N < 22k
This means that N=p*q may end up having bit length of 2k-1 or 2k, but we don't want 2k-1.
In your example k=64.
To fix it, we need to tighten up the lower bound on p and q to the following:
3) sqrt(2(2k-1)) <= p, q < 2k.
Bearing in mind that all results are integers, we apply the ceiling function and get finally
4) ceiling(sqrt(2(2k-1))) <= p, q < 2k.
For k=64 this works out to:
13043817825332782213 <= p, q < 264
An even simpler formulation is make the bounds dynamic, as in the following:
first find p, of any size. Then we want
2(2k - 1) <= p*q < 22k, so
5) (2(2k - 1))/ p <= q < (22k)/p will do the trick.
For RSA, we actually do want both primes to be sufficiently large and entropic, and yet not be too close to each other. We can do that by choosing p to have length k-1 or k-2 and applying 5).

Related

Is this O(N) algorithm actually O(logN)?

I have an integer, N.
I denote f[i] = number of appearances of the digit i in N.
Now, I have the following algorithm.
FOR i = 0 TO 9
FOR j = 1 TO f[i]
k = k*10 + i;
My teacher said this is O(N). It seems to me more like a O(logN) algorithm.
Am I missing something?
I think that you and your teacher are saying the same thing but it gets confused because the integer you are using is named N but it is also common to refer to an algorithm that is linear in the size of its input as O(N). N is getting overloaded as the specific name and the generic figure of speech.
Suppose we say instead that your number is Z and its digits are counted in the array d and then their frequencies are in f. For example, we could have:
Z = 12321
d = [1,2,3,2,1]
f = [0,2,2,1,0,0,0,0,0,0]
Then the cost of going through all the digits in d and computing the count for each will be O( size(d) ) = O( log (Z) ). This is basically what your second loop is doing in reverse, it's executing one time for each occurence of each digits. So you are right that there is something logarithmic going on here -- the number of digits of Z is logarithmic in the size of Z. But your teacher is also right that there is something linear going on here -- counting those digits is linear in the number of digits.
The time complexity of an algorithm is generally measured as a function of the input size. Your algorithm doesn't take N as an input; the input seems to be the array f. There is another variable named k which your code doesn't declare, but I assume that's an oversight and you meant to initialise e.g. k = 0 before the first loop, so that k is not an input to the algorithm.
The outer loop runs 10 times, and the inner loop runs f[i] times for each i. Therefore the total number of iterations of the inner loop equals the sum of the numbers in the array f. So the complexity could be written as O(sum(f)) or O(Σf) where Σ is the mathematical symbol for summation.
Since you defined that N is an integer which f counts the digits of, it is in fact possible to prove that O(Σf) is the same thing as O(log N), so long as N must be a positive integer. This is because Σf equals how many digits the number N has, which is approximately (log N) / (log 10). So by your definition of N, you are correct.
My guess is that your teacher disagrees with you because they think N means something else. If your teacher defines N = Σf then the complexity would be O(N). Or perhaps your teacher made a genuine mistake; that is not impossible. But the first thing to do is make sure you agree on the meaning of N.
I find your explanation a bit confusing, but lets assume N = 9075936782959 is an integer. Then O(N) doesn't really make sense. O(length of N) makes more sense. I'll use n for the length of N.
Then f(i) = iterate over each number in N and sum to find how many times i is in N, that makes O(f(i)) = n (it's linear). I'm assuming f(i) is a function, not an array.
Your algorithm loops at most:
10 times (first loop)
0 to n times, but the total is n (the sum of f(i) for all digits must be n)
It's tempting to say that algorithm is then O(algo) = 10 + n*f(i) = n^2 (removing the constant), but f(i) is only calculated 10 times, each time the second loops is entered, so O(algo) = 10 + n + 10*f(i) = 10 + 11n = n. If f(i) is an array, it's constant time.
I'm sure I didn't see the problem the same way as you. I'm still a little confused about the definition in your question. How did you come up with log(n)?

Diffie Hellman generating a result in a range

I'm using the diffie-hellman key exchange method to securely generate a key for use with the AES cipher (the result will be hashed to make in the ideal length). Assuming the exponent is a prime of length 2^2048 bits, how can i calculate the size of the base and the modulus if i want the decimal result to be of a length in between (2^6)^32 and (2^6)^40 (i.e. a base64 string of length equal to or greater than 32 and less than or equal to 40 characters). The base i want to use is within the range 3
I'm new to Diffie-Hellman exchanges, are there any restrictions on the modulus, the base or the exponents that i should be aware of?
Is there an equation i can use to derive the ideal pair lengths, or do i have to pre calculate it and store it in an array.
Thanks,
I'm not sure what you are asking about.
For Diffie-Hellman you choose a safe or strong prime p between 2^2047 and 2^2048-1 in your case, then choose an element 0 < g < p-1 such that g^(p-1) mod *p*=1 but g^x mod p≠1 for all 0 < x < p-1 . p and g are constant parameters for your implementation. The size of g does not matter for the scheme. Now for a key exchange you sample 0 < a,b < p-1 uniformly and random, exchange g^a mod p and g^b mod p, and calculate g^ab mod p. Because of the random choice of a and b the result g^ab mod p is also random with 0 < (g^ab mod p) < p-1.
As you have already noticed you can then hash g^ab mod p to generate a short key (256 bit with sha256 for example).

32-bit fractional multiplication with cross-multiplication method (no 64-bit intermediate result)

I am programming a fixed-point speech enhancement algorithm on a 16-bit processor. At some point I need to do 32-bit fractional multiplication. I have read other posts about doing 32-bit multiplication byte by byte and I see why this works for Q0.31 formats. But I use different Q formats with varying number of fractional bits.
So I have found out that for fractional bits less than 16, this works:
(low*low >> N) + low*high + high*low + (high*high << N)
where N is the number of fractional bits. I have read that the low*low result should be unsigned as well as the low bytes themselves. In general this gives exactly the result I want in any Q format with less than 16 fractional bits.
Now it gets tricky when the fractional bits are more than 16. I have tried out several numbers of shifts, different shifts for low*low and high*high I have tried to put it on paper, but I can't figure it out.
I know it may be very simple but the whole idea eludes me and I would be grateful for some comments or guidelines!
It's the same formula. For N > 16, the shifts just mean you throw out a whole 16-bit word which would have over- or underflowed. low*low >> N means just shift N-16 bit in the high word of the 32-bit result of the multiply and add to the low word of the result. high * high << N means just use the low word of the multiply result shifted left N-16 and add to the high word of the result.
There are a few ideas at play.
First, multiplication of 2 shorter integers to produce a longer product. Consider unsigned multiplication of 2 32-bit integers via multiplications of their 16-bit "halves", each of which produces a 32-bit product and the total product is 64-bit:
a * b = (a_hi * 216 + a_lo) * (b_hi * 216 + b_lo) =
a_hi * b_hi * 232 + (a_hi * b_lo + a_lo * b_hi) * 216 + a_lo * b_lo.
Now, if you need a signed multiplication, you can construct it from unsigned multiplication (e.g. from the above).
Supposing a < 0 and b >= 0, a *signed b must be equal
264 - ((-a) *unsigned b), where
-a = 232 - a (because this is 2's complement)
IOW,
a *signed b =
264 - ((232 - a) *unsigned b) =
264 + (a *unsigned b) - (b * 232), where 264 can be discarded since we're using 64 bits only.
In exactly the same way you can calculate a *signed b for a >= 0 and b < 0 and must get a symmetric result:
(a *unsigned b) - (a * 232)
You can similarly show that for a < 0 and b < 0 the signed multiplication can be built on top of the unsigned multiplication this way:
(a *unsigned b) - ((a + b) * 232)
So, you multiply a and b as unsigned first, then if a < 0, you subtract b from the top 32 bits of the product and if b < 0, you subtract a from the top 32 bits of the product, done.
Now that we can multiply 32-bit signed integers and get 64-bit signed products, we can finally turn to the fractional stuff.
Suppose now that out of those 32 bits in a and b N bits are used for the fractional part. That means that if you look at a and b as at plain integers, they are going to be 2N times greater than what they really represent, e.g. 1.0 is going to look like 2N (or 1 << N).
So, if you multiply two such integers the product is going to be 2N*2N = 22*N times greater than what it should represent, e.g. 1.0 * 1.0 is going to look like 22*N (or 1 << (2*N)). IOW, plain integer multiplication is going to double the number of fractional bits. If you want the product to
have the same number of fractional bits as in the multiplicands, what do you do? You divide the product by 2N (or shift it arithmetically N positions right). Simple.
A few words of caution, just in case...
In C (and C++) you cannot legally shift a variable left or right by the same or greater number of bits contained in the variable. The code will compile, but not work as you may expect it to. So, if you want to shift a 32-bit variable, you can shift it by 0 through 31 positions left or right (31 is the max, not 32).
If you shift signed integers left, you cannot overflow the result legally. All signed overflows result in undefined behavior. So, you may want to stick to unsigned.
Right shifts of negative signed integers are implementation-specific. They can either do an arithmetic shift or a logical shift. Which one, it depends on the compiler. So, if you need one of the two you need to either ensure that your compiler just supports it directly
or implement it in some other ways.

Recognizing when to use the modulus operator

I know the modulus (%) operator calculates the remainder of a division. How can I identify a situation where I would need to use the modulus operator?
I know I can use the modulus operator to see whether a number is even or odd and prime or composite, but that's about it. I don't often think in terms of remainders. I'm sure the modulus operator is useful, and I would like to learn to take advantage of it.
I just have problems identifying where the modulus operator is applicable. In various programming situations, it is difficult for me to see a problem and realize "Hey! The remainder of division would work here!".
Imagine that you have an elapsed time in seconds and you want to convert this to hours, minutes, and seconds:
h = s / 3600;
m = (s / 60) % 60;
s = s % 60;
0 % 3 = 0;
1 % 3 = 1;
2 % 3 = 2;
3 % 3 = 0;
Did you see what it did? At the last step it went back to zero. This could be used in situations like:
To check if N is divisible by M (for example, odd or even)
or
N is a multiple of M.
To put a cap of a particular value. In this case 3.
To get the last M digits of a number -> N % (10^M).
I use it for progress bars and the like that mark progress through a big loop. The progress is only reported every nth time through the loop, or when count%n == 0.
I've used it when restricting a number to a certain multiple:
temp = x - (x % 10); //Restrict x to being a multiple of 10
Wrapping values (like a clock).
Provide finite fields to symmetric key algorithms.
Bitwise operations.
And so on.
One use case I saw recently was when you need to reverse a number. So that 123456 becomes 654321 for example.
int number = 123456;
int reversed = 0;
while ( number > 0 ) {
# The modulus here retrieves the last digit in the specified number
# In the first iteration of this loop it's going to be 6, then 5, ...
# We are multiplying reversed by 10 first, to move the number one decimal place to the left.
# For example, if we are at the second iteration of this loop,
# reversed gonna be 6, so 6 * 10 + 12345 % 10 => 60 + 5
reversed = reversed * 10 + number % 10;
number = number / 10;
}
Example. You have message of X bytes, but in your protocol maximum size is Y and Y < X. Try to write small app that splits message into packets and you will run into mod :)
There are many instances where it is useful.
If you need to restrict a number to be within a certain range you can use mod. For example, to generate a random number between 0 and 99 you might say:
num = MyRandFunction() % 100;
Any time you have division and want to express the remainder other than in decimal, the mod operator is appropriate. Things that come to mind are generally when you want to do something human-readable with the remainder. Listing how many items you could put into buckets and saying "5 left over" is good.
Also, if you're ever in a situation where you may be accruing rounding errors, modulo division is good. If you're dividing by 3 quite often, for example, you don't want to be passing .33333 around as the remainder. Passing the remainder and divisor (i.e. the fraction) is appropriate.
As #jweyrich says, wrapping values. I've found mod very handy when I have a finite list and I want to iterate over it in a loop - like a fixed list of colors for some UI elements, like chart series, where I want all the series to be different, to the extent possible, but when I've run out of colors, just to start over at the beginning. This can also be used with, say, patterns, so that the second time red comes around, it's dashed; the third time, dotted, etc. - but mod is just used to get red, green, blue, red, green, blue, forever.
Calculation of prime numbers
The modulo can be useful to convert and split total minutes to "hours and minutes":
hours = minutes / 60
minutes_left = minutes % 60
In the hours bit we need to strip the decimal portion and that will depend on the language you are using.
We can then rearrange the output accordingly.
Converting linear data structure to matrix structure:
where a is index of linear data, and b is number of items per row:
row = a/b
column = a mod b
Note above is simplified logic: a must be offset -1 before dividing & the result must be normalized +1.
Example: (3 rows of 4)
1 2 3 4
5 6 7 8
9 10 11 12
(7 - 1)/4 + 1 = 2
7 is in row 2
(7 - 1) mod 4 + 1 = 3
7 is in column 3
Another common use of modulus: hashing a number by place. Suppose you wanted to store year & month in a six digit number 195810. month = 195810 mod 100 all digits 3rd from right are divisible by 100 so the remainder is the 2 rightmost digits in this case the month is 10. To extract the year 195810 / 100 yields 1958.
Modulus is also very useful if for some crazy reason you need to do integer division and get a decimal out, and you can't convert the integer into a number that supports decimal division, or if you need to return a fraction instead of a decimal.
I'll be using % as the modulus operator
For example
2/4 = 0
where doing this
2/4 = 0 and 2 % 4 = 2
So you can be really crazy and let's say that you want to allow the user to input a numerator and a divisor, and then show them the result as a whole number, and then a fractional number.
whole Number = numerator/divisor
fractionNumerator = numerator % divisor
fractionDenominator = divisor
Another case where modulus division is useful is if you are increasing or decreasing a number and you want to contain the number to a certain range of number, but when you get to the top or bottom you don't want to just stop. You want to loop up to the bottom or top of the list respectively.
Imagine a function where you are looping through an array.
Function increase Or Decrease(variable As Integer) As Void
n = (n + variable) % (listString.maxIndex + 1)
Print listString[n]
End Function
The reason that it is n = (n + variable) % (listString.maxIndex + 1) is to allow for the max index to be accounted.
Those are just a few of the things that I have had to use modulus for in my programming of not just desktop applications, but in robotics and simulation environments.
Computing the greatest common divisor
Determining if a number is a palindrome
Determining if a number consists of only ...
Determining how many ... a number consists of...
My favorite use is for iteration.
Say you have a counter you are incrementing and want to then grab from a known list a corresponding items, but you only have n items to choose from and you want to repeat a cycle.
var indexFromB = (counter-1)%n+1;
Results (counter=indexFromB) given n=3:
`1=1`
`2=2`
`3=3`
`4=1`
`5=2`
`6=3`
...
Best use of modulus operator I have seen so for is to check if the Array we have is a rotated version of original array.
A = [1,2,3,4,5,6]
B = [5,6,1,2,3,4]
Now how to check if B is rotated version of A ?
Step 1: If A's length is not same as B's length then for sure its not a rotated version.
Step 2: Check the index of first element of A in B. Here first element of A is 1. And its index in B is 2(assuming your programming language has zero based index).
lets store that index in variable "Key"
Step 3: Now how to check that if B is rotated version of A how ??
This is where modulus function rocks :
for (int i = 0; i< A.length; i++)
{
// here modulus function would check the proper order. Key here is 2 which we recieved from Step 2
int j = [Key+i]%A.length;
if (A[i] != B[j])
{
return false;
}
}
return true;
It's an easy way to tell if a number is even or odd. Just do # mod 2, if it is 0 it is even, 1 it is odd.
Often, in a loop, you want to do something every k'th iteration, where k is 0 < k < n, assuming 0 is the start index and n is the length of the loop.
So, you'd do something like:
int k = 5;
int n = 50;
for(int i = 0;i < n;++i)
{
if(i % k == 0) // true at 0, 5, 10, 15..
{
// do something
}
}
Or, you want to keep something whitin a certain bound. Remember, when you take an arbitrary number mod something, it must produce a value between 0 and that number - 1.

Modular arithmetic

I'm new to cryptography and modular arithmetic. So, I'm sure it's a silly question, but I can't help it.
How do I calculate a from
pow(a,q) = 1 (mod p),
where p and q are known? I don't get the "1 (mod p)" part, it equals to 1, doesn't it? If so, than what is "mod p" about?
Is this the same as
pow(a,-q) (mod p) = 1?
The (mod p) part refers not to the right hand side, but to the equality sign: it says that modulo p, pow(a,q) and 1 are equal. For instance, "modulo 10, 246126 and 7868726 are equal" (and they are also both equal to 6 modulo 10): two numbers x and y are equal modulo p if they have the same remainder on dividing by p, or equivalently, if p divides x-y.
Since you seem to be coming from a programming perspective, another way of saying it is that pow(a,q)%p=1, where "%" is the "remainder" operator as implemented in several languages (assuming that p>1).
You should read the Wikipedia article on Modular arithmetic, or any elementary number theory book (or even a cryptography book, since it is likely to introduce modular arithmetic).
To answer your other question: there is no general formula for finding such an a (to the best of my knowledge) in general. Assuming that p is prime, and using Fermat's little theorem to reduce q modulo p-1, and assuming that q divides p-1 (or else no such a exists), you can produce such an a by taking a primitive root of p and raising it to the power (p-1)/q. [And more generally, when p is not prime, you can reduce q modulo φ(p), then assuming it divides φ(p) and you know a primitive root (say r) mod p, you can take r to the power of φ(p)/q, where φ is the totient function -- this comes from Euler's theorem.]
Not silly at all, as this is the basis for public-key encryption. You can find an excellent discussion on this at http://home.scarlet.be/~ping1339/congr.htm#The-equation-a%3Csup%3Ex.
PKI works by choosing p and q that are large and relatively prime. One (say p) becomes your private key and the other (q) is your public key. The encryption is "broken" if an attacker guesses p, given aq (the encrypted message) and q (your public key).
So, to answer your question:
aq = 1 mod p
This means aq is a number that leaves a remainder of 1 when divided by p. We don't care about the integer portion of the quotient, so we can write:
aq / p = n + 1/p
for any integer value of n. If we multiply both sides of the equation by p, we have:
aq = np + 1
Solving for a we have:
a = (np+1)1/q
The final step is to find a value of n that generates the original value of a. I don't know of any way to do this other than trial and error -- which equates to a "brute force" attempt to break the encryption.