Generator G's requirement to be a primitive root modulo p in the Diffie Hellman algorithm - cryptography

Having searched, I've found myself confused by the use of P and G in the Diffie Hellman algorithm. There is requirementy that P is prime, and G is a primitive root of P.
I understand the security is based on the difficulty of factoring the result of two very large prime numbers, so I have no issue with that. However, there appears to be little available information on the purpose of G being a primitive root of P. Can anyone answer why this requirement exists (with references if possible)? Does it just increase the security? Given that shared keys can be created with apparently any combination of p and g, even ones that are not prime, I find this intriguing. It can surely only be for security? If so, how does it increase it?
Thanks in advance
Daniel

If g is not a primitive root of p, g will only generate a subgroup of GFp. This has consequences for the security properties of the system: the security of the system will only be proportional to the order of g in GFp instead of proportional to the full order of GFp.
To take a small example: select p=13 and g=3.
The order of 3 in GF_13 is 3 (3^1=3, 3^2=9, 3^3=1).
Following the usual steps of Diffie-Hellman, Alice and Bob should each select integers a, b between 1 and p-1 and calculate resp. A = ga and B = gb. To brute force this, an attacker should expect to try all possible values of a (or b) between 1 and p-1 until he finds a value that yields A (or B). But since g was not a primitive root modulo p, he only need to try the values 1, 2 and 3 in order to find a solution a' so that A = ga'. And the secret is s = gab = (ga)b =(ga')b = ga'b = (gb)a' = Ba', which the attacker can now calculate.

There is no requirement that the generator g used for Diffie-Hellman is a primitive root nor is this even a common choice. Much more popular is to choose g such that it generates a prime order subgroup. I.e. the order of g is a prime q, which is a large prime factor of p-1.
For example the Diffie-Hellman groups proposed for IKE have been chosen such that p is a safe prime and g generates the subgroup of order (p-1)/2.
One motivation for choosing g as a generator of a big prime order subgroup is that this allows to make
the decisional Diffie-Hellman assumption. This assumption does not hold if g is a primitive root, and that makes the analysis of the implemented protocols somewhat harder.

The security of Diffie-Hellman is not based on the difficulty of factoring. It is based on the (assumed) difficulty of calculating general discrete logarithms.
g must be a primitive root of p for the algorithm to be correct and useable. It ensures that for every number 0 <= x < p, there is a distinct value of gx mod p. That is, it ensures that g can "generate" every value in the finite field.

Related

Prove the language is not context-free?

How can you prove the language L given below is not context-free, I would like to know does my proof given below makes any sense, if not, what would be the correct method to prove?
L = {a^n b^n c^i|n ≤ i ≤ 2n}
I am trying to solve this language by contradiction. Suppose L is regular and with pumping length p such that S = a^p b^p c^p. Observe that S ∉ L. Since there must be a pumping cycle xy with length less than p, this can duplicate y which consists of some number of b to cause x(y^2)z to enter the language because the number of b exceeds the number of c by no longer bound by the given condition of i which is n ≤ i ≥ 2n, therefore, we have contradiction and hence language L is not context-free.
The proof is by contradiction. Assume the language is context-free. Then, by the pumping lemma for context-free languages, any string in L can be written as uvxyz where |vxy| < p, |vy| > 0 and for all natural numbers k, u(v^k)x(y^k)z is in the language as well. Choose a^p b^p c^(p+1). Then we must be able to write this string as uvxyz so that |vy| > 0. There are several possibilities to consider:
v and y consist only of a's. In this case, pumping in either direction causes the numbers of a's and b's to differ, producing a string not in the language; so this cannot be the case.
v and y consist only of a's and b's. In this case, pumping might keep the numbers of a's and b's the same, but pumping up will eventually cause the number of c's to be less than the number of a's and b's; so this cannot be the case.
v and y consist only of b's. This case is similar to (1) above and so cannot be a valid choice.
v and y consist only of b's and c's. Also similar to (1) and (3) in that pumping will cause the numbers of a's and b's to differ.
v and y consist only of c's. Pumping up will eventually cause there to be more c's than twice the number of a's; so this cannot be the case either.
No matter how we choose v and y, pumping will produce strings not in the language. This is a contradiction; this means our assumption that the language is context-free must have been incorrect.

Return highest or lowest value Z notation , formal method

I am new to Z notation,
Lets say I have a function f defined as X |--> Y ,
where X is string and Y is number.
How can I get highest Y value in this function? Does 'loop' exist in formal method so I can solve it using loop?
I know there is recursion in Z notation, but based on the material provided, I only found it apply in multiset or bag, can it apply in function?
Any extra reference application of 'loop' or recursion application will be appreciated. Sorry for my English.
You can just use the predefined function max that takes a set of integers as input and returns the maximum number. The input values here are the range (the set of all values) of the function:
max(ran(f))
Please note that the maximum is not defined for empty sets.
Regarding your question about recursion or loops: You can actually define a function recursively but I think your question aims more at a way to compute something. This is not easily expressed in Z and this is IMO a good thing because it is used for specifications and it is not a programming language. Even if there wouldn't be a max or ran function, you could still specify the number m you are looking for by:
\exists s:String # (s,m):f /\
\forall s2:String, i2:Z # (s2,i2):f ==> i2 <= m
("m is a value of f, belonging to an s and all other values i2 of f are smaller or equal")
After getting used to the style it is usually far better to understand than any programming language (except your are trying to describe an algorithm itself and not its expected outcome).#
Just for reference: An example of a recursive definition (let's call it rmax) for the maximum would consist of a base case:
\forall e:Z # rmax({e}) = e
and a recursive case:
\forall e:Z; S:\pow(Z) #
S \noteq {} \land
rmax({e} \cup S) = \IF e > rmax(S) \THEN e \ELSE rmax(S)
But note that this is still not a "computation rule" of rmax because e in the second rule can be an arbitrary element of S. In more complex scenarios it might even be not obvious that the defined relation is a function at all because depending on the chosen elements different results could be computed.

RSA algorthum calculations

I have been working though a network book and hit the RSA section.
Consider the RSA algorithm with p=5 and q=11.
so I get N = p*q = 55 right?
and z = (p-1) * (q -1) = 40
I think I got this right but the book is not very clear on how to calculate this.
The example in the book says that e = 3 but does not give a reason why. Because the author likes it or is there another reason?
and how do i go about finding d so that de= 1(mod z) and d < 160
Thanks for any help with this its a bit above me right now.
Your calculations of n and z are correct.
An RSA cryptosystem consists of three variables n, d and e. Variable e is the least important of the three, and is usually chosen arbitrarily to make computations simple; 3 and 65537 are the most common choices for e. The only requirements are that e is odd and co-prime to the totient (z in your implementation); thus e is frequently chosen prime so that it will be co-prime to the totient no matter what totient is chosen. The reason that 3 and 65537 are frequently used for e is because it makes the computation easy; both numbers have only two 1-bits in their binary representation, so only two iterations of a complicated loop are needed.
You can see an implementation of an RSA cryptosystem at my blog. If you poke around there, you will also find some other crypto-related stuff that may interest you.
what you are looking for is the extended euclidean algorithm
for an example see wikipedia or here

What is the language of this deterministic finite automata?

Given:
I have no idea what the accepted language is.
From looking at it you can get several end results:
1.) bb
2.) ab(a,b)
3.) bbab(a, b)
4.) bbaaa
How to write regular expression for a DFA
In any automata, the purpose of state is like memory element. A state stores some information in automate like ON-OFF fan switch.
A Deterministic-Finite-Automata(DFA) called finite automata because finite amount of memory present in the form of states. For any Regular Language(RL) a DFA is always possible.
Let's see what information stored in the DFA (refer my colorful figure).
(note: In my explanation any number means zero or more times and Λ is null symbol)
State-1: is START state and information stored in it is even number of a has been come. And ZERO b.
Regular Expression(RE) for this state is = (aa)*.
State-4: Odd number of a has been come. And ZERO b.
Regular Expression for this state is = (aa)*a.
Figure: a BLUE states = EVEN number of a, and RED states = ODD number of a has been come.
NOTICE: Once first b has been come, move can't back to state-1 and state-4.
State-5: comes after Yellow b. Yellow b means b after odd numbers of a.
Once you gets b after odd numbers of a(at state-5) every thing is acceptable because there is self a loop for (b,a) at state-5.
You can write for state-5 : Yellow-b followed-by any string of a, b that is = Yellow-b (a + b)*
State-6: Just to differentiate whether odd a or even.
State-2: comes after even a then b then any number of b. = (aa)* bb*
State-3: comes after state-2 then first a then there is a loop via state-6.
We can write for state-3 comes = state-2 a (aa)* = (aa)*bb* a (aa)*
Because in our DFA, we have three final states so language accepted by DFA is union (+ in RE) of three RL (or three RE).
So the language accepted by the DFA is corresponding to three accepting states-2,3,5, And we can write like:
State-2 + state-3 + state-5
(aa)*bb* + (aa)*bb* a (aa)* + Yellow-b (a + b)*
I forgot to explain how Yellow-b comes?
ANSWER: Yellow-b is a b after state-4 or state-3. And we can write like:
Yellow-b = ( state-4 + state-3 ) b = ( (aa)*a + (aa)*bb* a (aa)* ) b
[ANSWER]
(aa)*bb* + (aa)*bb* a (aa)* + ( (aa)*a + (aa)*bb* a (aa)* ) b (a + b)*
English Description of Language: DFA accepts union of three languages
EVEN NUMBERs OF a's, FOLLOWED BY ONE OR MORE b's,
EVEN NUMBERs OF a's, FOLLOWED BY ONE OR MORE b's, FOLLOWED BY ODD NUMBERs OF a's.
A PREFIX STRING OF a AND b WITH ODD NUMBER OF a's, FOLLOWED BY b, FOLLOWED BY ANY STRING OF a AND b AND Λ.
English Description is complex but this the only way to describe the language. You can improve it by first convert given DFA into minimized DFA then write RE and description.
Also, there is a Derivative Method to find RE from a given Transition Graph using Arden's Theorem. I have explained here how to write a regular expression for a DFA using Arden's theorem. The transition graph must first be converted into a standard form without the null-move and single start state. But I prefer to learn Theory of computation by analysis instead of using the Mathematical derivation approach.
I guess this question isn't relevant anymore :) and it's probably better to guide you through it then just stating the answer, but I think I got a basic expression that covers it (it's probably minimizable), so i'll just write it down for future searchers
(aa)*b(b)* // for stoping at 2
U
(aa)*b(b)*a(aa)* // for stoping at 3
U
(aa)*b(b)*a(aa)*b((a)*(b)*)* // for stoping at 5 via 3
U
a(aa)*b((a)*(b)*)* // for stoping at 5 via 4
The examples (1 - 4) that you give there are not the language accepted by the DFA. They are merely strings that belong to the language that the DFA accepts. Therefore, they all fall in the same language.
If you want to figure out the regular expression that defines that DFA, you will need to do something called k-path induction, and you can read up on it here.

RSA private exponent determination

My question is about RSA signing.
In case of RSA signing:
encryption -> y = x^d mod n,
decryption -> x = y^e mod n
x -> original message
y -> encrypted message
n -> modulus (1024 bit)
e -> public exponent
d -> private exponent
I know x, y, n and e. Knowing these can I determine d?
If you can factor n = p*q, then d*e ≡ 1 (mod m) where m = φ(n) = (p-1)*(q-1), (φ(m) is Euler's totient function) in which case you can use the extended Euclidean algorithm to determine d from e. (d*e - k*m = 1 for some k)
All these are very easy to compute, except for the factoring, which is designed to be intractably difficult so that public-key encryption is a useful technique that cannot be decrypted unless you know the private key.
So, to answer your question in a practical sense, no, you can't derive the private key from the public key unless you can wait the hundreds or thousands of CPU-years to factor n.
Public-key encryption and decryption are inverse operations:
x = ye mod n = (xd)e mod n = xde mod n = xkφ(n)+1 mod n = x * (xφ(n))k mod n = x mod n
where (xφ(n))k = 1 mod n because of Euler's theorem.
The answer is yes under two conditions. One, somebody factors n. Two, someone slips the algorithm a mickey and convinces the signer to use one of several possible special values for x.
Applied Cryptography pages 472 and 473 describe two such schemes. I don't fully understand exactly how they would work in practice. But the solution is to use an x that cannot be fully controlled by someone who wants to determine d (aka the attacker).
There are several ways to do this, and they all involve hashing x, fiddling the value of the hash in predictable ways to remove some undesirable properties, and then signing that value. The recommended techniques for doing this are called 'padding', though there is one very excellent technique that does not count as a padding method that can be found in Practical Cryptography.
No. Otherwise a private key would be of no use.