Why are prime numbers used in Diffie-Hellman key exchange? - cryptography

Diffie-Hellman key exchange algorithm uses operations like 2^8 mod n where n is a prime number.
What is the reason for using prime numbers instead of other numbers?

Prime numbers don't break down into smaller factors, making cracking the code or hash much harder than using, say 12, which breaks down with /2 or /3 or /4 or /6. The prime number 7, is less than 12, but only has the factor of 7, so there are less attack vectors. This is a drastic oversimplification, but hopefully helps a little.
Here's a specific example:
2^x mod 12
This only has 2 possible values, for any x above 1: 4 or 8. Since this is used to generate the shared key in a similar way, you end up with the same two possibilities. In other words, once you know that the base and mod are 2 and 12 (which any computer listening in on the conversation would be able to pick up), you automatically know that the shared secret encryption key can be only one of two possibilities. It only takes two simple operations to determine which decrypts the message. Now let's look at a prime mod:
2^x mod 13
This has 12 different possibilities, for x>1. It also has 12 different possible shared keys that can be generated. Thus it requires 6x more computing power to decrypt a message based on this prime modulus, than it would on the mod 12 example.
2^x mod 14 has 4 possibilities.
2^x mod 15 has 4
2^x mod 16 collapses completely into 1 possibility after x=3 (which is why choosing a base that fits the DH requirements is important)
2^x mod 17 has... you guessed it, 16 possibilities! Aren't primes cool? :)
Thus, yes the factorability of the modulus number has everything to do with crackability of the encrypted message.

Related

DEFLATE: how to handle "no distance codes" case?

I mostly get RFC 1951, however I'm not too clear on how to manage the case where (when using dynamic Huffman tables) no distance codes are needed or present. For example, let's take the input:
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890987654321ZYXWVUTSR
where no backreference is possible since there are no repetitions of length >= 3.
According to RFC 1951, at least one distance code must be present regardless, otherwise it wouldn't be possible to encode HDIST - 1. I understand, according to the reference, that such code should be of zero bits to signal "no distance codes".
One distance code of zero bits means that there are no distance codes
used at all (the data is all literals).
In infgen symbols, I'd expect to see a dist 0 0.
Analyzing what gzip does with infgen, however, I see that TWO distance codes are emitted (each 1 bit long) for the above input (even though none is actually used then):
! infgen 2.4 output
!
gzip
!
last
dynamic
litlen 48 6
litlen 49 6
litlen 50 6
...cut...
litlen 121 6
litlen 122 6
litlen 256 6
dist 0 1
dist 1 1
literal 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890987654321Z
literal 'YXWVUTSR
end
!
crc
length
So what's the correct behavior in these cases?
If there are no matches in the deflate block, there will be no lengths from the length/literal code, and so the decoder will never look for a distance code. In that case, what would make the most sense is to provide no information at all about a distance code.
However the format does not permit that, since the 5-bit HDIST value in the header is interpreted as 1 to 32 distance codes, for which lengths must be provided for in the header. You must provide at least one distance code length in the header, even though it will never be used.
There are several valid things you can do in that case. RFC 1951 notes you can provide a single distance code (HDIST == 0, meaning one length), with length zero, which would be just one zero in the list of lengths.
It is also permitted to provide a single code of length one, or you could do as zlib is doing, which is to provide two codes of length one. You can actually put any valid distance code description you like there, and it will still be accepted.
As to why zlib's deflate is choosing to define two codes there, I can only guess that Jean-loup was being conservative, writing something he knew that even an over-simplified inflator would have to accept. Both gzip and zopfli do the same thing. They all do the same thing when there is only one distance code used. They could emit just the single one-bit distance code, per the RFC, but they emit two single-bit distance codes, one of which is never used.
Really the right thing to do would be to write a single zero length as noted in the RFC, which would take the fewest number of bits in the header. I will consider updating zlib to do that, to eke out a few more bits of compression.

RSA algorthum calculations

I have been working though a network book and hit the RSA section.
Consider the RSA algorithm with p=5 and q=11.
so I get N = p*q = 55 right?
and z = (p-1) * (q -1) = 40
I think I got this right but the book is not very clear on how to calculate this.
The example in the book says that e = 3 but does not give a reason why. Because the author likes it or is there another reason?
and how do i go about finding d so that de= 1(mod z) and d < 160
Thanks for any help with this its a bit above me right now.
Your calculations of n and z are correct.
An RSA cryptosystem consists of three variables n, d and e. Variable e is the least important of the three, and is usually chosen arbitrarily to make computations simple; 3 and 65537 are the most common choices for e. The only requirements are that e is odd and co-prime to the totient (z in your implementation); thus e is frequently chosen prime so that it will be co-prime to the totient no matter what totient is chosen. The reason that 3 and 65537 are frequently used for e is because it makes the computation easy; both numbers have only two 1-bits in their binary representation, so only two iterations of a complicated loop are needed.
You can see an implementation of an RSA cryptosystem at my blog. If you poke around there, you will also find some other crypto-related stuff that may interest you.
what you are looking for is the extended euclidean algorithm
for an example see wikipedia or here

Why are the outputs of this pseudo random number generator (LFSR) so predictable?

Recently I asked here, how to generate random numbers in hardware and was told to use an LFSR. It will be random but will start repeating after a certain value.
The problem is that the random numbers generated are so predictable that the next value can be easily guessed. For example check the simulation below:
The next "random" number can be guessed by adding the previous number with a +1 of itself. Can someone please verify if this is normal and to be expected.
Here is the code I used for the LFSR:
module LFSR(
input clock,
input reset,
output [12:0] rnd
);
wire feedback = rnd[12] ^ rnd[3] ^ rnd[2] ^ rnd[0];
reg [12:0] random;
always # (posedge clock or posedge reset)
begin
if (reset)
random <= 13'hF; //An LFSR cannot have an all 0 state, thus reset to FF
else
random <= {random[11:0], feedback}; //shift left the xor'd every posedge clock
end
assign rnd = random;
endmodule
The location of the bits to XOR are picked up from here: The table page 5
LFSR only generates one random bit per clock. It doesn't generate a new (in your case) 13-bit number each cycle. The other 12 bits in rnd are just the old random values, so it will not appear very random.
If you need a 13-bit random number, then you must either sample LFSR every 13 cycles, or put 13 LFSR in parallel with different seeds, and use the 13 zero bits as your random number.
An LFSR is most certainly not 'random' in any real sense whatsoever. To quote Von Neumann "Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin." I haven't looked up whether the feedback terms you've chosen are maximal, meaning that they'll provide a sequence with a length equal to the number of bits in your LFSR, but that's the best you can do.
So yes, the next value in your LFSR is extremely predictable. If you need something more securely 'random' you need to look into cryptographic methods, these depend on a secret key of course, and are also much more computationally intensive than an LFSR. You 'get what you pay for' though.
Incidentally, a system where you get predictable 'random' numbers is highly useful in it's own right. Usually for simulation purposes.

How can I ensure that when I shuffle my puzzle I still end up with an even permutation?

I'm interested making an implementation of the 14-15 puzzle:
I'm creating an array with the values 0 - 15 in increasing order:
S = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }
Now, what I want to do is shuffle them to create a new instance of the puzzle. However, I know that if I create a board with an "odd permutation" than it is unsolvable.
Wikipedia says I need to create the puzzle with an even permutation. I believe this means that I simply have to do ensure I do an even number of swaps?
How would I modify Fisher-Yates so I ensure I end up with an even permutation at the end? If I do a swap for every element in the array that would be 16 swaps which I believe would be an even permutation. However, do I need to be concerned about swapping with itself? Is there any other way to ensure I have a valid puzzle?
You should be able to use Fischer-Yates.
Generate a random permutation using Fischer-Yates.
Check if it is even.
If it is not even, swap the first two elements of the permutation.
Consider an even permutation P = x1 x2 .... xn.
Fischer yates generates P with probabilty 1/n!.
It generates x2 x1 ... xn with probability 1/n!.
Thus the probability that the above process generates the permutation P is 2/n! = 1/(n!/2)
n!/2 is the number of even permutations.
Thus the above process generates even permutations with same probability.
To check if a permutation is even: count the parity of the number of inversions in the permutation.
Here's what I found already answered here:
"This problem basically boils down to doing a standard shuffle algorithm with a small twist.
The key observation is that for the 15-puzzle to be solvable the parity of the permutation and the parity of the blank square must be the same.
First create a random permutation using a standard algorithm for that purpose. For example the Knuth shuffle algorithm: Random Permutations
The advantage of using Knuth's shuffle ( or Fisher-Yates shuffle ) is that it involves swapping numbers, so you can easily keep track of the parity of the permutation. Each swap either keeps the parity ( if you swap 1 & 3 ), or changes the parity ( if you swap 1 & 2 ).
Place the blank square on the same parity as the parity of the permutation, and you are done. If the permutation has odd parity then place the blank an odd square (1,3,5,... chosen at random ). If the permutation has even parity then place the blank on an even square."
Also, "In practice, roughly every 4 consecutively generated permutations will consist of two even and two odd permutations, so even the per-iteration cost is negligible."
You can also check this site out: http://eusebeia.dyndns.org/epermute
I wouldn't really try altering the algorithm itself, it's probably moot for this application anyway. From what I see there are two options:
Just re-shuffle until you get an even permutation. This would probably throw away half a permutation on average (well, maybe a little more), but the extra work is very likely negligible.
Shuffle the board by using the game's moves itself. That is, just do a few hundred random moves. Since you're not taking all pieces out and re-assembling them you can't generate a state that's impossible to solve.
Fisher-Yates depends on the ability to swap any element with any other element. Since this violates the physics of the puzzle, I don't think you can use it here.
The naive solution is to do what you would do manually, randomly select one of the tiles adjacent to the empty one and swap with it. I don't know how many swaps you'd need to do to get a good shuffle.
UPDATED ANSWER:
Before I introduce this algorithm, I need to define two terms: inversion and polarity.
Inversion: A pair of objects that are in the reverse order from where they ought to be. For more information on inversion, refer Counting inversions in an array
Polarity of a puzzle is whether the total number of inversions among all tiles is even or odd. A puzzle with 10 inversions has even polarity; a puzzle with 7 inversions has odd polarity.
Consider 3x3 puzzle like this:
| 6 | 3 | 2 |
| .. | 4 | 7 |
| 5 | 1 | 0 |
Counting all inversions here, we get: (i) 6 is inverted with 0, 1, 2, 3, 4 and 5. (ii) 3 is inverted with 0, 1, and 2. (iii) 2 is inverted with 0 and 1. (iv) 4 is inverted with 0 and 1. (v) 7 is inverted with 0, 1 and 5. (vi) 5 is inverted with 0 and 1. (vii) 1 is inverted with 0. In total we have 19 inversions.
If the width of puzzle is even number then moving a tile up or down will reverse the polarity so it is important that the puzzle is having even polarity when the empty tile is in last row. For this we will add the distance of the empty tile from the bottom row to our total inversions.
Now we know that a puzzle is solvable if it has even polarity (or permutations). So if our polarity is even then our problem is solved but for odd polarity we have to do this:
If the empty tile is not in the first row, then swap first two adjacent tiles in first row. This will change the polarity by 1 and we will have solvable puzzle having even polarity.
But if empty tile is in first row then swap adjacent tiles in last row. This would make puzzle solvable. So at the end you always end up with a solvable puzzle.
I hope I satisfy the answering requirements of stackoverflow for this question.

How do I process enormous numbers? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Most efficient implementation of a large number class
Suppose I needed to calculate 2^150000. Obviously that number is going to exceed the size of an int, float, or double. How can I make a data type that allows normal math functions but exceeds the basic number types?
If this is a "depends which language you use" kind of deal. I will say C#.
See
Most efficient implementation of a large number class
for some leads.
If C# is not cast in stone, and you want something that just works out of the box, then there are several options. The one I know best is Python, but I think that languages like Scheme and Ruby support large numbers, too.
Python: 2**150000. Prints the result after about 1 second.
If you want free mathematics software, look at Maxima or Sage.
You might also consider using Frink, which is a language with the native capability of dealing with measurement units.
It computes 2^150000 without difficulty, deals with fractions (e.g. 1/3+2/5 --> 11/15), computes 3 meters + 2 inch --> 3.0508 m and is a full programming language.
Frink - Copyright 2000-2008 Alan Eliasen, eliasen#mindspring.com
http://futureboy.us/frinkdocs/
Several languages have built in support for arbitrary large numbers. You could use Mathematica, for example. I tried your example in Mathematica, and the result has 45,155 digits. I tried the same example with bc on a Unix machine. bc supports extended precision, but not that extended; it bombed on the example.
Lisp is your friend. Default biginteger numbers.
I find it very frustrating to use a language without arbitrarily large numbers: it seems nonsensical to be able to use ordinary operators like addition on most numbers, but to have to switch to method calls on a BigInt instance simply because of its size.
A whole bunch of languages have more complete numeric towers, and seamlessly coerce when needed; e.g., Allegro Common Lisp evaluates and prints all 45,155 digits of (expt 2 150000) in 1ms.
cl-user(2): (time (expt 2 150000))
; cpu time (non-gc) 0 msec user, 0 msec system
; cpu time (gc) 0 msec user, 0 msec system
; cpu time (total) 0 msec user, 0 msec system
; real time 1 msec
; space allocation:
; 2 cons cells, 18,784 other bytes, 0 static bytes
There is a product in C called calc which is an arbitrary precision calculator. I used it once when working as a researcher and found it fairly straightforward to use...
http://sourceforge.net/projects/calc/
It can be programmed for difficult or long calculations and can accept arguments from the command line. In interactive mode, it accepts one command at a time, and displays the answer.
Ordinarily the commands are simply expressions such as:
3 * (4 + 1)
and calc will print:
15
Calc does the arithmetic operators +, -, /, * as well as ^ (exponentiation), % (modulus) and // (integer divide).
For example:
3 * 19 ^ 43 - 1
will produce:
29075426613099201338473141505176993450849249622191102976
Calc values can be VERY large. For example:
2 ^ 23209 - 1
will print:
402874115778988778181873329071 ... loads of digits ... 3779264511
Hope this helps...
I don't know C# but I do know the Ruby programming language has the BigDemical class that seems to allow numbers of unlimited size.
Python has a bignum library. If you need to implement a bignum library in another language you can at least use the Python one as reference for validating your work. Note that bignums have a few implementation gotchas that aren't immediately obvious if you don't know what you're looking for.