RSA Cryptography in c# - cryptography

Suppose an RSA algorithm created private key on two machines. Is there any possibility that both keys are the same?

Short answer: No. There is a theoretical possibility, but even if you create a key every second you aren't likely to get the same one twice before the sun explodes.

Yes. Have you heard of the pigeon-hole principle?

Normally, you create RSA keys by randomly selecting extremely large numbers and checking whether they're prime.
Given the sizes of the numbers involved (100+ digits), the only reasonable possibility of a collision is if there's a problem in the random number generator, so that (at least under some circumstances) the numbers it picks aren't very random.
This was exactly the sort of problem that led to a break in the SSL system in Netscape (~4.0, if memory serves). In this particular case, the problem was in generating a session key, but the basic idea was the same -- a fair amount of the "random" bits that were used were actually pretty predictable, so an attacker who knew the sources of the bits could fairly quickly generate the same "random" number, and therefore the same session key.

yes. but probability is very low

In the RSA cryptosystem with public key (n,e), the private key (n,d) is generated such that n = p * q, where p, q are large N-bit primes and ed − 1 can be evenly divided by the totient (p − 1)(q − 1).
To generate the same private key, you essentially need to generate the same p,q,e so it is an abysmally small probability.

Related

Likelihood of Collision

I want to hash an internal account number and use the result as a unique public identifier for an account record. The identifier is limited to 40 characters. I have approximately 250 records with unique account numbers.
What is less likely to result in a collision.
Taking the SHA-1 of the SHA-256 hash of the account number.
Taking the SHA-256 of the account number and picking out 40 characters.
These approaches are identical (*), so you should use the second one. There is no reason to inject SHA-1 into the system. Any selection of bits out of SHA-256 are independent and "effectively random."
An alternate solution that may be convenient is to turn these into v5 UUIDs. If you keep your namespace secret (which is allowed), this may be a very nice way to do what you're describing.
(*) There are some subtleties around the fact that your using "characters" rather than bytes here, and you could get a larger space in 40 "characters" by using a better encoding than you're likely using. It's possible the spaces are a little different based on how you're actually encoding. But it deeply doesn't matter. These spaces are enormous, and the two approaches will be the same in practice, so use the one that only needs one algorithm.
Another approach that may meet your needs better is to stretch the identifiers. If the space is sufficiently sparse (i.e if the number of possible identifiers is dramatically larger than the number of actually used identifiers), then stretching algorithms like PBKDF2 are designed to handle exactly that. They are expensive to compute, but you can tune their cost to match your security requirements.
The general problem with just hashing is that hashes are very fast, and if your space of possible identifiers is very small, then it's easy to brute force. Stretching algorithms make the cost of guessing arbitrarily expensive, so large spaces are impractical to brute force. They do this without requiring any secrets, which is nice. The general approach is:
Select a "salt" value. This can be publicly known. It does not matter. For this particular use case, because every account number is different, you can select a single global salt. (If the protected data could be the same, then it's important to have different salts for each record.)
Compute PBKDF2(salt, iterations, length, payload)
The number of iterations tunes how slow this operation is. The output is "effectively random" (just like a hash) and can be used in the same ways.
A common target for iterations is a value that delivers around 80-100ms. This is fairly fast on the server, but is extremely slow for brute-forcing large spaces, even if the attacker has better hardware than yours. Ideally your space should take at least millions of years to brute force (seriously; this is the kind of headroom we typically like in security; I personally target trillions of years). If it's smaller than a few years, then it probably can be brute forced quickly by throwing more hardware at it.
(Of course all of these choices can be turned based on your attack model. It depends on how dedicated and well-funded you expect attacks to be.)
A 40 character ID is 320 bits, which gives you plenty of space. With only 250 records, you can easily fit a unique counter into that. Three digits is only 24 bits, and you have the range 000 to 999 to play with. Fill up the rest of the ID with, say, the hex expression of part of the SHA-256 hash. With a 3-digit ID, that leaves 37 places for hex which covers 37*4 = 148 bits of the Sha-256 output.
You may want to put the counter in the middle of the hex string in a fixed position instead of at the beginning or end to make it less obvious.
<11 hex chars><3 digit ID><26 hex chars>

Generating public/private key pair based on input

OpenSSL provides tools to generate random public/private key pairs. Is there any mechanism to deterministically generate a pair based on some initial value?
For example, given the string 'abcd', generate a public/private key pair, such that the same public/private key pair can be generated again using the same string.
For sure, just use your password in a PBKDF to generate a key like array of bytes (random salt and high iteration count required). Then use this array of bytes as seed for a PRNG. Make sure that you always use the same PRNG! Then use that PRNG as input for RSA_generate_key. Make sure that generate key implementation is not changed.
Please read the answers on Initialize a PRNG with a password on crypto.stackexchange.com. Note that usually the private key is encrypted instead, e.g. using the PKCS#12 container. Note that both PKCS#12 containers and the method above are vulnerable to brute force attacks. Most passwords do deliver a very limited amount of entropy, making these brute force attacks more feasible. The advantage of the PKCS#12 container is that you do not have to store it with the ciphertext, it is only required during signature generation or decryption. Using a 128 bit hex value as password would alleviate the issue of brute forcing, but you likely won't be able to remember it.
Note that RSA key pair generation takes a lot of time (and finding a large prime has a nondeterministic running time, so it may take very long for specific key pairs). EC F(p) keys would be much less cumbersome.
Feasible? Certainly. Useful? Possibly. Fraught with danger? Certainly.

CSPRNG: Any time guarantees?

Does a cryptographically secure pseudorandom number generator also guarantee, that the entropy is gathered in such a way, that the value cannot occur twice when generated at a different time?
I know it's highly unlikely already, but are there specific guarantees?
I need to generate a series of unique IDs from a CSPRNG that must not have conflicts.
An ideal (CS)PRNG assures you that the probability of extracting a certain value is constant and does not change over time, no matter whether that value was already output in the past.
For instance, let's assume your ID is 32 bits long and today you extract 0x12345678. What just happened had a probability of 1/(2^32).
Tomorrow (and at any point in the future), you will still have the same probability 1/(2^32) of extracting the value 0x12345678.
However, the birthday paradox tells us that if you generate 65 536 (=2^(32/2)) values, there is a probability of 50% that two IDs are the same.
In other words, there are no hard guarantees the output of the CSPRNG will not be the same. Whether the chances are sufficiently small strongly depends on how long your ID is and how many IDs you expect to have in total over the whole lifetime of your system (special attention should be paid to security concerns when the attacker can generate IDs at will).
For completeness, all of that is applicable to any good PRNG - including the simplest coin to flip. Cryptographically Strong PRNGs have additional properties about complexity of predicting future or past outputs from any given output (it should be hard), ability to recover from compromise of the state, and ability to feed entropy.

what is difference between decoding time for aes 128/192/256, is aes192/256 too paranoic?

given this supercomputer : http://en.wikipedia.org/wiki/Tianhe-1A - that is no.1 at TOP500, operating at 2.5 petaFLOPS, how long it would take on average to decrypt properly encoded (that is with random password) string in these three ciphers ?
A bruteforce attack in the key space on even AES128 isn't currently feasible. But as security is only as strong as the weakest part of it, you usually attack the password which almost always has an entropy much smaller than the keysize.
You can't encode based on a password with raw AES. AES uses a key.
You first need to derive a key from the password, and this step is crucial to the security. Typically you use a password-based-key-derivation-function such as PBKDF2 to derive the key from the password. You need to use a random salt and an appropriate number of iterations.
And of course the password entropy is very important. An attacker will first try dictionary words and their variations and then continue on to brute forcing short passwords. How fast this is depends on the number of iterations of your key derivation.
There are recent attacks that reduce the effectiveness of AES256. Therefore, Bruce Shneier reccommends AES128.

Rainbow tables as a solution to large prime factoring

In explanations I've read about public key cryptography, it is said that some large number is come up with by multiplying together 2 extremely large primes. Since factoring the product of large primes is almost impossibly time-consuming, you have security.
This seems like a problem that could be trivially solved with rainbow tables. If you know the approximate size of primes used and know there are 2 of them, you could quickly construct a rainbow table. It'd be a mighty large table, but it could be done and the task could be parallelized across hardware.
Why are rainbow tables not an effective way to beat public key crypto based on multiplying large primes?
Disclaimer: obviously tens of thousands of crazy-smart security conscious people didn't just happen to miss for decades what I thought up in an afternoon. I assume I'm misunderstanding this because I was reading simplified layman explanations (eg: if more than 2 numbers are used) but I don't know enough yet to know where my knowledge gap is.
Edit: I know "rainbow table" relates to using pre-calculated hashes in a lookup table but the above sounds like a rainbow table attack so I'm using the term here.
Edit 2: As noted in the answers, there's no way to store just all of the primes, much less all of their products.
This site says there are about this many 512 bit primes: ((2^511) * 1) / (512 log(2)) = 4.35 × 10151
The mass of the sun is 2 × 1030 kg or 2 × 1033 g
That's 2.17 × 10124 primes per gram of the sun.
Qty. of 512 bit numbers that can fit in a kilobyte: 1 kb = 1024 bytes = 8192 bits / 512 = 16
That can fit in a terabyte: 16*1024*1024*1024 = 1.72 × 1010
Petabyte: 16*1024*1024*1024*1024 = 1.72 × 1013
Exabyte: 16*1024*1024*1024*1024*1024 = 1.72 × 1016
Even if 1 exabyte weighed 1 gram, we're nowhere close to reaching the 2.17 × 10124 needed to be able to fit all of these numbers into a hard drive with the mass of the sun
From one of my favorite books ever, Applied Cryptography by Bruce Schneier
"If someone created a database of all primes, won't he be
able to use that database to break public-key algorithms?
Yes, but he can't do it. If you could store one gigabyte
of information on a drive weighing one gram, then a list
of just the 512-bit primes would weigh so much that it
would exceed the Chandrasekhar limit and collapse into a
black hole... so you couldn't retrieve the data anyway"
In other words, it's impossible or unfeasible, or both.
The primes used in RSA and Diffie-Hellman are typically on the order of 2512. In comparison, there are only about 2256 atoms in the known universe. That means 2512 is large enough to assign 2256 unique numbers to every atom in the universe.
There is simply no way to store/calculate that much data.
As an aside, I assume you mean "a large table of primes" - rainbow tables are specificly tailored for hashes, and have no real meaning here.
I think the main problem is that rainbow tables pregenerated for certain algorithms use a rather "small" range (usually something in the range of 128 bits). This doesn't usually cover the whole range, but speeds the brute force process up. They usually consume some TB of space.
In prime factorization, primes are much larger (for secure RSA, 2048 bits are recommended). So the rainbow tables wouldn't be "mighty large", but impossible to store anywhere (using up like millions of TB of space).
Also, rainbow tables use hash chains too further speed up the process (Wikipedia has a good explanation) which can't be used for primes.