As we know DH algorithm based upon 2 large prime numbers, which are used to be called as prime and base
I am writing app which implements DH key exchange algorithm. From security point of view should I take care for securing prime and base numbers? And what are implications if I wouldn't do that?
the prime number and base are public values and you don't need take care for the security of that values
That values are the equivalent to a public key in RSA
Related
I would like to ask about best practices regarding a usage of an initialization vector (IV) and a key for symmetric cryptography algorithms.
I want to accept messages from a client, encrypt them and store in a backend. This will be done over a time, and there will be requests coming at a later time for pooling out the messages and return them in a readable form.
According what I know, the key can be the same during the encryption of multiple separate messages. The IV should change with every new encryption. This however, will cause problems, because every message will need a different IV for de-cryption at a later time.
I’d like to know if this is the best way of doing it. Is there any way to avoid storing IV with every message, which would simplify entire process of working with encryption/decryption?
IV selection is a bit complicated because the exact requirements depend on the mode of operation. There are some general rules, however:
You can't go wrong¹ with a random IV, except when using shorter IVs in modes that allow this.
Never use the same IV with the same key.
If you only ever encrypt a single message with a given key, the choice of IV doesn't matter².
Choose the IV independently of the data to encrypt.
Never use ECB.
Of the most common specific modes of operation:
CBC requires the IV to be generated uniformly at random. Do not use a counter as IV for CBC. Furthermore, if you're encrypting some data that contains parts that you receive from a third party, don't reveal the IV until you've fully received the data, .
CTR uses the IV as the initial value of a counter which is incremented for every block, not for every message, and the counter value needs to be unique for every block. A block is 16 bytes for all modern symmetric ciphers (including AES, regardless of the key size). So for CTR, if you encrypt a 3-block message (33 to 48 bytes) with 0 as the IV, the next message must start with IV=3 (or larger), not IV=1.
Modern modes such as Chacha20, GCM, CCM, SIV, etc. use a nonce as their IV. When a mode is described as using a nonce rather than an IV, this means that the only requirement is that the IV is never reused with the same key. It doesn't have to be random.
When encrypting data in a database, it is in general not safe to use the row ID (or a value derived from it) as IV. Using the row ID is safe only if the row is never updated or removed, because otherwise the second time data is stored using the same ID, it would repeat the IV. An adversary who sees two different messages encrypted with the same key and IV may well be able to decrypt both messages (the details depend on the mode and on how much the attacker can guess about the message content; note that even weak guesses such as “it's printable UTF-8” may suffice).
Unless you have a very good reason to do otherwise (just saving a few bytes per row does not count as a very good reason) and a cryptographer has reviewed the specific way in which you are storing and retrieving the data:
Use an authenticated encryption mode such as GCM, CCM, SIV or Chacha20+Poly1305.
If you can store a counter somewhere and make sure that it's never reset as long as you keep using the same encryption key, then each time you encrypt a message:
Increment the counter.
Use the new value of the counter as the nonce for the authenticated encryption.
The reason to increment the counter first is that if the process is interrupted, it will lead to a skipped counter value, which is not a problem. If step 2 was done without step 1, it would lead to repeating a nonce, which is bad. With this scheme, you can shave a few bytes off the nonce length if the mode allows it, as long as the length is large enough for the number of messages that you'll ever encrypt.
If you don't have such a counter, then use the maximum nonce length and generate a random counter. The reason to use the maximum nonce length is that due to the birthday paradox, a random n-bit nonce is expected to repeat when the number of messages approaches 2n/2.
In either case, you need to store the nonce in the row.
¹ Assuming that everything is implemented correctly, e.g. random values need to be generated with a random generator that is appropriate for cryptography.
² As long as it isn't chosen in a way that depends on the key.
Bitcoin mnemonic private key consist of 12 words. Theese words represent private key.
I was thinking if it is possible to make it shorter than 12 by increasing set of words from which to choose. It would be easier to remember. Lets say we have set of 10000 words. How many words would be enough to represent one private key then? Does somebody know the exact calculation? Or any suggestion why this is not a good idea?
Thank you.
The seed phrase can contain any number of words that you want it to. Their function is simply to allow recreation of the private key for the Bitcoin wallet.
Regarding the length of 12 words, taken from here:
The English-language wordlist for the BIP39 standard has 2048 words,
so if the phrase contained only 12 random words, the number of
possible combinations would be 2048^12 = 2^132 and the phrase would
have 132 bits of security. However, some of the data in a BIP39 phrase
is not random, so the actual security of a 12-word BIP39 seed
phrase is only 128 bits. This is approximately the same strength as
all Bitcoin private keys, so most experts consider it to be
sufficiently secure.
So although 12 would seem the correct amount, the important thing about them is that they must be securely stored, preferably in memory, but at least written and locked somewhere safe, and never stored in plaintext online or on your PC\devices.
One afterthought - by increasing the number of words that are used in the phrase, you could actually make it less secure because the longer it is, the more certain it would be that it must be recored somewhere physical rather than in your own memory.
For example a credit card expiry month can be only of only twelve values. So a hacker would have a one in twelve chance of guessing the correct encrypted value of a month. If they knew this, would they be able to crack the encryption more quickly?
If this is the case, how many variations of a value are required to avoid this? How about a bank card number security code which is commonly only three digits?
If you use a proper cipher like AES in a proper way, then encrypting such values is completely safe.
This is because modes of operation that are considered secure (such as CBC and CTR) take an additional parameter called the initialization vector, which effectively randomizes the ciphertext even if the same plain text is encrypted multiple times.
Note that it's extremely important that the IV is used correctly. Every call of the encryption function must use a different IV. For CBC mode, the IV has to be unpredictable and preferably random, while CTR requires a unique IV (a random IV is usually not a bad choice for CTR either).
Good encryption means that if the user knows for example as you mentioned that the expiration month of a credit card is one of twelve values then it will limit the number of options by just that, and not more.
i.e.
If a hacker needs to guess three numbers, a, b, c, each of them can have values from 1 to 3.
The number of options will be 3*3*3 = 27.
Now the hacker finds out that the first number, a, is always the fixed value 2.
So the number of options is 1*3*3 = 9.
If revealing the value of the number a will result in limiting the number of options to a value less then 9 than you have been cracked, but in a strong model, if one of the numbers will be revealed then the number of options to be limited will be exactly to 9.
Now you are obviously not using only the exp. date for encryption, i guess.
I hope i was clear enough.
Am I mistaken in thinking that the security of RSA encryption, in general, is limited by the amount of known prime numbers?
To crack (or create) a private key, one has to combine the right pair of prime numbers.
Is it impossible to publish a list of all the prime numbers in the range used by RSA? Or is that list sufficiently large to make this brute force attack unlikely? Wouldn't there be "commonly used" prime numbers?
RSA doesn't pick from a list of known primes: it generates a new very large number, then applies an algorithm to find a nearby number that is almost certainly prime. See this useful description of large prime generation):
The standard way to generate big prime numbers is to take a preselected random number of the desired length, apply a Fermat test (best with the base 2 as it can be optimized for speed) and then to apply a certain number of Miller-Rabin tests (depending on the length and the allowed error rate like 2−100) to get a number which is very probably a prime number.
(You might ask why, in that case, we're not using this approach when we try and find larger and larger primes. The answer is that the largest known prime has over 17 million digits- far beyond even the very large numbers typically used in cryptography).
As for whether collisions are possible- modern key sizes (depending on your desired security) range from 1024 to 4096, which means the prime numbers range from 512 to 2048 bits. That means that your prime numbers are on the order of 2^512: over 150 digits long.
We can very roughly estimate the density of primes using 1 / ln(n) (see here). That means that among these 10^150 numbers, there are approximately 10^150/ln(10^150) primes, which works out to 2.8x10^147 primes to choose from- certainly more than you could fit into any list!!
So yes- the number of primes in that range is staggeringly enormous, and collisions are effectively impossible. (Even if you generated a trillion possible prime numbers, forming a septillion combinations, the chance of any two of them being the same prime number would be 10^-123).
As new research comes out the answer to your question becomes more interesting. In a recent paper "Imperfect Forward Secrecy:How Diffie-Hellman Fails in Practice" by David Adrian et all found # https://weakdh.org/imperfect-forward-secrecy-ccs15.pdf accessed on 10/16/2015 the researchers show that although there probably are a sufficient number of prime numbers available to RSA's 1024 bit key set there are groups of keys inside the whole set that are more likely to be used because of implementation.
We estimate that even in the 1024-bit case, the computations are
plausible given nation-state resources. A small number of fixed or
standardized groups are used by millions of servers; performing
precomputation for a single 1024-bit group would allow passive
eavesdropping on 18% of popular HTTPS sites, and a second group would
allow decryption of traffic to 66% of IPsec VPNs and 26% of SSH
servers. A close reading of published NSA leaks shows that the
agency’s attacks on VPNs are consistent with having achieved such a
break. We conclude that moving to stronger key exchange methods should
be a priority for the Internet community.
The research also shows a flaw in TLS that could allow a man-in-middle attacker to downgrade the encryption to 512 bit.
So in answer to your question there are probably a sufficient quantity of prime numbers in RSA encryption on paper but in practice there is a security issue if your hiding from a nation state.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Since breaking password hashes has become a new passtime for scriptkiddies, I thought of the problem and came up with a novel(?) idea.
store the pass as offset+number instead of hash
the number is a product of two large primes
the password is converted into a number , offset is added and that prime is used to divide the number. If it divides AND the divisor is the larger of the two primes the password is correct.
by definition , each hash is unique and each password can be hashed in many different ways depending on the offset. Breaking one hash means you have to factor the number(which is hard), then find a word which corresponds to a number that is largerprime-offset (which is trivial).
To generate use function f() to turn password into a password-number (not important) , generate two random primes larger than 2^4096 or however much is enough. Take the larger prime and calculate prime-passwordnumber=offset. Multiply the primes to get "number". store number and offset.
To check. use function f() to turn password into a password-number, add offset to find prime. divide number with prime to get the other prime. Check that the first prime was the bigger of the two. If so, password was correct.
f() might be for example utf-8 encoding of the password understood as a large binary integer.
Your procedure doesn't really gain you anything over using a hash function. Reversing your function is difficult, yes, since it requires factoring large numbers, but reversing regular hash functions is also difficult. An attacker can still employ the same procedure they would against a regular hash algorithm: employ a brute force attack by testing every possible password.
This, of course, is inevitable with any scheme that stores sufficient data to validate the password. The only solution is to make it computationally expensive for the attacker to do so, by making the hash function expensive to compute, and by adding a salt to make sure they can't precompute.
In general, trying to invent your own crypto system is very hard to do correctly. There are many little things that you have to consider, and it's easy to miss something that an attack can exploit. You'd still be much better off and safer if you used an established cryptography or hashing library. Bcrypt for hashing will probably be much more secure than the solution you posted.
To formalize your scheme:
To create the hash:
User enters password pw
Convert pw to a byte array ba with an encoding function e
Convert ba to a large integer bn
Find prime numbers p and q, p > q > max(bn, 2^2048)
Store n = pq and o = p - bn
To verify the hash:
User enters password pw
Convert pw to a byte array ba with an encoding function e
Convert ba to a large integer bn
Verify that bn + o divides n
This being a secure hash requires that given n and o, it's not feasible to deduce pw, i.e. there is no algorithm that gives an advantage over guessing and checking. I believe it.
As I see it, the main benefit of your scheme is the randomness injected into the hashing process by selecting the random numbers. That they are primes and factoring should be hard is more of an implementation detail (it's your one-way function). Presumably it should also slow down checks, though I really don't know how slow division is on numbers that large.
It is interesting that the hash creation and password verification processes are so different. As you point out, this makes the technique of rainbow table hash chaining inapplicable. This may be something of an advantage, but per-user salting gets you similar protection from rainbow tables.