Title says it all.
Since Quantum Computers are said to be the next big thing, I figured the speed at which these systems operate on should be enough to decrypt files/applications in a 'Brute Force' manner.
Is it possible? When will it be possible?
Quantum computers operate differently from classical computers, rather than faster or slower. For some problems they're much faster than the best known algorithms, while for others they'd be slower if they would work at all.
For decrypting, there are quantum algorithms for attacking some specific ciphers. Probably the best known is Shor's Algorithm, which on a large enough quantum computer would allow you to factor large numbers efficiently, thus breaking RSA. Breaking RSA would require many thousands of high-quality qubits, and so is not something that's going to be available in the next few years. Longer term, I myself wouldn't try to guess when such a quantum computer will be available, although others may have more confidence.
There are quantum attacks on other ciphers as well, including elliptic curve cryptography. The good news is that post-quantum cryptography is an active field of research, and there are some promising developments already. Also, most symmetric ciphers in use today are quantum-resistant; while brute force search time on a quantum computer would in theory scale with the square root of the number of possible keys, doubling the key size addresses this neatly.
There are good resources for this on Wikipedia: https://en.wikipedia.org/wiki/Shor%27s_algorithm and https://en.wikipedia.org/wiki/Post-quantum_cryptography. The Microsoft Quantum samples repository includes a Q# implementation of Shor's algorithm.
Threat of Quantum Computers to today's encrypted data is real. Please refer to "Harvest now Decrypt later attack".
You can implement Shor's algorithm in Q#. Shor's algorithm is the threat to
Asymmetric key cryptography.
You can also implement Grover's Algorithm in Q#. Grover's algorithm uses brute-force / unsorted search to search symmetric keys.
Much progress has been made with Post Quantum Cryptography (PQC) since this question was originally answered in 2018. NIST is driving a standardization process to identify PQC algorithms. These PQC Algorithms will replace RSA / ECC based DHE/DSA/LEM algorithms.
So more than "when is it possible?" - we have to act now because the threat is real and the encrypted data as we know today is perhaps being passively snapped ( we wont know about it). Data elements such as social security id (in USA) and similar information have a shelf life that exceeds 7 to 10 years.
Related
I am currently doing my final project, and I need to conduct some performance tests on at least 3 lightweight encryption algorithms (symmetric block ciphers). Ideally encrypt/decrypt of a text file and measure/compare at least 3 metrics such as execution time, memory, code size, throughput. I'm struggling to work out how to achieve this so if anyone has any pointers I would be extremely grateful. I've no experience with code although I'm trying to work this out (looking at C#, Java Cryptographic Extension, FELICS).
Thanks.
Consider constructing speed and memory tests of one-time pad, as an example of perfect secrecy or a substitution cipher perhaps, DES as an outdated form of symmetric encryption, and it’s modern replacement AES. Pick a language, and duckduckgo for library implementations of the latter two.
In order to integrity protect a byte stream one can conceptually either use symmetric cryptography (e.g. an HMAC with SHA-1) or asymmetric cryptography (e.g. digital signature with RSA).
It is common sense that asymmetric cryptography is much more expensive than using symmetric cryptography. However, I would like to have hard numbers and would like to know whether there exist benchmark suites for existing crypto libraries (e.g. openssl) in order to gain some measurement results for symmetric and asymmetric cryptography algorithms.
The numbers I get from the built-in "openssl speed" app can, unfortunately, not be compared to each other.
Perhaps somebody already implemented a small benchmarking suite for this purpose?
Thanks,
Martin
I don't think a benchmark is useful here, because the two things you're comparing are built for different use-cases. An HMAC is designed for situations in which you have a shared secret you can use to authenticate the message, whilst signatures are designed for situations in which you don't have a shared secret, but rather want anyone with your public key be able to verify your signature. There are very few situations in which either primitive would be equally appropriate, and when there is, there's likely to be a clear favorite on security, rather than performance grounds.
It's fairly trivial to demonstrate that an HMAC is going to be faster, however: Signing a message requires first hashing it, then computing the signature over the hash, whilst computing an HMAC requries first hashing it, then computing the HMAC (which is merely two additional one-block hash computations). For the same reason, though, for any reasonable assumption as to message length and speed of your cryptographic primitives, the speed difference is going to be negligible, since the largest part of the cost is shared between both operations.
In short, you shouldn't choose the structure of your cryptosystem based on insignificant differences in performance.
All digital signature algorithms (RSA, DSA, ECDSA...) begin by hashing the source stream with a hash function; only the hash output is used afterwards. So the asymptotic cost of signing a long stream of data is the same as the asymptotic cost of hashing the same stream. HMAC is similar in that respect: first you input in the hash function a small fixed-size header, then the data stream; and you have an extra hash operation at the end which operates on a small fixed-size input. So the asymptotic cost of HMACing a long stream of data is the same as the asymptotic cost of hashing the same stream.
To sum up, for a suitably long data stream, a digital signature and HMAC will have the same CPU cost. Speed difference will not be noticeable (the complex part at the end of a digital signature is more expensive than what HMAC does, but a simple PC will still be able to do it in less than a millisecond).
The hash function itself can make a difference, though, at least if you can obtain the data with a high bandwidth. On a typical PC, you can hope hashing data at up to about 300 MB/s with SHA-1, but "only" 150 MB/s with SHA-256. On the other hand, a good mechanical harddisk or gigabit ethernet will hardly go beyond 100 MB/s read speed, so SHA-256 would not be the bottleneck here.
I was wondering if I could reasons or links to resources explaining why SHA512 is a superior hashing algorithm to MD5.
It depends on your use case. You can't broadly claim "superiority". (I mean, yes you can, in some cases, but to be strict about it, you can't really).
But there are areas where MD5 has been broken:
For starters: MD5 is old, and common. There are tons of rainbow tables against it, and they're easy to find. So if you're hashing passwords (without a salt - shame on you!) - using md5 - you might as well not be hashing them, they're so easy to find. Even if you're hashing with simple salts really.
Second off, MD5 is no longer secure as a cryptographic hash function (indeed it is not even considered a cryptographic hash function anymore as the Forked One points out). You can generate different messages that hash to the same value. So if you've got a SSL Certificate with a MD5 hash on it, I can generate a duplicate Certificate that says what I want, that produces the same hash. This is generally what people mean when they say MD5 is 'broken' - things like this.
Thirdly, similar to messages, you can also generate different files that hash to the same value so using MD5 as a file checksum is 'broken'.
Now, SHA-512 is a SHA-2 Family hash algorithm. SHA-1 is kind of considered 'eh' these days, I'll ignore it. SHA-2 however, has relatively few attacks against it. The major one wikipedia talks about is a reduced-round preimage attack which means if you use SHA-512 in a horribly wrong way, I can break it. Obivously you're not likely to be using it that way, but attacks only get better, and it's a good springboard into more research to break SHA-512 in the same way MD5 is broken.
However, out of all the Hash functions available, the SHA-2 family is currently amoung the strongest, and the best choice considering commonness, analysis, and security. (But not necessarily speed. If you're in embedded systems, you need to perform a whole other analysis.)
MD5 has been cryptographically broken for quite some time now. This basically means that some of the properties usually guaranteed by hash algorithms, do not hold anymore. For example it is possible to find hash collisions in much less time than potentially necessary for the output length.
SHA-512 (one of the SHA-2 family of hash functions) is, for now, secure enough but possibly not much longer for the foreseeable future. That's why the NIST started a contest for SHA-3.
Generally, you want hash algorithms to be one-way functions. They map some input to some output. Usually the output is of a fixed length, thereby providing a "digest" of the original input. Common properties are for example that small changes in input yield large changes in the output (which helps detecting tampering) and that the function is not easily reversible. For the latter property the length of the output greatly helps because it provides a theoretical upper bound for the complexity of a collision attack. However, flaws in design or implementation often result in reduced complexity for attacks. Once those are known it's time to evaluate whether still using a hash function. If the attack complexity drops far enough practical attacks easily get in the range of people without specialized computing equipment.
Note: I've been talking only about one kind of attack here. The reality if much more nuanced but also much harder to grasp. Since hash functions are very commonly used for verifying file/message integrity the collision thing is probably the easiest one to understand and follow.
There are a couple of points not being addressed here, and I feel it is from a lack of understanding about what a hash is, how it works, and how long it takes to successfully attack them, using rainbow or any other method currently known to man...
Mathematically speaking, MD5 is not "broken" if you salt the hash and throttle attempts (even by 1 second), your security would be just as "broken" as it would by an attacker slowly pelting away at your 1ft solid steel wall with a wooden spoon:
It will take thousands of years, and by then everyone involved will be dead; there are more important things to worry about.
If you lock their account by the 20th attempt... problem solved. 20 hits on your wall = 0.0000000001% chance they got through. There is literally a better statistical chance you are in fact Jesus.
It's also important to note that absolutely any hash function is going to be vulnerable to collisions by virtue of what a hash is: "a (small) unique id of something else".
When you increase the bit space you decrease collision rates, but you also increase the size of the id and the time it takes to compute it.
Let's do a tiny thought experiment...
SHA-2, if it existed, would have 4 total possible unique IDs for something else... 00, 01, 10 & 11. It will produce collisions, obviously. Do you see the issue here? A hash is just a generated ID of what you're trying to identify.
MD5 is actually really, really good at randomly choosing a number based on an input. SHA is actually not that much better at it; SHA just has massive more space for IDs.
The method used is about 0.1% of the reason the collisions are less likely. The real reason is the larger bit space.
This is literally the only reason SHA-256 and SHA-512 are less vulnerable to collisions; because they use a larger space for a unique id.
The actual methods SHA-256 and SHA-512 use to generate the hash are in fact better, but not by much; the same rainbow attacks would work on them if they had fewer bits in their IDs, and files and even passwords can have identical IDs using SHA-256 and SHA-512, it's just a lot less likely because it uses more bits.
The REAL ISSUE is how you implement your security
If you allow automated attacks to hit your authentication endpoint 1,000 times per second, you're going to get broken into. If you throttle to 1 attempt per 3 seconds and lock the account for 24 hours after the 10th attempt, you're not.
If you store the passwords without salt (a salt is just an added secret to the generator, making it harder to identify bad passwords like "31337" or "password") and have a lot of users, you're going to get hacked. If you salt them, even if you use MD5, you're not.
Considering MD5 uses 128 bits (32 bytes in HEX, 16 bytes in binary), and SHA 512 is only 4x the space but virtually eliminates the collision ratio by giving you 2^384 more possible IDs... Go with SHA-512, every time.
But if you're worried about what is really going to happen if you use MD5, and you don't understand the real, actual differences, you're still probably going to get hacked, make sense?
reading this
However, it has been shown that MD5 is not collision resistant
more information about collision here
MD5 has a chance of collision (http://www.mscs.dal.ca/~selinger/md5collision/) and there are numerous MD5 rainbow tables for reverse password look-up on the web and available for download.
It needs a much larger dictionary to map backwards, and has a lower chance of collision.
It is simple, MD5 is broken ;) (see Wikipedia)
Bruce Schneier wrote of the attack that "[w]e already knew that MD5 is a broken hash function" and that "no one should be using MD5 anymore."
Already understanding that AES is the encryption method of choice, should existing code that uses DES be re-written if the likely threat is on the level of script kiddies? (e.g. pkzip passwords can be cracked with free utilities by non-computer professionals, so is DES like that?) A quick google search seems to imply that even deprecated DES still requires a super computer and large quantity of time--or have times changed?
In particular, this CAPTCHA library uses DES to encrypt the challenge string which is sent to the user in viewstate.
DES is broken so far as storing sensitive data, and so I would certainly not use it in anything new, and would replace it in anything used for long term storage of any information of interest (data that someone would have a profit for national security interest in stealing).
At the moment a DES message can be broken by brute force in a couple of days (or less) using under $100,000 worth of custom hardware.
But there are some key factors in that:
The hardware is custom - the chips used to quickly brute a DES key are not the general purpose processor you'd find in a PC. That being said there is probably room today for using a cluster of Playstation 3s or current generation graphics cards with a GPGPU to crack a DES message in a reasonable amount of time, perhaps bringing down the cost to maybe $15,000.
The other factor is time - a DES message can be cracked in a day, but if your CAPTCHA library has a timestamp that specifies a 30 minute timeout for any given CAPTCHA response, it would still be effective (you could scale up your hardware, but then you're talking millions).
Overall I'd say that for non-long term storage, DES is still secure against "script kiddies".
no, DES cracking is not suitable for scriptkiddies and won't probaly be in the near forseeable future.
it requires such enormous processing power, we're talking about a load of FPGA processors.
for example the COPACOBANA in the CHES 2006 secret key challenge took 21 hours, 26 mins, 29 secs using 108 of it's 128 processors, at a troughput of 43.1852 billion keys per second, and found the key after searching trough 4.73507% of the keyspace
now, if we look at moores law we see, that if we currently build a similar machine, it'll currently take 1/4th of the time for the same amount of money, or 1/4th of the money for the same amount of time.
DES is broken by the standards of the crypto community; but the time required to break it is generally large enough that it would be 'safe' to use for this kind of application. On one assumption: the DES key changes from session to session. If the key doesn't change, then it is open to attack by a very very dedicated individual. Now, the question is, is your website subjected to people that will spend 10+ days cracking DES, rather then applying lessons learned by the rest of the Spam Industry in the way of Image Recognition.
DES is probably still good enough for most use cases. But the point is, there is normally reason to use an algorithm (or in this case rather: a key strength) which is known to be rather weak.
Wikipedia points out that even with special hardware around 9 Days are needed for an exhaustive key search. I don't think the script kiddies are likely to spend that many CPU time (even if they have a botnet) only to crack a captcha. (Actually, cracking captchas is normally A LOT easier with sufficient intelligent picture recognition...)
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Should practical quantum computing become a reality, I am wondering if there are any public key cryptographic algorithms that are based on NP-complete problems, rather than integer factorization or discrete logarithms.
Edit:
Please check out the "Quantum computing in computational complexity theory" section of
the wiki article on quantum computers. It points out that the class of problems quantum computers can answer (BQP) is believed to be strictly easier than NP-complete.
Edit 2:
'Based on NP-complete' is a bad way of expressing what I'm interested in.
What I intended to ask is for a Public Key encryption algorithm with the property that any method for breaking the encryption can also be used to break the underlying NP-complete problem. This means breaking the encryption proves P=NP.
I am responding to this old thread because it is a very common and important question, and all of the answers here are inaccurate.
The short answer to the original question is an unequivocal "NO". There are no known encryption schemes (let alone public-key ones) that are based on an NP-complete problem (and hence all of them, under polynomial-time reductions). Some are "closer" that others, though, so let me elaborate.
There is a lot to clarify here, so let's start with the meaning of "based on an NP-complete problem." The generally agreed upon interpretation of this is: "can be proven secure in a particular formal model, assuming that no polynomial-time algorithms exist for NP-complete problems". To be even more precise, we assume that no algorithm exists that always solves an NP-complete problem. This is a very safe assumption, because that's a really hard thing for an algorithm to do - it's seemingly a lot easier to come up with an algorithm that solves random instances of the problem with good probability.
No encryption schemes have such a proof, though. If you look at the literature, with very few exceptions (see below), the security theorems read like the following:
Theorem: This encryption scheme is provably secure, assuming that no
polynomial-time algorithm exists for
solving random instances of some problem X.
Note the "random instances" part. For a concrete example, we might assume that no polynomial-time algorithm exists for factoring the product of two random n-bit primes with some good probability. This is very different (less safe) from assuming that no polynomial-time algorithm exists for always factoring all products of two random n-bit primes.
The "random instances" versus "worst case instances" issue is what is tripped up several responders above. The McEliece-type encryption schemes are based on a very special random version of decoding linear codes - and not on the actual worst-case version which is NP-complete.
Pushing beyond this "random instances" issue has required some deep and beautiful research in theoretical computer science. Starting with the work of Miklós Ajtai, we have found cryptographic algorithms where the security assumption is a "worst case" (safer) assumption instead of a random case one. Unfortunately, the worst case assumptions are for problems that are not known to be NP complete, and some theoretical evidence suggests that we can't adapt them to use NP-complete problems. For the interested, look up "lattice based cryptography".
Some cryptosystems based on NP-hard problems have been proposed (such as the Merkle-Hellman cryptosystem based on the subset-sum problem, and the Naccache-Stern knapsack cryptosystem based on the knapsack problem), but they have all been broken. Why is this? Lecture 16 of Scott Aaronson's Great Ideas in Theoretical Computer Science says something about this, which I think you should take as definitive. What it says is the following:
Ideally, we would like to construct a [Cryptographic Pseudorandom Generator] or cryptosystem whose security was based on an NP-complete problem. Unfortunately, NP-complete problems are always about the worst case. In cryptography, this would translate to a statement like “there exists a message that’s hard to decode”, which is not a good guarantee for a cryptographic system! A message should be hard to decrypt with overwhelming probability. Despite decades of effort, no way has yet been discovered to relate worst case to average case for NP-complete problems. And this is why, if we want computationally-secure cryptosystems, we need to make stronger assumptions than P≠NP.
This was an open question in 1998:
On the possibility of basing Cryptography on the assumption that P != NP
by Oded Goldreich, Rehovot Israel, Shafi Goldwasser
From the abstract: "Our conclusion is that the question remains open".
--I wonder if that's changed in the last decade?
Edit:
As far as I can tell the question is still open, with recent progress toward an answer of no such algorithm exists.
Adi Akavia, Oded Goldreich, Shafi Goldwasser, and Dana Moshkovitz published this paper in the ACM in 2006: On basing one-way functions on NP-hardness "Our main findings are the following two negative results"
The stanford site Complexity Zoo is helpful in decripting what those two negative results mean.
While many forms have been broken, check out Merkle-Hellman, based on a form of the NP-complete 'Knapsack Problem'.
Lattice cryptography offers the (over)generalized take-home message that indeed one can design cryptosystems where breaking the average case is as hard as solving a particular NP-hard problem (typically the Shortest Vector Problem or the Closest Vector Problem).
I can recommend reading the introduction section of http://eprint.iacr.org/2008/521 and then chasing references to the cryptosystems.
Also, see the lecture notes at http://www.cs.ucsd.edu/~daniele/CSE207C/, and chase links for a book if you want.
Googling for NP-complete and Public key encryption finds False positives ... that are actually insecure. This cartoonish pdf appears to show a public key encyption algorithm based on the minimium dominating set problem. Reading further it then admits to lying that the algorithm is secure ... the underlying problem is NP-Complete but it's use in the PK algorithm does not preserve the difficulty.
Another False positive Google find: Cryptanalysis of the Goldreich-Goldwasser-Halevi cryptosystem from Crypto '97. From the abstract:
At Crypto '97, Goldreich, Goldwasser and Halevi proposed a public-key cryptosystem based on the closest vector problem in a lattice, which is known to be NP-hard. We show that there is a major flaw in the design of the scheme which has two implications: any ciphertext leaks information on the plaintext, and the problem of decrypting ciphertexts can be reduced to a special closest vector problem which is much easier than the general problem.
There is a web site that may be relevant to your interests: Post-Quantum Cryptography.
Here is my reasoning. Correct me if I'm wrong.
(i) ``Breaking'' a cryptosystem is necessarily a problem in NP and co-NP. (Breaking a cryptosystem involves inverting the encryption function, which is one-to-one and computable in polynomial-time. So, given the ciphertext, the plaintext is a certificate that can be verified in polynomial time. Thus querying the plaintext based on the ciphertext is in NP and in co-NP.)
(ii) If there is an NP-hard problem in NP and co-NP, then NP = co-NP. (This problem would be NP-complete and in co-NP. Since any NP language is reducible to this co-NP language, NP is a subset of co-NP. Now use symmetry: any language L in co-NP has -L (its compliment) in NP, whence -L is in co-NP---that is L = --L is in NP.)
(iii) I think that it is generally believed that NP != co-NP, as otherwise there are polynomial-sized proofs that boolean formulas are not satisfiable.
Conclusion: Complexity-theoretic conjectures imply that NP-hard cryptosystems don't exist.
(Otherwise, you have an NP-hard problem in NP and co-NP, whence NP = co-NP---which is believed to be false.)
While RSA and other widely-used cryptographic algorithms are based on the difficulty of integer factorization (which is not known to be NP-complete), there are some public key cryptography algorithms based on NP-complete problems too. A google search for "public key" and "np-complete" will reveal some of them.
(I incorrectly said before that quantum computers would speed up NP-complete problems, but this is not true. I stand corrected.)
As pointed out by many other posters, it is possible to base cryptography on NP-hard or NP-complete problems.
However, the common methods for cryptography are going to be based on difficult mathematics (difficult to crack, that is). The truth is that it is easier to serialize numbers as a traditional key than to create a standardized string that solves an NP-hard problem. Therefore, practical crypto is based on mathematical problems that are not yet proven to be NP-hard or NP-complete (so it is conceivable that some of these problems are in P).
In ElGamal or RSA encryption, breaking it requires the cracking the discrete logarithm, so look at this wikipedia article.
No efficient algorithm for computing general discrete logarithms logbg is known. The naive algorithm is to raise b to higher and higher powers k until the desired g is found; this is sometimes called trial multiplication. This algorithm requires running time linear in the size of the group G and thus exponential in the number of digits in the size of the group. There exists an efficient quantum algorithm due to Peter Shor however (http://arxiv.org/abs/quant-ph/9508027).
Computing discrete logarithms is apparently difficult. Not only is no efficient algorithm known for the worst case, but the average-case complexity can be shown to be at least as hard as the worst case using random self-reducibility.
At the same time, the inverse problem of discrete exponentiation is not (it can be computed efficiently using exponentiation by squaring, for example). This asymmetry is analogous to the one between integer factorization and integer multiplication. Both asymmetries have been exploited in the construction of cryptographic systems.
The widespread belief is that these are NP-complete, but maybe can't be proven so. Note that quantum computers may break crypto efficiently!
Since nobody really answered the question I have to give you the hint: "McEliece". Do some searches on it. Its a proven NP-Hard encryption algorithm. It needs O(n^2) encryption and decryption time. It has a public key of size O(n^2) too, which is bad. But there are improvements which lower all these bounds.