How does password-based encryption technically work? - passwords

Say I have some data and a password, and I want to encrypt the data in such a way that it can only be recovered with the right password.
How does this technically work (i.e. how to implement this)? I often hear people use bitshifting for encryption, but how do you base that on a password? How does password-based encryption work?
An example is Mac OS X FileVault
Thanks.
If you give sample code, preferably in C, Objective-C or pseudocode.

For (symmetric) encryption you need a secret key for encryption and decryption.
Usually, the password you supply is used as the source of this key. For various security reasons, the password is not (and often cannot, due to requirements of the cipher used) directly used as the key. Instead, a key derivation function is used to generate the key from the password.
This is why passwords for encryption must be long and fairly random: Otherwise the resulting key will only come from a very small subset of possible keys, and these can then simply all be tried, thus brute-forcing the encryption.
As to code examples, there are several possibilities:
look at the source code of a crypto library, such as OpenSSL
look at the source code of a program that implements encryption, such as GnuPG
google some sample source code for a simple encryption algorithm, or a key derivation function, and try to understand it
This depends on what you want to learn.

You'll need to look to other resources for a deep explanation, as this question is extremely broad.
Speaking generally: you use a password as a "seed" for an encryption key, as sleske pointed out. Then you use this key to apply a two-way encryption algorithm (i.e. one that can be applied once to encrypt and again to decrypt). When you apply the algorithm to a piece of data, it becomes encrypted in such a way that you could never get the data back out again without using the same key, and you can't practically produce the same key without having the same password as a seed.

If you're interested in crypto, read Applied Cryptography by Bruce Schneier. Excellent read, lots of examples. It goes through many different cryptography types.

An easy way, but not exactly secure, is to rotate each byte by a number determined by the password. You can use a hash code from a string, or count the number of characters, or whatever for the number.
What you are probably thinking of, though, is public key encryption. Here is a link to a document that will tell you the math for it - you'll have to work out the implementation details yourself, but it's not that hard once you understand the math.
http://mathaware.org/mam/06/Kaliski.pdf

The basic building block of most block ciphers is a construction called a Feistel Network. It's reasonably easy to understand.
Stream ciphers are even simpler - they're essentially just pseudo-random number generators, albeit with some important security properties, where the initial internal state is derived from the key.

Password based encryption IS symmetric. The input usually consists of a salt in addition to the password. FooBabel has a cool app where you can play around with this... currently they hard code the Salt to an array of eight bytes (zero to seven) for simplicity. I put in a request to see that they let users input the salt. Anyway, here it is - PBECrypto

Related

Authentication tips using NTAG 424 DNA TT

I need to implement an authentication procedure between a reader an NFC tag but being my knowledge limited in this area I will appreciated some aid in order to understand few concepts.
Pardon in advance for rewrite the Bible but I could not summarize it more.
There are many tags families ( ICODE, MIFARE, NTAG...) but after doing a research I think NTAG 424 DNA matches my requirements(I need mainly authentication features).
It comes with AES encryption, CMAC protocol and 3-pass-authentication system and here is when I started to need assistance.
AES -> As I am concerned this is a block cipher to encrypt plain texts via permutations and mapping. Is a symmetric standard and it does not use the master key, instead session keys are used being them derivations from the master key. (Q01: What I do not know is where this keys are stored in the tag. Keys must be stored on specialized HW but no tag "specs" remark this, apart from MIFARE SAM labels.)
CMAC -> It is an alteration of CBC-MAC to make authentication secure for dynamically sized messages. If data is not confidential then MAC can be used on plain-texts to verify them, but to gain confidentiality and authentication features "Encrypt-than-mac" must be pursuit. Here also session keys are used, but not the same keys used in the encryption step.(Q02: The overall view of CMAC may be a protocol to implement verification along with confidentiality, this is my opinion and could be wrong.)
3-pass-protocol -> ISO/IEC 9798-2 norm where tag and reader are mutually verified. It may also use MAC along with session keys to achieve this task.(Q03: I think this is the upper layer of all the system to verify tags and readers. The "3 pass protocol" relays in MAC to be functional and, if confidentiality features are also needed, then CMAC might be used instead of single MAC. CMAC needs AES to be functional, applying session keys on each step. Please correct me if I am posting savages mistakes)
/*********/
P.S: I am aware that this is a coding related forum but surely I can find here someone with more knowledge than me about cryptography to answer this questions.
P.S.S: I totally do not know where master and session keys are kept in the Tag side. Have they need to be include by a separate HW along with the main NFC circuit ?
(Target)
This is to implement a mutual verification process between tag and reader, using the NTAG 424 DNA TagTamper label. (The target is to avoid 3º parties copies, being authentication the predominant part instead of message confidentiality)
Lack of knowledge of cryptography and trying to understand how AES, CMAC and the mutual authentication are used on this NTAG.
(Extra Info)
NTAG 424 DNA TT: https://www.nxp.com/products/identification-security/rfid/nfc-hf/ntag/ntag-for-tags-labels/ntag-424-dna-424-dna-tagtamper-advanced-security-and-privacy-for-trusted-iot-applications:NTAG424DNA
ISO 9798-2: http://bcc.portal.gov.bd/sites/default/files/files/bcc.portal.gov.bd/page/adeaf3e5_cc55_4222_8767_f26bcaec3f70/ISO_IEC_9798-2.pdf
3-pass-authentication:https://prezi.com/p/rk6rhd03jjo5/3-pass-mutual-authentication/
Keys storage HW:https://www.microchip.com/design-centers/security-ics/cryptoauthentication
The NTAG424 chips are not particularly easy to use, but they offer some nice features which can be used for different security applications. However one important thing to note, is that although it heavily relies on encryption, from an implementation side, that is not the main challenge, because all of the aes encryption, cmac computation and so on is already available as some sort of package or library in most programming languages. Some examples are even given by nxp in their application note. For example in python you will be able to use the AES package from Crypto.Cipher import AES as stated in one of the examples of the application note.
My advice is to simply retrace their personalization example beginning at the initial authentication, and then work your way up to whatever you are trying to achieve. It is also possible to use these examples in order to test the encryption and the building of apdu commands. Most of the work is not hard, but sometimes the NXP documents can be a bit confusing.
One small note, if you are working with python, there is some code available on github which you might be able to reuse.
For iOS, I'm working on a library for DNA communication, NfcDnaKit:
https://github.com/johnnyb/nfc-dna-kit

Is RSA-encoded data exchangable

Up to now, I thought that if I have RSA-encrypted data, this data would be easily exchangable between most platforms (.net, java, pc, unix..), because of the commonly used algorithm.
Through investigating for another questions I had, I'm now confused. I have found even between MS-implementations differences (some provider reverse the resulting byte-array). Moreover the padding seems not to follow a standardization.
Can someone with experience in cross platform cryptography give a statement, if RSA-encoded data is relatively simple exchangable (with some obvious pitfalls) or if this is a headache?
Note that RSA encryption is normally not used by itself, but in combination with a symmetric encryption algorithm.
So, to make sure to be interoperable, you need to make sure that:
Both sides use the same padding scheme for RSA (e.g. the one originally defined in PKCS#1 v1.5, or OAEP). (That does not mean that the padding has to be deterministic, just that the decrypter know which bits of the decrypted text was padding and which were the original message).
Both sides use the same format for their messages (e.g. the one in PKCS#7 or its successors).
Both sides use the same symmetric algorithm (e.g. AES-128), mode of operation (e.g. CBC) and block cipher padding scheme (e.g. PKCS#5-padding).
The encrypting party must use the public key corresponding to the private key used by the decrypting party.
The simple answer to your question is no, the cryptographic algorithm itself does not specify how to store or transmit bytes between implementations to ensure interoperability. For that you must use a standard format or protocol that gives these instructions down to the bit level. For example, in Paulo answer he talks about PKCS#7 and PKCS#1. These in turn rely on the DER-encoding rules of ASN.1 that specify exactly how to covert the big integer pieces of RSA into an unambigous sequence of bytes and back again.

How to determine the encryption scheme used when given a cipher text and the key

For a homework assignment, I am asked to determine the algorithm used to generate a given cipher text. The key is also given. Currently, I am working down a list of simple encryption algorithms and semi-blindly testing different decryption arrangements in an attempt at retrieving the given plain text.
Is there a better way to go about this process? I've read pages of Google results on the topic and haven't come across anything that explained a better process than what I'm already doing. Thus far I've run multiple levels of linguistical analysis upon the cipher text and am trying to plug in logical values into the encrypted message to decrypt it.
This is built around basic cryptographic systems, nothing at the level of public key encryption or DES.
Even if I can get the original message, how will that show the encryption scheme that was used?
My answer would be there is nothing wrong with trying various different algorithms out and seeing what works.
Cryptanalysis is like solving a puzzle, not a step by step process. You try things, you see what works, what you think gets you closer. It is absolutely trial and error based on knowledge of the potential algorithms, patterns and techniques and the reasons for them. Differential cryptanalysis, a modern technique, basically amounts to trying various combinations of keys and plaintexts within an algorithm and looking at the differences to see if you can find patterns.
From your comments, I think you're facing a vigenere cipher or some similar variant thereof. In this case, the key is important because essentially a vigenere cipher is a set of caesar ciphers and the length of the key determines the number of these ciphers. Now, the rules of the scheme in question will tell you exactly what cipher it is, but that's the basis of it.

Symmetric key authentication protocol

Does anybody know some simple authentication and data transfer protocol based on symmetric keys only? Due to memory constraints (kilobytes RAM and ROM) we cant afford asymmetric cryptography and due to closed environment asymmetric cryptography does not increase security of any way.
I am looking for simple symmetric cryptography protocol that can be kept in head and written on one paper sheet. I was looking in EAP-PSK https://www.rfc-editor.org/rfc/rfc4764#page-4 but still think that 2^6 pages is way to much for something simple and secure.
Does anybody know some useful url, paper or idea?
For secrecy, use AES-CBC. For message authentication, use HMAC-SHA256. Use a different key for each.
In both cases, use an existing, validated, timing-attack-free implementation of the cryptographic primitives.
I think you're looking for the Diffie-Hellman key exchange: only requires bignum integer arithmetic (powers, multiplication, and modulus only, at that): http://en.wikipedia.org/wiki/Diffie–Hellman_key_exchange

Does partial known plaintext weaken a hash?

This is a question about an authentication scheme.
Say I have a shared secret string S, and two computers, C1 and C2
Computer one (C1) sends a random string (R) to computer two (C2)
C2 hashes (say SHA256) the concatenation of S and R (SR)
C2 sends the hash of SR to C1, along with some instructions
C1 compares the received hash of SR with it's own hash of SR and executes the instructions if they match
Wash, rinse, repeat with different values of R
Now, what I want to know is if someone intercepts a whole bunch of R values, and a whole bunch of SR hashes, can they use that as a "crib" to work out what S is, thus allowing them to forge instructions?
I'm already aware of the potential for a MITM attack here (attacker intercepts response, changes the instructions and forwards it on).
I honestly don't know what I'm dealing with here, I only have a bit of historical knowledge about encryption but that included the use of cribs to break them. I'm not a theorist, so anything you can definitively tell me about specific strong hashes would be great.
Alternate authentication schemes are also welcome, assuming the constraints of an existing shared secret string like in this example. Would I be better off just using S as a key for AES? If I do that, can I still use this in the encrypted message to prevent replay attacks?
Any and all advice welcome, I sort of deviated from my question at the end, so feel free to deviate in your answers!
What you're talking about is called a message authentication code - a MAC. If the secret is sufficiently large (such that it cannot be brute forced in reasonable time) and the MAC is properly implemented, then no, knowing the plaintext doesn't help the attacker.
The key, however, is that it has to be properly implemented. The problem is that crypto is hard. Really hard. Unless you're an expert or have an expert to review your work in context, it's extremely easy to make a mistake. Even worse, it's very easy for people to write crypto that they don't know how to break, but which can be broken quite easily by someone in the know.
The advice you got in the comments is the correct advice: use a proven scheme like SSL or TLS instead of creating your own.
Answering your question:
No, the only way to break a hash is brute force, as small diferences in the origin mean big differences in the output of the hashing algorithm (given that the algorithm has been proben to be unbroken). You must to know S to perform a MITM here.
But, Byron Withlock is correct:
Using a homemade encryption scheme when there are sooo many better schemes available is crazy. Leave encryption to the experts. – Byron Whitlock 4 mins ago
I'm with Byron. Just use something off-the-shelf and tested by people with a clue. How about SSL? – Steven Sudit 57 secs ago
Many cryptographic hash functions are vulnerable to a lengt extension attack. That means if an attacker knows hash(S) but not S, then he may still be able to compute hash(S || M) for some messages M. For example, the attacker might try to get hash(S), by sending the challenge string "" to one of the parties. Your scheme does not have a detailed description. So it is not clear if such a length extension attack is possible. To avoid these kind of attacks you might consider to use for example HMAC instead of the more simple hashing scheme that you propose.
This scheme is weak because the instructions themselves aren't authenticated. You want to send the MAC of R + instructions - and ensure that R is fixed length so that an attacker can't shuffle about between R and instructions.
I take it the purpose of the random value is to ensure the "freshness" of the instructions sent?
You could also look into using gpg, if SSL doesn't meet your needs. That's likely to be a lot better than homegrown crypto.