I'm currently using AES (256) with CBC mode to encrypt data. I store the initialization vector with the encrypted data. Right now I'm just adding the IV to the beggining of the encrypted data, then on decrypt, reading it in as a hard coded length of bytes.
If the initialization vector length changes in the future, this method will break.
So my questions is:
Will longer AES key sizes in the future = longer IVs? Or, in other words, will the block size of AES change in the future?
If so, what would be the best way of dealing with this? Using the first byte as an indicator of how long the IV is, then reading in that many bytes?
Rijndael does support larger block sizes, but AES is currently fixed at a 128 bit block. It seems relatively unlikely that the larger Rijndael block sizes will be standardized by NIST, since this would effectively be a completely new algorithm, one that hasn't been implemented by anyone. If NIST feels the need for block cipher with a larger size, it's likely the would simply run a new contest.
However what I would recommend is that, rather than the IV length, you include near the start of your message some kind of algorithm identifier (a single byte is all you'll need), which will allow you not just the flexibility to handle larger IVs, but also extend your format in other ways in the future, for instance a new algorithm. Eg 0 == AES-256/CBC, 1 == AES-256/GCM, 2=AES-2.0/CBC, 3=AES-256/CBC with special extra header somewhere, etc, etc.
PS - don't forget to also use a message authentication code, since otherwise you expose yourself to a variety of easy message modification attacks.
The purpose of the initialization vector is to randomize the first block, so that the same data encrypted twice with the same key will not produce the same output.
From an information-theoretic point of view, there are "only" 2^128 distinct IVs for AES, because those are all the possible random values you might XOR with your first block of actual data. So there is never any reason to have an IV larger than the cipher's block size.
Larger block sizes could justify larger IVs. Larger key sizes do not.
A larger block size would mean a different algorithm by definition. So however you tag your data to indicate what algorithm you are using, that is how you will tell what block size (and therefore IV size) to use.
As an alternative solution you could switch to AES-CTR mode. Counter mode requires a Nonce, but the Nonce does not have to be tied to the AES block size. If the AES block size were increased (unlikely, as Jack says) then you could retain the same size Nonce.
Related
I am collecting sensitive form user input which, when the input has finished, I want to encrypt using asymmetric encryption.
I do not know the length of the data beforehand. I dont't want the data to be swapped out (because it is sensitive).
So I would think, that a something like a stringstream with an allocator based on libsodium_malloc/libsoium_free would be the right choice. Now in the libsodium documentation on secure memory it says:
The returned address will not be aligned if the allocation size is not a multiple of the required alignment.
For this reason, sodium_malloc() should not be used with packed structure or variable-length structures, unless the size given to sodium_malloc() is rounded up in order to ensure proper alignment.
I am not really sure what this means and if it applies to me. Why would I care for proper alignment?
Is my approach the right way to do it at all?
So, I'm using the RijndaelManaged class (.NET 2.0) to do AES-128 CBC encryption on small strings (around a dozen characters or less) in a config file. I've got everything working properly except that when I decrypt the data, the padding bytes are not removed. I understand I can choose to not do any padding but that is VERY insecure and that the padding bytes need to be added because that's how AES works (in discreet block sizes). Right now I'm using PaddingMode.ISO10126 to let the CryptoStream automatically append crypto random bytes.
What is the industry-standard way of handling this? What's the right way of getting rid of these "extra bytes" on decryption?
The best way of getting rid of padding is of course to use PKCS#7 padding instead, and let the cipher instance get rid of the padding, as GregS suggested.
The best way of performing encryption nowadays is to use CTR mode encryption instead, or preferably a cipher that contains authentication/integrity protection such as GCM. Note that with small strings you need to take care not to reveal information through the size of the cipher text though (the result of performing CTR mode encryption on "yes" will result in three bytes, "no" will result in 2 bytes).
Just working on a algorithm and so far i can encrypt and decrypt a number, which works fine. My question now is how do i go abouts encrypting an image? How does the UIdata look and shold i convert the image to that before I start? Never done anything on this level in terms of encryption and any input would be great! Thanks!
You'll probably want to encrypt in small chunks - perhaps a byte or word/int (4 bytes), maybe even a long (8 bytes) at a time depending on how your algorithm is implemented.
I don't know the signature of your algorithm (i.e. what types of input it takes and what types output it gives), but the most common ciphers are block ciphers, i.e. algorithms which have a input of some block size (nowadays 128 bits = 16 bytes is a common size), and a same-sized output, additionally to a key input (which should also have at least 128 bits).
To encrypt longer pieces of data (and actually, also for short pieces if you send multiple such pieces with the same key), you use a mode of operation (and probably additionally a padding scheme). This gives you an algorithm (or a pair of such) with an arbitrary length plaintext input, and slightly bigger ciphertext output (which the decryption algorithm undoes then).
Some hints:
Don't use ECB mode (i.e. simply encrypting each block independently of the others).
Probably you also should apply a MAC, to protect your data against malicious modifications (and also breaking of the encryption scheme by choosen-ciphertext attacks). Some modes of operation already include a MAC.
I have lots of small secrets that I want to store encrypted in a database. The database client will have the keys, and the database server will not deal with encryption and decryption. All of my secrets are 16 bytes or less, which means just one block when using AES. I'm using a constant IV (and key) to make the encryption deterministic and my reason for doing deterministic encryption is to be able to easily query the database using ciphertext and making sure the same secret is not stored twice (by making the column UNIQUE). As far as I can see there should be no problem doing this, as long as the key is secret. But I want to be sure: Am I right or wrong? In case I'm wrong, what attacks could be done?
BTW: Hashes are quite useless here, because of a relatively small number of possible plaintexts. With a hash it would be trivial to obtain the original plaintext.
An ideal cipher, for messages of length n bits, is a permutation of the 2n sequences of n bits, chosen at random in the 2n! such permutations. The "key" is the description of which permutation was chosen.
A secure block cipher is supposed to be indistinguishable from an ideal cipher, with n being the block size. For AES, n=128 (i.e. 16 bytes). AES is supposed to be a secure block cipher.
If all your secrets have length exactly 16 bytes (or less than 16 bytes, with some padding convention to unambiguously extend them to 16 bytes), then an ideal cipher is what you want, and AES "as itself" should be fine. With common AES implementations, which want to apply padding and process arbitrarily long streams, you can get a single-block encryption by asking for ECB mode, or CBC mode with an all-zero IV.
All the issues about IV, and why chaining modes such as CBC were needed in the first place, come from multi-block messages. AES encrypts 16-byte messages (no more, no less): chaining modes are about emulating an ideal cipher for longer messages. If, in your application, all messages have length exactly 16 bytes (or are shorter, but you add padding), then you just need the "raw" AES; and a fixed IV is a close enough emulation of raw AES.
Note, though, the following:
If you are storing encrypted elements in a database, and require uniqueness for the whole lifetime of your application, then your secret key is long-lived. Keeping a secret key secret for a long time can be a hard problem. For instance, long-lived secret keys need some kind of storage (which resists to reboots). How do you manage dead hard disks ? Do you destroy them in an acid-filled cauldron ?
Encryption ensures confidentiality, not integrity. In most security models, attackers can be active (i.e., if the attacker can read the database, he can probably write into it too). Active attacks open up a full host of issues: for instance, what could happen if the attacker swaps some of your secrets within the database ? Or alters some randomly ? Encryption is, as always, the easy part (not that it is really "easy", but it is much easier than the rest of the job).
If the assembly is publicly available, or can become so, your key and IV can be discovered by using Reflector to expose the source code that uses it. That would be the main problem with this, if the data really were secret. It is possible to obfuscate MSIL, but that just makes it harder to trace through; it still has to be computer-consumable, so you can't truly encrypt it.
The static IV would make your implementation vulnerable to frequency attacks. See For AES CBC encryption, whats the importance of the IV?
When using AES encryption, plaintext must be padded to the cipher block size. Most libraries and standards use padding where the padding bytes can be determined from the unpadded plaintext length. Is there a benefit to using random padding bytes when possible?
I'm implementing a scheme for storing sensitive per-user and per-session data. The data will usually be JSON-encoded key-value pairs, and can be potentially short and repetitive. I'm looking to PKCS#5 for guidance, but I planned on using AES for the encryption algorithm rather than DES3. I was planning on a random IV per data item, and a key determined by the user ID and password or a session ID, as appropriate.
One thing that surprised me is the PKCS#5 padding scheme for the plaintext. To pad the ciphertext to 8-byte blocks, 1 to 8 bytes are added at the end, with the padding byte content reflecting the number of padding bytes (i.e. 01, 0202, 030303, up to 0808080808080808). My own padding scheme was to use random bytes at the front of the plaintext, and the last character of the plaintext would be the number of padding bytes added.
My reasoning was that in AES-CBC mode, each block is a function of the ciphertext of the preceding block. This way, each plaintext would have an element of randomness, giving me another layer of protection from known plaintext attacks, as well as IV and key issues. Since my plaintext is expected to be short, I don't mind holding the whole decrypted string in memory, and slicing padding off the front and back.
One drawback would be the same unpadded plaintext, IV, and key would result in different ciphertext, making unit testing difficult (but not impossible - I can use a pseudo-random padding generator for testing, and a cryptographically strong one for production).
Another would be that, to enforce random padding, I'd have to add a minimum of two bytes - a count and one random byte. For deterministic padding, the minimum is one byte, either stored with the plaintext or in the ciphertext wrapper.
Since a well-regarded standard like PKCS#5 decided to use deterministic padding, I'm wondering if there is something else I missed, or I'm judging the benefits too high.
Both, I suspect. The benefit is fairly minimal.
You have forgotten about the runtime cost of acquiring or generating cryptographic-quality random numbers. at one extreme, when a finite supply of randomness is available (/dev/random on some systems for instance), your code may have to wait a long time for more random bytes.
At the other extreme, when you are getting your random bytes from a PRNG, you could expose yourself to problems if you're using the same random source to generate your keys. If you're sending encrypted data to multiple recipients one after another, you have given the previous recipient a whole bunch of information about the state of the PRNG which will be used to pick the key for your next comms session. If your PRNG algorithm is ever broken, which is IMO more likely than a good plaintext attack on full AES, you're much worse off than if you had used deliberately-deterministic padding.
In either case, however you get the padding, it's more computationally intensive than PKCS#5 padding.
As an aside, it is fairly standard to compress potentially-repetitive data with e.g. deflate before encrypting it; this reduces the redundancy in the data, which can make certain attacks more difficult to perform.
One last recommendation: deriving the key with a mechanism in which only the username and password vary is very dangerous. If you are going to use it, make sure you use a Hash algorithm with no known flaws (not SHA-1, not MD-5). cf this slashdot story
Hope this helps.