Are AES keys just random bytes of a specific length or is there some sort of extra checks? - cryptography

Since I want to scale up a simple website but I just need a simple encryption done through environment variables rather than setting up a Redis to hold the key.
I'm looking at this Converting Secret Key into a String and Vice Versa to do the retrieval.
I know I can export the string but I was wondering if any arbitrary bytes can be used so long as it meets the length requirement.

An AES key is a sequence of 16, 24, or 32 bytes chosen by a cryptographically secure random number generator. There are no checks other than the length.

Related

VB.net Hash Algorithm

I am working on a Desktop Application using VB.net with an existing database. Including the user's username and password, I want to do the login window using the existing password but it was hashed password. May I know what hash algorithm use in this data X8NUoMVWb/w6D4QdmumxoQ==?
You can make an educated guess simply by looking at the length of the hash, as generally there's only a handful of popular hashing algorithms used for passwords, all with their own distinct output lengths:
Hash
Output length (bytes)
Output length (bits)
MD5
16
128
SHA-1
24
160
SHA-2 (SHA256)
32
256
SHA-2 (SHA512)
64
512
You can never know for sure because while different hashing algorithms have different output sizes, the output can always be truncated (or padded with random bytes).
That said, X8NUoMVWb/w6D4QdmumxoQ== is a Base64-encoded binary value which decodes to a 16-byte value. 16 bytes is 128 bits - it's very likely this is an MD5 hash value.
The 16 bytes convert to Base 16 (hexadecimal) are 5FC354A0C5566FFC3A0F841D9AE9B1A1.
This MD5 hash doesn't appear in any freely available leaked password databases or hash-reverse services I tried.
Note that systems like bcrypt generate an output string which is not just a hash-value, but actually a data structure containing the hash and other data. In bcrypt's case the string always starts with $2 which will never appear in a Base16 or Base64-encoded string.

Initialization vector - best practices (symmetric cryptography)

I would like to ask about best practices regarding a usage of an initialization vector (IV) and a key for symmetric cryptography algorithms.
I want to accept messages from a client, encrypt them and store in a backend. This will be done over a time, and there will be requests coming at a later time for pooling out the messages and return them in a readable form.
According what I know, the key can be the same during the encryption of multiple separate messages. The IV should change with every new encryption. This however, will cause problems, because every message will need a different IV for de-cryption at a later time.
I’d like to know if this is the best way of doing it. Is there any way to avoid storing IV with every message, which would simplify entire process of working with encryption/decryption?
IV selection is a bit complicated because the exact requirements depend on the mode of operation. There are some general rules, however:
You can't go wrong¹ with a random IV, except when using shorter IVs in modes that allow this.
Never use the same IV with the same key.
If you only ever encrypt a single message with a given key, the choice of IV doesn't matter².
Choose the IV independently of the data to encrypt.
Never use ECB.
Of the most common specific modes of operation:
CBC requires the IV to be generated uniformly at random. Do not use a counter as IV for CBC. Furthermore, if you're encrypting some data that contains parts that you receive from a third party, don't reveal the IV until you've fully received the data, .
CTR uses the IV as the initial value of a counter which is incremented for every block, not for every message, and the counter value needs to be unique for every block. A block is 16 bytes for all modern symmetric ciphers (including AES, regardless of the key size). So for CTR, if you encrypt a 3-block message (33 to 48 bytes) with 0 as the IV, the next message must start with IV=3 (or larger), not IV=1.
Modern modes such as Chacha20, GCM, CCM, SIV, etc. use a nonce as their IV. When a mode is described as using a nonce rather than an IV, this means that the only requirement is that the IV is never reused with the same key. It doesn't have to be random.
When encrypting data in a database, it is in general not safe to use the row ID (or a value derived from it) as IV. Using the row ID is safe only if the row is never updated or removed, because otherwise the second time data is stored using the same ID, it would repeat the IV. An adversary who sees two different messages encrypted with the same key and IV may well be able to decrypt both messages (the details depend on the mode and on how much the attacker can guess about the message content; note that even weak guesses such as “it's printable UTF-8” may suffice).
Unless you have a very good reason to do otherwise (just saving a few bytes per row does not count as a very good reason) and a cryptographer has reviewed the specific way in which you are storing and retrieving the data:
Use an authenticated encryption mode such as GCM, CCM, SIV or Chacha20+Poly1305.
If you can store a counter somewhere and make sure that it's never reset as long as you keep using the same encryption key, then each time you encrypt a message:
Increment the counter.
Use the new value of the counter as the nonce for the authenticated encryption.
The reason to increment the counter first is that if the process is interrupted, it will lead to a skipped counter value, which is not a problem. If step 2 was done without step 1, it would lead to repeating a nonce, which is bad. With this scheme, you can shave a few bytes off the nonce length if the mode allows it, as long as the length is large enough for the number of messages that you'll ever encrypt.
If you don't have such a counter, then use the maximum nonce length and generate a random counter. The reason to use the maximum nonce length is that due to the birthday paradox, a random n-bit nonce is expected to repeat when the number of messages approaches 2n/2.
In either case, you need to store the nonce in the row.
¹ Assuming that everything is implemented correctly, e.g. random values need to be generated with a random generator that is appropriate for cryptography.
² As long as it isn't chosen in a way that depends on the key.

Is it safe to store extremely complicated Password in SHA1?

Is it safe to hash extremely complicated password (longer than 25 chars, any ascii chars even binary) with SHA1 ?
Actually, the password represent a tokenID but I don't want to store it like this in the database, i prefer to hash it for more security.
The password (token) is valid only for 14 Days and I need to hash it the most faster as possible (so no way to use something like bcrypt)
What must be the ideal length of the Password (token) ?
In the general case, no. "Complicated" it may be, but cryptographically random it probably is not.
A bare minimum would be applying an RFC2104 HMAC with a secret key (pepper); however, a more appropriate alternative that can, if you absolutely insist, still be quite fast would be to use PBKDF2-HMAC-SHA-256 and ignore all rules of security regarding a sufficiently high iteration count, i.e. choose an iteration count of 10, instead of 10,000.
For password/token hashing, of course, never request more bytes of PBKDF2 output than the native hash function provides - 20 for SHA-1, 32 for SHA-256, 64 for SHA-512.
I have several example implementations of PBKDF2 at my Github repository that may help, and there are others in other languages, of course.
Use a cryptographically random per-password (per-token) salt.

Parallelizable hashing algorithm where size and order of sub-strings is irrelevant

EDIT
Here is the problem I am trying to solve:
I have a string broken up into multiple parts. These parts are not of equal, or predictable length. Each part will have a hash value. When I concatenate parts I want to be able to use the hash values from each part to quickly get the hash value for the parts together. In addition the hash generated by putting the parts together must match the hash generated if the string were hashed as a whole.
Basically I want a hashing algorithm where the parts of the data being hashed can be hashed in parallel, and I do not want the order or length of the pieces to matter. I am not breaking up the string, but rather receiving it in unpredictable chunks in an unpredictable order.
I am willing to ensure an elevated collision rate, so long as it is not too elevated. I am also ok with a slightly slower algorithm as it is hardly noticeable on small strings, and done in parallel for large strings.
I am familiar with a few hashing algorithms, however I currently have a use-case for a hash algorithm with the property that the sum of two hashes is equal to a hash of the sum of the two items.
Requirements/givens
This algorithm will be hashing byte-strings with length of at least 1 byte
hash("ab") = hash('a') + hash('b')
Collisions between strings with the same characters in different order is ok
Generated hash should be an integer of native size (usually 32/64 bits)
String may contain any character from 0-256 (length is known, not \0 terminated)
The ascii alpha-numeric characters will be by far the most used
A disproportionate number of strings will be 1-8 ASCII characters
A very tiny percentage of the strings will actually contain bytes with values at or above 127
If this is a type of algorithm that has terminology associated with it, I would love to know that terminology. If I knew what a proper term/name for this type of hashing algorithm was it would be much easier to google.
I am thinking the simplest way to achieve this is:
Any byte's hash should be its value, normalized to <128 (if >128 subtract 128)
To get the hash of a string you normalize each byte to <128 and add it to the key
Depending on key size I may need to limit how many characters are used to hash to avoid overflow
I don't see anything wrong with just adding each (unsigned) byte value to create a hash which is just the sum of all the characters. There is nothing wrong with having an overflow: even if you reach the 32/64 bit limit (and it would have to be a VERY/EXTREMELY long string to do this) the overflow into a negative number won't matter in 2's complement arithmetic. As this is a linear process it doesn't matter how you split your string.

Hash Function for 2D Barcode Data

I am writing a string of about 120 characters to a 2D barcode. Along with other text, the string contains a unique ticket number. I want to ensure that someone doesn't generate counterfeit tickets by reading the 2D barcode and generation their own barcoded tickets.
I would like to hash the string and append the hash value to what gets embedded in the barcode. That way I can compare the two on reading and see if the data had been tampered with. I have seen several hash function that return 64 bytes and up but the more characters you embed in a 2D barcode the bigger the barcode image becomes. I would like an algorithm that returns a fairly small value. It would also be nice if I could provide the function my own key. Collision is not that big of a deal. This isn't any kind of national security application.
Any suggestions?
Use any standard hash function. Take the 120-character string; append your own secret value; feed it into SHA-1 or MD5 or whatever hash function you have handy or feel like implementing; then just take the first however-many bits you want and use that as your value. (If you need ASCII characters, then I suggest that you take groups of 6 bits and use a base-64 encoding.)
If the hash you're using is any good (as, e.g., MD5 and SHA-1 are; MD5 shouldn't be used for serious cryptographic algorithms these days but it sounds like it's good enough for your needs) then any set of bits from it will be "good enough" in the sense that no other function producing that many bits will be much better.
(Warning: For serious cryptographic use, you should be a little more careful. Look at, e.g., http://en.wikipedia.org/wiki/HMAC for more information. From your description, I do not believe you need to worry about such things.)