Why is an s-box input longer than its output? - cryptography

I don't understand where the extra bits are coming from in this article about s-boxes. Why doesn't the s-box take in the same number of bits for input as output?

It is the way s-boxes work. They can be m * n ==> m bit input , n bit output.
For example, in the AES S-box the number of bits in input is equal to the number of bits in output.
In DES, m=6 and n=4.
The input is expanded from 32 to 48 bits in the first stages of DES. So it is be reduced to 32 bits again by applying one round of S-box substitution. Thus no information is lost here.
The Wikipedia article on itself can be a bit confusing. It will make people think that information is lost. You should read the article in conjuncture with implementation details of some encryption algorithm using s-boxes.

What extra bits? They are going from 6 to 4.
EDIT: Whoops! I'm an idiot. This is kinda like a 2nd grade multiplication table. They strip the outer bits off of the 6-bit block to be encypted, and leave the middle 4. Just like a table for an arithmatic operation, they go down one side, and find the outer bit sequence, then across the top and find the middle ones. To answer your question, it could input and output the same number of bits, but this s-box is just set up to do it the way it does. Its arbitrary.

Related

Why do we only take the 1st and last bits together for SBox substitution?

Today, in class we were discussing about how s-box is used in feistal ciphers. Our sir went on to say that for s-box substitution, we take the 1st and last bits as one part and the remaining bits as another, following which we find the matching 4 bits.
So if we were to have the 6bits as: "100100", it would map to 1110 (left column: 10 , right column: 0010)
My doubt now is if it possible for us to take different positions of bits together, like the 1st and 3rd bit? If not, is there a reason why we can only take the 1st and last bits?

Could you explain how to convert from lz77 to huffman?

Could you explain how to convert from lz77 to huffman on the example in the below picture?
Easy:
In the first step your output is essentially 3 numbers:
prev index
number of characters to repeat
next character (be it ascii or unicode)
The algorithm demands that you specify a sliding window up front. That means you know how big (1) and (2) can be at most.
In other words, you know how many bits (1) and (2) will take up.
Since (3) is essentially also a character from a fixed length alphabet, you also know the bit-length of (3)
That means it's safe to simply concatenate them.
So, the output of the first algorithm can be thought of as outputting a bit-sequence, where every item in the sequence has a fixed length.
That's ideal for applying huffman.
Of course the specifics are not mentioned, and you can choose from a lot of options.
normalized huffman table
1 on left-branch vs 0 on left-branch
priorities when merging items of similar count
etc
So I can not readily explain the exact output values you are showing.
But I hope I can at least explain how to get from A to B.
You can't. The coding shown is, well, figurative. Not literal. The symbols A, B, and C are all coded to the single bit 0. Obviously that's not going to be very helpful on the decoding end.

Cryptographic Hash Function

I have an exam tomorrow on cryptography and came across an old exam question on hash functions and finding out the probability of collision of two hash values being the same, but I don't know how to calculate it. The question is:
If the hash value is a 20 bit output and allowable inputs must not exceed 2^64 bits, what is the probability of two randomly chosen values yielding a collision?
Was hoping someone could provide a solution.
Thanks.
Should be 1 / (2 ^ 20). (It should be independent of the length of the Input if you consider 2 randomly choosen inputs (and not ALL possible inputs), given the hash function is proper.) So I guess the additional Information about the length of the Input is just to make you crazy.

is SHA-512 collision resistant?

According to the books that i have read, it says that S.H.A(Secure Hash Algorithm) is collision resistant.But if the input space is a 1024 bit number and the output space is a 512 bit message digest then shouldn't it be colliding for
(2^1024)/(2^512) times? As the range is lesser than the domain being mapped there should have been collisions. please explain where i am going wrong.
The chance for a collision does not depend on the input size. The chance to a 512-bit hash collision is 1.4×10^77, see Probability table
Maybe your book has also mentioned the definition of collision resistance? It does not mean that no collisions are created (which is clearly not the case), but that given a hash you are not able to create a message easily that produces this hash.
a hash function H is collision resistant if it is hard to find two
inputs that hash to the same output; that is, two inputs a and b such
that H(a) = H(b), and a ≠ b
From Wikipedia
As you describe: Since the input space (arbitrary size) is larger than the output space (e.g. 512bit for sha512), there always exist collisions.
"Collision resistant" means, it is adequately unlikely for a collision to be found.
Your confusion is answered when considering how large the output space "512 bits" really is:
2^512 (the number of possible configurations of a 512 bit array) is of the order 10^154.
For comparison: The number of atoms in the visible universe is somewhere in the range of 10^80.
A million is 10^6.
So a million of our 'visible universes' has 10^86 atoms.
A million times a million universes has 10^92 atoms.
If you could store a single 512 bit value on a single atom, how many universes would you need to have all possible 512 bit has values stored?
Starting with a specific 512bit number (and assuming the has function is not broken), the probability p to obtain a collision is assuming you can produce new hashes with a rate R and have the total time of t to do this is:
p = R*t/(2^(512/2))
(The exponent is halved, see "birthday attach". The expected search space for a success is to find a collision in n bits is n/2.)
Let's plugin in some example numbers:
The has rate of the bitcoin network is currently about R = 200*10^15 / s (200 million terrahashes per second).
Consider the situation that since the beginning of the universe the bitcoin network's current hashing capacity would have been available for the sole purpose of finding a collision for a specific hash value, i.e. for an available time of t=13.787*10^9 years,
then the probability that a collision would have been found by now is about 7 × 10^-41 %
Again, it is hard to appreciate how small this number is.
Edit: A similar question with a good answer is found here: https://crypto.stackexchange.com/questions/89558/are-sha-256-and-sha-512-collision-resistant

what must the minimum number of bits in each word of the Little Man computer be?

I came across the follow question while reading a CS book, can someone please explain it to me? >"The Little Man computer can have ten operation codes (0-9) and address 100 words of storage (0-99). If binary numbers are to replace decimal numbers, what must the minimum number of bits in each word of the LMC be?"
Since you need to be able to distinguish 10 codes for operation, the minimum word size would have to be 4 bits. Using 4 bits, you can represent up to 2^4 = 16 possible codes (since each bit can be 0 or 1). Anything less (2^3 = 8) will not allow a separate binary number for each code.
The Little Man Computer is an architecture where one instruction is held in one word, therefore a word has to contain both the op code and the address. That means you have to hold 000 to 999 so my answer would be 10 bits. You could assume that the question implies the op code and address in separate fields - in that case you need 4 bits for the op code and 7 bits for the address making 11 in total.
Note that the LMC has a "jump if greater than or equal to zero" instruction and for this to mean anything you must be able to represent negative numbers - so that implies that memory has a sign bit. My own simulation allows -999 to +999 as numbers in memory.