RSA-OAEP : How do Cryptographic hash functions expand a number of bits? - cryptography

First of, this question is not really code related, but i am trying to understand what happens behind the code. Hope someone know the anwser to this one, because it have been troubling me for some time.
I am writing a program in c#, which is using the RSA crypto service provider.
From what i can understand, the class is using SHA1 by standard in its padding.
I have been trying to understand what actually happens during the padding, but can't seem to get my head around a single step in the process.
The algorithm for OAEP that i am currently looking at, is simply the wiki one.
http://en.wikipedia.org/wiki/OAEP
The step that is troubling me is 3). I thought hash functions always returned a certain amount of bits (SHA1 - 160bits), so how can it simply expand the amount of bits to n-k0, which with a standard 1024 key bit-strenght would be 864 bits?

I've never done anything with OAEP, but crypto hash functions (as described in step 3) use a procedure spelled out in http://en.wikipedia.org/wiki/PBKDF. Basically, to expand the number of output bits, you 1st repeat the hash with an incremented counter concatenated to the argument being hashed, then concatenate those results until you have enough bits. This technique doesn't add entropy to the result, but does allow you to create a longer output bitstream.
From wikipedia:
If you want a key that's dklen long, and your crypto hash function U only outputs hlen bits:
DK = T1 || T2 || ... || Tdklen/hlen
Ti = F(Password, Salt, Iterations, i)
F(Password, Salt, Iterations, i) = U1 ^ U2 ^ ... ^ Uc
U1 = PRF(Password, Salt || INT_msb(i))
U2 = PRF(Password, U1)
...
Uc = PRF(Password, Uc-1)
(If you only need one iteration of the cryptographic hash function, c=1, so you don't need the XOR operator ^, and for each i, you only need to calculate U1)

Specifically for OAEP, the recommendation is to use an algorithm called MGF1, which operates. By repeatedly hashing a seed and a counter, and concatenating the results together, the spe I fixation comes from RfC 2437
From the RfC text, where Z is the seed and l is the length of the output:
3.For counter from 0 to {l / hLen}-1, do the following:
a.Convert counter to an octet string C of length 4 with the
primitive I2OSP: C = I2OSP (counter, 4)
b.Concatenate the hash of the seed Z and C to the octet string T:
T = T || Hash (Z || C)
4.Output the leading l octets of T as the octet string mask.

Related

Password entropy logarithm

I've a password with a length of 10 and 78 unique characters. I know that the first two characters of the password must be digits (from 0-9). My calculation is:
E = log2(10^2) + log2(78^8) = 56,93
Is that right?
Yep this is the correct calculation for information-theoretic entropy.
Remember though, entropy is a measure of the uncertainty generated by a source, not a property of the generated bits themselves.

Reversing XOR Encryption not working?

So I have the following function To XOR a string and it works but when I pass the encrypted string back into it the output is not decrypetd. MY understanding is to De XOR something is to simply XOR it again but it seems to fail, Any ideals on why this is ? Also Im storing the xor as hex
Public Shared Function StringToHash(ByVal str As String) As Integer
Dim hash As Integer = (AscW(Char.ToLower(str(0))) Xor &H4B9ACE2F) * &H1000193
For i As Integer = 1 To str.Length - 1
hash = (AscW(Char.ToLower(str(i))) Xor hash) * &H1000193
Next
hash = hash * &H1000193
Return hash
End Function
As already indicated, this is about hashing, not encryption - two related but separate notions.
Although secure hashes are one-way, the above function is certainly not a secure hash. So you may be able to reverse it for very small input sizes.
Besides that, although hashes are usually not reversible, it may be possible to brute force the hash value.
This would only work for if the input alphabet is limited (a few characters, or a limited set of words etc.) and it may return too many inputs if the hash output size is small (which in this case it certainly is). But otherwise you just toss in some input and test if it results in the same hash value.

Efficient random permutation of n-set-bits

For the problem of producing a bit-pattern with exactly n set bits, I know of two practical methods, but they both have limitations I'm not happy with.
First, you can enumerate all of the possible word values which have that many bits set in a pre-computed table, and then generate a random index into that table to pick out a possible result. This has the problem that as the output size grows the list of candidate outputs eventually becomes impractically large.
Alternatively, you can pick n non-overlapping bit positions at random (for example, by using a partial Fisher-Yates shuffle) and set those bits only. This approach, however, computes a random state in a much larger space than the number of possible results. For example, it may choose the first and second bits out of three, or it might, separately, choose the second and first bits.
This second approach must consume more bits from the random number source than are strictly required. Since it is choosing n bits in a specific order when their order is unimportant, this means that it is making an arbitrary distinction between n! different ways of producing the same result, and consuming at least floor(log_2(n!)) more bits than are necessary.
Can this be avoided?
There is obviously a third approach of iteratively computing and counting off the legal permutations until a random index is reached, but that's simply a space-for-time trade-off on the first approach, and isn't directly helpful unless there is an efficient way to count off those n permutations.
clarification
The first approach requires picking a single random number between zero and (where w is the output size), as this is the number of possible solutions.
The second approach requires picking n random values between zero and w-1, zero and w-2, etc., and these have a product of , which is times larger than the first approach.
This means that the random number source has been forced to produce bits to distinguish n! different results which are all equivalent. I'd like to know if there's an efficient method to avoid relying on this superfluous randomness. Perhaps by using an algorithm which produces an un-ordered list of bit positions, or by directly computing the nth unique permutation of bits.
Seems like you want a variant of Floyd's algorithm:
Algorithm to select a single, random combination of values?
Should be especially useful in your case, because the containment test is a simple bitmask operation. This will require only k calls to the RNG. In the code below, I assume you have randint(limit) which produces a uniform random from 0 to limit-1, and that you want k bits set in a 32-bit int:
mask = 0;
for (j = 32 - k; j < 32; ++j) {
r = randint(j+1);
b = 1 << r;
if (mask & b) mask |= (1 << j);
else mask |= b;
}
How many bits of entropy you need here depends on how randint() is implemented. If k > 16, set it to 32 - k and negate the result.
Your alternative suggestion of generating a single random number representing one combination among the set (mathematicians would call this a rank of the combination) is simpler if you use colex order rather than lexicographic rank. This code, for example:
for (i = k; i >= 1; --i) {
while ((b = binomial(n, i)) > r) --n;
buf[i-1] = n;
r -= b;
}
will fill the array buf[] with indices from 0 to n-1 for the k-combination at colex rank r. In your case, you'd replace buf[i-1] = n with mask |= (1 << n). The binomial() function is binomial coefficient, which I do with a lookup table (see this). That would make the most efficient use of entropy, but I still think Floyd's algorithm would be a better compromise.
[Expanding my comment:] If you only have a little raw entropy available, then use a PRNG to stretch it further. You only need enough raw entropy to seed a PRNG. Use the PRNG to do the actual shuffle, not the raw entropy. For the next shuffle reseed the PRNG with some more raw entropy. That spreads out the raw entropy and makes less of a demand on your entropy source.
If you know exactly the range of numbers you need out of the PRNG, then you can, carefully, set up your own LCG PRNG to cover the appropriate range while needing the minimum entropy to seed it.
ETA: In C++there is a next_permutation() method. Try using that. See std::next_permutation Implementation Explanation for more.
Is this a theory problem or a practical problem?
You could still do the partial shuffle, but keep track of the order of the ones and forget the zeroes. There are log(k!) bits of unused entropy in their final order for your future consumption.
You could also just use the recurrence (n choose k) = (n-1 choose k-1) + (n-1 choose k) directly. Generate a random number between 0 and (n choose k)-1. Call it r. Iterate over all of the bits from the nth to the first. If we have to set j of the i remaining bits, set the ith if r < (i-1 choose j-1) and clear it, subtracting (i-1 choose j-1), otherwise.
Practically, I wouldn't worry about the couple of words of wasted entropy from the partial shuffle; generating a random 32-bit word with 16 bits set costs somewhere between 64 and 80 bits of entropy, and that's entirely acceptable. The growth rate of the required entropy is asymptotically worse than the theoretical bound, so I'd do something different for really big words.
For really big words, you might generate n independent bits that are 1 with probability k/n. This immediately blows your entropy budget (and then some), but it only uses linearly many bits. The number of set bits is tightly concentrated around k, though. For a further expected linear entropy cost, I can fix it up. This approach has much better memory locality than the partial shuffle approach, so I'd probably prefer it in practice.
I would use solution number 3, generate the i-th permutation.
But do you need to generate the first i-1 ones?
You can do it a bit faster than that with kind of divide and conquer method proposed here: Returning i-th combination of a bit array and maybe you can improve the solution a bit
Background
From the formula you have given - w! / ((w-n)! * n!) it looks like your problem set has to do with the binomial coefficient which deals with calculating the number of unique combinations and not permutations which deals with duplicates in different positions.
You said:
"There is obviously a third approach of iteratively computing and counting off the legal permutations until a random index is reached, but that's simply a space-for-time trade-off on the first approach, and isn't directly helpful unless there is an efficient way to count off those n permutations.
...
This means that the random number source has been forced to produce bits to distinguish n! different results which are all equivalent. I'd like to know if there's an efficient method to avoid relying on this superfluous randomness. Perhaps by using an algorithm which produces an un-ordered list of bit positions, or by directly computing the nth unique permutation of bits."
So, there is a way to efficiently compute the nth unique combination, or rank, from the k-indexes. The k-indexes refers to a unique combination. For example, lets say that the n choose k case of 4 choose 3 is taken. This means that there are a total of 4 numbers that can be selected (0, 1, 2, 3), which is represented by n, and they are taken in groups of 3, which is represented by k. The total number of unique combinations can be calculated as n! / ((k! * (n-k)!). The rank of zero corresponds to the k-index of (2, 1, 0). Rank one is represented by the k-index group of (3, 1, 0), and so forth.
Solution
There is a formula that can be used to very efficiently translate between a k-index group and the corresponding rank without iteration. Likewise, there is a formula for translating between the rank and corresponding k-index group.
I have written a paper on this formula and how it can be seen from Pascal's Triangle. The paper is called Tablizing The Binomial Coeffieicent.
I have written a C# class which is in the public domain that implements the formula described in the paper. It uses very little memory and can be downloaded from the site. It performs the following tasks:
Outputs all the k-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the k-index to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the entire set.
Converts the index in a sorted binomial coefficient table to the corresponding k-index. The technique used is also much faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers. This version returns a long value. There is at least one other method that returns an int. Make sure that you use the method that returns a long value.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with at least 2 cases and there are no known bugs.
The following tested example code demonstrates how to use the class and will iterate through each unique combination:
public void Test10Choose5()
{
String S;
int Loop;
int N = 10; // Total number of elements in the set.
int K = 5; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
So, the way to apply the class to your problem is by considering each bit in the word size as the total number of items. This would be n in the n!/((k! (n - k)!) formula. To obtain k, or the group size, simply count the number of bits set to 1. You would have to create a list or array of the class objects for each possible k, which in this case would be 32. Note that the class does not handle N choose N, N choose 0, or N choose 1 so the code would have to check for those cases and return 1 for both the 32 choose 0 case and 32 choose 32 case. For 32 choose 1, it would need to return 32.
If you need to use values not much larger than 32 choose 16 (the worst case for 32 items - yields 601,080,390 unique combinations), then you can use 32 bit integers, which is how the class is currently implemented. If you need to use 64 bit integers, then you will have to convert the class to use 64 bit longs. The largest value that a long can hold is 18,446,744,073,709,551,616 which is 2 ^ 64. The worst case for n choose k when n is 64 is 64 choose 32. 64 choose 32 is 1,832,624,140,942,590,534 - so a long value will work for all 64 choose k cases. If you need numbers bigger than that, then you will probably want to look into using some sort of big integer class. In C#, the .NET framework has a BigInteger class. If you are working in a different language, it should not be hard to port.
If you are looking for a very good PRNG, one of the fastest, lightweight, and high quality output is the Tiny Mersenne Twister or TinyMT for short . I ported the code over to C++ and C#. it can be found here, along with a link to the original author's C code.
Rather than using a shuffling algorithm like Fisher-Yates, you might consider doing something like the following example instead:
// Get 7 random cards.
ulong Card;
ulong SevenCardHand = 0;
for (int CardLoop = 0; CardLoop < 7; CardLoop++)
{
do
{
// The card has a value of between 0 and 51. So, get a random value and
// left shift it into the proper bit position.
Card = (1UL << RandObj.Next(CardsInDeck));
} while ((SevenCardHand & Card) != 0);
SevenCardHand |= Card;
}
The above code is faster than any shuffling algorithm (at least for obtaining a subset of random cards) since it only works on 7 cards instead of 52. It also packs the cards into individual bits within a single 64 bit word. It makes evaluating poker hands much more efficient as well.
As a side, note, the best binomial coefficient calculator I have found that works with very large numbers (it accurately calculated a case that yielded over 15,000 digits in the result) can be found here.

vb xor checksum

This question may already have been asked but nothing on SO actually gave me the answer I need.
I am trying to reverse engineer someone else's vb.NET code and I am stuck with what a Xor is doing here. Here is 1 line of the body of a soap request that gets parsed (some values have been obscured so the checksum may not work in this case):
<HD>CHANGEDTHIS01,W-A,0,7753.2018E,1122.6674N, 0.00,1,CID_V_01*3B</HD>
and this is the snippet of vb code that checks it
LastStar = strValues(CheckLoop).IndexOf("*")
StrLen = strValues(CheckLoop).Length
TransCheckSum = Val("&h" + strValues(CheckLoop).Substring(LastStar + 1, (StrLen - (LastStar + 1))))
CheckSum = 0
For CheckString = 0 To LastStar - 1
CheckSum = CheckSum Xor Asc(strValues(CheckLoop)(CheckString))
Next '
If CheckSum <> TransCheckSum Then
'error with the checksum
...
OK, I get it up to the For loop. I just need an explanation of what the Xor is doing and how that is used for the checksum.
Thanks.
PS: As a bonus, if anyone can provide a c# translation I would be most grateful.
Using Xor is a simple algorithm to calculate a checksum. The idea is the same as when calculating a parity bit, but there is eight bits calculated across the bytes. More advanced algorithms like CRC and MD5 are often used to calculate checksums for more demanding applications.
The C# code would look like this:
string value = strValues[checkLoop];
int lastStar = value.IndexOf("*");
int transCheckSum = Convert.ToByte(value.Substring(lastStar + 1, 2), 16);
int checkSum = 0;
for (int checkString = 4; checkString < lastStar; checkString++) {
checkSum ^= (int)value[checkString];
}
if (checkSum != transCheckSum) {
// error with the checksum
}
I made some adjustments to the code to accomodate the transformation to C#, and some things that makes sense. I declared the variables used, and used camel case rather than Pascal case for local variables. I use a local variable for the string, instead of getting it from the collection each time.
The VB Val method stops parsing when it finds a character that it doesn't recognise, so to use the framework methods I assumed that the length of the checksum is two characters, so that it can parse the string "3B" rather than "3B</HD>".
The loop starts at the fourth character, to skip the first "<HD>", which should logically not be part of the data that the checksum should be calculated for.
In C# you don't need the Asc function to get the character code, you can just cast the char to an int.
The code is basically getting the character values and doing a Xor in order to check the integrity, you have a very nice explanation of the operation in this page, in the Parity Check section : http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/BitOp/xor.html

RSA private exponent determination

My question is about RSA signing.
In case of RSA signing:
encryption -> y = x^d mod n,
decryption -> x = y^e mod n
x -> original message
y -> encrypted message
n -> modulus (1024 bit)
e -> public exponent
d -> private exponent
I know x, y, n and e. Knowing these can I determine d?
If you can factor n = p*q, then d*e ≡ 1 (mod m) where m = φ(n) = (p-1)*(q-1), (φ(m) is Euler's totient function) in which case you can use the extended Euclidean algorithm to determine d from e. (d*e - k*m = 1 for some k)
All these are very easy to compute, except for the factoring, which is designed to be intractably difficult so that public-key encryption is a useful technique that cannot be decrypted unless you know the private key.
So, to answer your question in a practical sense, no, you can't derive the private key from the public key unless you can wait the hundreds or thousands of CPU-years to factor n.
Public-key encryption and decryption are inverse operations:
x = ye mod n = (xd)e mod n = xde mod n = xkφ(n)+1 mod n = x * (xφ(n))k mod n = x mod n
where (xφ(n))k = 1 mod n because of Euler's theorem.
The answer is yes under two conditions. One, somebody factors n. Two, someone slips the algorithm a mickey and convinces the signer to use one of several possible special values for x.
Applied Cryptography pages 472 and 473 describe two such schemes. I don't fully understand exactly how they would work in practice. But the solution is to use an x that cannot be fully controlled by someone who wants to determine d (aka the attacker).
There are several ways to do this, and they all involve hashing x, fiddling the value of the hash in predictable ways to remove some undesirable properties, and then signing that value. The recommended techniques for doing this are called 'padding', though there is one very excellent technique that does not count as a padding method that can be found in Practical Cryptography.
No. Otherwise a private key would be of no use.