RSA algorthum calculations

RSA algorthum calculations - cryptography

I have been working though a network book and hit the RSA section.
Consider the RSA algorithm with p=5 and q=11.
so I get N = p*q = 55 right?
and z = (p-1) * (q -1) = 40
I think I got this right but the book is not very clear on how to calculate this.
The example in the book says that e = 3 but does not give a reason why. Because the author likes it or is there another reason?
and how do i go about finding d so that de= 1(mod z) and d < 160
Thanks for any help with this its a bit above me right now.

Your calculations of n and z are correct.
An RSA cryptosystem consists of three variables n, d and e. Variable e is the least important of the three, and is usually chosen arbitrarily to make computations simple; 3 and 65537 are the most common choices for e. The only requirements are that e is odd and co-prime to the totient (z in your implementation); thus e is frequently chosen prime so that it will be co-prime to the totient no matter what totient is chosen. The reason that 3 and 65537 are frequently used for e is because it makes the computation easy; both numbers have only two 1-bits in their binary representation, so only two iterations of a complicated loop are needed.
You can see an implementation of an RSA cryptosystem at my blog. If you poke around there, you will also find some other crypto-related stuff that may interest you.

what you are looking for is the extended euclidean algorithm
for an example see wikipedia or here

Related

Efficient random permutation of n-set-bits

For the problem of producing a bit-pattern with exactly n set bits, I know of two practical methods, but they both have limitations I'm not happy with.
First, you can enumerate all of the possible word values which have that many bits set in a pre-computed table, and then generate a random index into that table to pick out a possible result. This has the problem that as the output size grows the list of candidate outputs eventually becomes impractically large.
Alternatively, you can pick n non-overlapping bit positions at random (for example, by using a partial Fisher-Yates shuffle) and set those bits only. This approach, however, computes a random state in a much larger space than the number of possible results. For example, it may choose the first and second bits out of three, or it might, separately, choose the second and first bits.
This second approach must consume more bits from the random number source than are strictly required. Since it is choosing n bits in a specific order when their order is unimportant, this means that it is making an arbitrary distinction between n! different ways of producing the same result, and consuming at least floor(log_2(n!)) more bits than are necessary.
Can this be avoided?
There is obviously a third approach of iteratively computing and counting off the legal permutations until a random index is reached, but that's simply a space-for-time trade-off on the first approach, and isn't directly helpful unless there is an efficient way to count off those n permutations.
clarification
The first approach requires picking a single random number between zero and (where w is the output size), as this is the number of possible solutions.
The second approach requires picking n random values between zero and w-1, zero and w-2, etc., and these have a product of , which is times larger than the first approach.
This means that the random number source has been forced to produce bits to distinguish n! different results which are all equivalent. I'd like to know if there's an efficient method to avoid relying on this superfluous randomness. Perhaps by using an algorithm which produces an un-ordered list of bit positions, or by directly computing the nth unique permutation of bits.

Seems like you want a variant of Floyd's algorithm:
Algorithm to select a single, random combination of values?
Should be especially useful in your case, because the containment test is a simple bitmask operation. This will require only k calls to the RNG. In the code below, I assume you have randint(limit) which produces a uniform random from 0 to limit-1, and that you want k bits set in a 32-bit int:
mask = 0;
for (j = 32 - k; j < 32; ++j) {
r = randint(j+1);
b = 1 << r;
if (mask & b) mask |= (1 << j);
else mask |= b;
}
How many bits of entropy you need here depends on how randint() is implemented. If k > 16, set it to 32 - k and negate the result.
Your alternative suggestion of generating a single random number representing one combination among the set (mathematicians would call this a rank of the combination) is simpler if you use colex order rather than lexicographic rank. This code, for example:
for (i = k; i >= 1; --i) {
while ((b = binomial(n, i)) > r) --n;
buf[i-1] = n;
r -= b;
}
will fill the array buf[] with indices from 0 to n-1 for the k-combination at colex rank r. In your case, you'd replace buf[i-1] = n with mask |= (1 << n). The binomial() function is binomial coefficient, which I do with a lookup table (see this). That would make the most efficient use of entropy, but I still think Floyd's algorithm would be a better compromise.

[Expanding my comment:] If you only have a little raw entropy available, then use a PRNG to stretch it further. You only need enough raw entropy to seed a PRNG. Use the PRNG to do the actual shuffle, not the raw entropy. For the next shuffle reseed the PRNG with some more raw entropy. That spreads out the raw entropy and makes less of a demand on your entropy source.
If you know exactly the range of numbers you need out of the PRNG, then you can, carefully, set up your own LCG PRNG to cover the appropriate range while needing the minimum entropy to seed it.
ETA: In C++there is a next_permutation() method. Try using that. See std::next_permutation Implementation Explanation for more.

Is this a theory problem or a practical problem?
You could still do the partial shuffle, but keep track of the order of the ones and forget the zeroes. There are log(k!) bits of unused entropy in their final order for your future consumption.
You could also just use the recurrence (n choose k) = (n-1 choose k-1) + (n-1 choose k) directly. Generate a random number between 0 and (n choose k)-1. Call it r. Iterate over all of the bits from the nth to the first. If we have to set j of the i remaining bits, set the ith if r < (i-1 choose j-1) and clear it, subtracting (i-1 choose j-1), otherwise.
Practically, I wouldn't worry about the couple of words of wasted entropy from the partial shuffle; generating a random 32-bit word with 16 bits set costs somewhere between 64 and 80 bits of entropy, and that's entirely acceptable. The growth rate of the required entropy is asymptotically worse than the theoretical bound, so I'd do something different for really big words.
For really big words, you might generate n independent bits that are 1 with probability k/n. This immediately blows your entropy budget (and then some), but it only uses linearly many bits. The number of set bits is tightly concentrated around k, though. For a further expected linear entropy cost, I can fix it up. This approach has much better memory locality than the partial shuffle approach, so I'd probably prefer it in practice.

I would use solution number 3, generate the i-th permutation.
But do you need to generate the first i-1 ones?
You can do it a bit faster than that with kind of divide and conquer method proposed here: Returning i-th combination of a bit array and maybe you can improve the solution a bit

Background
From the formula you have given - w! / ((w-n)! * n!) it looks like your problem set has to do with the binomial coefficient which deals with calculating the number of unique combinations and not permutations which deals with duplicates in different positions.
You said:
"There is obviously a third approach of iteratively computing and counting off the legal permutations until a random index is reached, but that's simply a space-for-time trade-off on the first approach, and isn't directly helpful unless there is an efficient way to count off those n permutations.
...
This means that the random number source has been forced to produce bits to distinguish n! different results which are all equivalent. I'd like to know if there's an efficient method to avoid relying on this superfluous randomness. Perhaps by using an algorithm which produces an un-ordered list of bit positions, or by directly computing the nth unique permutation of bits."
So, there is a way to efficiently compute the nth unique combination, or rank, from the k-indexes. The k-indexes refers to a unique combination. For example, lets say that the n choose k case of 4 choose 3 is taken. This means that there are a total of 4 numbers that can be selected (0, 1, 2, 3), which is represented by n, and they are taken in groups of 3, which is represented by k. The total number of unique combinations can be calculated as n! / ((k! * (n-k)!). The rank of zero corresponds to the k-index of (2, 1, 0). Rank one is represented by the k-index group of (3, 1, 0), and so forth.
Solution
There is a formula that can be used to very efficiently translate between a k-index group and the corresponding rank without iteration. Likewise, there is a formula for translating between the rank and corresponding k-index group.
I have written a paper on this formula and how it can be seen from Pascal's Triangle. The paper is called Tablizing The Binomial Coeffieicent.
I have written a C# class which is in the public domain that implements the formula described in the paper. It uses very little memory and can be downloaded from the site. It performs the following tasks:
Outputs all the k-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the k-index to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the entire set.
Converts the index in a sorted binomial coefficient table to the corresponding k-index. The technique used is also much faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers. This version returns a long value. There is at least one other method that returns an int. Make sure that you use the method that returns a long value.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with at least 2 cases and there are no known bugs.
The following tested example code demonstrates how to use the class and will iterate through each unique combination:
public void Test10Choose5()
{
String S;
int Loop;
int N = 10; // Total number of elements in the set.
int K = 5; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
So, the way to apply the class to your problem is by considering each bit in the word size as the total number of items. This would be n in the n!/((k! (n - k)!) formula. To obtain k, or the group size, simply count the number of bits set to 1. You would have to create a list or array of the class objects for each possible k, which in this case would be 32. Note that the class does not handle N choose N, N choose 0, or N choose 1 so the code would have to check for those cases and return 1 for both the 32 choose 0 case and 32 choose 32 case. For 32 choose 1, it would need to return 32.
If you need to use values not much larger than 32 choose 16 (the worst case for 32 items - yields 601,080,390 unique combinations), then you can use 32 bit integers, which is how the class is currently implemented. If you need to use 64 bit integers, then you will have to convert the class to use 64 bit longs. The largest value that a long can hold is 18,446,744,073,709,551,616 which is 2 ^ 64. The worst case for n choose k when n is 64 is 64 choose 32. 64 choose 32 is 1,832,624,140,942,590,534 - so a long value will work for all 64 choose k cases. If you need numbers bigger than that, then you will probably want to look into using some sort of big integer class. In C#, the .NET framework has a BigInteger class. If you are working in a different language, it should not be hard to port.
If you are looking for a very good PRNG, one of the fastest, lightweight, and high quality output is the Tiny Mersenne Twister or TinyMT for short . I ported the code over to C++ and C#. it can be found here, along with a link to the original author's C code.
Rather than using a shuffling algorithm like Fisher-Yates, you might consider doing something like the following example instead:
// Get 7 random cards.
ulong Card;
ulong SevenCardHand = 0;
for (int CardLoop = 0; CardLoop < 7; CardLoop++)
{
do
{
// The card has a value of between 0 and 51. So, get a random value and
// left shift it into the proper bit position.
Card = (1UL << RandObj.Next(CardsInDeck));
} while ((SevenCardHand & Card) != 0);
SevenCardHand |= Card;
}
The above code is faster than any shuffling algorithm (at least for obtaining a subset of random cards) since it only works on 7 cards instead of 52. It also packs the cards into individual bits within a single 64 bit word. It makes evaluating poker hands much more efficient as well.
As a side, note, the best binomial coefficient calculator I have found that works with very large numbers (it accurately calculated a case that yielded over 15,000 digits in the result) can be found here.

RSA private exponent determination

My question is about RSA signing.
In case of RSA signing:
encryption -> y = x^d mod n,
decryption -> x = y^e mod n
x -> original message
y -> encrypted message
n -> modulus (1024 bit)
e -> public exponent
d -> private exponent
I know x, y, n and e. Knowing these can I determine d?

If you can factor n = p*q, then d*e ≡ 1 (mod m) where m = φ(n) = (p-1)*(q-1), (φ(m) is Euler's totient function) in which case you can use the extended Euclidean algorithm to determine d from e. (d*e - k*m = 1 for some k)
All these are very easy to compute, except for the factoring, which is designed to be intractably difficult so that public-key encryption is a useful technique that cannot be decrypted unless you know the private key.
So, to answer your question in a practical sense, no, you can't derive the private key from the public key unless you can wait the hundreds or thousands of CPU-years to factor n.
Public-key encryption and decryption are inverse operations:
x = ye mod n = (xd)e mod n = xde mod n = xkφ(n)+1 mod n = x * (xφ(n))k mod n = x mod n
where (xφ(n))k = 1 mod n because of Euler's theorem.

The answer is yes under two conditions. One, somebody factors n. Two, someone slips the algorithm a mickey and convinces the signer to use one of several possible special values for x.
Applied Cryptography pages 472 and 473 describe two such schemes. I don't fully understand exactly how they would work in practice. But the solution is to use an x that cannot be fully controlled by someone who wants to determine d (aka the attacker).
There are several ways to do this, and they all involve hashing x, fiddling the value of the hash in predictable ways to remove some undesirable properties, and then signing that value. The recommended techniques for doing this are called 'padding', though there is one very excellent technique that does not count as a padding method that can be found in Practical Cryptography.

No. Otherwise a private key would be of no use.

Generator G's requirement to be a primitive root modulo p in the Diffie Hellman algorithm

Having searched, I've found myself confused by the use of P and G in the Diffie Hellman algorithm. There is requirementy that P is prime, and G is a primitive root of P.
I understand the security is based on the difficulty of factoring the result of two very large prime numbers, so I have no issue with that. However, there appears to be little available information on the purpose of G being a primitive root of P. Can anyone answer why this requirement exists (with references if possible)? Does it just increase the security? Given that shared keys can be created with apparently any combination of p and g, even ones that are not prime, I find this intriguing. It can surely only be for security? If so, how does it increase it?
Thanks in advance
Daniel

If g is not a primitive root of p, g will only generate a subgroup of GFp. This has consequences for the security properties of the system: the security of the system will only be proportional to the order of g in GFp instead of proportional to the full order of GFp.
To take a small example: select p=13 and g=3.
The order of 3 in GF_13 is 3 (3^1=3, 3^2=9, 3^3=1).
Following the usual steps of Diffie-Hellman, Alice and Bob should each select integers a, b between 1 and p-1 and calculate resp. A = ga and B = gb. To brute force this, an attacker should expect to try all possible values of a (or b) between 1 and p-1 until he finds a value that yields A (or B). But since g was not a primitive root modulo p, he only need to try the values 1, 2 and 3 in order to find a solution a' so that A = ga'. And the secret is s = gab = (ga)b =(ga')b = ga'b = (gb)a' = Ba', which the attacker can now calculate.

There is no requirement that the generator g used for Diffie-Hellman is a primitive root nor is this even a common choice. Much more popular is to choose g such that it generates a prime order subgroup. I.e. the order of g is a prime q, which is a large prime factor of p-1.
For example the Diffie-Hellman groups proposed for IKE have been chosen such that p is a safe prime and g generates the subgroup of order (p-1)/2.
One motivation for choosing g as a generator of a big prime order subgroup is that this allows to make
the decisional Diffie-Hellman assumption. This assumption does not hold if g is a primitive root, and that makes the analysis of the implemented protocols somewhat harder.

The security of Diffie-Hellman is not based on the difficulty of factoring. It is based on the (assumed) difficulty of calculating general discrete logarithms.
g must be a primitive root of p for the algorithm to be correct and useable. It ensures that for every number 0 <= x < p, there is a distinct value of gx mod p. That is, it ensures that g can "generate" every value in the finite field.

Why RSA Decryption process takes longer time than the Encryption process?

I have some idea that it is due to some complex calculation, but i want to know about what exactly happens which takes long time than the corresponding encryption process. Any link to webpage or paper would be of great help.
Thanks
Thanks for the answers, One more Doubt, What about the Signing and verification? Will this time difference be there for Signing and verification also? Ex. Signing requires more time than Verification?

Let's call n, d and e the RSA modulus, private exponent and public exponent, respectively. The RSA decryption speed is proportional to (log d)(log n)2 (i.e. quadratic in the length of the modulus, and linear in the length of the private exponent). Similarly, the RSA encryption speed is proportional to (log e)(log n)2. The private key holder also knows the factorization of n, which can be used to speed up private key operation by a factor of about 4 (with the Chinese Remainder Theorem). For details on the involved algorithms, see the Handbook of Applied Cryptography, especially chapter 14 ("Efficient Implementation").
For proper security, the private exponent (d) must be big; it has been shown that if it is smaller than 29% of the length of the modulus (n) then the private key can be reconstructed. We do not know what is the minimum length to avoid such weaknesses, so in practice d will have about the same length than n. This means that decryption will be about cubic in the length of n.
The same provisions do not apply to the public exponent (e), which can be as small as wished for, as long as it complies with the RSA rules (e must be relatively prime to r-1 for all prime factors r of n). So it is customary that a very small e is chosen. It is so customary that there are widely deployed implementations that cannot handle big public exponents. For instance, the RSA implementation in Windows' CryptoAPI (the one used e.g. by Internet Explorer when connected to a HTTPS site with a RSA server certificate) cannot process a RSA public key if e does not fit in 32 bits. e=3 is the best possible, but e=65537 is traditional (it is an historical kind of blunder, because a very small exponent can induce a perceived weakness if RSA is used without its proper and standard padding, something which should never be done anyway). 65537 is a 17-bit long integer, whereas a typical length for n and d will be 1024 bits or more. This makes public-key operations (message encryption, signature verification) much faster than private-key operations (message decryption, signature generation).

In theory, it doesn't have to be. The encryption and decryption algorithms are essentially identical. Given:
d = decryption key
e = encryption key
n = modulus (product of primes)
c = encrypted code group
m = plaintext code group
Then:
Encryption ci = mie (mod n)
Decryption mi = cid (mod n)
The normal algorithm for raising to a power is iterative, so the time taken depends on the size of the exponent. In most cases, the pair works out with the decryption key being (usually considerably) larger than the encryption key.
It is possible to reverse that though. Just for a toy example, consider:
p=17
q=23
n=391
Here's a list of some valid encryption/decryption key pairs for this particular pair of primes:
e = 17, d = 145
e = 19, d = 315
e = 21, d = 285
e = 23, d = 199
e = 25, d = 169
e = 27, d = 339
e = 29, d = 85
e = 31, d = 159
e = 35, d = 171
e = 37, d = 333
e = 39, d = 343
e = 41, d = 249
e = 43, d = 131
e = 45, d = 133
e = 47, d = 15
e = 49, d = 273
e = 51, d = 283
e = 53, d = 93
e = 57, d = 105
e = 59, d = 179
Out of those 20 key pairs, only one has a decryption key smaller than the encryption key. In the other cases, the decryption key ranges from just under twice as big to almost 17 times as large. Of course, when the modulus is tiny like this, it's quick and easy to generate a lot of key pairs, so finding a small decryption key would be fairly easy -- with a real RSA key, however, it's not quite so trivial, and we generally just accept the first pair we find. As you can see from the list above, in that case, you're quite likely to end up with a decryption key that's considerably larger than your encryption key, and therefore decryption will end up slower than encryption. When working with ~100 digit numbers, we'd have to be quite patient to find a pair for which decryption was going to be (even close to) as fast as encryption.

The encryption power is usually chosen to be a prime of the form 2^n+1 (17, 63357) which requires a relatively few multiplication operations. The decryption value will be a much larger number as a consequence, and thus take more work to compute.

There are two factors involved in this:
On the one hand, the public exponent can be chosen to be a small number with only two 1-bits (usually 3, 17 or 65537). This means the RSA encryption operation can be done with a few modular squarings and an addition. This cannot be reversed: If you force the private exponent to be a small number, the security of the system is obviously broken.
On the other hand, the holder of the private key can store some precalculated values derived from the original primes. With those he can use the CRT algorithm to replace the single exponentiation modulo a n-bit number with two exponentiaions modulo a n/2-bit number. This is approximately four times faster than the naive way.
So for RSA key pairs with random public exponents, private key operations can actually be faster. But the effect of choosing a small public exponent is much greater than the effect of the faster algorithm, so encryption is faster in practice.

RSA Laboratories describes why pretty well
In practical applications, it is common to choose a small public exponent for the public key.
...
With the typical modular exponentiation algorithms used to implement the RSA algorithm, public key operations take O(k^2) steps, private key operations take O(k^3) steps

How much longer? Do you have any exact details?
Any way, it make sense that decryption is complicated more than encryption, since the encryption it is not in a symmetric way like 123 => abc and abc > 123.
For more details I suggest starting here.
To read about how the calculatio works, this article seems very good one http://www.di-mgt.com.au/rsa_alg.html

In short "multiply = easy, factor = hard".
Take a look at (http://en.wikipedia.org/wiki/RSA#Encryption) which references optimizations in exponentiation (http://en.wikipedia.org/wiki/Exponentiation_by_squaring#Further_applications)
The best resource I found was the following lecture on cryptography from Princeton (http://www.cs.princeton.edu/courses/archive/spr05/cos126/lectures/22.pdf)

d and e are multiplicatively inverse numbers modulo phi(n). That means that it doesn't matter witch of the two you'll choose for encryption, and witch one for decryption. You just choose once before encryption. If you want fast decryption than you choose the bigger number for encryption. It's that simple.

Why are we using i as a counter in loops? [closed]

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I know this might seem like an absolutely silly question to ask, yet I am too curious not to ask...
Why did "i" and "j" become THE variables to use as counters in most control structures?
Although common sense tells me they are just like X, which is used for representing unknown values, I can't help to think that there must be a reason why everyone gets taught the same way over and over again.
Is it because it is actually recommended for best practices, or a convention, or does it have some obscure reason behind it?
Just in case, I know I can give them whatever name I want and that variables names are not relevant.

It comes ultimately from mathematics: the summation notation traditionally uses i for the first index, j for the second, and so on. Example (from http://en.wikipedia.org/wiki/Summation):
It's also used that way for collections of things, like if you have a bunch of variables x1, x2, ... xn, then an arbitrary one will be known as xi.
As for why it's that way, I imagine SLaks is correct and it's because I is the first letter in Index.

I believe it dates back to Fortran. Variables starting with I through Q were integer by default, the others were real. This meant that I was the first integer variable, and J the second, etc., so they fell towards use in loops.

Mathematicians were using i,j,k to designate integers in algebra (subscripts, series, summations etc) long before (e.g 1836 or 1816) computers were around (this is the origin of the FORTRAN variable type defaults). The habit of using letters from the end of the alphabet (...,x,y,z) for unknown variables and from the beginning (a,b,c...) for constants is generally attributed to Rene Descartes, (see also here) so I assume i,j,k...n (in the middle of the alphabet) for integers is likely due to him too.

i = integer
Comes from Fortran where integer variables had to start with the letters I through N and real variables started with the other letters. Thus I was the first and shortest integer variable name. Fortran was one of the earliest programming languages in widespread use and the habits developed by programmers using it carried over to other languages.
EDIT: I have no problem with the answer that it derives from mathematics. Undoubtedly that is where the Fortran designers got their inspiration. The fact is, for me anyway, when I started to program in Fortran we used I, J, K, ... for loop counters because they were short and the first legally allowed variable names for integers. As a sophomore in H.S. I had probably heard of Descartes (and a very few others), but made very little connection to mathematics when programming. In fact, the first course I took was called "Fortran for Business" and was taught not by the math faculty, but the business/econ faculty.
For me, at least, the naming of variables had little to do with mathematics, but everything due to the habits I picked up writing Fortran code that I carried into other languages.

i stands for Index.
j comes after i.

These symbols were used as matrix indexes in mathematics long before electronic computers were invented.

I think it's most likely derived from index (in the mathematical sense) - it's used commonly as an index in sums or other set-based operations, and most likely has been used that way since before there were programming languages.

There's a preference in maths for using consecutive letters in the alphabet for "anonymous" variables used in a similar way. Hence, not just "i, j, k", but also "f, g, h", "p, q, r", "x, y, z" (rarely with "u, v, w" prepended), and "α, β, γ".
Now "f, g, h" and "x, y, z" are not used freely: the former is for functions, the latter for dimensions. "p, q, r" are also often used for functions.
Then there are other constraints on available sequences: "l" and "o" are avoided, because they look too much like "1" and "0" in many fonts. "t" is often used for time, "d & δ" for differentials, and "a, s, m, v" for the physical measures of acceleration, displacement, mass, and velocity. That leaves not so many gaps of three consecutive letters without unwanted associations in mathematics for indices.
Then, as several others have noticed, conventions from mathematics had a strong influence on early programming conventions, and "α, β, γ" weren't available in many early character sets.

I found another possible answer that could be that i, j, and k come from Hamilton's Quaternions.
Euler picked i for the imaginary unit.
Hamilton needed two more square roots of -1:
ii = jj = kk = ijk = -1
Hamilton was really influential, and quaternions were the standard way to do 3D analysis before 1900. By then, mathematicians were used to thinking of (ijk) as a matched set.
Vector calculus replaced quaternionic analysis in the 1890s because it was a better way to write Maxwell's equations. But people tended to write vector quantities as like this: (3i-2j+k) instead of (3,-2,1). So (ijk) became the standard basis vectors in R^3.
Finally, physicists started using group theory to describe symmetries in systems of differential equations. So (ijk) started to connote "vectors that get swapped around by permutation groups," then drifted towards "index-like things that take on all possible values in some specified set," which is basically what they mean in a for loop.

by discarding (a little biased)
a seems an array
b seems another array
c seems a language name
d seems another language name
e seems exception
f looks bad in combination with "for" (for f, a pickup?)
g seems g force
h seems height
i seems an index
j seems i (another index)
k seems a constant k
l seems a number one (1)
m seems a matrix
n seems a node
o seems an output
p sounds like a pointer
q seems a queue
r seems a return value
s seems a string
t looks like time
u reserved for UVW mapping or electic phase
v reserved for UVW mapping or electic phase or a vector
w reserved for UVW mapping or electic phase or a weight
x seems an axis (or an unknown variable)
y seems an axis
z seems a third axis

One sunny afternoon, Archimedes what pondering (as was usual for sunny afternoons) and ran into his buddy Eratosthenes.
Archimedes said, "Archimedes to Eratosthenes greeting! I'm trying to come up with a solution to the ratio of several spherical rigid bodies in equilibrium. I wish to iterate over these bodies multiple times, but I'm having a frightful time keeping track of how many iterations I've done!"
Eratosthenes said, "Why Archimedes, you ripe plum of a kidder, you could merely mark successive rows of lines in the sand, each keeping track of the number of iterations you've done within iteration!"
Archimedes cried out to the world that his great friend was undeniably a shining beacon of intelligence for coming up with such a simple solution. But Archimedes remarked that he likes to walk in circles around his sand pit while he ponders. Thus, there was risk of losing track of which row was on top, and which was on bottom.
"Perhaps I should mark these rows with a letter of the alphabet just off to the side so that I will always know which row is which! What think you of that?" he asked, then added, "But Eratosthenes... whatever letters shall I use?"
Eratosthenes was sure he didn't know which letters would be best, and said as much to Archimedes. But Archimedes was unsatisfied and continued to prod the poor librarian to choose, at least, the two letters that he would require for his current sphere equilibrium solution.
Eratosthenes, finally tired of the incessant request for two letters, yelled, "I JUST DON'T KNOW!!!"
So Archimedes chose the first two letters in Eratosthenes' exclamatory sentence, and thanked his friend for the contribution.
These symbols were quickly adopted by ancient Greek Java developers, and the rest is, well... history.

i think it's because a lot of loops use an Int type variable to do the counting, like
for (int i = 0; etc
and when you type, you actually speak it out in your head (like when you read), so in your mind, you say 'int....'
and when you have to make up a letter right after that 'int....' , you say / type the 'i' because that is the first letter you think of when you've just said 'int'
like you spell a word to kids who start learning reading you spell words for them by using names, like this:
WORD spells William W, Ok O, Ruby R, Done D
So you say Int I, Double d, Float f, string s etc. based on the first letter.
And j is used because when you have done int I, J follows right after it.

I think it's a combination of the other mentioned reasons :
For starters, 'i' was commonly used by mathematicians in their notation, and in the early days of computing with languages that weren't binary (ie had to be parsed and lexed in some fashion), the vast majority of users of computers were also mathematicians (... and scientists and engineers) so the notation fell into use in computer languages for programming loops, and has kind of just stuck around ever since.
Combine this with the fact that screen space in those very early days was very limited, as was memory, it made sense to keep shorter variable names.

Possibly historical ?
FORTRAN, aurguably the first high level language, defined i,j,k,l,m as Integer datatypes by default, and loops could only be controlled by integer variable, the convention continues ?
eg:
do 100 i= j,100,5
....
100 continue
....

i = iterator, i = index, i = integer
Which ever you figure "i" stands for it still "fits the bill".
Also, unless you have only a single line of code within that loop, you should probably be naming the iterator/index/integer variable to something more meaningful. Like: employeeIndex
BTW, I usually use "i" in my simple iterator loops; unless of course it contains multiple lines of code.

i = iota, j = jot; both small changes.
iota is the smallest letter in the greek alphabet; in the English language it's meaning is linked to small changes, as in "not one iota" (from a phrase in the New Testament: "until heaven and earth pass away, not an iota, not a dot, will pass from the Law" (Mt 5:18)).
A counter represents a small change in a value.
And from iota comes jot (iot), which is also a synonym for a small change.
cf. http://en.wikipedia.org/wiki/Iota

Well from Mathematics: (for latin letters)
a,b: used as constants or as integers for a rational number
c: a constant
d: derivative
e: Euler's number
f,g,h: functions
i,j,k: are indexes (also unit vectors and the quaternions)
l: generally not used. looks like 1
m,n: are rows and columns of matrices or as integers for rational numbers
o: also not used (unless you're in little o notation)
p,q: often used as primes
r: sometimes a spatial change of variable other times related to prime numbers
s,t: spatial and temporal variables or s is used as a change of variable for t
u,v,w: change of variable
x,y,z: variables

Many possible main reasons, I guess:
mathematicians use i and j for Natural Numbers in formulas (the ones that use Complex Numbers rarely, at least), so this carried over to programming
from C, i hints to int. And if you need another int then i2 is just way too long, so you decide to use j.
there are languages where the first letter decides the type, and i is then an integer.

It comes from Fortran, where i,j,k,l,m,n are implicitly integers.

It definitely comes from mathematics, which long preceded computer programming.
So, where did if come from in math? My completely uneducated guess is that it's as one fellow said, mathematicians like to use alphabetic clusters for similar things -- f, g, h for functions; x, y, z for numeric variables; p, q, r for logical variables; u, v, w for other sets of variables, especially in calculus; a, b, c for a lot of things. i, j, k comes in handy for iterative variables, and that about exhausts the possibilities. Why not m, n? Well, they are used for integers, but more often the end points of iterations rather than the iterative variables themselves.
Someone should ask a historian of mathematics.

Counters are so common in programs, and in the early days of computing, everything was at a premium...
Programmers naturally tried to conserve pixels, and the 'i' required fewer pixels than any other letter to represent. (Mathematicians, being lazy, picked it for the same reason - as the smallest glyph).
As stated previously, 'j' just naturally followed...
:)

I use it for a number of reasons.
Usually my loops are int based, so
you make a complete triangle on the
keyboard typing "int i" with the
exception of the space I handle with
my thumb. This is a very fast
sequence to type.
The "i" could stand for iterator, integer, increment, or index, each of which makes
logical sense.
With my personal uses set aside, the theory of it being derived from FORTRAN is correct, where integer vars used letters I - N.

I learned FORTRAN on a Control Data Corp. 3100 in 1965. Variables starting with 'I' through 'N' were implied to be integers. Ex: 'IGGY' and 'NORB' were integers, 'XMAX' and 'ALPHA' were floating-point. However, you could override this through explicit declaration.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas