Why does the 'i' need to be divided by 2 in caculating positional encoding? - tensorflow

In this transformer tutorial:
https://www.tensorflow.org/text/tutorials/transformer
def get_angles(pos, i, d_model):
angle_rates = 1 / np.power(10000, (2 * (i//2)) / np.float32(d_model))
return pos * angle_rates
I don't understand why 'i//2' is used, since in the original formula there is no specification of the integer division.
So what's the purpose of i//2?

according to the formula
PE(pos, 2i) = sin(1/100000^(2i /D) , PE(pos, 2i+1) = cos(1/100000^(2i
/D)
As you can see, for odd row, 2i -> 2i, for even row 2i+1 -> 2i, for example, a word embedding with [0,1,2,3,4,5,6,7] should transfer to [0,0,1,1,2,2,3,3], there is where i//2 comes from.

Related

Let Σ={a,b,c}. How many languages over Σ are there such that each string in the language has length 2 or less?

First of all I see the number of strings as the following:
1 (epsilon 0 length string) + 3 (pick one letter) + 9 (3 options for first letter, 3 options for second)
For a total of 13 strings. Now as far as I know a language can pick any combination of this for example l1 = {ab,a,ac} l2 = {c}
I'm not sure how to calculate the total number of languages there could be here. Any Help?
So you have a set with 13 elements. A particular language could be any subset of this set. How many subsets does this set have?
This is called the power set of that set, and it has 213 elements.
Cardinality of character set, say d = 3.
Total words possible of length (<= k), say w = (d^(k+1) - 1)/(d-1) = 13.
Total languages possible = Power Set {Each word can be included or not} = 2^w = 8192.

torch logical indexing of tensor

I looking for an elegant way to select a subset of a torch tensor which satisfies some constrains.
For example, say I have:
A = torch.rand(10,2)-1
and S is a 10x1 tensor,
sel = torch.ge(S,5) -- this is a ByteTensor
I would like to be able to do logical indexing, as follows:
A1 = A[sel]
But that doesn't work.
So there's the index function which accepts a LongTensor but I could not find a simple way to convert S to a LongTensor, except the following:
sel = torch.nonzero(sel)
which returns a K x 2 tensor (K being the number of values of S >= 5). So then I have to convert it to a 1 dimensional array, which finally allows me to index A:
A:index(1,torch.squeeze(sel:select(2,1)))
This is very cumbersome; in e.g. Matlab all I'd have to do is
A(S>=5,:)
Can anyone suggest a better way?
One possible alternative is:
sel = S:ge(5):expandAs(A) -- now you can use this mask with the [] operator
A1 = A[sel]:unfold(1, 2, 2) -- unfold to get back a 2D tensor
Example:
> A = torch.rand(3,2)-1
-0.0047 -0.7976
-0.2653 -0.4582
-0.9713 -0.9660
[torch.DoubleTensor of size 3x2]
> S = torch.Tensor{{6}, {1}, {5}}
6
1
5
[torch.DoubleTensor of size 3x1]
> sel = S:ge(5):expandAs(A)
1 1
0 0
1 1
[torch.ByteTensor of size 3x2]
> A[sel]
-0.0047
-0.7976
-0.9713
-0.9660
[torch.DoubleTensor of size 4]
> A[sel]:unfold(1, 2, 2)
-0.0047 -0.7976
-0.9713 -0.9660
[torch.DoubleTensor of size 2x2]
There are two simpler alternatives:
Use maskedSelect:
result=A:maskedSelect(your_byte_tensor)
Use a simple element-wise multiplication, for example
result=torch.cmul(A,S:gt(0))
The second one is very useful if you need to keep the shape of the original matrix (i.e A), for example to select neurons in a layer at backprop. However, since it puts zeros in the resulting matrix whenever the condition dictated by the ByteTensor doesn't apply, you can't use it to compute the product (or median, etc.). The first one only returns the elements that satisfy the condittion, so this is what I'd use to compute products or medians or any other thing where I don't want zeros.

Smooth Coloring Mandelbrot Set Without Complex Number Library

I've coded a basic Mandelbrot explorer in C#, but I have those horrible bands of color, and it's all greyscale.
I have the equation for smooth coloring:
mu = N + 1 - log (log |Z(N)|) / log 2
Where N is the escape count, and |Z(N)| is the modulus of the complex number after the value has escaped, it's this value which I'm unsure of.
My code is based off the pseudo code given on the wikipedia page: http://en.wikipedia.org/wiki/Mandelbrot_set#For_programmers
The complex number is represented by the real values x and y, using this method, how would I calculate the value of |Z(N)| ?
|Z(N)| means the distance to the origin, so you can calculate it via sqrt(x*x + y*y).
If you run into an error with the logarithm: Check the iterations before. If it's part of the Mandelbrot set (iteration = max_iteration), the first logarithm will result 0 and the second will raise an error.
So just add this snippet instead of your old return code. .
if (i < iterations)
{
return i + 1 - Math.Log(Math.Log(Math.Sqrt(x * x + y * y))) / Math.Log(2);
}
return i;
Later, you should divide i by the max_iterations and multiply it with 255. This will give you a nice rgb-value.

Random Number Generation to Memory from a Distribution using VBA

I want to generate random numbers from a selected distribution in VBA (Excel 2007).
I'm currently using the Analysis Toolpak with the following code:
Application.Run "ATPVBAEN.XLAM!Random", "", A, B, C, D, E, F
Where
A = how many variables that are to be randomly generated
B = number of random numbers generated per variable
C = number corresponding to a distribution
1= Uniform
2= Normal
3= Bernoulli
4= Binomial
5= Poisson
6= Patterned
7= Discrete
D = random number seed
E = parameter of distribution (mu, lambda, etc.) depends on choice for C
(F) = additional parameter of distribution (sigma, etc.) depends on choice for C
But I want to have the random numbers be generated into an array, and NOT onto a sheet.
I understand that where the "" is designates where the random numbers should be printed to, but I don't know the syntax for assigning the random numbers to an array, or some other form of memory storage instead of to a sheet.
I've tried following the syntax discussed at this Analysis Toolpak site, but have had no success.
I realize that VBA is not the ideal place to generate random numbers, but I need to do this in VBA. Any help is much appreciated! Thanks!
Using the inbuilt functions is the key. There is a corresponding version for each of these functions but Poisson. In my presented solution I am using an algorithm presented by Knuth to generate a random number from the Poisson Distribution.
For Discrete or Patterned you obviously have to write your custom algorithm.
Regarding the seed you can place a Randomize [seed] before filling your array.
Function RandomNumber(distribution As Integer, Optional param1 = 0, Optional param2 = 0)
Select Case distribution
Case 1 'Uniform
RandomNumber = Rnd()
Case 2 'Normal
RandomNumber = Application.WorksheetFunction.NormInv(Rnd(), param1, param2)
Case 3 'Bernoulli
RandomNumber = IIf(Rnd() > param1, 1, 0)
Case 4 'Binomial
RandomNumber = Application.WorksheetFunction.Binom_Inv(param1, param2, Rnd())
Case 5 'Poisson
RandomNumber = RandomPoisson(param1)
Case 6 'Patterned
RandomNumber = 0
Case 7 'Discrete
RandomNumber = 0
End Select
End Function
Function RandomPoisson(ByVal lambda As Integer) 'Algorithm by Knuth
l = Exp(-lambda)
k = 0
p = 1
Do
k = k + 1
p = p * Rnd()
Loop While p > l
RandomPoisson = k - 1
End Function
Why not use the inbuilt functions?
Uniform = rnd
Normal = WorksheetFunction.NormInv
Bernoulli = iif(rnd()<p,0,1)
Binomial = WorksheetFunction.Binomdist
Poisson = WorksheetFunction.poisson
Patterned = for ... next
Discrete =
-
select case rnd()
case <0.1
'choice 1
case 0.1 to 0.4
'choice 2
case >0.4
'choice 3
end select

Recognizing when to use the modulus operator

I know the modulus (%) operator calculates the remainder of a division. How can I identify a situation where I would need to use the modulus operator?
I know I can use the modulus operator to see whether a number is even or odd and prime or composite, but that's about it. I don't often think in terms of remainders. I'm sure the modulus operator is useful, and I would like to learn to take advantage of it.
I just have problems identifying where the modulus operator is applicable. In various programming situations, it is difficult for me to see a problem and realize "Hey! The remainder of division would work here!".
Imagine that you have an elapsed time in seconds and you want to convert this to hours, minutes, and seconds:
h = s / 3600;
m = (s / 60) % 60;
s = s % 60;
0 % 3 = 0;
1 % 3 = 1;
2 % 3 = 2;
3 % 3 = 0;
Did you see what it did? At the last step it went back to zero. This could be used in situations like:
To check if N is divisible by M (for example, odd or even)
or
N is a multiple of M.
To put a cap of a particular value. In this case 3.
To get the last M digits of a number -> N % (10^M).
I use it for progress bars and the like that mark progress through a big loop. The progress is only reported every nth time through the loop, or when count%n == 0.
I've used it when restricting a number to a certain multiple:
temp = x - (x % 10); //Restrict x to being a multiple of 10
Wrapping values (like a clock).
Provide finite fields to symmetric key algorithms.
Bitwise operations.
And so on.
One use case I saw recently was when you need to reverse a number. So that 123456 becomes 654321 for example.
int number = 123456;
int reversed = 0;
while ( number > 0 ) {
# The modulus here retrieves the last digit in the specified number
# In the first iteration of this loop it's going to be 6, then 5, ...
# We are multiplying reversed by 10 first, to move the number one decimal place to the left.
# For example, if we are at the second iteration of this loop,
# reversed gonna be 6, so 6 * 10 + 12345 % 10 => 60 + 5
reversed = reversed * 10 + number % 10;
number = number / 10;
}
Example. You have message of X bytes, but in your protocol maximum size is Y and Y < X. Try to write small app that splits message into packets and you will run into mod :)
There are many instances where it is useful.
If you need to restrict a number to be within a certain range you can use mod. For example, to generate a random number between 0 and 99 you might say:
num = MyRandFunction() % 100;
Any time you have division and want to express the remainder other than in decimal, the mod operator is appropriate. Things that come to mind are generally when you want to do something human-readable with the remainder. Listing how many items you could put into buckets and saying "5 left over" is good.
Also, if you're ever in a situation where you may be accruing rounding errors, modulo division is good. If you're dividing by 3 quite often, for example, you don't want to be passing .33333 around as the remainder. Passing the remainder and divisor (i.e. the fraction) is appropriate.
As #jweyrich says, wrapping values. I've found mod very handy when I have a finite list and I want to iterate over it in a loop - like a fixed list of colors for some UI elements, like chart series, where I want all the series to be different, to the extent possible, but when I've run out of colors, just to start over at the beginning. This can also be used with, say, patterns, so that the second time red comes around, it's dashed; the third time, dotted, etc. - but mod is just used to get red, green, blue, red, green, blue, forever.
Calculation of prime numbers
The modulo can be useful to convert and split total minutes to "hours and minutes":
hours = minutes / 60
minutes_left = minutes % 60
In the hours bit we need to strip the decimal portion and that will depend on the language you are using.
We can then rearrange the output accordingly.
Converting linear data structure to matrix structure:
where a is index of linear data, and b is number of items per row:
row = a/b
column = a mod b
Note above is simplified logic: a must be offset -1 before dividing & the result must be normalized +1.
Example: (3 rows of 4)
1 2 3 4
5 6 7 8
9 10 11 12
(7 - 1)/4 + 1 = 2
7 is in row 2
(7 - 1) mod 4 + 1 = 3
7 is in column 3
Another common use of modulus: hashing a number by place. Suppose you wanted to store year & month in a six digit number 195810. month = 195810 mod 100 all digits 3rd from right are divisible by 100 so the remainder is the 2 rightmost digits in this case the month is 10. To extract the year 195810 / 100 yields 1958.
Modulus is also very useful if for some crazy reason you need to do integer division and get a decimal out, and you can't convert the integer into a number that supports decimal division, or if you need to return a fraction instead of a decimal.
I'll be using % as the modulus operator
For example
2/4 = 0
where doing this
2/4 = 0 and 2 % 4 = 2
So you can be really crazy and let's say that you want to allow the user to input a numerator and a divisor, and then show them the result as a whole number, and then a fractional number.
whole Number = numerator/divisor
fractionNumerator = numerator % divisor
fractionDenominator = divisor
Another case where modulus division is useful is if you are increasing or decreasing a number and you want to contain the number to a certain range of number, but when you get to the top or bottom you don't want to just stop. You want to loop up to the bottom or top of the list respectively.
Imagine a function where you are looping through an array.
Function increase Or Decrease(variable As Integer) As Void
n = (n + variable) % (listString.maxIndex + 1)
Print listString[n]
End Function
The reason that it is n = (n + variable) % (listString.maxIndex + 1) is to allow for the max index to be accounted.
Those are just a few of the things that I have had to use modulus for in my programming of not just desktop applications, but in robotics and simulation environments.
Computing the greatest common divisor
Determining if a number is a palindrome
Determining if a number consists of only ...
Determining how many ... a number consists of...
My favorite use is for iteration.
Say you have a counter you are incrementing and want to then grab from a known list a corresponding items, but you only have n items to choose from and you want to repeat a cycle.
var indexFromB = (counter-1)%n+1;
Results (counter=indexFromB) given n=3:
`1=1`
`2=2`
`3=3`
`4=1`
`5=2`
`6=3`
...
Best use of modulus operator I have seen so for is to check if the Array we have is a rotated version of original array.
A = [1,2,3,4,5,6]
B = [5,6,1,2,3,4]
Now how to check if B is rotated version of A ?
Step 1: If A's length is not same as B's length then for sure its not a rotated version.
Step 2: Check the index of first element of A in B. Here first element of A is 1. And its index in B is 2(assuming your programming language has zero based index).
lets store that index in variable "Key"
Step 3: Now how to check that if B is rotated version of A how ??
This is where modulus function rocks :
for (int i = 0; i< A.length; i++)
{
// here modulus function would check the proper order. Key here is 2 which we recieved from Step 2
int j = [Key+i]%A.length;
if (A[i] != B[j])
{
return false;
}
}
return true;
It's an easy way to tell if a number is even or odd. Just do # mod 2, if it is 0 it is even, 1 it is odd.
Often, in a loop, you want to do something every k'th iteration, where k is 0 < k < n, assuming 0 is the start index and n is the length of the loop.
So, you'd do something like:
int k = 5;
int n = 50;
for(int i = 0;i < n;++i)
{
if(i % k == 0) // true at 0, 5, 10, 15..
{
// do something
}
}
Or, you want to keep something whitin a certain bound. Remember, when you take an arbitrary number mod something, it must produce a value between 0 and that number - 1.