Logical addressing numpy mess up with other matrices - numpy

I have just found a problem and I don't know if it is meant to be this way or I am just doing it wrong. When I use logical addressing in a numpy matrix to change all the values of a matrix that are, say, equal to a 1. All other matrices that somehow have something to do with this matrix will also be modified.
In [1]: import numpy as np
In [2]: from numpy import matrix as mtx
In [3]: A=mtx(np.eye(6))
In [4]: A
matrix([[ 1., 0., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 1.]])
In [5]: B=A
In [6]: C=B
In [7]: D=C
In [8]: A[A==1]=5
In [9]: A
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 5., 0., 0., 0., 0.],
[ 0., 0., 5., 0., 0., 0.],
[ 0., 0., 0., 5., 0., 0.],
[ 0., 0., 0., 0., 5., 0.],
[ 0., 0., 0., 0., 0., 5.]])
In [10]: B
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 5., 0., 0., 0., 0.],
[ 0., 0., 5., 0., 0., 0.],
[ 0., 0., 0., 5., 0., 0.],
[ 0., 0., 0., 0., 5., 0.],
[ 0., 0., 0., 0., 0., 5.]])
In [11]: C
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 5., 0., 0., 0., 0.],
[ 0., 0., 5., 0., 0., 0.],
[ 0., 0., 0., 5., 0., 0.],
[ 0., 0., 0., 0., 5., 0.],
[ 0., 0., 0., 0., 0., 5.]])
In [12]: D
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 5., 0., 0., 0., 0.],
[ 0., 0., 5., 0., 0., 0.],
[ 0., 0., 0., 5., 0., 0.],
[ 0., 0., 0., 0., 5., 0.],
[ 0., 0., 0., 0., 0., 5.]])
Can anyone tell me what am I doing wrong? is this a bug?

This is not a bug. Saying B=A in python means that both B and A point to the same object. You need to copy the matrix.
>>> import numpy as np
>>> from numpy import matrix as mtx
>>> A = mtx(np.eye(6))
>>> B = A.copy()
>>> C = A
#Check memory locations.
>>> id(A)
>>> id(C)
19608352 #Same object as A
>>> id(B)
19607992 #Different object then A
>>> A[A==1] = 5
>>> B #B is a different object then A
matrix([[ 1., 0., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 1.]])
>>> C #C is the same object as A
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 5., 0., 0., 0., 0.],
[ 0., 0., 5., 0., 0., 0.],
[ 0., 0., 0., 5., 0., 0.],
[ 0., 0., 0., 0., 5., 0.],
[ 0., 0., 0., 0., 0., 5.]])
The same issue can be seen with python list:
>>> A = [5,3]
>>> B = A
>>> B[0] = 10
>>> A
[10, 3]
Note that this is different then returning a numpy view as in this case:
>>> A = mtx(np.eye(6))
>>> B = A[0] #B is a view and now points to the first row of A
>>> id(A)
>>> id(B) #Different objects!
#B still points to the memory location of A's first row, but through numpy trickery
>>> B
matrix([[ 1., 0., 0., 0., 0., 0.]])
>>> B *= 5 #In place multiplication, updates B which is the same as A's first row
>>> A
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 1.]])
As the view B points to the first row of A, A is changed. Now lets force a copy.
>>> B = B*10 #Assigns B*10 to a different chunk of memory
>>> A
matrix([[ 5., 0., 0., 0., 0., 0.],
[ 0., 1., 0., 0., 0., 0.],
[ 0., 0., 1., 0., 0., 0.],
[ 0., 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 1., 0.],
[ 0., 0., 0., 0., 0., 1.]])
>>> B
matrix([[ 50., 0., 0., 0., 0., 0.]])


One-hot encode labels in keras

I have a set of integers from a label column in a CSV file - [1,2,4,3,5,2,..]. The number of classes is 5 ie range of 1 to 6. I want to one-hot encode them using the below code.
y = df.iloc[:,10].values
y = tf.keras.utils.to_categorical(y, num_classes = 5)
But this code gives me an error
IndexError: index 5 is out of bounds for axis 1 with size 5
How can I fix this?
If you use tf.keras.utils.to_categorical to one-hot the label vector, the integers should start from 0 to num_classes, source. In your case, you should do as follows
import tensorflow as tf
import numpy as np
a = np.array([1,2,4,3,5,2,4,2,1])
y_tf = tf.keras.utils.to_categorical(a-1, num_classes = 5)
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0.]], dtype=float32)
or, you can use pd.get_dummies,
import pandas as pd
import numpy as np
a = np.array([1,2,4,3,5,2,4,2,1])
a_pd = pd.get_dummies(a).astype('float32').values
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 0., 1.],
[0., 1., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 1., 0., 0., 0.],
[1., 0., 0., 0., 0.]], dtype=float32)

Implementing BandRNN with pytorch and tensorflow

So I am trying to figure out how to train my matrix in a way that I will get a BandRNN.
BandRnn is a diagonalRNN model with a different number of connections per neuron.
For example:
C is the number of connections per neuron.
I found out that there is a way to turn off some of the gradients in a for loop, in a way that prevents them from being trained as follows:
for p in model.input.parameters():
p.requires_grad = False
But I can't find a proper way to do so, in a way that will make my matrix become a BandRNN.
Hopefully, someone will be able to help me with this issue.
As far as I know you can only activate/deactivate requires_grad on a tensor, and not on distinct components of that tensor. Instead what you could do is zero out the values outside the band.
First create a mask for the band, you could use torch.ones with torch.diagflat:
>>> torch.diagflat(torch.ones(5), offset=1)
By setting the right dimension for torch.ones as well as the right offset you can generate offset diagonal matrices with consistent shapes.
>>> N = 10; i = -1
>>> torch.diagflat(torch.ones(N-abs(i)), offset=i)
tensor([[0., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.]])
>>> N = 10; i = 0
>>> torch.diagflat(torch.ones(N-abs(i)), offset=i)
tensor([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.]])
>>> N = 10; i = 1
>>> torch.diagflat(torch.ones(N-abs(i)), offset=i)
tensor([[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0.]])
You get the point, summing these matrices element-wise allows use to get a mask:
>>> N = 10; b = 3
>>> mask = sum(torch.diagflat(torch.ones(N-abs(i)), i) for i in range(-b//2,b//2+1))
>>> mask
tensor([[1., 1., 0., 0., 0.],
[1., 1., 1., 0., 0.],
[1., 1., 1., 1., 0.],
[0., 1., 1., 1., 1.],
[0., 0., 1., 1., 1.]])
Then you can zero out the values outside the band on your nn.Linear:
>>> m = nn.Linear(N, N)
>>> m.weight.data = m.weight * mask
>>> m.weight
Parameter containing:
tensor([[-0.3321, -0.3377, -0.0000, -0.0000, -0.0000],
[-0.4197, 0.1729, 0.2101, 0.0000, 0.0000],
[ 0.3467, 0.2857, -0.3919, -0.0659, 0.0000],
[ 0.0000, -0.4060, 0.0908, 0.0729, -0.1318],
[ 0.0000, -0.0000, -0.4449, -0.0029, -0.1498]], requires_grad=True)
Note, you might need to perform this on each forward pass as the parameters outside the band might get updated to non-zero values during the training. Of course, you can initialize mask once and keep it in memory.
It would be more convenient to wrap everything into a custom nn.Module.

Numpy arange a diagonal array

I would like to create a square numpy array such that it starts counting from the diagonal.
Do you know a one-liner for that?
Example with 5x5:
array([[ 1., 2., 3., 4., 5.],
[ 0., 1., 2., 3., 4.],
[ 0., 0., 1., 2., 3.],
[ 0., 0., 0., 1., 2.],
[ 0., 0., 0., 0., 1.]])
In [49]: np.identity(5).cumsum(axis=1).cumsum(axis=1)
array([[ 1., 2., 3., 4., 5.],
[ 0., 1., 2., 3., 4.],
[ 0., 0., 1., 2., 3.],
[ 0., 0., 0., 1., 2.],
[ 0., 0., 0., 0., 1.]]
>>> mat = np.vstack((np.concatenate((np.zeros(i),np.arange(1,5-i+1))) for i in range(0,5)))
>>> mat
array([[1., 2., 3., 4., 5.],
[0., 1., 2., 3., 4.],
[0., 0., 1., 2., 3.],
[0., 0., 0., 1., 2.],
[0., 0., 0., 0., 1.]])

Error computing KL divergence in Scipy

I am trying to calculate KL divergence using the entropy function of scipy.
My p is:
array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
[ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
and q is:
array([[ 0.05242718, 0.04436347, 0.04130855, 0.04878344, 0.04310538,
0.02856853, 0.03303122, 0.02517992, 0.08525434, 0.03450324,
0.14580068, 0.1286993 , 0.28897473],
[ 0.65421444, 0.11592199, 0.0642645 , 0.02989768, 0.01385762,
0.01756484, 0.01024294, 0.00891479, 0.01140301, 0.00718939,
0.00938009, 0.01070139, 0.04644726],
[ 0.65984136, 0.13251236, 0.06345234, 0.02891162, 0.02429709,
0.02025307, 0.01073064, 0.01170066, 0.00678652, 0.00703361,
0.00560414, 0.00651137, 0.02236522],
[ 0.32315928, 0.23900077, 0.05460232, 0.03953635, 0.02901102,
0.01294443, 0.02372061, 0.02092882, 0.01188251, 0.01377188,
0.02976672, 0.05854314, 0.14313218],
[ 0.7717858 , 0.09692616, 0.03415596, 0.01713088, 0.01108141,
0.0128005 , 0.00847301, 0.01049734, 0.0052889 , 0.00514799,
0.00442508, 0.00485477, 0.01743218]], dtype=float32)
When I do:
I am getting the following error:
ValueError Traceback (most recent call last)
<ipython-input-201-563ea7d4decf> in <module>()
4 print('p0:',p[0])
5 print('q0:',q[0])
----> 6 entropy(p[0],q[0])
/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/mlab.py in entropy(y, bins)
1570 y = np.zeros((len(x)+2,), x.dtype)
1571 y[1:-1] = x
-> 1572 dif = np.diff(y)
1573 up = (dif == 1).nonzero()[0]
1574 dn = (dif == -1).nonzero()[0]
/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
781 if (np.diff(bins) < 0).any():
782 raise ValueError(
--> 783 'bins must increase monotonically.')
785 # Initialize empty histogram
ValueError: bins must increase monotonically.
Why is it?
This works with the example arrays:
import scipy as sp
sp.stats.entropy(p[0], q[0])
Looking at the stack trace in the error massage, it becomes apparent that you did not call scipy's entropy function but matplotlib's entropy, which works differently.
Here is the relevant part:
/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/mlab.pyin entropy(y, bins)

How to create a diagonal multi-dimensional (ie greater than 2) in numpy

Is there a higher (than two) dimensional equivalent of diag?
L = [...] # some arbitrary list.
A = ndarray.diag(L)
will create a diagonal 2-d matrix shape=(len(L), len(L)) with elements of L on the diagonal.
I'd like to do the equivalent of:
length = len(L)
A = np.zeros((length, length, length))
for i in range(length):
A[i][i][i] = L[i]
Is there a slick way to do this?
You can use diag_indices to get the indices to be set. For example,
x = np.zeros((3,3,3))
L = np.arange(6,9)
x[np.diag_indices(3,ndim=3)] = L
array([[[ 6., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 7., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 8.]]])
Under the hood diag_indices is just the code Jaime posted, so which to use depends on whether you want it spelled out in a numpy function, or DIY.
You can use fancy indexing:
In [2]: a = np.zeros((3,3,3))
In [3]: idx = np.arange(3)
In [4]: a[[idx]*3] = 1
In [5]: a
array([[[ 1., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 0.]],
[[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 1.]]])
For a more general approach, you could set the diagonal of an arbitrarily sized array doing something like:
def set_diag(arr, values):
idx = np.arange(np.min(arr.shape))
arr[[idx]*arr.ndim] = values