assign certain entries of Tensor, like set_subtensor of Theano - tensorflow

Can I just assign values to certain entries in a tensor? I got this problems when I compute the cross correlation matrix of a NxP feature matrix feats, where N is observations and P is dimension. Some columns are constant so the standard deviation is zero, and I don't want to devide by std for those constant column. Here is what I did:
fmean, fvar = tf.nn.moments(feats, axes = [0], keep_dims = False)
fstd = tf.sqrt(fvar)
feats = feats - fmean
sel = (fstd != 0)
feats[:, sel] = feats[:, sel]/ fstd[sel]
corr = tf.matmul(tf.transpose(feats), feats)
However, I got this error: TypeError: 'Tensor' object does not support item assignment. Is there any workaround for such issue?

You can make your feats a tf.Variable and use tf.scatter_update to update locations selectively.
It's a bit awkward in that scatter_update needs a list of linear indices to update, so you'd need to convert your [:, sel] implicit 2D specification into explicit list of 1D indices. There's example of constructing 1D indices from 2D here
There's some work in simplifying this kind of use-case in issue #206

Related

Writing SKLearn Regresion Coefficients To Pandas Series

I have a regression model that I fit in SKlearn's LinearRegression module:
To extract the coefficients, I used the code;
coefficients = model.coef_
It produced the following array with a shape of (1, 10):
[-4.72307152e-05 1.29731143e-04 8.75483702e-05 -6.28749019e-04
1.75096740e-04 -3.30209379e-06 1.35937650e-03 3.89048429e-11
8.48406857e-03 -1.36499030e-05]
Now, I would like to save the array to a pd.Series. I am taking the following approach:
features = ["f1", "f2", "f3", "f4", "f5", "f6", "f7", "f8", "f9", "f10"]
model_coefs = pd.Series(coefficients, index=features)
And, the system gives me the following error:
ValueError: Length of passed values is 1, index implies 10.
What I have tried:
Transposing the underlying array, coefficients, to give it a length of 10.
Reshaping the array to give it a shape of (10,1).
But nothing seems to work. I am not sure where I am going wrong.
For your case you want to flatten the array so .ravel should do the trick for example:
pd.Series(np.zeros((1, 10)).ravel(), index=features)
It's strange the coeffs output are of shape (1, 10), when I run the base sklearn example here (with multiple features) my coeffs are of 1-d:
In [27]: regr.coef_
Out[27]:
array([ 3.03499549e-01, -2.37639315e+02, 5.10530605e+02, 3.27736980e+02,
-8.14131709e+02, 4.92814588e+02, 1.02848452e+02, 1.84606489e+02,
7.43519617e+02, 7.60951722e+01])
In [28]: regr.coef_.shape
Out[28]: (10,)

Logical AND/OR in Keras Backend

Tensorflow has tf.logical_and() and tf.logical_or() for comparison of two boolean tensors, i.e. tf.logical_and(x,y)==TRUE if x==TRUE and y==TRUE (doc). I can't find anything like this in the Keras backend though. They have keras.backend.any() and .all(), but this is for aggregation within a tensor, not between. I've been having to use workarounds with nested K.switch() functions, but it is painfully inelegant.
Let x and y be boolean keras tensors of the same shape.
To take elementwise or, do the following:
keras.backend.any(keras.backend.stack([x, y], axis=0), axis=0)
To take elementwise and, do the following:
keras.backend.all(keras.backend.stack([x, y], axis=0), axis=0)
Here keras.backend.stack([x, y], axis=0) stacks x and y into a new tensor with an additional dimension at number 0. After that keras.backend.any takes a logical or along the new dimension, and keras.backend.any takes the logical and.
My solution (perhaps not the best, because I haven't found others either), is:
A = K.cast(someBooleanTensor, K.floatx())
B = K.cast(anotherBooleanTensor, K.floatx())
A_and_B = A * B #this is also something I use a lot for gathering elements
A_or_B = 1 -((1-A)*(1-B))
But thinking about it now... I never tested python operators... perhaps they work?

Efficient axis-wise cartesian product of multiple 2D matrices with Numpy or TensorFlow

So first off, I think what I'm trying to achieve is some sort of Cartesian product but elementwise, across the columns only.
What I'm trying to do is, if you have multiple 2D arrays of size [ (N,D1), (N,D2), (N,D3)...(N,Dn) ]
The result is thus to be a combinatorial product across axis=1 such that the final result will then be of shape (N, D) where D=D1*D2*D3*...Dn
e.g.
A = np.array([[1,2],
[3,4]])
B = np.array([[10,20,30],
[5,6,7]])
cartesian_product( [A,B], axis=1 )
>> np.array([[ 1*10, 1*20, 1*30, 2*10, 2*20, 2*30 ]
[ 3*5, 3*6, 3*7, 4*5, 4*6, 4*7 ]])
and extendable to cartesian_product([A,B,C,D...], axis=1)
e.g.
A = np.array([[1,2],
[3,4]])
B = np.array([[10,20],
[5,6]])
C = np.array([[50, 0],
[60, 8]])
cartesian_product( [A,B,C], axis=1 )
>> np.array([[ 1*10*50, 1*10*0, 1*20*50, 1*20*0, 2*10*50, 2*10*0, 2*20*50, 2*20*0]
[ 3*5*60, 3*5*8, 3*6*60, 3*6*8, 4*5*60, 4*5*8, 4*6*60, 4*6*8]])
I have a working solution that essentially creates an empty (N,D) matrix and then broadcasting a vector columnwise product for each column within nested for loops for each matrix in the provided list. Clearly is horrible once the arrays get larger!
Is there an existing solution within numpy or tensorflow for this? Potentially one that is efficiently paralleizable (A tensorflow solution would be wonderful but a numpy is ok and as long as the vector logic is clear then it shouldn't be hard to make a tf equivalent)
I'm not sure if I need to use einsum, tensordot, meshgrid or some combination thereof to achieve this. I have a solution but only for single-dimension vectors from https://stackoverflow.com/a/11146645/2123721 even though that solution says to work for arbitrary dimensions array (which appears to mean vectors). With that one i can do a .prod(axis=1), but again this is only valid for vectors.
thanks!
Here's one approach to do this iteratively in an accumulating manner making use of broadcasting after extending dimensions for each pair from the list of arrays for elmentwise multiplications -
L = [A,B,C] # list of arrays
n = L[0].shape[0]
out = (L[1][:,None]*L[0][:,:,None]).reshape(n,-1)
for i in L[2:]:
out = (i[:,None]*out[:,:,None]).reshape(n,-1)

Numpy Array Shape Issue

I have initialized this empty 2d np.array
inputs = np.empty((300, 2), int)
And I am attempting to append a 2d row to it as such
inputs = np.append(inputs, np.array([1,2]), axis=0)
But Im getting
ValueError: all the input arrays must have same number of dimensions
And Numpy thinks it's a 2 row 0 dimensional object (transpose of 2d)
np.array([1, 2]).shape
(2,)
Where have I gone wrong?
To add a row to a (300,2) shape array, you need a (1,2) shape array. Note the matching 2nd dimension.
np.array([[1,2]]) works. So does np.array([1,2])[None, :] and np.atleast_2d([1,2]).
I encourage the use of np.concatenate. It forces you to think more carefully about the dimensions.
Do you really want to start with np.empty? Look at its values. They are random, and probably large.
#Divakar suggests np.row_stack. That puzzled me a bit, until I checked and found that it is just another name for np.vstack. That function passes all inputs through np.atleast_2d before doing np.concatenate. So ultimately the same solution - turn the (2,) array into a (1,2)
Numpy requires double brackets to declare an array literal, so
np.array([1,2])
needs to be
np.array([[1,2]])
If you intend to append that as the last row into inputs, you can just simply use np.row_stack -
np.row_stack((inputs,np.array([1,2])))
Please note this np.array([1,2]) is a 1D array.
You can even pass it a 2D row version for the same result -
np.row_stack((inputs,np.array([[1,2]])))

How to expand a Tensorflow Variable

Is there any way to make a Tensorflow Variable larger? Like, let's say I wanted to add a neuron to a layer of a neural network in the middle of training. How would I go about doing that? An answer in This question told me how to change the shape of the variable, to expand it to fit another row of weights, but I don't know how to initialize those new weights.
I figure another way of going about this might involve combining variables, as in initializing the weights first in a second variable and then adding that in as a new row or column of the first variable, but I can't find anything that lets me do that either.
There are various ways you could accomplish this.
1) The second answer in that post (https://stackoverflow.com/a/33662680/5548115) explains how you can change the shape of a variable by calling 'assign' with validate_shape=False. For example, you could do something like
# Assume var is [m, n]
# Add the new 'data' of shape [1, n] with new values
new_neuron = tf.constant(...)
# If concatenating to add a row, concat on the first dimension.
# If new_neuron was [m, 1], you would concat on the second dimension.
new_variable_data = tf.concat(0, [var, new_neuron]) # [m+1, n]
resize_var = tf.assign(var, new_variable_data, validate_shape=False)
Then when you run resize_var, the data pointed to by 'var' will now have the updated data.
2) You could also create a large initial variable, and call tf.slice on different regions of the variable as training progresses, since you can dynamically change the 'begin' and 'size' attributes of slice.
Simply using tf.concat for expand a Tensorflow Variable,you can see the api_docs
for detail.
v1 = tf.Variable(tf.zeros([5,3]),dtype=tf.float32)
v2 = tf.Variable(tf.zeros([1,3]),dtype=tf.float32)
v3 = tf.concat(0,[v1, v2])
Figured it out. It's kind of a roundabout process, but it's the only one I can tell that actually functions. You need to first unpack the variables, then append the new variable to the end, then pack them back together.
If you're expanding along the first dimension, it's rather short: only 7 lines of actual code.
#the first variable is 5x3
v1 = tf.Variable(tf.zeros([5, 3], dtype=tf.float32), "1")
#the second variable is 1x3
v2 = tf.Variable(tf.zeros([1, 3], dtype=tf.float32), "2")
#unpack the first variable into a list of size 3 tensors
#there should be 5 tensors in the list
change_shape = tf.unpack(v1)
#unpack the second variable into a list of size 3 tensors
#there should be 1 tensor in this list
change_shape_2 = tf.unpack(v2)
#for each tensor in the second list, append it to the first list
for i in range(len(change_shape_2)):
change_shape.append(change_shape_2[i])
#repack the list of tensors into a single tensor
#the shape of this resultant tensor should be [6, 3]
final = tf.pack(change_shape)
If you want to expand along the second dimension, it gets somewhat longer.
#First variable, 5x3
v3 = tf.Variable(tf.zeros([5, 3], dtype=tf.float32))
#second variable, 5x1
v4 = tf.Variable(tf.zeros([5, 1], dtype=tf.float32))
#unpack tensors into lists of size 3 tensors and size 1 tensors, respectively
#both lists will hold 5 tensors
change = tf.unpack(v3)
change2 = tf.unpack(v4)
#for each tensor in the first list, unpack it into its own list
#this should make a 2d array of size 1 tensors, array will be 5x3
changestep2 = []
for i in range(len(change)):
changestep2.append(tf.unpack(change[i]))
#do the same thing for the second tensor
#2d array of size 1 tensors, array will be 5x1
change2step2 = []
for i in range(len(change2)):
change2step2.append(tf.unpack(change2[i]))
#for each tensor in the array, append it onto the corresponding array in the first list
for j in range(len(change2step2[i])):
changestep2[i].append(change2step2[i][j])
#pack the lists in the array back into tensors
changestep2[i] = tf.pack(changestep2[i])
#pack the list of tensors into a single tensor
#the shape of this resultant tensor should be [5, 4]
final2 = tf.pack(changestep2)
I don't know if there's a more efficient way of doing this, but this works, as far as it goes. Changing further dimensions would require more layers of lists, as necessary.