Slicing on scan output with TensorFlow - tensorflow

If I want to slice after a scan operation in TensorFlow.
But I just get strange results with TensorFlow:
k = 10
x = 2
out = tf.scan(lambda previous_output, current_input: previous_output * current_input,
tf.fill([k], x), initializer=tf.constant(1))
result = out[-1] # slice with tensorflow - don't work
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(sess.run(out)[-1]) # works but all values are computed and stored in an np array
print(sess.run(result)) # don't work???
I get as output:
1024
3
The second value is obviously wrong and random (sometimes 0 or other values).
So my question is why? The analog code in Theano e.g. works and Theano can do some optimization when querying just the last element of the output tensor.

Related

don't know how to use tensorflow gradient

I got a problem when using tf.gradients to compute gradient.
my x is a tf.constant() of a vector v of shape (4, 1)
and my y is the sigmoid of v, also of shape (4, 1), so the gradient of y with respect to x should be a diagonal matrix of shape (4, 4).
My code:
c = tf.constant(sigmoid(x_0#w_0))
d = tf.constant(x_0#w_0)
Omega = tf.gradients(c, d)
_Omega = sess.run(Omega)
the error is
Fetch argument None has invalid type .
In addition, I think using tf.gradients might be wrong, there may be some other functions that can compute this.
My question:
point out where I am wrong and how to fix it using tf.gradients
or using another function.
Edit:
want to compute the derivative like this: see the vector_by_vector section https://en.wikipedia.org/wiki/Matrix_calculus#Vector-by-vector
and the result Omega would look like the following:
[[s1(1-s1) 0 0 0 ]
[0 s2(1-s2) 0 0 ]
[0 0 s3(1-s3) 0 ]
[0 0 0 s4(1-s4)]]
where si = sigmoid(x_0i#w_0), where x_0i is the ith row of x_0.
Generally, compute a vector over another vector, should be a matrix.
First of all, you can't calculate gradients for constants. You'll get None op for gradients. That's the reason for your error. One way to calculate gradients would be tf graph (see the code below) Or other way could be using tf.GradientTape in Eager execution mode:
import tensorflow as tf
import numpy as np
arr = np.random.rand(4, 1)
ip = tf.Variable(initial_value=arr)
sess = tf.Session()
c_var = tf.math.sigmoid(ip)
Omega = tf.gradients(c_var, ip)
sess.run(tf.global_variables_initializer())
_Omega = sess.run(Omega)
print(_Omega)
Now, you can pass any sized vector. Still, not sure how you will get (4, 4) diagonal matrix for the gradients.

Tensorflow - Is there a simple way to zero out the losses of the samples with the highest losses in a mini-batch?

I am training a neural network for classification. In the context of my research, I would like to zero out the (k) highest losses in each minibatch. I couldn't figure out a simple way to perform this procedure, without relying on numpy at some level.
I have tried the following procedure :
1. Compute the argmax indices of the losses array -- It returns a tf Tensor
2. Slice the losses tensor with the indices array
The issue is that the slicing couldn't be performed using a tf Tensor.
# losses is tf.Tensor
ind_sorted = tf.argsort(losses)
losses_sorted = losses[ind_sorted] # Error mentioned above
# The issue is that ind_1_sorted depends on the output of the neural network. I couldn't find an equivalent of the detach method in pytorch
k_smallest_losses = losses_sorted[:k] # Keeping only the k smallest losses
loss = tf.sum(k_smallest_losses) # Performing the summation of the k smallest losses
Probably you want to use tf.nn.top_k, which returns you both the values and indices of top_k items. (Note to get smallest losses, I add a negative to your loss and convert them back when done).
batch = 2
max_len = 6
losses = tf.random.uniform(shape=[batch, max_len], minval=0, maxval=2, dtype = tf.float32)
bottom_losses_values, bottom_losses_indices = tf.nn.top_k(-losses, k=3)
total = tf.reduce_sum(-bottom_losses_values, axis=-1)
with tf.Session() as sess:
losses, bottom_losses_values, bottom_losses_indices, total = sess.run([losses, bottom_losses_values, bottom_losses_indices, total])
print 'original losses\n', losses
print 'bottom 3 loss values\n', -bottom_losses_values
print 'bottom 3 loss indices\n', bottom_losses_indices
print 'total\n', total
Results:
original losses
[[ 1.45301318 1.65069246 1.31003475 1.71488905 1.71400714 0.0543921 ]
[ 0.09954047 0.12081003 0.24793792 1.51561213 1.73758292 1.43859148]]
bottom 3 loss values
[[ 0.0543921 1.31003475 1.45301318]
[ 0.09954047 0.12081003 0.24793792]]
bottom 3 loss indices
[[5 2 0]
[0 1 2]]
total
[ 2.81744003 0.46828842]

Accessing elements of a placeholder in tensorflow [duplicate]

This question already has answers here:
Weighted cost function in tensorflow
(2 answers)
Closed 4 years ago.
I have a neural network with MSE loss function being implemented something like this:
# input x_ph is of size Nx1 and output should also be of size Nx1
def train_neural_network_batch(x_ph, predict=False):
prediction = neural_network_model(x_ph)
# MSE loss function
cost = tf.reduce_mean(tf.square(prediction - y_ph))
optimizer = tf.train.AdamOptimizer(learn_rate).minimize(cost)
# mini-batch optimization here
I'm fairly new to neural networks and Python, but I understand that each iteration, a sample of training points will be fed into the neural network and the loss function evaluated at the points in this sample. However, I would like to be able to modify the loss function so that it weights certain data more heavily. Some pseudocode of what I mean
# manually compute the MSE of the data without the first sampled element
cost = 0.0
for ii in range(1,len(y_ph)):
cost += tf.square(prediction[ii] - y_ph[ii])
cost = cost/(len(y_ph)-1.0)
# weight the first sampled data point more heavily according to some parameter W
cost += W*(prediction[0] - y_ph[0])
I might have more points I wish to weight differently as well, but for now, I'm just wondering how I can implement something like this in tensorflow. I know len(y_ph) is invalid as y_ph is just a placeholder, and I can't just do something like y_ph[i] or prediction[i].
You can do this in multiple ways:
1) If some of your data instances weighting are simply 2 times or 3 times more than normal instance, you may just copy those instance multiple times in your data set. Thus they would occupy more weight in loss, hence satisfy your intention. This is the simplest way.
2) If your weighting is more complex, say a float weighting. You can define a placeholder for weighting, multiply it to loss, and use feed_dict to feed the weighting in session together with x batch and y batch. Just make sure instance_weight is the same size with batch_size
E.g.
import tensorflow as tf
import numpy as np
with tf.variable_scope("test", reuse=tf.AUTO_REUSE):
x = tf.placeholder(tf.float32, [None,1])
y = tf.placeholder(tf.float32, [None,1])
instance_weight = tf.placeholder(tf.float32, [None,1])
w1 = tf.get_variable("w1", shape=[1, 1])
prediction = tf.matmul(x, w1)
cost = tf.square(prediction - y)
loss = tf.reduce_mean(instance_weight * cost)
opt = tf.train.AdamOptimizer(0.5).minimize(loss)
with tf.Session() as sess:
x1 = [[1.],[2.],[3.]]
y1 = [[2.],[4.],[3.]]
instance_weight1 = [[10.0], [10.0], [0.1]]
sess.run(tf.global_variables_initializer())
print (x1)
print (y1)
print (instance_weight1)
for i in range(1000):
_, loss1, prediction1 = sess.run([opt, loss, prediction], feed_dict={instance_weight : instance_weight1, x : x1, y : y1 })
if (i % 100) == 0:
print(loss1)
print(prediction1)
NOTE instance_weight1, you may change instance_weight1 to see the difference (here batch_size is set to 3)
Where x1,y1 and x2,y2 follow the rule y=2*x
Whereas x3,y3 follow the rule y=x
But with different weight as [10,10,0.1], the prediction1 coverage to y1,y2 rule and almost ignored y3, the output are as:
[[1.9823183]
[3.9646366]
[5.9469547]]
PS: in tensorflow graph, it's highly recommended not to use for loops, but use matrix operator instead to parallel the calculation.

Tensorflow : Choosing a range of columns in each row from a Tensor

I would like to choose only particular columns in each row of a tensor, using it for an RNN
seq_len=[11,12,20,30] #This is the sequence length, assume 4 sequences
array=tf.ones([4,30]) #Assuming this is the array I want to index from
function(array,seq_len) #apply required function
Output=(first 11 elements from row 0, first 12 from row 2, first 20 from row 3 etc), perhaps obtained as a flat tensor
You can use tf.sequence_mask and tf.boolean_mask to get them flattened:
mask = tf.sequence_mask(seq_len, MAX_LENGTH) # Replace MAX_LENGTH with the size of array on the right dimension, 30 in your case
output= tf.boolean_mask(array, mask=mask)
A tensor in tensorflow can be sliced just like a numpy array and then concatenated into one tensor. Assuming you measure the sequence length from the first element.
Use [row_idx,column_idx] to slice the tensor. slice = array[0,:] would assign the first row to slice.
flat_slices = tf.concat([slice,slice]) will flatten them into one tensor.
import tensorflow as tf
seq_len = [11,12,20,30]
array = tf.ones([4,30])
init = tf.global_variables_initializer()
with tf.Session() as sess:
init.run()
flatten = array[0,:seq_len[0]]
for i in range(1,len(seq_len)):
row = array[i,:seq_len[i]]
flatten = tf.concat([flatten, row])
print(sess.run(flatten))

How can I compare if column equals in a matrix multiplication mannar?

I am using Keras (tensorflow as backend). What I want to do is to write a lambda layer that gets 2 tensor input and compare every combination of 2 column of them using Indicator function and produce a new tensor with 0-1 value. Here is an example.
Input: x = K.variable(np.array([[1,2,3],[2,3,4]])),
y = K.variable(np.array([[1,2,3],[2,3,4]]))
Output
z=K.variable(np.array[[1,0],[0,1]])
As far as I know, tensorflow provides tf.equal() to compare tensor in a elementwise way. But if I apply it here, I get
>>> z=tf.equal(x,y)
>>> K.eval(z)
array([[True, True, True],
[True, True, True]], dtype=bool)
It only compares tensor in same position.
So my questions are:
1. Is there a tensorflow API to get my desired output or if I need to write my own function to complete it?
2. If it is the latter one, then there is another problem. I noticed that in keras the input is mini-batch, so the input format looks like: (None, m, n). When writing my own method, how can I tackle with the first dimension, which is None?
Any reply would be appreciated!
You could use broadcasting.
import numpy as np
import tensorflow as tf
x = tf.constant(np.array([[1,2,3],[2,3,4]]))
y = tf.constant(np.array([[1,2,3],[2,3,4]]))
x_ = tf.expand_dims(x, 0)
y_ = tf.expand_dims(y, 1)
res = tf.reduce_all(tf.equal(x_, y_), axis=-1)
sess = tf.Session()
sess.run(res)