How to drop last row and last col in a tensor using Keras Tensorflow - tensorflow

Let's say I have a tensor (None, 2, 56, 56, 256). Now I want to have my tensor with shape (None, 2, 55, 55, 256) by dropping last col and last row. How can I acheive this using Keras/Tensorflow?

In tensorflow we can slice tensors using python slice notation. SO, given a tensor X with shape (20,2,56,56,256) say (as you have described but with a batch size of 20), we can easily slice it taking all but the last 'row' in the 2nd and 3rd dimension as follows:
X[:,:,:-1,:-1,:]
Note the use of :-1 to denote "everything before the last 'row'".
Given this know-how about slicing the tensor in tensorflow we just need to adapt it for keras. We could, of course, write a full blown custom layer implementing this (or possibly even find one out there someone else has written - I've not looked but slicing is pretty common so suspect someone has written something somewhere!).
However, for something as simple as this, I'd advocate just using a Lambda layer which we can define as follows:
my_slicing_layer = Lambda(lambda x: x[:,:,:-1,:-1,:], name='slice')
And can use in our keras models as normal:
my_model = Sequential([
Activation('relu', input_shape=(2,56,56,256)),
my_slicing_layer
])

Related

Using tf.where() and tf.gather_nd() with None dimension

I am tackling a machine learning problem in which I feed my network with data of shape (batch_size, n_objects, n_features). So, each training instance comes with a given number of objects, each of them having a given number of features. Among these features I have electric charge, and while writing a custom loss function I would like to use only the neutral objects to compute it. Thus, starting from a tensor of shape (batch_size, n_objects, n_features) I would like to get a tensor of shape (batch_size, n_neutral_objects, n_features). In doing this, I'm facing a couple of problems.
First of all, I made a try by creating a tensor by hand. I have 3 training instances, each one having 2 objects, each one having 3 features. I try to get the neutral objects using tf.where() and tf.gather() methods in the following way (suppose that electric charge is the 2nd feature):
a = tf.constant([[[3.5, 0, 6], [2.1, 1, 2.9]], [[1.5, 1, 4.5], [2.0, 0, 4.2]], [[6.2, 0, 6.1], [4.8, 1, 3.4]]]) #toy input tensor
b = tf.where(a[:,:,1] == 0) #find neutral objects (charge is 2nd feature)
c = tf.gather_nd(a,b) #gather them
print(c)
This kind of works, as I get
[[3.5 0. 6. ]
[2. 0. 4.2]
[6.2 0. 6.1]], shape=(3, 3), dtype=float32)
as an output, which are the desired objects. But I've somehow lost the first dimension, as I don't want a tensor of shape (3, 3), but rather one of shape (3, 1, 3), namely still 3 input instances, each one having only one neutral object, each of them having 3 features.
Things get worse if I plug my approach into my TF model. In this real-life case, my batch size is None and I am thus dealing with tensors of shape (None, 4000, 14) (so 4000 objects for each training instance, 14 features each). This is the code I tried
def get_neutrals(tensor):
print("tensor.get_shape()", tensor.get_shape())
charges = tensor[:,:,4] #charge is the 5th feature in this case
print("charges.get_shape()", charges.get_shape())
where_neutrals = tf.where(charges == 0) # get the neutrals only
print("where_neutrals.get_shape()", where_neutrals.get_shape())
print("tf.gather_nd(tensor, where_neutrals).get_shape()", tf.gather_nd(tensor, where_neutrals).get_shape())
return tf.gather_nd(tensor, where_neutrals)
and this is what I get printed if I call my method:
tensor.get_shape() (None, 4000, 14)
charges.get_shape() (None, 4000)
where_neutrals.get_shape() (None, 2)
tf.gather_nd(tensor, where_neutrals).get_shape() (None, 14)
The last two shapes are completely unexpected and I don't know why they look like this. Can anyone here help with this?
Thanks a lot, cheers,
F.

Looking for a pytorch function to repeat a vector

I am looking for a pytorch function that is similar to tf's tile function.
I saw that PyTorch used to have a tile function, but apparently it was removed.
An example for the functionality I am looking for:
Let's say I have a tensor of dimensions (1,1,1,1000), I want to repeat it several times so I get a (1,40,40,1000) tensor.
Torch tensors have a repeat() method, therefore:
a = torch.rand((1, 1, 1, 1000))
b = a.repeat(1, 40, 40, 1)
b.shape # Gives torch.Size([1, 40, 40, 1000])

When the input shape is incompatible, what will tensorflow actually do?

Thanks for your reading.
I train a LSTM predictor with fixed dimension (None, 5, 2), and I test the predictor with smaller dimension (None, 1, 2), and I got the warning:
WARNING:tensorflow:Model was constructed with shape (None, 5, 2) for input Tensor("input_1_1:0", shape=(None, 5, 2), dtype=float32), but it was called on an input with incompatible shape (None, 1, 2).
However, the results are fine.
I just wonder what tensorflow actually do when the case happens? Say it will automatically pad zero, such that to match the dimensions?
Again, thanks for your reading and looking forward to an answer.
Tensor computations are executed as a TensorFlow graph - see https://www.tensorflow.org/guide/intro_to_graphs. Normally graph execution is faster.
The second dimension of LSTM is dynamic. In such cases keras have to rebuild the graph every time when the input shape changing. It is slow. If your input shape is changing frequently - graph execution could be slower than eager execution. Because of that - keras issuing a warning.
Keras don't pad your data.

How to use metric with three inputs(GAP metric) in Keras while training?

This is GAP metric code from kaggle
def GAP(pred, conf, true):
x = pd.DataFrame({'pred': pred, 'conf': conf, 'true': true})
x.sort_values('conf', ascending=False, inplace=True, na_position='last')
x['correct'] = (x.true == x.pred).astype(int)
x['prec_k'] = x.correct.cumsum() / (np.arange(len(x)) + 1)
x['term'] = x.prec_k * x.correct
gap = x.term.sum() / x.true.count()
return gap
I want to use it while training, but it get conf argument - vector of probability or confidence scores for prediction. But metrics must get only two arguments. Does any possibility to use it like this:
model.compile(loss='my_loss',metrics=[GAP])
Yes.. There is a way you can do this with a small tweak. Note that frameworks like Keras support loss functions and metrics of the form fun(true, pred). The function definition should be in this form only.
Also, the second limitation is, the shapes of both true and pred must be same.
Tweaking the first limitation: Concatenate the two output tensors into one. Suppose you have x number of output classes, then shape of conf and pred will be (None, x). You can concatenate these two tensors into one producing final_output with shape (None, 2, x).
Doing this is only the first step. It won't work unless we tweak the second limitation.
Now let us tweak the second limitation: This limitation can be shortened to: "The dimensions of both these tensors must be same." Note that I am trying to reduce the limitation from shape to dimensions. This can be done by having dynamic shapes, for ex: shape(true) = (None, 1, x) and shape(pred) = (None, None, x) will not throw errors as None can take any value at runtime. In short, add a layer at the end of the model to combine outputs and that layer should have dynamic output shape.
But in your case, true will also have shape (None, x). You can just expand dimensions of this tensor at axis=1 to get (None, 1, x) and then the newly generated true can be provided as input to the model.
Note that as you are combining two tensors, the final_output will always have shape (None, 2, x) which isn't equal to (None, 1, x). But as we have configured the last layer to return dynamic shape i.e. (None, None, x), this will not be a problem at compile time. And Keras never checks for shape mismatch at runtime except an operation on tensor causes that error.
Now, that you have final_output with same shape as true, you just need to slice the final_output to get back the original two tensors pred and conf in your custom loss function and metrics.
The above was purely logical.. To see an example implementation, check out layers and loss function here.

How to freeze/lock weights of one TensorFlow variable (e.g., one CNN kernel of one layer)

I have a TensorFlow CNN model that is performing well and we would like to implement this model in hardware; i.e., an FPGA. It's a relatively small network but it would be ideal if it were smaller. With that goal, I've examined the kernels and find that there are some where the weights are quite strong and there are others that aren't doing much at all (the kernel values are all close to zero). This occurs specifically in layer 2, corresponding to the tf.Variable() named, "W_conv2". W_conv2 has shape [3, 3, 32, 32]. I would like to freeze/lock the values of W_conv2[:, :, 29, 13] and set them to zero so that the rest of the network can be trained to compensate. Setting the values of this kernel to zero effectively removes/prunes the kernel from the hardware implementation thus achieving the goal stated above.
I have found similar questions with suggestions that generally revolve around one of two approaches;
Suggestion #1:
tf.Variable(some_initial_value, trainable = False)
Implementing this suggestion freezes the entire variable. I want to freeze just a slice, specifically W_conv2[:, :, 29, 13].
Suggestion #2:
Optimizer = tf.train.RMSPropOptimizer(0.001).minimize(loss, var_list)
Again, implementing this suggestion does not allow the use of slices. For instance, if I try the inverse of my stated goal (optimize only a single kernel of a single variable) as follows:
Optimizer = tf.train.RMSPropOptimizer(0.001).minimize(loss, var_list = W_conv2[:,:,0,0]))
I get the following error:
NotImplementedError: ('Trying to optimize unsupported type ', <tf.Tensor 'strided_slice_2228:0' shape=(3, 3) dtype=float32>)
Slicing tf.Variables() isn't possible in the way that I've tried it here. The only thing that I've tried which comes close to doing what I want is using .assign() but this is extremely inefficient, cumbersome, and caveman-like as I've implemented it as follows (after the model is trained):
for _ in range(10000):
# get a new batch of data
# reset the values of W_conv2[:,:,29,13]=0 each time through
for m in range(3):
for n in range(3):
assign_op = W_conv2[m,n,29,13].assign(0)
sess.run(assign_op)
# re-train the rest of the network
_, loss_val = sess.run([optimizer, loss], feed_dict = {
dict_stuff_here
})
print(loss_val)
The model was started in Keras then moved to TensorFlow since Keras didn't seem to have a mechanism to achieve the desired results. I'm starting to think that TensorFlow doesn't allow for pruning but find this hard to believe; it just needs the correct implementation.
A possible approach is to initialize these specific weights with zeros, and modify the minimization process such that gradients won't be applied to them. It can be done by replacing the call to minimize() with something like:
W_conv2_weights = np.ones((3, 3, 32, 32))
W_conv2_weights[:, :, 29, 13] = 0
W_conv2_weights_const = tf.constant(W_conv2_weights)
optimizer = tf.train.RMSPropOptimizer(0.001)
W_conv2_orig_grads = tf.gradients(loss, W_conv2)
W_conv2_grads = tf.multiply(W_conv2_weights_const, W_conv2_orig_grads)
W_conv2_train_op = optimizer.apply_gradients(zip(W_conv2_grads, W_conv2))
rest_grads = tf.gradients(loss, rest_of_vars)
rest_train_op = optimizer.apply_gradients(zip(rest_grads, rest_of_vars))
tf.group([rest_train_op, W_conv2_train_op])
I.e,
Preparing a constant Tensor for canceling the appropriate gradients
Compute gradients only for W_conv2, then multiply element-wise with the constant W_conv2_weights to zero the appropriate gradients and only then apply gradients.
Compute and apply gradients "normally" to the rest of the variables.
Group the 2 train ops to a single training op.