Finding loss mask of variable length in keras tensorflow - tensorflow

Trying to build loss function which captures the below functionality, which mask the output values once 'end of sequence' is encountered.
Given a tensor of shape [BatchSize,MaxSequenceLenght,OutputNodes]
Consider the below example
batch size = 3
Max Sequence Length=4
OutputNodes = 3
predicted = [[[0.1,0.3,0.2],[0.4,0.6,0.8],[0.5,0.2,0.3],[0.0,0.0,0.99]],
[[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.9],[0.4,0.6,0.8]],
[[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.1],[0.4,0.6,0.1]]]
I am dedicating the last output node to symbolise the 'end of sequence(EOS)' here node=2 . Nodes are labelled as (0, 1 and 2)
Based on the predicted value, I have to return a mask which tries to find the first occurrence of EOS.
In the above example,
first row has following sequence (argmax) => 1,2,0,2
Second row has following sequence => 1,1,2,2
Third row has following sequence => 1,1,9,1
So my mask should be
[[1,0,0,0],
[1,1,0,0],
[1,1,1,1]
The mask will ensure, the values post the EOS is ignored or not considered in calculating the loss.
Below is my code snipped I tried
sequence_cluster_asign = keras.backend.argmax(sequence_values,axis=-1)
loss_mask = []
for seq in K.tf.unstack(sequence_cluster_asign):
##appendEOS- To make sure tf.where is not empty
seq = tf.concat([seq,endOfSequenceTensor],axis=0)
endOfSequenceLocation = K.tf.where(K.tf.equal(seq,endOfSequence))[0][0]
loss_mask.append(tf.sequence_mask(endOfSequenceLocation,max_decoder_seq_length,dtype=tf.float32))
final_mask = K.stack(loss_mask)
Error encountered : ValueError: Cannot infer num from shape (?,?)

If you want to get mask in your question, you can use the following method.
import tensorflow as tf
import keras
from keras import backend as K
sequence_values = K.placeholder(shape=(None, 4, 3))
sequence_cluster_asign = keras.backend.argmax(sequence_values,axis=-1)
# keras version
result = K.cast(K.less(sequence_cluster_asign,sequence_values.get_shape().as_list()[-1]-1),dtype='int32')
result = K.cumprod(result,axis=-1)
# tensorflow version
# result = tf.cast(tf.less(sequence_cluster_asign,sequence_values.get_shape().as_list()[-1]-1),dtype=tf.int32)
# result = tf.cumprod(result,axis=-1)
predicted = [[[0.1,0.3,0.2],[0.4,0.6,0.8],[0.5,0.2,0.3],[0.0,0.0,0.99]],
[[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.9],[0.4,0.6,0.8]],
[[0.1,0.3,0.2],[0.4,0.9,0.8],[0.5,0.2,0.1],[0.4,0.6,0.1]]]
with tf.Session() as sess:
print(result.eval(feed_dict={sequence_values:predicted}))
[[1 0 0 0]
[1 1 0 0]
[1 1 1 1]]

Related

How to extract variable values that equal a certain value (pyomo)?

I am building a routing optimization model using pyomo on python.
I have solved my model but I am trying to extract the decision variable information for my model. My model is binary, and the values I am looking for are values of my model.z decision variable that equal to 1.
When I write instance.pprint() I get the following sample of output. I therefore want to code something that gives me only the decision variables that are equal to 1 such as z(1,4).
Sample of my code is shown below:
model.I = RangeSet(5)
model.J = RangeSet(5)
model.z = Var(model.I, model.J, domain = Binary)
def constraint (model,i):
return sum(model.z[i,j] - model.z[j,i] for j in model.J if i != j) == 0
model.constraint = Constraint(model.I, rule=constraint)
print()
z_values = pd.Series(model.z[i,j].extract_values(), name = model.z.name)
print(z_values)
I have tried the above code but as some of my values are 0 (because they have not being visited), I have been getting the following error message.
ValueError: Error retrieving component z[5,4]: The component has not been constructed.
Ideally the output should be something like this:
(0,3) -- 1
(1,2) -- 1
(2,4) -- 1
(3,1) -- 1
(4,5) -- 1
(5,0) -- 1
Any ideas?
This should work (and answer your other derivative question)
# value extract
import pyomo.environ as pyo
nodes = [1,2,3,4,5,6]
model = pyo.ConcreteModel()
model.N = pyo.Set(initialize=nodes)
model.Z = pyo.Var(model.N, model.N, domain=pyo.Binary, initialize=0) # only initializing here for demo...
# blah blah constraints & solve
# stuff in some fake results...
model.Z[1, 2] = 1
model.Z[2, 6] = 1
model.Z[3, 5] = 1
model.Z[6, 3] = 1
# model.display()
# make a dictionary of the route ...
# recall that binary "1" variables evaluate as True
route = {start: stop for (start, stop) in model.Z.index_set() if pyo.value(model.Z[start, stop])}
# print(route)
start_node = 1
print(f'from {start_node} ', end='')
while start_node in route.keys():
end_node = route.get(start_node)
print(f'-> {end_node} ' , end='')
start_node = end_node

How to concatenate two tensors with intervals in tensorflow?

I want to concatenate two tensors checkerboard-ly in tensorflow2, like examples showed below:
example 1:
a = [[1,1],[1,1]]
b = [[0,0],[0,0]]
concated_a_and_b = [[1,0,1,0],[0,1,0,1]]
example 2:
a = [[1,1,1],[1,1,1],[1,1,1]]
b = [[0,0,0],[0,0,0],[0,0,0]]
concated_a_and_b = [[1,0,1,0,1,0],[0,1,0,1,0,1],[1,0,1,0,1,0]]
Is there a decent way in tensorflow2 to concatenate them like this?
A bit of background for this:
I first split a tensor c with a checkerboard mask into two halves a and b. A after some transformation I have to concat them back into oringnal shape and order.
What I mean by checkerboard-ly:
Step 1: Generate a matrix with alternated values
You can do this by first concatenating into [1, 0] pairs, and then by applying a final reshape.
Step 2: Reverse some rows
I split the matrix into two parts, reverse the second part and then rebuild the full matrix by picking alternatively from the first and second part
Code sample:
import math
import numpy as np
import tensorflow as tf
a = tf.ones(shape=(3, 4))
b = tf.zeros(shape=(3, 4))
x = tf.expand_dims(a, axis=-1)
y = tf.expand_dims(b, axis=-1)
paired_ones_zeros = tf.concat([x, y], axis=-1)
alternated_values = tf.reshape(paired_ones_zeros, [-1, a.shape[1] + b.shape[1]])
num_samples = alternated_values.shape[0]
middle = math.ceil(num_samples / 2)
is_num_samples_odd = middle * 2 != num_samples
# Gather first part of the matrix, don't do anything to it
first_elements = tf.gather_nd(alternated_values, [[index] for index in range(middle)])
# Gather second part of the matrix and reverse its elements
second_elements = tf.reverse(tf.gather_nd(alternated_values, [[index] for index in range(middle, num_samples)]), axis=[1])
# Pick alternatively between first and second part of the matrix
indices = np.concatenate([[[index], [index + middle]] for index in range(middle)], axis=0)
if is_num_samples_odd:
indices = indices[:-1]
output = tf.gather_nd(
tf.concat([first_elements, second_elements], axis=0),
indices
)
print(output)
I know this is not a decent way as it will affect time and space complexity. But it solves the above problem
def concat(tf1, tf2):
result = []
for (index, (tf_item1, tf_item2)) in enumerate(zip(tf1, tf2)):
item = []
for (subitem1, subitem2) in zip(tf_item1, tf_item2):
if index % 2 == 0:
item.append(subitem1)
item.append(subitem2)
else:
item.append(subitem2)
item.append(subitem1)
concated_a_and_b.append(item)
return concated_a_and_b

How should I append an element to each sequence data by tf.data.Dataset

I want to get sequence data with char2int['EOS'] added behind by tf.data.Dataset.
The codes I wrote are as below:
import tensorflow as tf
def _get_generator(list_of_text, char2int):
def gen():
for text in list_of_text:
yield [char2int[x] for x in text] # transform char to int
return gen
def get_dataset(list_of_text, char2int):
gen = _get_generator(list_of_text, char2int)
dataset = tf.data.Dataset.from_generator(gen, (tf.int32), tf.TensorShape([None]))
dataset = dataset.map(lambda seq: seq+[char2int['EOS']]) # append EOS to the end of line
data_iter = dataset.make_initializable_iterator()
return dataset, data_iter
char2int = {'EOS':1, 'a':2, 'b':3, 'c':4}
list_of_text = ['aaa', 'abc'] # the sequence data
with tf.Graph().as_default():
dataset, data_iter = get_dataset(list_of_text, char2int)
with tf.Session() as sess:
sess.run(data_iter.initializer)
tt1 = sess.run(data_iter.get_next())
tt2 = sess.run(data_iter.get_next())
print(tt1) # got [3 3 3] but I want [2 2 2 1]
print(tt2) # god [3 4 5] but I want [2 3 4 1]
But I can't get what I want. It performs element-wise addition to each data. How should I fix it, thanks
In your map function you are adding each value by 1 instead of concatenating the value. You can change your _get_generator to :
def _get_generator(list_of_text, char2int):
def gen():
for text in list_of_text:
yield [char2int[x] for x in text] + [char2int['EOS']]# transform char to int
return gen
and remove dataset.map call.
As Vijay points out in his answer, the + operator on a tf.Tensor of type tf.int32 performs addition rather than concatenation. To concatenate an additional symbol onto the end of the sequence, instead use tf.concat() in the Dataset.map():
dataset = dataset.map(lambda seq: tf.concat([seq, [char2int['EOS']]], axis=0)

Padding Labels for Tensorflow CTC Loss?

I would like to pad my labels so that they would be of equal length to be passed into the ctc_loss function. Apparently, -1 is not allowed. If I were to apply padding, should the padding value be part of the labels for ctc?
Update
I have this code that converts dense labels into sparse ones to be passed to the ctc_loss function which I think is related to the problem.
def dense_to_sparse(dense_tensor, out_type):
indices = tf.where(tf.not_equal(dense_tensor, tf.constant(0, dense_tensor.dtype)
values = tf.gather_nd(dense_tensor, indices)
shape = tf.shape(dense_tensor, out_type=out_type)
return tf.SparseTensor(indices, values, shape)
Actually, -1 values are allowed to be present in the y_true argument of the ctc_batch_cost with one limitation - they should not appear within the actual label "content" which is specified by label_length (here i-th label "content" would start from the index 0 and end at the index label_length[i]).
So it is perfectly fine to pad labels with -1 so that they would be of equal length, as you intended. The only thing you should take care about is to correctly calculate and pass corresponding label_length values.
Here is the sample code which is a modified version of the test_ctc unit test from keras:
import numpy as np
from tensorflow.keras import backend as K
number_of_categories = 4
number_of_timesteps = 5
labels = np.asarray([[0, 1, 2, 1, 0], [0, 1, 1, 0, -1]])
label_lens = np.expand_dims(np.asarray([5, 4]), 1)
# dimensions are batch x time x categories
inputs = np.zeros((2, number_of_timesteps, number_of_categories), dtype=np.float32)
input_lens = np.expand_dims(np.asarray([5, 5]), 1)
k_labels = K.variable(labels, dtype="int32")
k_inputs = K.variable(inputs, dtype="float32")
k_input_lens = K.variable(input_lens, dtype="int32")
k_label_lens = K.variable(label_lens, dtype="int32")
res = K.eval(K.ctc_batch_cost(k_labels, k_inputs, k_input_lens, k_label_lens))
It runs perfectly fine even with -1 as the last element of the (second) labels sequence because corresponding label_lens item (second) specified that its length is 4.
If we change it to be 5 or if we change some other label value to be -1 then we have the All labels must be nonnegative integers exception that you've mentioned. But this just means that our label_lens is invalid.
Here's how I do it. I have a dense tensor labels that includes padding with -1, so that all targets in a batch have the same length. Then I use
labels_sparse = dense_to_sparse(labels, sparse_val=-1)
where
def dense_to_sparse(dense_tensor, sparse_val=0):
"""Inverse of tf.sparse_to_dense.
Parameters:
dense_tensor: The dense tensor. Duh.
sparse_val: The value to "ignore": Occurrences of this value in the
dense tensor will not be represented in the sparse tensor.
NOTE: When/if later restoring this to a dense tensor, you
will probably want to choose this as the default value.
Returns:
SparseTensor equivalent to the dense input.
"""
with tf.name_scope("dense_to_sparse"):
sparse_inds = tf.where(tf.not_equal(dense_tensor, sparse_val),
name="sparse_inds")
sparse_vals = tf.gather_nd(dense_tensor, sparse_inds,
name="sparse_vals")
dense_shape = tf.shape(dense_tensor, name="dense_shape",
out_type=tf.int64)
return tf.SparseTensor(sparse_inds, sparse_vals, dense_shape)
This creates a sparse tensor of the labels, which is what you need to put into the ctc loss. That is, you call tf.nn.ctc_loss(labels=labels_sparse, ...) The padding (i.e. all values equal to -1 in the dense tensor) is simply not represented in this sparse tensor.

How to keep calculated values in a Tensorflow graph (on the GPU)?

How can we make sure that a calculated value will not be copied back to CPU/python memory, but is still available for calculations in the next step?
The following code obviously doesn't do it:
import tensorflow as tf
a = tf.Variable(tf.constant(1.),name="a")
b = tf.Variable(tf.constant(2.),name="b")
result = a + b
stored = result
with tf.Session() as s:
val = s.run([result,stored],{a:1.,b:2.})
print(val) # 3
val=s.run([result],{a:4.,b:5.})
print(val) # 9
print(stored.eval()) # 3 NOPE:
Error : Attempting to use uninitialized value _recv_b_0
The answer is to store the value in a tf.Variable by storing to it using the assign operation:
working code:
import tensorflow as tf
with tf.Session() as s:
a = tf.Variable(tf.constant(1.),name="a")
b = tf.Variable(tf.constant(2.),name="b")
result = a + b
stored = tf.Variable(tf.constant(0.),name="stored_sum")
assign_op=stored.assign(result)
val,_ = s.run([result,assign_op],{a:1.,b:2.})
print(val) # 3
val=s.run(result,{a:4.,b:5.})
print(val[0]) # 9
print(stored.eval()) # ok, still 3