I am trying to print and log the custom metrics (dice score) for all classes for validation set during training. I want the Keras to calculate custom metrics on validation set after each epoch. My current program is working but I have to use some tricks that ultimately cause memory problems during training.
The issue is to print and log the dice scores of all classes, the calculations are done on tensors which I am unable to print. I cant use eager mode due to some compatibility issues with TensorFlow 2.0 and forced to initialize another session.
My custom metrics class is given below:
class Metrics(tf.keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.val_lv = []
self.val_rk = []
self.val_lk = []
self.val_sp = []
def on_epoch_end(self, batch, logs={}):
layer_name = 'loss6'
self.intermediate_layer_model = tf.keras.models.Model(inputs=self.model.input,
outputs=self.model.get_layer(layer_name).output)
for batch_index in range(0, len(self.validation_data)):
temp_targ = self.validation_data[batch_index][1][0]
temp_targ=temp_targ.astype('float32')
temp_predict = (np.asarray( self.intermediate_layer_model.predict(
self.validation_data[batch_index][0]))).round()
val_lvs = tf.reduce_mean((dice_coef(temp_targ[:,1, :, :], temp_predict[:,1, :, :])))
val_rks = tf.reduce_mean(dice_coef(temp_targ[:, 2, :, :], temp_predict[:, 2, :, :]))
val_lks = tf.reduce_mean(dice_coef(temp_targ[:, 3, :, :], temp_predict[:, 3, :, :]))
val_sps = tf.reduce_mean(dice_coef(temp_targ[:, 4, :, :], temp_predict[:, 4, :, :]))
self.val_lv.append(val_lvs)
self.val_rk.append(val_rks)
self.val_lk.append(val_lks)
self.val_sp.append(val_sps)
sess = tf.Session()
print('liver-score:', sess.run(tf.reduce_mean(self.val_lv)))
print('rk-score:', sess.run(tf.reduce_mean(self.val_rk)))
print('lk-score:', sess.run(tf.reduce_mean(self.val_lk)))
print('sp-score:', sess.run(tf.reduce_mean(self.val_sp)))
logs['liver-score'] = sess.run(tf.reduce_mean(self.val_lv))
logs['rk-score'] = sess.run(tf.reduce_mean(self.val_rk))
logs['lk-score'] = sess.run(tf.reduce_mean(self.val_lk))
logs['sp-score'] = sess.run(tf.reduce_mean(self.val_sp))
sess.close()
return
Note that the variables lv, rk, lk and sp are abbreviations for my class names.
Any alternative way to print and log the metrics except for using session?
As far as i understand, temp_predict and temp_predict are numpy arrays. So the only way you can end up with tensors is because you are using tf.reduce_mean. You can replace it with np.mean. This will only work if dice_coef has no tensorflow ops. If it does then you will have to replace them with numpy functions. Once you do that, then you wouldn't have to open new sessions.
And also instead of creating a new model at the end of every epoch (intermediate_layer_model), you can construct a keras function using tf.keras.backend.function more about it here.
Related
My goal is to build a script to change an operation into another one using TF's graph editor. So far I tried making a script that just changes the input kernel weights of a Conv2D, but to no avail, as the interface is pretty confusing.
with tf.Session() as sess:
model_filename = sys.argv[1]
with gfile.FastGFile(model_filename, 'r') as f:
graph_def = graph_pb2.GraphDef()
text_format.Merge(f.read(), graph_def)
importer.import_graph_def(graph_def)
#my_sgv = ge.sgv("Conv2D", graph=tf.get_default_graph())
#print my_sgv
convs = find_conv2d_ops(tf.get_default_graph())
print convs
my_sgv = ge.sgv(convs)
print my_sgv
conv_tensor = tf.get_default_graph().get_tensor_by_name(convs[0].name + ':0')
conv_weights_input = tf.get_default_graph().get_tensor_by_name(convs[0].inputs[1].name)
weights_new = tf.Variable(tf.truncated_normal([1, 1, 1, 8], stddev=0.03),
name='Wnew')
ge.graph_replace(conv_tensor, {conv_weights_input: weights_new})
The error is "input needs to be a Tensor: ". Can someone please provide some insights?
Since you are dealing with a tf.Variable you don't need to use graph editor. tf.assign will be sufficient.
You can use it like the following:
assign_op = tf.assign(conv_weights_input, weights_new)
with tf.Session() as sess:
sess.run(assign_op)
If you are looking to sub out operations and not weights. Consider the following example (modified from this example):
import tensorflow as tf
import tensorflow.contrib.graph_editor as ge
def build():
a_pl = tf.placeholder(dtype=tf.float32, name="a")
b_pl = tf.placeholder(dtype=tf.float32, name="b")
c = tf.add(a_pl, b_pl, name="c")
build() #or load graph from disc
a = tf.constant(1.0, shape=[2, 3], name="a_const")
b = tf.constant(2.0, shape=[2, 3], name="b_const")
a_pl = tf.get_default_graph().get_tensor_by_name("a:0")
b_pl = tf.get_default_graph().get_tensor_by_name("b:0")
c = tf.get_default_graph().get_tensor_by_name("c:0")
c_ = ge.graph_replace(c, {a_pl: a, b_pl: b})
with tf.Session() as sess:
#no need for placeholders
print(sess.run(c_))
#will give error since a_pl and b_pl have no value
print(sess.run(c))
The issue with your code is that you're dealing with wights, and not tensors. The crux of the above example is that the first argument is the target tensor (output tensor) that have the to be replaced tensors as dependencies. The second argument are the actual tensors you want to replace.
It's also worth noting that conv_weights_input is actually a tensor, where weights_new is a tf.Variable. I believe what you want is to replace weights_new with a new conv operation with random weight initialisation.
I want to know whether the tensorflow operations in this link, have a gradient defined. I am asking because I am implementing a custom loss function and when I run it I always have this error :
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
This is my custom Loss function:
def calculate_additional_loss(y_true,y_pred):
#additional loss
x_decoded_normalized = original_dim* y_pred
#y_true = K.print_tensor(y_true, message='y_true = ')
#y_pred = K.print_tensor(y_pred, message='y_pred = ')
error = tf.constant(0, dtype= tf.float32)
additional_loss= tf.constant(0, dtype= tf.float32)
final_loss= tf.constant(0, dtype= tf.float32)
for k in range(batch_size):
#add padding
reshaped_elem_1 = K.reshape(x_decoded_normalized[k], [DIM,DIM])
a = K.reshape(reshaped_elem_1[:,DIM-1], [DIM,1])
b = K.reshape(reshaped_elem_1[:,1], [DIM,1])
reshaped_elem_1 = tf.concat ([b,reshaped_elem_1], axis= 1)
reshaped_elem_1 = tf.concat ([reshaped_elem_1,a], axis= 1)
c= K.reshape(reshaped_elem_1[DIM-1,:], [1,DIM+2])
d= K.reshape(reshaped_elem_1[1,:], [1,DIM+2])
reshaped_elem_1 = tf.concat ([d,reshaped_elem_1],axis=0)
reshaped_elem_1 = tf.concat ([reshaped_elem_1,c],axis=0)
for (i,j) in range(reshaped_elem_1.shape[0],reshaped_elem_1.shape[1]):
error = tf.add(error, tf.pow((reshaped_elem_1[i,j]-
reshaped_elem_1[i,j+1]),-2),
tf.pow((reshaped_elem_1[i,j]-reshaped_elem_1[i,j-
1]),-2), tf.pow((reshaped_elem_1[i,j]-
reshaped_elem_1[i-1,j]),-2),
tf.pow((reshaped_elem_1[i,j]-reshaped_elem_1[i+1,j]),-2))
additional_loss = tf.add(additional_loss, tf.divide(error, original_dim))
final_loss += tf.divide(additional_loss, batch_size)
print('final_loss', final_loss)
return final_loss
and This is where I am calling it:
models = (encoder, decoder)
additional_loss = calculate_additional_loss(inputs,outputs)
vae.add_loss(additional_loss)
vae.compile(optimizer='adam')
vae.summary()
plot_model(vae,to_file='vae_mlp.png',show_shapes=True)
vae.fit(x_train, epochs=epochs, batch_size=batch_size, validation_data=(x_test, None), verbose = 1, callbacks=[CustomMetrics()])
Thank you in advance.
Most ops have a defined gradient. There are some ops for which a gradient is not defined and the error message you get gives you some examples.
Having said that, there are couple of mistakes I see in your code :
final_loss is defined as tf.constant, but you are trying to increment it.
You are taking a tuple from range
error is defined as tf.constant, but you are trying to increment it.
Don't use for loop in this way over batch_size. Instead use TensorFlow functions to handle batch dimension directly. This way you are just proliferating your nodes.
The way you have written your code makes me think that you're thinking of TensorFlow as pure python. It is not. You define the graph and then you execute it inside a session. So, in the function use TF functions to just define the computations.
I created a char-level language generation with Tensorflow here. I used tf.placeholder API, which according to the google docs:
Feeding is least efficient way to feed data into a TensorFlow program.
I decided to change my code and replace it with new TensroFlow Dataset API.
I used from_generator to create Dataset:
dataset = tf.data.Dataset.from_generator(gen, (tf.int32, tf.int32),
(tf.TensorShape([None, None]),
tf.TensorShape([None, None])))
self.iterator = dataset.make_initializable_iterator()
self.inp, self.target = self.iterator.get_next()
As can be seen in above code, I used [None, None] for Tensorshape to give the model more generality. During the training everything is perfectly fine. But at inference some problem arise. In tf.placeholder API I used following code to generate characters:
def inference(self):
converter = utils.TextReader(filename=FLAGS.CONVERTER_PATH)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
samples = []
new_state = sess.run(self.init_state)
c = 12 # random starting token
samples.append(c)
for i in range(1000):
x = np.zeros((1, 1))
x[0, 0] = c
feed_dict = {
self.inp: x,
self.init_state: new_state
}
preds, new_state = sess.run([self.prediction, self.final_state], feed_dict=feed_dict)
c = utils.pick_top_n(preds, converter.vocab_size)
samples.append(c)
samples = np.array(samples)
print(converter.arr_to_text(samples))
In Dataset API, I dont have tf.placeholder to feed my previous character. And when I use the above code, as expected, following error happened:
InvalidArgumentError (see above for traceback): ConcatOp : Dimensions of inputs should match: shape[0] = [1,50] vs. shape[1] = [32,50]
At inference, the model use the same input shape ([32,50]) that I used for training. Which is not what I want (Actually, I define TensorShape([None,None]) to handle this but not works).
How can I fix the issue with new Dataset API?
Complete code.
I have created a sliding window algorithm using numpy that slides over a wav audio file and feeds slices of it to my NN in tensorflow, which detects features in the audio slices. Once tensorflow does its thing, it returns its output to numpy land, where I reassemble the slices into an array of predictions that match each sample position of the original file:
import tensorflow as tf
import numpy as np
import nn
def slide_predict(layers, X, modelPath):
output = None
graph = tf.Graph()
with graph.as_default():
input_layer_size, hidden_layer_size, num_labels = layers
X_placeholder = tf.placeholder(tf.float32, shape=(None, input_layer_size), name='X')
Theta1 = tf.Variable(nn.randInitializeWeights(input_layer_size, hidden_layer_size), name='Theta1')
bias1 = tf.Variable(nn.randInitializeWeights(hidden_layer_size, 1), name='bias1')
Theta2 = tf.Variable(nn.randInitializeWeights(hidden_layer_size, num_labels), name='Theta2')
bias2 = tf.Variable(nn.randInitializeWeights(num_labels, 1), name='bias2')
hypothesis = nn.forward_prop(X_placeholder, Theta1, bias1, Theta2, bias2)
sess = tf.Session(graph=graph)
saver = tf.train.Saver()
init = tf.global_variables_initializer()
sess.run(init)
saver.restore(sess, modelPath)
window_size = layers[0]
pad_amount = (window_size * 2) - (X.shape[0] % window_size)
X = np.pad(X, (pad_amount, 0), 'constant')
for w in range(window_size):
start = w
end = -window_size + w
X_shifted = X[start:end]
X_matrix = X_shifted.reshape((-1, window_size))
prediction = sess.run(hypothesis, feed_dict={X_placeholder: X_matrix})
output = prediction if (output is None) else np.hstack((output, prediction))
sess.close()
output.shape = (X.size, -1)
return output
Unfortunately, this algorithm is quite slow. I placed some logs along the way and by far the slowest portion is the part where I actually run my tensorflow graph. This could be due to the actual tensorflow calculations being slow (if so, I'm probably just SOL), but I am wondering if a large part of the slowness isn't because I am transferring large audio files repeatedly back and forth in and out of tensorflow. So my questions are:
1) Is feeding a placeholder repeatedly like this going to be noticeably slower than feeding it once and calculating the values for X inside tensorflow?
2) If yes, whats the best way to implement a sliding window algorithm inside tensorflow to do this calculation?
The first issue is that your algorithm is has quadratic time complexity in window_size, because of calling np.hstack() in each iteration to build the output array, which copies both the current values of output and prediction into a new array:
for w in range(window_size):
# ...
output = prediction if (output is None) else np.hstack((output, prediction))
Instead of calling np.hstack() in every iteration, it would be more efficient to build a Python list of the prediction arrays, and call np.hstack() on them once, after the loop terminates:
output_list = []
for w in range(window_size):
# ...
prediction = sess.run(...)
output_list.append(prediction)
output = np.hstack(output_list)
The second issue is that feeding large values to TensorFlow can be inefficient, if the amount of computation in the sess.run() call is small, because those values are (currently) copied into C++ (and the results are copied out. One useful strategy for this is to try and move the sliding window loop into the TensorFlow graph, using the tf.map_fn() construct. For example, you could restructure your program as follows:
# NOTE: If you call this function often, you may want to (i) move the `np.pad()`
# into the graph as `tf.pad()`, and (ii) replace `X_t` with a placeholder.
X = np.pad(X, (pad_amount, 0), 'constant')
X_t = tf.convert_to_tensor(X)
def window_func(w):
start = w
end = w - window_size
X_matrix = tf.reshape(X_t[start:end], (-1, window_size))
return nn.forward_prop(X_matrix, Theta1, bias1, Theta2, bias2)
output_t = tf.map_fn(window_func, tf.range(window_size))
# ...
output = sess.run(output_t)
I am trying to make a generative RNN model in tensorflow. What is annoying me is that with the new switch to state_is_tupe being true by default in the RNN library, I am having a hard time finding the best way to save state between batches. I know I can change it back to being False but I don't want to do it since it is deprecated. When I am done with the training I need to be able to perserve the hidden states between calls to session.run since I will be generating the sequences one sample at a time. I figured out that I can return the state of the rnn as follows.
rnn = tf.nn.rnn_cell.MultiRNNCell(cells)
zero_state = rnn.zero_state(batch_size, tf.float32)
output, final_state = tf.nn.dynamic_rnn(rnn, self.input_sound, initial_state = zero_state)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
state_output = sess.run(final_state, feed_dict = {self.input_sound: np.zeros((64, 32, 512))})
This would be great but the issue emerges when I want to pass state_output back into the model. Since a placeholder can only be a tensor object I can't pass it back the state_output tupel.
I am looking for a very generic solution. The rnn could be a MultiRNNCell or a single LSTMCell or any other combination imaginable.
I think I figured it out. I used the following code to flatten the state tuples into a single 1D tensor. I can than chop it up when I pass it back into the model according to the size specification of the rnn cell.
def flatten_state_tupel(x):
result = []
for x_ in x:
if isinstance(x_, tf.Tensor) or not hasattr(x_, '__iter__'):
result.append(x_)
else:
result.extend(flatten_state_tupel(x_))
return result
def pack_state_tupel(state):
return tf.concat(0, [tf.reshape(s, (-1,)) for s in flatten_state_tupel(state)])
def unpack_state_tupel(state, size):
state = tf.reshape(state, (-1, tf.reduce_sum(flatten_state_tupel(size))))
def _make_state_tupel(sz, i):
if hasattr(sz, '__iter__'):
result = []
for s in sz:
base_index, y = _make_state_tupel(s, i)
result.append(y)
return base_index, tf.nn.rnn_cell.LSTMStateTuple(*result) if isinstance(sz, tf.nn.rnn_cell.LSTMStateTuple) else tuple(result)
else:
return i + sz, state[..., i : i + sz]
return _make_state_tupel(size, 0)[-1]
I use the functions as follows.
rnn = tf.nn.rnn_cell.MultiRNNCell(cells)
zero_state = pack_state_tupel(rnn.zero_state(batch_size, tf.float32))
self.initial_state = tf.placeholder_with_default(zero_state, None)
output, final_state = tf.nn.dynamic_rnn(rnn, self.input_sound, initial_state = unpack_state_tupel(self.initial_state, rnn.state_size))
packed_state = pack_state_tupel(final_state)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
state_output = sess.run(packed_state, feed_dict = {self.input_sound: np.zeros((64, 32, 512))})
print(state_output.shape)
state_output = sess.run(packed_state, feed_dict = {self.input_sound: np.zeros((64, 32, 512)), self.initial_state: np.zeros(state_output.shape[0])})
print(state_output)
This way it will zero the state if I do not pass anything (which will be the case during training) however I can save and pass the state between batches during generation.