In Tensorflow, how do I generate a scalar summary? - tensorflow

Does anyone have a minimal example of using a SummaryWriter with a scalar_summary in order to see (say) a cross entropy result during a training run?
The example given in the documentation:
merged_summary_op = tf.merge_all_summaries()
summary_writer = tf.train.SummaryWriter('/tmp/mnist_logs', sess.graph_def)
total_step = 0
while training:
total_step += 1
session.run(training_op)
if total_step % 100 == 0:
summary_str = session.run(merged_summary_op)
summary_writer.add_summary(summary_str, total_step)
Returns an error: TypeError: Fetch argument None of None has invalid type , must be a string or Tensor. (Can not convert a NoneType into a Tensor or Operation.)
When I run it.
If I add a:
tf.scalar_summary('cross entropy', cross_entropy)
operation after my cross entropy calculation, then instead I get the error:
InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_2' with dtype float
Which suggests that I need to add a feed_dict to the
summary_str = session.run(merged_summary_op)
call, but I am not clear what that feed_dict should contain....

The feed_dict should contain the same values that you use for running the training_op. It basically specifies the input values to your network for which you want to calculate the summaries.

The error is probably coming from:
session.run(training_op)
Did you paste the example code into a version of the mnist code that requires a feed_dict for feeding in training examples? Check the backtrace it gave you (and include it above if that doesn't solve the problem).

Related

Passing random value in tensorflow function as a parameter

I have code in my augmentation tf.data pipeline...
# BLURE
filter_size = tf.random.uniform(shape=[], minval=0, maxval=5)
image = tfa.image.mean_filter2d(image, filter_shape=filter_size)
But I'm constantly getting error...
TypeError: The `filter_shape` argument must be a tuple of 2 integers. Received: Tensor("filter_shape:0", shape=(), dtype=int32)
I tried getting static value from random tensorflow like this...
# BLURE
filter_size = tf.get_static_value(tf.random.uniform(shape=[], minval=0, maxval=5))
image = tfa.image.mean_filter2d(image, filter_shape=filter_size)
But I get error...
TypeError: The `filter_shape` argument must be a tuple of 2 integers. Received: None
And this errors makes me sad :(
I want to create augmentation pipeline for tf.data btw...
You should specify an output shape. However, when I did that I ran into another error which hints that the shape requested by mean_filter2d should not be a Tensor. Therefore, I decided to simply go with the random module to generate a random tuple to modify your image.
import random
import tensorflow_addons as tfa
filter_size = tuple(random.randrange(0, 5) for _ in range(2))
image_bllr = tfa.image.mean_filter2d(image, filter_shape=filter_size)

Tensorflow loss function no gradient provided

Currently I try to code my own loss function, but when returning the result (a tensor that consists of a list with the loss values) I get the following error:
ValueError: No gradients provided for any variable: ['conv2d/kernel:0', 'conv2d/bias:0', 'conv2d_1/kernel:0', 'conv2d_1/bias:0', 'dense/kernel:0', 'dense/bias:0', 'dense_1/kernel:0', 'dense_1/bias:0', 'dense_2/kernel:0', 'dense_2/bias:0'].
However in tutorials and in their docs they also use tf.recude_mean and when using it like them (they showed how to code mse loss function) I dont get the error, so it seems that I am missing something
My code:
gl = tfa.losses.GIoULoss()
def loss(y_true, y_pred):
batch_size = y_true.shape[0]
# now contains 32 lists (a batch) of bbxs -> shape is (32, 7876)
bbx_true = y_true.numpy()
# now contains 32 lists (a batch) of bbxs here we have to double access [0] in order to get the entry itself
# -> shape is (32, 1, 1, 7876)
bbx_pred = y_pred.numpy()
losses = []
curr_true = []
curr_pred = []
for i in range(batch_size):
curr_true = bbx_true[i]
curr_pred = bbx_pred[i][0][0]
curr_true = [curr_true[x:x+4] for x in range(0, len(curr_true), 4)]
curr_pred = [curr_pred[x:x+4] for x in range(0, len(curr_pred), 4)]
if len(curr_true) == 0:
curr_true.append([0., 0.,0.,0.])
curr_loss = gl(curr_true, curr_pred)
losses.append(curr_loss)
return tf.math.reduce_mean(losses, axis=-1)
Basically I want to achive bounding box regression and because of that I want to use the GIoUloss loss function. Because my model outputs 7896 neurons (the max amount of bounding boxes I want to predict according to my training set times 4) and the gioloss function needs the input as an array of lists with 4 elements each, I have to perform this transformation.
How do I have to change my code in order to also build up a gradient
Numpy don't provide autograd functions so you need to have Tensorflow tensors exclusively in your loss (otherwise the gradient is lost during backpropagation). So avoid using .numpy() and use the tensorflow operators and slicing on tensoflow tensors instead.

Strange output of Conv2D in tflite graph

I have a tflite graph fragment of which depicted on attached picture
I needed to debug it's behavior and already on the first step I got quite puzzling results.
When I feed zeros tensor as input after first Conv2D I expect to get a tensor which consists only of values from bias of Conv2D (since all kernel elements get multiplied by zeros), but instead I've got a tensor which consists of some random data, here is the code snippet:
def test_graph(path=PATH_DEFAULT):
interp = tf.lite.Interpreter(path)
interp.allocate_tensors()
input_details = interp.get_input_details()
in_idx = input_details[0]['index']
zeros = np.zeros(shape=(1, 256, 256, 3), dtype=np.float32)
interp.set_tensor(in_idx, zeros)
interp.invoke()
# index of output of first conv2d operator is 3 (see netron pic)
after_conv_2d = interp.get_tensor(3)
# shape of bias is just [count of output channels]
n, h, w, c = after_conv_2d.shape
# if we feed zeros as input, we can expect that the only values we get are the values of bias
# since all kernel elems in that case are multiplied by zeros
uniq_vals_cnt = len(np.unique(after_conv_2d))
assert uniq_vals_cnt <= c, f"There are {uniq_vals_cnt} in output, should be <= than {c}"
output:
AssertionError: There are 287928 in output, should be <= than 24
Can someone help me with my misunderstanding?
Seems my assumption that I can get any intermediate tensor from interpreter is wrong, we can do it only for outputs, even though interpreter do not raise error and even gives tensors of the right shape for indices related to non-output tesnors.
One way to debug such graph would be to make all tensors outputs, but it seems easiest way to do it would be converting tflite file to pb with toco and then convert pb back to tflite with new outputs specified. This way is not ideal though because toco support for tflite -> pb conversion was removed after 1.9 and using versions before that can break (in my case it breaks) on some graphs.
More of it is here:
tflite: get_tensor on non-output tensors gives random values

How do I create multiple custom AUC metrics, one for each of the outputs, in TensorFlow?

In TensorFlow 2.0, there's the class tf.keras.metrics.AUC. It can easily be added to the list of metrics of the compile method as follows.
# Example taken from the documentation
model.compile('sgd', loss='mse', metrics=[tf.keras.metrics.AUC()])
However, in my case, the output of my neural network is an NxM tensor, where N is the batch size and M is the number of separate outputs. I would like to compute the AUC metric for each of these M outputs separately (across all N instances of the batch). So, there should be M AUC metrics, each of them is computed with N observations. I tried to create a custom metric, but I am facing some issues. The following is my first attempt.
def get_custom_auc(output):
auc = tf.metrics.AUC()
#tf.function
def custom_auc(y_true, y_pred):
y_true = y_true[:, output]
y_pred = y_pred[:, output]
auc.update_state(y_true, y_pred)
return auc.result()
custom_auc.__name__ = "custom_auc_" + str(output)
return custom_auc
The need to rename custom_auc.__name__ is described in the following post: Is it possible to have a metric that returns an array (or tensor) rather than a number?. However, this implementation raises an error.
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (strided_slice_1:0) = ] [3.14020467 3.06779885 2.86414027...] [y (Cast_1/x:0) = ] [0]
[[{{node metrics/custom_auc_2/StatefulPartitionedCall/assert_greater_equal/Assert/AssertGuard/else/_161/Assert}}]] [Op:__inference_keras_scratch_graph_5149]
I have also tried to create the AUC object inside the custom_auc, but this is not possible because I am using #tf.function, so I will get the error ValueError: tf.function-decorated function tried to create variables on non-first call.. Even if I remove the #tf.function (which I may need because I may use some if-else statements inside the implementation), I get another error
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable _AnonymousVar33 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar33/N10tensorflow3VarE does not exist.
[[node metrics/custom_auc_0/add/ReadVariableOp (defined at /train.py:173) ]] [Op:__inference_keras_scratch_graph_5174]
Note that, currently, I am adding these AUC metrics, one for each of the M outputs, as described in this answer. Furthermore, I cannot simply return the object auc, because apparently Keras expects the output of the custom metric to be a tensor and not an AUC object. So, if you do that, you get the following error.
TypeError: To be compatible with tf.contrib.eager.defun, Python functions must return zero or more Tensors; in compilation of .custom_auc at 0x1862e6680>, found return value of type , which is not a Tensor.
I've also tried to implement a custom metric class as follows.
class CustomAUC(tf.metrics.Metric):
def __init__(self, num_outputs, name="custom_auc", **kwargs):
super(CustomAUC, self).__init__(name=name, **kwargs)
assert num_outputs >= 1
self.num_outputs = num_outputs
self.aucs = [tf.metrics.AUC() for _ in range(self.num_outputs)]
def update_state(self, y_true, y_pred, sample_weight=None):
for output in range(self.num_outputs):
y_true1 = y_true[:, output]
y_pred1 = y_pred[:, output]
self.aucs[output].update_state(y_true1, y_pred1)
def result(self):
return [auc.result() for auc in self.aucs]
However, I am currently getting the error
ValueError: Shapes (200,) and () are incompatible
This error seems to be related to reset_states, so maybe I should also override this method. In fact, if I override reset_states with the following implementation
def reset_states(self):
for auc in self.aucs:
auc.reset_states()
I don't get this error anymore, but I get another error
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (strided_slice_1:0) = ] [-1.38822043 1.24234951 -0.254447281...] [y (Cast_1/x:0) = ] [0]
[[{{node metrics/custom_auc/PartitionedFunctionCall/assert_greater_equal/Assert/AssertGuard/else/_98/Assert}}]] [Op:__inference_keras_scratch_graph_5248]
So, how do I implement this custom AUC metric, one for each of the M outputs of the network? Basically, I want to do something similar to the solution described in this answer, but with the AUC metric.
I have also opened the related issue on the TensorFlow's Github issue tracker.
I have a similar problem like yours. I have a model with 3 outputs and and i want to compute a custom metric (ConfusionMatricMetric) for the 3 outputs (that have different number of classes each). I used a solution in here https://keras.io/guides/customizing_what_happens_in_fit/ - Going lower level. My problem now is that I can't train the model because of
ValueError: tf.function-decorated function tried to create variables on non-first call.
then I used
tf.config.run_functions_eagerly(True)
and now the models train, very slow but it can be saved
P.S. I also used tf.keras.metrics.KLDivergence() instead of my custom metric and reproduced the same experiment with the same results as above - trained & saved (tf.saved_model.save)

How to accomplish elif in tensorflow?

I try to accomplish if... elif..elif..else... in tensorflow, but some errors occurred. Then I try tf.cond, but it is a singe brunch.
labels is defined as a placeholder, it is a tensor that needs to be fed when training. The range of labels and newlogits is [0,27], but when computing accuracy, I want to map the labels and the logits to [0,3].
def tower_acc(logits, labels, batch_size):
newlogits=tf.argmax(logits,1)
resultlabels =[]
resultlogits =[]
for i in range(batch_size):
if labels[i]<=4:
tmplabel=0
elif 5<labels[i]<=9:
tmplabel=1
elif 10<labels[i]<=14:
tmplabel=2
else:
tmplabel=3
resultlabels.append(tmplabel)
for i in range(batch_size):
if newlogits[i]<=4:
tmplogit=0
elif 5<newlogits[i]<=9:
tmplogit=1
elif 10<newlogits[i]<=14:
tmplogit=2
else:
tmplogit=3
resultlogits.append(tmplogit)
correct_pred = tf.equal(resultlogits, resultlabels)
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
return accuracy
The error is the following:
raise TypeError("Using a tf.Tensor as a Python bool is not allowed. "
TypeError: Using a tf.Tensor as a Python bool is not allowed. Use if t is not None: instead of if t: to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
You have to review Tensorflow basics.
Like the error says, you cannot treat Tensorflow tensors as Python booleans. label[i]<4 is a (boolean) tensorflow tensor. Think about it as a pointer into your Tensorflow graph - it doesn't have a value by itself (in your case, its value is obviously dependent on the placeholder you feed). Another problem with your code is that Tensorflow doesn't support a<x<b notation (you would need tf.logical_and for that).
While in priniciple, it is possible to nest tf.cond operations by using an inner tf.cond within the false_fn of an outer tf.cond, your entire approach to remapping integers is inappropriate - by using a for loop and ifs, you are trying to force the gpu to work serially.
Instead, define a lookup table with 28 elements, mapping each integer to 0, 1, 2 or 3 and use 'tf.gather' to map all of the labels from their 28-class representation to a 4-class representation. This mapping can be done at the same time for all of the labels, no loops needed.