How are tensors added in TensorFlow? - tensorflow

I am trying to understand some basics about tensorflow by reading this tutorial here: https://www.guru99.com/tensor-tensorflow.html
What I cannot understand is why when running these commands:
# Add
tensor_a = tf.constant([[1,2]], dtype = tf.int32)
tensor_b = tf.constant([[3, 4]], dtype = tf.int32)
tensor_add = tf.add(tensor_a, tensor_b)
print(tensor_add)
I get this result:
Tensor("Add:0", shape=(1, 2), dtype=int32)
I did the calculation on paper and when adding these 2 vectors, I get something complete different(4,6), why is that? What is a "tensor" anyway?

A "tensor" in TensorFlow is a computational object. What you get with tf.add is a NODE that adds its inputs, tensor_a and tensor_b - which is what you're seeing with Tensor("Add:0") (the :0 is its form of 'id'). This node, however, does nothing until executed - it's just "there" (see below). To execute, run
with tf.Session() as sess: # 'with' ensures computing resources are
print(sess.run(tensor_add)) # properly handled, but isn't required
I suggest you check out some starter tutorials, as TF isn't exactly intuitive - e.g. here. Good luck

Related

How can I see values in tensor object? How can we see what's going on inside tensor object?

Why can't I see values in the tensorflow object? I don't know what values are going in object and how to see them. Seeing values in objects will solve my problem. I am finding tensorflow difficult because you can't see what's going on inside objects.
I have tried tf.Print() but it is not working
How can I see "predict_op" value? I don't know what is inside it. It is really important for me to see the values.
predict_op = tf.argmax(Z3, 1) #Will return max value column index.
correct_prediction = tf.equal(predict_op, tf.argmax(Y, 1))
# Calculate accuracy on the test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
train_accuracy = accuracy.eval({X: X_train, Y: Y_train})
test_accuracy = accuracy.eval({X: X_test, Y: Y_test})
print("Train Accuracy:", train_accuracy)
print("Test Accuracy:", test_accuracy)
Also if I run below code it gives error because I don't know what "tf.argmax(Y, 1)" is giving me.
con = tf.confusion_matrix(labels=tf.argmax(Y, 1),
predictions=tf.argmax(Z3, 1))
sess = tf.Session()
with sess.as_default():
print(sess.run(con))
In tensorflow, a tensor is, roughly, something that has a shape, a numerical representation in some curcumstances. Namely, a variable is a tensor and a tf.matmul produces a tensor, and a tf.placeholder is a tensor. All of them have a shape, but act drastically different when it comes to "what is a value of a tensor question?".
A variable once initialized always has a value - that is what we all are familiar with. A tensor like tf.matmul is an operation. Operations only describe what should be done with it's inputs. Operations only have value once you provide an input (or an input of an input, if op depends on another op). They are like functions, that descrive what to do, but you can never tell what is the ouput without providing an input. Placeholders, while still being a tensor, never have a value at all.
That said, if you, for example, want to debug a line tf.matmul(a, b) you must go on with running next code:
a_mul_b_op = tf.matmul(a, b)
a, b, a_mul_b = sess.run([a, b, a_mul_b_op], {x: input_x, y: input_y, etc: etc})
print(a, b, a_mul_b)
If you would like to read a value of variable (variables persist in memory in between calls to sess.run unlike operational tensors) you can go for either of next 2 ways that are equivalent:
print(var_conv42.eval())
print(sess.run([var_conv42]))
You probably need to go through the Introduction to TensorFlow article to understand how TensorFlow works. But here's a brief summary.
Define-by-run vs define-then-run
A TensorFlow program doesn't execute like a normal python script. A python scripts are define-by-run programs, meaning anything once defined you can change/see values. However TensorFlow programs are define-then-run. TensorFlow first builds a computational graph and then executes parts of/whole graph using a Session object. More info in the linke above.
Solving the problem with your code
If you want to see the value of predict_op you need to feed in the inputs/placeholders required to compute that particular tensor. For example say (I don't know how you are computing Z3 so I am assuming a simple computation),
X1 = tf.placeholder(…)
X2 = tf.placeholder(…)
Z3 = X1 + X2
predict_op = tf.argmax(Z3, 1)
Then you need to do the following to get the value of predict_op,
sess.run(predict_op, feed_dict={X1:<value>, X2:<value>})

How to properly use tf.metrics.mean_iou in Tensorflow to show confusion matrix on Tensorboard?

I found evaluation script in Tensorflow official implementation of DeeplabV3+ (eval.py) uses tf.metrics.mean_iou to update mean IOU, and adds it to Tensorboard for record.
tf.metrics.mean_iou actually returns 2 tensors, one is calculated mean IOU, the other is an opdate_op, and according to official doc (doc), confusion matrix. It seems every time if you want to get calculated mean_iou, you have to call that update_op first.
I am trying to add this update_op into summary as a tensor, but it does not work. My question is how to add this confusion matrix into Tensorboard?
I saw some other threads on how to calculate confusion matrix and add it to Tensorboard, with extra operations. I just would like to know if one can do this without those extra operations.
Any help would be appreciated.
I will post my answer here since someone upvoted it.
Let's say you defined mean_iou op in the following manner:
miou, update_op = tf.metrics.mean_iou(
predictions, labels, dataset.num_of_classes, weights=weights)
tf.summary.scalar(predictions_tag, miou)
If you see your graph in Tensorboard, you will find there is a node named 'mean_iou', and after expanding this node, you will find there is an op called 'total_confucion_matrix'. This is what you will need to calculate recall and precision for each class.
After you get the node name, you can add it to your tensorboard via tf.summary.text or print in your terminal bytf.print function. An example is posted below:
miou, update_op = tf.metrics.mean_iou(
predictions, labels, dataset.num_of_classes, weights=weights)
tf.summary.scalar(predictions_tag, miou)
# Get the correct tensor name of confusion matrix, different graphs may vary
confusion_matrix = tf.get_default_graph().get_tensor_by_name('mean_iou/total_confusion_matrix:0')
# Calculate precision and recall matrix
precision = confusion_matrix / tf.reshape(tf.reduce_sum(confusion_matrix, 1), [-1, 1])
recall = confusion_matrix / tf.reshape(tf.reduce_sum(confusion_matrix, 0), [-1, 1])
# Print precision, recall and miou in terminal
precision_op = tf.print("Precision:\n", precision,
output_stream=sys.stdout)
recall_op = tf.print("Recall:\n", recall,
output_stream=sys.stdout)
miou_op = tf.print("Miou:\n", miou,
output_stream=sys.stdout)
# Add precision and recall matrix in Tensorboard
tf.summary.text('recall_matrix', tf.dtypes.as_string(recall, precision=4))
tf.summary.text('precision_matrix', tf.dtypes.as_string(precision, precision=4))
# Create summary hooks
summary_op = tf.summary.merge_all()
summary_hook = tf.contrib.training.SummaryAtEndHook(
log_dir=FLAGS.eval_logdir, summary_op=summary_op)
precision_op_hook = tf.train.FinalOpsHook(precision_op)
recall_op_hook = tf.train.FinalOpsHook(recall_op)
miou_op_hook = tf.train.FinalOpsHook(miou_op)
hooks = [summary_hook, precision_op_hook, recall_op_hook, miou_op_hook]
num_eval_iters = None
if FLAGS.max_number_of_evaluations > 0:
num_eval_iters = FLAGS.max_number_of_evaluations
if FLAGS.quantize_delay_step >= 0:
tf.contrib.quantize.create_eval_graph()
tf.contrib.training.evaluate_repeatedly(
master=FLAGS.master,
checkpoint_dir=FLAGS.checkpoint_dir,
eval_ops=[update_op],
max_number_of_evaluations=num_eval_iters,
hooks=hooks,
eval_interval_secs=FLAGS.eval_interval_secs)
Then you will have your precision and recall matrix summarised in your Tensorboard:

Output from TensorFlow `py_func` has unknown rank/shape

I am trying to create a simple neural net in TensorFlow. The only tricky part is I have a custom operation that I have implemented with py_func. When I pass the output from py_func to a Dense layer, TensorFlow complains that the rank should be known. The specific error is:
ValueError: Inputs to `Dense` should have known rank.
I don't know how to preserve the shape of my data when I pass it through py_func. My question is how do I get the correct shape? I have a simple example below to illustrate the problem.
def my_func(x):
return np.sinh(x).astype('float32')
inp = tf.convert_to_tensor(np.arange(5))
y = tf.py_func(my_func, [inp], tf.float32, False)
with tf.Session() as sess:
with sess.as_default():
print(inp.shape)
print(inp.eval())
print(y.shape)
print(y.eval())
The output from this snippet is:
(5,)
[0 1 2 3 4]
<unknown>
[ 0.
1.17520118 3.62686038 10.01787472 27.28991699]
Why is y.shape <unknown>? I want the shape to be (5,) the same as inp. Thanks!
Since py_func can execute arbitrary Python code and output anything, TensorFlow can't figure out the shape (it would require analyzing Python code of function body) You can instead give the shape manually
y.set_shape(inp.get_shape())

How to freeze/lock weights of one TensorFlow variable (e.g., one CNN kernel of one layer)

I have a TensorFlow CNN model that is performing well and we would like to implement this model in hardware; i.e., an FPGA. It's a relatively small network but it would be ideal if it were smaller. With that goal, I've examined the kernels and find that there are some where the weights are quite strong and there are others that aren't doing much at all (the kernel values are all close to zero). This occurs specifically in layer 2, corresponding to the tf.Variable() named, "W_conv2". W_conv2 has shape [3, 3, 32, 32]. I would like to freeze/lock the values of W_conv2[:, :, 29, 13] and set them to zero so that the rest of the network can be trained to compensate. Setting the values of this kernel to zero effectively removes/prunes the kernel from the hardware implementation thus achieving the goal stated above.
I have found similar questions with suggestions that generally revolve around one of two approaches;
Suggestion #1:
tf.Variable(some_initial_value, trainable = False)
Implementing this suggestion freezes the entire variable. I want to freeze just a slice, specifically W_conv2[:, :, 29, 13].
Suggestion #2:
Optimizer = tf.train.RMSPropOptimizer(0.001).minimize(loss, var_list)
Again, implementing this suggestion does not allow the use of slices. For instance, if I try the inverse of my stated goal (optimize only a single kernel of a single variable) as follows:
Optimizer = tf.train.RMSPropOptimizer(0.001).minimize(loss, var_list = W_conv2[:,:,0,0]))
I get the following error:
NotImplementedError: ('Trying to optimize unsupported type ', <tf.Tensor 'strided_slice_2228:0' shape=(3, 3) dtype=float32>)
Slicing tf.Variables() isn't possible in the way that I've tried it here. The only thing that I've tried which comes close to doing what I want is using .assign() but this is extremely inefficient, cumbersome, and caveman-like as I've implemented it as follows (after the model is trained):
for _ in range(10000):
# get a new batch of data
# reset the values of W_conv2[:,:,29,13]=0 each time through
for m in range(3):
for n in range(3):
assign_op = W_conv2[m,n,29,13].assign(0)
sess.run(assign_op)
# re-train the rest of the network
_, loss_val = sess.run([optimizer, loss], feed_dict = {
dict_stuff_here
})
print(loss_val)
The model was started in Keras then moved to TensorFlow since Keras didn't seem to have a mechanism to achieve the desired results. I'm starting to think that TensorFlow doesn't allow for pruning but find this hard to believe; it just needs the correct implementation.
A possible approach is to initialize these specific weights with zeros, and modify the minimization process such that gradients won't be applied to them. It can be done by replacing the call to minimize() with something like:
W_conv2_weights = np.ones((3, 3, 32, 32))
W_conv2_weights[:, :, 29, 13] = 0
W_conv2_weights_const = tf.constant(W_conv2_weights)
optimizer = tf.train.RMSPropOptimizer(0.001)
W_conv2_orig_grads = tf.gradients(loss, W_conv2)
W_conv2_grads = tf.multiply(W_conv2_weights_const, W_conv2_orig_grads)
W_conv2_train_op = optimizer.apply_gradients(zip(W_conv2_grads, W_conv2))
rest_grads = tf.gradients(loss, rest_of_vars)
rest_train_op = optimizer.apply_gradients(zip(rest_grads, rest_of_vars))
tf.group([rest_train_op, W_conv2_train_op])
I.e,
Preparing a constant Tensor for canceling the appropriate gradients
Compute gradients only for W_conv2, then multiply element-wise with the constant W_conv2_weights to zero the appropriate gradients and only then apply gradients.
Compute and apply gradients "normally" to the rest of the variables.
Group the 2 train ops to a single training op.

Using external optimizers with tensorflow and stochastic network elements

I have been using Tensorflow with the l-bfgs optimizer from openopt. It was pretty easy to setup callbacks to allow Tensorflow to compute gradients and loss evaluations for the l-bfgs, however, I am having some trouble figuring out how to introduce stochastic elements like dropout into the training procedure.
During the line search, l-bfgs performs multiple evaluations of the loss function, which need to operate on the same network as the prior gradient evaluation. However, it seems that for each evaluation of the tf.nn.dropout function, a new set of dropouts is created. I am looking for a way to fix the dropout over multiple evaluations of the loss function, and then allow it to change between the gradient steps of the l-bfgs. I'm assuming this has something to do with the control flow ops in tensorflow, but there isn't really a good tutorial on how to use these and they are a little enigmatic to me.
Thanks for your help!
Drop-out relies on uses random_uniform which is a stateful op, and I don't see a way to reset it. However, you can hack around it by substituting your own random numbers and feeding them to the same input point as random_uniform, replacing the generated values
Taking the following code:
tf.reset_default_graph()
a = tf.constant([1, 1, 1, 1, 1], dtype=tf.float32)
graph_level_seed = 1
operation_level_seed = 1
tf.set_random_seed(graph_level_seed)
b = tf.nn.dropout(a, 0.5, seed=operation_level_seed)
Visualize the graph to see where random_uniform is connected
You can see dropout takes input of random_uniform through the Add op which has a name mydropout/random_uniform/(random_uniform). Actually the /(random_uniform) suffix is there for UI reasons, and the true name is mydropout/random_uniform as you can see by printing tf.get_default_graph().as_graph_def(). That gives you shortened tensor name. Now you append :0 to get actual tensor name. (side-note: operation could produce multiple tensors which correspond to suffixes :0, :1 etc. Since having one output is the most common case, :0 is implicit in GraphDef and node input is equivalent to node:0. However :0 is not implicit when using feed_dict so you have to explicitly write node:0)
So now you can fix the seed by generating your own random numbers (of the same shape as incoming tensor), and reusing them between invocations.
tf.reset_default_graph()
a = tf.constant([1, 1, 1, 1, 1], dtype=tf.float32)
graph_level_seed = 1
operation_level_seed = 1
tf.set_random_seed(graph_level_seed)
b = tf.nn.dropout(a, 0.5, seed=operation_level_seed, name="mydropout")
random_numbers = np.random.random(a.get_shape()).astype(dtype=np.float32)
sess = tf.Session()
print sess.run(b, feed_dict={"mydropout/random_uniform:0":random_numbers})
print sess.run(b, feed_dict={"mydropout/random_uniform:0":random_numbers})
You should see the same set of numbers with 2 run calls.