How to understand using tf.cond with tf.Print? - tensorflow

Look at the code:
import tensorflow as tf
x = tf.constant(1.0)
y = tf.constant(2.0)
z = tf.constant(3.0)
def f1():
return tf.Print(x, [x])
def f2():
return tf.Print(z, [z])
op = tf.cond(x>y, f1, f2)
with tf.Session() as sess:
sess.run(op)
I'm very puzzled, the output of tf.Print is 3.0
As we know, tf.Print(z, [z]) will output the value of z only when z is evaluated, but I don't think I have evaluated z.
Another question is about tf.cond, how does it add node to graph, for example how does add tf.Print to graph, I think it should relate some tensor with the return of tf.Print, otherwise tf.Print won't be executed.
I'm so puzzled.

I think you might have got the order of the arguments to tf.cond confused. The call:
tf.cond(predicate, f, g)
is equivalent to "if predicate is true then evaluate f, otherwise evaluate g"
In your example, since your predicate x > y is false, f2 is evaluated
Note
Since tensorflow 1.4, tf.cond will accept key-word arguments true_fn and false_fn, so you can avoid any confusion by writing:
tf.cond(predicate, true_fn=f, false_fn=g)
# Or equivalently...
tf.cond(predicate, false_fn=g, true_fn=f)

Related

How to: TensorFlow-Probability custom loss that ignores NA values (or otherwise masks loss)

I seek to implement in TensorFlow-Probability a masked loss function, that can ignore NAs in the labels.
This is a well worn task for regular tensors. I cannot find an example for distributions.
My distributions are sized (batch, time-steps, outputs) (512, 251 days, 1 to 8 time series)
The traditional loss function given in examples is this using the distribution's log probability.
neg_log_likelihood <- function (x, rv_x) {
-1*(rv_x %>% tfd_log_prob(x))
}
When I replace NAs with zeros, the model trains fine and converges. When I leave in NAs it produces NaN losses as expected.
I've experimented with many different permutations of tf$where to replace loss with 0, the label with 0, etc. In each of those cases the model stops training and loss stays near some constant. That's the case even when there's just a single NA in the labels.
neg_log_likelihood_missing <- function (x, rv_x) {
loss = -1*( rv_x %>% tfd_log_prob(x) )
loss_nonan = tf$where( tf$math$is_finite(x) , loss, 0 )
return(
loss_nonan
)
}
My use of R here is incidental, and any examples in python or otherwise I can translate. If there's a correct way to this so that losses correctly back-propagate, I would greatly appreciate it.
If you are using gradient based inference, you may need the "double where" trick.
While this gets you a correct value of y:
y = computation(x)
tf.where(is_nan(y), 0, y)
...the derivative of the tf.where can still have a nan.
Instead write:
safe_x = tf.where(is_unsafe(x), some_safe_x, x)
y = computation(safe_x)
tf.where(is_unsafe(x), 0, y)
...to get both a safe y out and a safe dy/dx.
For the case you're considering, perhaps write:
class MyMaskedDist(tfd.Distribution):
...
def _log_prob(self, x):
safe_x = tf.where(tf.is_nan(x), self.mode(), x)
lp = compute_log_prob(safe_x)
lp = tf.where(tf.is_nan(x), tf.zeros([], lp.dtype), lp)
return lp

Triple tensor product with Tensorflow

Suppose I have a matrix A and two vectors x,y, of appropriate dimensions. I want to compute the dot product x' * A * y, where x' denotes the transpose. This should result in a scalar.
Is there a convenient API function in Tensorflow to do this?
(Note that I am using Tensorflow 2).
Use tf.linalg.tensordot(). See the documentation
As you have mentioned in the question that you are trying to find dot product. In this case tf.matmul() will not work, as it is only for cross product of metrices.
Demo code snippet
import tensorflow as tf
A = tf.constant([[1,4,6],[2,1,5],[3,2,4]])
x = tf.constant([3,2,7])
result = tf.linalg.tensordot(tf.transpose(x), A, axes=1)
result = tf.linalg.tensordot(result, x, axes=1)
print(result)
And the result will be
>>>tf.Tensor(532, shape=(), dtype=int32)
Few points I want to mention here
Don't forget the axes argument inside tf.linalg.tensordot()
When you create tf.zeros(5) it will create a list of shape 5 and it will be like [0,0,0,0,0], when you transpose this it will give you the same list. But if you create it like tf.zeros((5,1)), it would be a vector of shape (5,1) and the result will be
[
[0],[0],[0],[0],[0]
]
Now you can transpose this and the result will be different, but I recommend you do the code snippet I have mentioned. In case of dot product you don't have to bother much about this.
If you are still facing issues, will be very happy to help you.
Just do the following,
import tensorflow as tf
x = tf.constant([1,2])
a = tf.constant([[2,3],[3,4]])
y = tf.constant([2,3])
z = tf.reshape(tf.matmul(tf.matmul(x[tf.newaxis,:], a), y[:, tf.newaxis]),[])
print(z.numpy())
Returns
>>> 49
Just use tf.transpose and multiplication operator like this:
tf.transpose(x)* A * y .
Based on your example:
x = tf.zeros(5)
A = tf.zeros((5,5))
How about
x = tf.expand_dims(x, -1)
tf.matmul(tf.matmul(x, A, transpose_a=True), x)

Tensorflow indexing into python list during tf.while_loop

I have this annoying problem and i dont know how to solve it.
I am reading in batches of data from a CSV using a dataset reader and am wanting to gather certain columns. The reader returns a tuple of tensors and, depending on which reader i use, columns are either indexed via integer or string.
I can easily enough do a for loop in python and slice the columns I want however I am wanting to do this in a tf.while_loop to take advantage of parallel execution.
This is where my issue lies - the iterator in the while loop is tensor based and i cannot use this to index into my dataset. If i try and evaluate it I get an error about the session not being the same etc etc
How can i use a while loop (or a map function) and have the function be able to index into a python list/dict without evaluating or running the iterator tensor?
Simple example:
some_data = [1,2,3,4,5]
x = tf.constant(0)
y = len(some_data)
c = lambda x: tf.less(x, y)
b = lambda x: some_data[x] <--- You cannot index like this!
tf.while_loop(c, b, [x])
Does this fit your requirement somewhat ? It does nothing apart from print the value.
import tensorflow as tf
from tensorflow.python.framework import tensor_shape
some_data = [11,222,33,4,5,6,7,8]
def func( v ):
print (some_data[v])
return some_data[v]
with tf.Session() as sess:
r = tf.while_loop(
lambda i, v: i < 4,
lambda i, v: [i + 1, tf.py_func(func, [i], [tf.int32])[0]],
[tf.constant(0), tf.constant(2, tf.int32)],
[tensor_shape.unknown_shape(), tensor_shape.unknown_shape()])
r[1].eval()
It prints
11
4
222
33
The order changes everytime but I guess tf.control_dependencies may be useful to control that.

Clipping(Filtering) tf.placeholder values

I want to change my tf.placeholder values such that:
values < SmallConstant is set to 0.
It's not exactly clipping, so I can't use: tf.clip_by_value()
I tried the suggestion in Conditional assignment of tensor values in TensorFlow, and this is what I have so far:
x = tf.placeholder(tf.float32, None)
condition = tf.less(x, tf.constant(SmallConst))
tf.assign(x, tf.where(condition, tf.zeros_like(x), x))
However, on running this, I get an error saying
AttributeError: 'Tensor' object has no attribute 'assign'
It seems tf.assign() can be done on tf.Variable but not on tf.placeholder.
Is there any other way I can do this?
Thank you!
Yes, it's even easier than you think:
x = tf.placeholder(tf.float32, None)
# create a bool tensor the same shape as x
condition = x < SmallConst
# create tensor same shape as x, with values greater than SmallConst set to 0
to_remove = x*tf.to_float(condition)
# set all values of x less than SmallConst to 0
x_clipped = x - to_remove
I'd normally just put that into one line like:
x_clipped = x - x*tf.to_float(x < small_const)
note: using tf.to_float on a tensor of type bool will give you 0.0s in place of Falses and 1.0s in place of Trues
Additional information for cleaner code:
The numerical operators (e.g. <, >=, +, - etc, but not ==) are overloaded for tensorflow tensors such that you can use native python variables with tensors to get a new tensor that is the result of that operation. So tf.constant() is actually fairly rarely actually needed. Example of this in action:
a = tf.placeholder(tf.int32)
b = a + 1
c = a > 0
print(b) # gives "<tf.Tensor 'add:0' shape=<unknown> dtype=int32>"
print(c) # gives "<tf.Tensor 'Greater:0' shape=<unknown> dtype=bool>"
sess.run(b, {a: 1}) # gives scalar int32 numpy array with value 2
sess.run(c, {a: 1}) # gives scalar bool numpy array with value True
This is also true of numpy.
tf.assign() only works on Variables because it will
Update 'ref' by assigning 'value' to it.
Tensors in tensorflow are immutable. The result of any operation on a tensor is another tensor, but the original tensor will never change. Variables, however are mutable, and you change their value with tf.assign()

(Tensorflow)Does the op assign change the gradient computation?

I use the op "assign" to change the value of variables instead of "=", but I found the gradient I got is quite different. Could anyone tell me the difference and why? thanks!
Like change w = w1 to op1 = tf.assign(w, w1) sess.run(op1)
= and tf.assign are different operations.
= is a python operation, in which you assign a python value to a python variable
tf.assign is a Tensorflow operation that assigns the value to the variable ref and returns the assign operation.
= is executed in python and doesn't affect the computation graph.
tf.assign is a node in the computational graph.
To understand, let's run this simple script
import tensorflow as tf
x = tf.Variable(1)
y = tf.Variable(2)
x = y
print(x.name, y.name)
a = tf.Variable(1)
b = tf.Variable(2)
# override a, otherwise a content is 1
a = a.assign(b)
print(a.name, b.name)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(sess.run([x, y, a, b]))
print(x.name, y.name) outputs Variable_1:0 Variable_1:0
because = is executed in python and you've overwritten the variable x.
print(a.name, b.name) outputs Assign:0 Variable_3:0 because you defined an assign op in the computational graph, now a is an assign op.
When you run the defined graph, you get:
[2, 2, 2, 2]
But these values are computed differently: one is a computation in the graph, the others no.
If you forgot to assign a to the assign op created with tf.assign (thus you change the line a = a.assign(b) to a.assign(b)), then when you evaluate the graph, you'll get:
[2, 2, 1, 2]