compute Hessians w.r.t higher rank variable not work neither by tf.hessians() nor tf.gradients() - numpy

When we need to calculate double gradient or Hessian, in tensorflow, we may use tf.hessians(F(x),x), or use tf.gradient(tf.gradients(F(x),x)[0], x)[0]. However, when x is not rank one, I was told the following error when use tf.hessians().
ValueError: Cannot compute Hessian because element 0 of xs does not
have rank one.. Tensor model_inputs/action:0 must have rank 1.
Received rank 2, shape (?, 1)
in following code:
with tf.name_scope("1st scope"):
self.states = tf.placeholder(tf.float32, (None, self.state_dim), name="states")
self.action = tf.placeholder(tf.float32, (None, self.action_dim), name="action")
with tf.name_scope("2nd scope"):
with tf.variable_scope("3rd scope"):
self.policy_outputs = self.policy_network(self.states)
# use tf.gradients twice
self.actor_action_gradients = tf.gradients(self.policy_outputs, self.action)[0]
self.actor_action_hessian = tf.gradients(self.actor_action_gradients, self.action)[0]
# or use tf.hessians
self.actor_action_hessian = tf.hessian(self.policy_outputs, self.action)
When using tf.gradients(), also causes an error:
in create_variables self.actor_action_hessian =
tf.gradients(self.actor_action_gradients, self.action)[0]
AttributeError: 'NoneType' object has no attribute 'dtype'
How can I fix this, does neither tf.gradients() nor tf.hessians() can be used in this case?

The second approach is fine, error is somewhere else, namely your graph is not connected.
self.actor_action_gradients = tf.gradients(self.policy_outputs, self.action)[0]
self.actor_action_hessian = tf.gradients(self.actor_action_gradients, self.action)[0]
errror is thrown in second line because self.actor_action_gradients is None, and so you can't compute its gradient. Nothing in your code suggests that self.policy_outputs depends on self.action (and it shouldn't, since its action that depends on policy, not policy on action).
Once you fix this you will notice, that "hessian" is not really a hessian but a vector, to form a proper hessian of f wrt. x you have to iterate over all values returned by tf.gradients, and compute tf.gradients of each one independently. This is a known limitation in TF, and no simpler way is available right now.

Related

How to get batch_size if shape method in Keras & TF returns None for the batch_size?

I'm wrapping a function as a layer. In this function, I need to know what is the shape of the input. The first index of shape is the batch_size, I need to know it! The problem is that K.int_shape returns something like (None, 2, 10). But, this (None) thing should be known at runtime, right? it is still None and causes an error.
Basically, in my function I want to create a constant that is as long as the batch_size.
Here is my function for what its worth
def func(inputs):
max_iter=3
x, y= inputs
c= tf.complex(x, y)
print(K.int_shape(c))
z= tf.zeros(shape=K.int_shape(c), dtype='complex64')
#b=K.switch(K.greater( tf.abs(c) , 4), K.constant(1, shape=(1,1)), K.constant(0, shape=(1,1)))
for i in range(max_iter):
c= c * c + z
return c
layer= Lambda(func)
You can see where I created the constant z. I want its shape to be equal to the input shape. But this is causing an error with massive trace. If I replace that with a fixed shape it works. I traced the error to this damn None thing.
Instead of using int_shape, you can use tf.zeros_like to create z
z= tf.zeros_like(c, dtype='complex64')

TypeError: 'TensorShape' object is not callable

I am new to Tensorflow programming , i was digging up some functions and got this error in the snippet :
**with** **tf.Session()** as sess_1:
c = tf.constant(5)
d = tf.constant(6)
e = c + d
print(sess_1.run(e))
print(sess_1.run(e.shape()))
Error found :Traceback (most recent call last):
File "C:/Users/Ashu/PycharmProjects/untitled/Bored.py", line 15, in
print(sess_1.run(e.shape()))
TypeError: 'TensorShape' object is not callable
I didn't found it here so can anyone please clarify this silly doubt as i am new learner.Sorry for any typing mistake !
I have a one more doubt , when i uses simply eval() function it doesn't print anything in pycharm , i had to use it along with print() method. So my doubt is when print() method is used it doesn't print the dtype of the tensor , it simply print the tensor or python object value in pycharm.(Why i am not getting the output in the format like : array([1. , 1.,] , dtype=float32))Is it the Pycharm way to print the tensor in new version or is it something i am doing wrong ? So excited to know the thing behind this , please help and pardon if i am wrong at any place.
One confusing aspect of tensorflow for beginners is there are two types of shape: dynamic shape, given by tf.shape(x), and static shape, given by x.shape (assuming x is a tensor). While they represent the same concept, they are used very differently.
Static shape is the shape of a tensor known at run time. Its a data type in its own right, but it can be converted to a list using as_list().
x = tf.placeholder(shape=(None, 3, 4))
static_shape = x.shape
shape_list = x.shape.as_list()
print(shape_list) # [None, 3, 4]
y = tf.reduce_sum(x, axis=1)
print(y.shape.as_list()) # [None, 4]
During operations, tensorflow tracks static shapes as best it can. In the above example, y's shape was calculated based on the partially known shape of x's. Note we haven't even created a session, but the static shape is still known.
Since the batch size is not known, you can't use the static first entry in calculations.
z = tf.reduce_sum(x) / tf.cast(x.shape.as_list()[0], tf.float32) # ERROR
(we could have divided by x.shape.as_list()[1], since that dimension is known at run-time - but that wouldn't demonstrate anything here)
If we need to use a value which is not known statically - i.e. at graph construction time - we can use the dynamic shape of x. The dynamic shape is a tensor - like other tensors in tensorflow - which is evaluated using a session.
z = tf.reduce_sum(x) / tf.cast(tf.shape(x)[0], tf.float32) # all good!
You can't call as_list on the dynamic shape, nor can you inspect its values without going through a session evaluation.
As stated in the documentation, you can only call a session's run method with tensors, operations, or lists of tensors/operations. Your last line of code calls run with the result of e.shape(), which has type TensorShape. The session can't execute a TensorShape argument, so you're getting an error.
When you call print with a tensor, the system prints the tensor's content. If you want to print the tensor's type, use code like print(type(tensor)).

Tensorflow: InvalidArgumentError: Input ... incompatible with expected float_ref

The following code results in a very unhelpful error:
import tensorflow as tf
x = tf.Variable(tf.constant(0.), name="x")
with tf.Session() as s:
val = s.run(x.assign(1))
print(val) # 1
val = s.run(x, {x: 2})
print(val) # 2
val = s.run(x.assign(1), {x: 0.}) # InvalidArgumentError
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node Assign_1 was passed float from _arg_x_0_0:0 incompatible with expected float_ref.
How did I get this error?
Why do I get this error?
Here's what I could infer.
How did I get this error?
This error is seen when attempting to perform the following two operations in a single session run:
A Tensorflow variable is assigned a value
That same variable is also passed a value as part of the feed_dict
This is why the first 2 runs succeed (they both don't simultaneously attempt to perform both these operations).
Why do I get this error?
I am not sure, but I don't think this was an intentional design choice by Google. Here's my explanation:
Firstly, the TF(TensorFlow) source code (basically) resolves x.assign(1) to tf.assign(x, 1) which gives us a hint for better understand the error message when it says Input 0.
The error message refers to x when it says Input 0 of the assign op.
It goes on to say that the first argument of the assign op was passed float from _arg_x_0_0:0.
TLDR
Thus for a run where a TF variable is provided as a feed, that variable will no longer be treated as a variable (but instead as the value it was assigned), and thus any attempts at further assigning a value to it would be erroneous since only TF variables can be assigned a value in the graph.
Fix
If your graph has variable assignment operation, don't pass a value to that same variable in your feed_dict. ¯_(ツ)_/¯. Assuming you're using the feed_dict to provide an initial value, you could instead assign it a value in a prior session run. Or, leverage tf.control_dependencies when building your graph to assign it an initial value from a placeholder as shown below:
import tensorflow as tf
x = tf.Variable(tf.constant(0.), name="x")
initial_x = tf.placeholder(tf.float32)
assign_from_placeholder = x.assign(initial_x)
with tf.control_dependencies([assign_from_placeholder]):
x_assign = x.assign(1)
with tf.Session() as s:
val = s.run(x_assign, {initial_x: 0.}) # Success!

Unable to obtain moments using tensorflow

I want to calculate the moments of a vector x = np.random.normal(0,1,[1,500]). When I do mean, std = tf.nn.moments(x,axes=[0]), it throws this error:
File "/tmp/venv/local/lib/python2.7/site-packages/tensorflow/python/ops/nn.py", line 830, in moments
y = math_ops.cast(x, dtypes.float32) if x.dtype == dtypes.float16 else x
TypeError: data type not understood
I am using tensorflow==0.11.0. What is the correct syntax?
As shown in the documentation for tf.nn.moments, the input x must be a Tensor.
You should use something like the following:
x = tf.placeholder("float", [None,500])
mean, std = tf.nn.moments(x, axes=[0])
sess = tf.Session()
sess.run(tf.global_variables_initializer())
sample_mean, sample_std = sess.run([mean, std],
feed_dict={x: np.random.normal(0,1,[1,500])})
Note: This particular calculation does not make much sense, since there is only one data value. You may want to either increase the shape to something like [32, 500], or more likely change the axes from [0] to [1].
Regardless, the calculation will complete without errors, despite the calculated standard deviation being equal to 0, because the moments are calculated along an axis with one dimension.

Placeholders for LSTM-RNN parameters in TensorFlow

I would like to use placeholders for the dropout rate, number of hidden units, and number of layers in an LSTM-based RNN. Below is the code I am currently trying.
dropout_rate = tf.placeholder(tf.float32)
n_units = tf.placeholder(tf.uint8)
n_layers = tf.placeholder(tf.uint8)
net = rnn_cell.BasicLSTMCell(n_units)
net = rnn_cell.DropoutWrapper(net, output_keep_prob = dropout_rate)
net = rnn_cell.MultiRNNCell([net] * n_layers)
The last line gives the following error:
TypeError: Expected uint8, got <tensorflow.python.ops.rnn_cell.DropoutWrapper
object ... of type 'DropoutWrapper' instead.
I would appreciate any help.
The Error is raised from the following code: [net] * n_layers.
You are trying to make a list looking like [net, net, ..., net] (with a length of n_layers), but n_layers is now a placeholder of unknown value.
I can't think of a way to do that with a placeholder, so I guess you must go back to a standard n_layers=3. (Anyway, putting n_layers as a placeholder was not a good practice in the first place.)