I want to create a temporary variable in TF and then substract it from my input variable if it is a traning phase. This is simplified code that I use. Please, could you give me a piece of advice how to make it work?
Please, keep in mind that I don't want to create a variable if it is not traning phase.
import tensorflow as tf
def some_transformation(x):
x0 = tf.get_variable('x0', initializer=tf.random_uniform([1],
maxval=0.3, dtype=tf.float32), dtype=tf.float32)
return tf.subtract(x, x0)
x = tf.placeholder("float", [])
is_traning = tf.placeholder(tf.int32, None)
x_transformed = tf.cond(is_traning > 0, lambda: some_transformation(x), lambda: x)
#x_transformed = some_transformation(x)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
out = sess.run(x_transformed, feed_dict={x: 10, is_traning: 1})
print(out)
Please do post your error messages along with your code, after running your code I see you're getting this error:
ValueError: Initializer for variable x0/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer.
That appears to be because you are trying to call tf.get_variable from inside some_transformation and it's telling you to change the property initializer=tf.random_uniform(...) to initializer=lambda: tf.random_uniform(...).
You might also opt to define x0 outside of the transformation and pass it in as:
x_transformed = tf.cond(is_traning > 0, lambda: some_transformation(x, x0), lambda: x)
If that's valid in your use case.
Related
I want to know whether the tensorflow operations in this link, have a gradient defined. I am asking because I am implementing a custom loss function and when I run it I always have this error :
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
This is my custom Loss function:
def calculate_additional_loss(y_true,y_pred):
#additional loss
x_decoded_normalized = original_dim* y_pred
#y_true = K.print_tensor(y_true, message='y_true = ')
#y_pred = K.print_tensor(y_pred, message='y_pred = ')
error = tf.constant(0, dtype= tf.float32)
additional_loss= tf.constant(0, dtype= tf.float32)
final_loss= tf.constant(0, dtype= tf.float32)
for k in range(batch_size):
#add padding
reshaped_elem_1 = K.reshape(x_decoded_normalized[k], [DIM,DIM])
a = K.reshape(reshaped_elem_1[:,DIM-1], [DIM,1])
b = K.reshape(reshaped_elem_1[:,1], [DIM,1])
reshaped_elem_1 = tf.concat ([b,reshaped_elem_1], axis= 1)
reshaped_elem_1 = tf.concat ([reshaped_elem_1,a], axis= 1)
c= K.reshape(reshaped_elem_1[DIM-1,:], [1,DIM+2])
d= K.reshape(reshaped_elem_1[1,:], [1,DIM+2])
reshaped_elem_1 = tf.concat ([d,reshaped_elem_1],axis=0)
reshaped_elem_1 = tf.concat ([reshaped_elem_1,c],axis=0)
for (i,j) in range(reshaped_elem_1.shape[0],reshaped_elem_1.shape[1]):
error = tf.add(error, tf.pow((reshaped_elem_1[i,j]-
reshaped_elem_1[i,j+1]),-2),
tf.pow((reshaped_elem_1[i,j]-reshaped_elem_1[i,j-
1]),-2), tf.pow((reshaped_elem_1[i,j]-
reshaped_elem_1[i-1,j]),-2),
tf.pow((reshaped_elem_1[i,j]-reshaped_elem_1[i+1,j]),-2))
additional_loss = tf.add(additional_loss, tf.divide(error, original_dim))
final_loss += tf.divide(additional_loss, batch_size)
print('final_loss', final_loss)
return final_loss
and This is where I am calling it:
models = (encoder, decoder)
additional_loss = calculate_additional_loss(inputs,outputs)
vae.add_loss(additional_loss)
vae.compile(optimizer='adam')
vae.summary()
plot_model(vae,to_file='vae_mlp.png',show_shapes=True)
vae.fit(x_train, epochs=epochs, batch_size=batch_size, validation_data=(x_test, None), verbose = 1, callbacks=[CustomMetrics()])
Thank you in advance.
Most ops have a defined gradient. There are some ops for which a gradient is not defined and the error message you get gives you some examples.
Having said that, there are couple of mistakes I see in your code :
final_loss is defined as tf.constant, but you are trying to increment it.
You are taking a tuple from range
error is defined as tf.constant, but you are trying to increment it.
Don't use for loop in this way over batch_size. Instead use TensorFlow functions to handle batch dimension directly. This way you are just proliferating your nodes.
The way you have written your code makes me think that you're thinking of TensorFlow as pure python. It is not. You define the graph and then you execute it inside a session. So, in the function use TF functions to just define the computations.
I want to use a function that creates weights for a normal dense layer, it basically behaves like an initialization function, only that it "initializes" before every new forward pass.
The flow for my augmented linear layer looks like this:
input = (x, W)
W_new = g(x,W)
output = tf.matmul(x,W_new)
However, g(x,W) is not differentiable, as it involves some sampling. Luckily it also doesn't have any parameters I want to learn so I just try to do the forward and backward pass, as if I would have never replaced W.
Now I need to tell the automatic differentiation to not backpropagate through g(). I do this with:
W_new = tf.stop_gradient(g(x,W))
Unfortunately this does not work, as it complains about non-matching shapes.
What does work is the following:
input = (x, W)
W_new = W + tf.stop_gradient(g(x,W) - W)
output = tf.matmul(x,W_new)
as suggested here: https://stackoverflow.com/a/36480182
Now the forward pass seems to be OK, but I don't know how to override the gradient for the backward pass. I know, that I have to use: gradient_override_map for this, but could not transfer applications I have seen to my particular usecase (I am still quite new to TF).
However, I am not sure how to do this and if there isn't an easier way. I assume something similar has to be done in the first forward pass in a given model, where all weights are initialized while we don't have to backpropagate through the init functions as well.
Any help would be very much appreciated!
Hey #jhj I too faced the same problem fortunately I found this gist. Hope this helps :)
Sample working -
import tensorflow as tf
from tensorflow.python.framework import ops
import numpy as np
Define custom py_func which takes also a grad op as argument:
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name, "PyFuncStateless": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
Def custom square function using np.square instead of tf.square:
def mysquare(x, name=None):
with ops.name_scope(name, "Mysquare", [x]) as name:
sqr_x = py_func(np.square,
[x],
[tf.float32],
name=name,
grad=_MySquareGrad) # <-- here's the call to the gradient
return sqr_x[0]
Actual gradient:
def _MySquareGrad(op, grad):
x = op.inputs[0]
return grad * 20 * x # add a "small" error just to see the difference:
with tf.Session() as sess:
x = tf.constant([1., 2.])
y = mysquare(x)
tf.global_variables_initializer().run()
print(x.eval(), y.eval(), tf.gradients(y, x)[0].eval())
I am running the following code:
import tensorflow as tf
sess = tf.InteractiveSession()
y = tf.Variable(initial_value=[1,2])
sess.run(y, feed_dict={y: [100,2]})
Gives:
[100,2]
However, after that:
sess.run(y)
Gives the origianl value of y: [1,2].
Why doesn't the:
sess.run(y, feed_dict={y: [100,2]})
updates the value of y, and saves it?
Because feed_dict overrides the values of the keys of the dictionary.
With the statement:
sess.run(y, feed_dict={y: [100,2]})
you're telling tensorflow to replace the values of y with [100, 2] for the current computation. This is not an assignment.
Therefore, the next call
sess.run(y)
fetches the original variables and uses it.
If you want to assign a value to a variable, you have to define this operation in the computational graph, using tf.assing
If you want to use a feed dictionary, initialize a placeholder instead of a variable and define the output.
As an example (in the same style as your question code),
import tensorflow as tf
import numpy as np
sess = tf.InteractiveSession()
inputs = tf.placeholder(tf.int32, shape = (2,2))
output = tf.matmul(inputs, tf.transpose(inputs))
test_input = np.array([[10,2], [4,4]])
print test_input.shape
# (2,2)
sess.run(output, feed_dict = {inputs : test_input})
# array([[104, 48], [48, 32]], dtype=int32)
If you just want to change the value of a variable look to nessuno's answer.
I've been trying to understand how variables are initialized in Tensorflow. Below, I created a simple example which defines a variable in some variable_scope and the process is wrapped in the subfunction.
In my understanding, this code creates a variable 'x' inside the 'test_scope' at tf.initialize_all_variables() stage and it can always be accessed after that using tf.get_variable(). But this code ended up with the Attempting to use uninitialized value error at print(x.eval()) line.
I don't have any idea about how Tensorflow initializes variables. Can I get any help? Thank you.
import tensorflow as tf
def create_var_and_prod_with(y):
with tf.variable_scope('test_scope'):
x = tf.Variable(0.0, name='x', trainable=False)
return x * y
s = tf.InteractiveSession()
y = tf.Variable(1.0, name='x', trainable=False)
create_var_and_prod_with(y)
s.run(tf.initialize_all_variables())
with tf.variable_scope('test_scope'):
x = tf.get_variable('x', [1], initializer=tf.constant_initializer(0.0), trainable=False)
print(x.eval())
print(y.eval())
If you want to reuse a variable, you have to declare it using get_variables and than explicitly ask to the scope to make the variables reusable.
If you change the line
x = tf.Variable(0.0, name='x', trainable=False)
with:
x = tf.get_variable('x', [1], trainable=False)
And you ask to the scope to make the already defined variable available:
with tf.variable_scope('test_scope') as scope:
scope.reuse_variables()
x = tf.get_variable('x', [1], initializer=tf.constant_initializer(0.0), trainable=False)
Then you can run print(x.eval(), y.eval()) without problems.
If you want to reuse a variable with tf.get_variable('x'), the variable has to be created in the first place with tf.get_variable('x').
Moreover, when you want to retrieve a created variable, you need to be in a scope withreuse=True`.
Here is what your code should look like:
import tensorflow as tf
def create_var_and_prod_with(y):
with tf.variable_scope('test_scope'):
x = tf.get_variable('x', [1], initializer=tf.constant_initializer(0.0), trainable=False)
return x * y
y = tf.Variable(1.0, name='x', trainable=False)
create_var_and_prod_with(y)
with tf.variable_scope('test_scope', reuse=True):
x = tf.get_variable('x') # you only need the name to retrieve x
# Try to put the session only at the end when it is needed
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(x.eval())
print(y.eval())
You can read more about it in this tutorial.
I was looking at the mechanics section for Tensorflow, specifically on shared variables. In the section "The problem", they are dealing with a convolutional neural net, and provide the following code (which runs an image through the model):
# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)
If the model was implemented in such a way, would it then be impossible to learn/update the parameters because there's a new set of parameters for each image in my training set?
Edit:
I've also tried "the problem" approach on a simple linear regression example, and there do not appear to be any issues with this method of implementation. Training seems to work as well as can be shown by the last line of the code. So I'm wondering if there is a subtle discrepancy in the tensorflow documentation and what I'm doing. :
import tensorflow as tf
import numpy as np
trX = np.linspace(-1, 1, 101)
trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 # create a y value which is approximately linear but with some random noise
X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
def model(X):
with tf.variable_scope("param"):
w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
return tf.mul(X, w) # lr is just X*w so this model line is pretty simple
y_model = model(X)
cost = (tf.pow(Y-y_model, 2)) # use sqr error for cost function
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) # construct an optimizer to minimize cost and fit line to my data
sess = tf.Session()
init = tf.initialize_all_variables() # you need to initialize variables (in this case just variable W)
sess.run(init)
with tf.variable_scope("train"):
for i in range(100):
for (x, y) in zip(trX, trY):
sess.run(train_op, feed_dict={X: x, Y: y})
print sess.run(y_model, feed_dict={X: np.array([1,2,3])})
One has to create the variable set only once per whole training (and testing) set. The goal of variable scopes is to allow for modularization of subsets of parameters, such as those belonging to layers (e.g. when architecture of a layer is repeated, the same names can be used within each layer scope).
In your example you create parameters only in the model function. You can print out your variable names to see that it is assigned to the specified scope:
from __future__ import print_function
X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
print("X:", X.name)
print("Y:", Y.name)
def model(X):
with tf.variable_scope("param"):
w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
print("w:", w.name)
return tf.mul(X, w)
The call to sess.run(train_op, feed_dict={X: x, Y: y}) only evaluates the value of train_op given the provided values of X and Y. No new variables (incl. parameters) are created there; therefore, it has no effect. You can make sure the variable names stay the same by again printing them out:
with tf.variable_scope("train"):
print("X:", X.name)
print("Y:", Y.name)
for i in range(100):
for (x, y) in zip(trX, trY):
sess.run(train_op, feed_dict={X: x, Y: y})
You will see that variable names stay the same, as they are already initialized.
If you'd like to retrieve a variable using its scope, you need to use get_variable within a tf.variable_scope enclosure:
with tf.variable_scope("param"):
w = tf.get_variable("weights", [1])
print("w:", w.name)