I've been trying to understand how variables are initialized in Tensorflow. Below, I created a simple example which defines a variable in some variable_scope and the process is wrapped in the subfunction.
In my understanding, this code creates a variable 'x' inside the 'test_scope' at tf.initialize_all_variables() stage and it can always be accessed after that using tf.get_variable(). But this code ended up with the Attempting to use uninitialized value error at print(x.eval()) line.
I don't have any idea about how Tensorflow initializes variables. Can I get any help? Thank you.
import tensorflow as tf
def create_var_and_prod_with(y):
with tf.variable_scope('test_scope'):
x = tf.Variable(0.0, name='x', trainable=False)
return x * y
s = tf.InteractiveSession()
y = tf.Variable(1.0, name='x', trainable=False)
create_var_and_prod_with(y)
s.run(tf.initialize_all_variables())
with tf.variable_scope('test_scope'):
x = tf.get_variable('x', [1], initializer=tf.constant_initializer(0.0), trainable=False)
print(x.eval())
print(y.eval())
If you want to reuse a variable, you have to declare it using get_variables and than explicitly ask to the scope to make the variables reusable.
If you change the line
x = tf.Variable(0.0, name='x', trainable=False)
with:
x = tf.get_variable('x', [1], trainable=False)
And you ask to the scope to make the already defined variable available:
with tf.variable_scope('test_scope') as scope:
scope.reuse_variables()
x = tf.get_variable('x', [1], initializer=tf.constant_initializer(0.0), trainable=False)
Then you can run print(x.eval(), y.eval()) without problems.
If you want to reuse a variable with tf.get_variable('x'), the variable has to be created in the first place with tf.get_variable('x').
Moreover, when you want to retrieve a created variable, you need to be in a scope withreuse=True`.
Here is what your code should look like:
import tensorflow as tf
def create_var_and_prod_with(y):
with tf.variable_scope('test_scope'):
x = tf.get_variable('x', [1], initializer=tf.constant_initializer(0.0), trainable=False)
return x * y
y = tf.Variable(1.0, name='x', trainable=False)
create_var_and_prod_with(y)
with tf.variable_scope('test_scope', reuse=True):
x = tf.get_variable('x') # you only need the name to retrieve x
# Try to put the session only at the end when it is needed
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(x.eval())
print(y.eval())
You can read more about it in this tutorial.
Related
I am trying to implement a noisy linear layer in tensorflow, inheriting from tf.keras.layers.Layer . Everything works fine except for reusing variables. This seems to stem from some issue with the scoping: Whenever i use the add_weight function from the superclass and a weight with the same name already exists, it seems to ignore the given reuse-flag in the scope and creates a new variable instead. Interestingly, it does not add a 1 to the variable name in the end as usual in similar cases, but rather adds the 1 to the scope name.
import tensorflow as tf
class NoisyDense(tf.keras.layers.Layer):
def __init__(self,output_dim):
self.output_dim=output_dim
super(NoisyDense, self).__init__()
def build(self, input_shape):
self.input_dim = input_shape.as_list()[1]
self.noisy_kernel = self.add_weight(name='noisy_kernel',shape= (self.input_dim,self.output_dim))
def noisydense(inputs, units):
layer = NoisyDense(units)
return layer.apply(inputs)
inputs = tf.placeholder(tf.float32, shape=(1, 10),name="inputs")
scope="scope"
with tf.variable_scope(scope):
inputs3 = noisydense(inputs,
1)
my_variable = tf.get_variable("my_variable", [1, 2, 3],trainable=True)
with tf.variable_scope(scope, reuse=tf.AUTO_REUSE):
inputs2 = noisydense(inputs,
1)
my_variable = tf.get_variable("my_variable", [1, 2, 3],trainable=True)
tvars = tf.trainable_variables()
init=tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
tvars_vals = sess.run(tvars)
for var, val in zip(tvars, tvars_vals):
print(var.name, val)
This results in the variables
scope/noisy_dense/noisy_kernel:0
scope_1/noisy_dense/noisy_kernel:0
scope/my_variable:0
being printed. I would like it to reuse the noisy kernel instead of creating a second one, as it is done for my_variable.
I want to create a temporary variable in TF and then substract it from my input variable if it is a traning phase. This is simplified code that I use. Please, could you give me a piece of advice how to make it work?
Please, keep in mind that I don't want to create a variable if it is not traning phase.
import tensorflow as tf
def some_transformation(x):
x0 = tf.get_variable('x0', initializer=tf.random_uniform([1],
maxval=0.3, dtype=tf.float32), dtype=tf.float32)
return tf.subtract(x, x0)
x = tf.placeholder("float", [])
is_traning = tf.placeholder(tf.int32, None)
x_transformed = tf.cond(is_traning > 0, lambda: some_transformation(x), lambda: x)
#x_transformed = some_transformation(x)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
out = sess.run(x_transformed, feed_dict={x: 10, is_traning: 1})
print(out)
Please do post your error messages along with your code, after running your code I see you're getting this error:
ValueError: Initializer for variable x0/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer.
That appears to be because you are trying to call tf.get_variable from inside some_transformation and it's telling you to change the property initializer=tf.random_uniform(...) to initializer=lambda: tf.random_uniform(...).
You might also opt to define x0 outside of the transformation and pass it in as:
x_transformed = tf.cond(is_traning > 0, lambda: some_transformation(x, x0), lambda: x)
If that's valid in your use case.
I was trying to assign a variable y in tensorflow which is to be dependent on x. But, even upon changing the value of x, y doesn't change.
import tensorflow as tf
sess = tf.Session()
x=tf.Variable(4,name='x')
model = tf.global_variables_initializer()
sess.run(model)
y=tf.Variable(2*x,name='y')
model = tf.global_variables_initializer()
sess.run(model)
sess.run(x)
sess.run(tf.assign(x,2))
print(sess.run(y))
I am expecting an output 4, but I'm getting 8. Any help would be appreciated.
Gramercy...
y=tf.Variable(2*x, name='y') just means y will be initialized by x*2, change this line into y = 2 * x will do as you expected.
In tensorflow, we can define our own op and its gradient by:
https://gist.github.com/harpone/3453185b41d8d985356cbe5e57d67342
However, can we modify any variable in the computational graph in these python functions. For example in the "_MySquareGrad" function?
I assume we can get the variable by:
var = tf.get_variable('var')
and then do something to change its value and then assign it back?
e.g.
tmp = var*10
var.assign(tmp)
Thanks!
Also when we do var*10, do we have to convert it to numpy?
Background: I'm familiar with automatic differentiation, but new to Tensorflow and Python. So please point out any syntactic problem and let me know if my intention is clear.
You can modify the variables in the computational graph in these python functions. Your example code with tmp = var*10 will work and does not convert anything to numpy.
In fact you should try to avoid converting to numpy as much as possible since it will slow down the computation.
edit:
You can include your code to the gradient computation graph of the _MySquareGrad function doing this:
def _MySquareGrad(op, grad):
#first get a Variable that was created using tf.get_variable()
with tf.variable_scope("", reuse=True):
var = tf.get_variable('var')
#now create the assign graph:
tmp = var*10.
assign_op = var.assign(tmp)
#now make the assign operation part of the grad calculation graph:
with tf.control_dependencies([assign_op]):
x = tf.identity(op.inputs[0])
return grad * 20 * x
Here is a working example:
import tensorflow as tf
from tensorflow.python.framework import ops
import numpy as np
# Define custom py_func which takes also a grad op as argument:
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
# Def custom square function using np.square instead of tf.square:
def mysquare(x, name=None):
with ops.name_scope(name, "Mysquare", [x]) as name:
sqr_x = py_func(np.square,
[x],
[tf.float32],
name=name,
grad=_MySquareGrad) # <-- here's the call to the gradient
return sqr_x[0]
### Actual gradient:
##def _MySquareGrad(op, grad):
##x = op.inputs[0]
##return grad * 20 * x # add a "small" error just to see the difference:
def _MySquareGrad(op, grad):
#first get a Variable that was created using tf.get_variable()
with tf.variable_scope("", reuse=True):
var = tf.get_variable('var')
#now create the assign graph:
tmp = var*10.
assign_op = var.assign(tmp)
#now make the assign operation part of the grad calculation graph:
with tf.control_dependencies([assign_op]):
x = tf.identity(op.inputs[0])
return grad * 20 * x
with tf.Session() as sess:
x = tf.constant([1., 2.])
var = tf.get_variable(name="var", shape=[], initializer=tf.constant_initializer(0.2))
y = mysquare(x)
tf.global_variables_initializer().run()
print(x.eval(), y.eval(), tf.gradients(y, x)[0].eval())
print("Now var is 10 times larger:", var.eval())
I was looking at the mechanics section for Tensorflow, specifically on shared variables. In the section "The problem", they are dealing with a convolutional neural net, and provide the following code (which runs an image through the model):
# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)
If the model was implemented in such a way, would it then be impossible to learn/update the parameters because there's a new set of parameters for each image in my training set?
Edit:
I've also tried "the problem" approach on a simple linear regression example, and there do not appear to be any issues with this method of implementation. Training seems to work as well as can be shown by the last line of the code. So I'm wondering if there is a subtle discrepancy in the tensorflow documentation and what I'm doing. :
import tensorflow as tf
import numpy as np
trX = np.linspace(-1, 1, 101)
trY = 2 * trX + np.random.randn(*trX.shape) * 0.33 # create a y value which is approximately linear but with some random noise
X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
def model(X):
with tf.variable_scope("param"):
w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
return tf.mul(X, w) # lr is just X*w so this model line is pretty simple
y_model = model(X)
cost = (tf.pow(Y-y_model, 2)) # use sqr error for cost function
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost) # construct an optimizer to minimize cost and fit line to my data
sess = tf.Session()
init = tf.initialize_all_variables() # you need to initialize variables (in this case just variable W)
sess.run(init)
with tf.variable_scope("train"):
for i in range(100):
for (x, y) in zip(trX, trY):
sess.run(train_op, feed_dict={X: x, Y: y})
print sess.run(y_model, feed_dict={X: np.array([1,2,3])})
One has to create the variable set only once per whole training (and testing) set. The goal of variable scopes is to allow for modularization of subsets of parameters, such as those belonging to layers (e.g. when architecture of a layer is repeated, the same names can be used within each layer scope).
In your example you create parameters only in the model function. You can print out your variable names to see that it is assigned to the specified scope:
from __future__ import print_function
X = tf.placeholder("float") # create symbolic variables
Y = tf.placeholder("float")
print("X:", X.name)
print("Y:", Y.name)
def model(X):
with tf.variable_scope("param"):
w = tf.Variable(0.0, name="weights") # create a shared variable (like theano.shared) for the weight matrix
print("w:", w.name)
return tf.mul(X, w)
The call to sess.run(train_op, feed_dict={X: x, Y: y}) only evaluates the value of train_op given the provided values of X and Y. No new variables (incl. parameters) are created there; therefore, it has no effect. You can make sure the variable names stay the same by again printing them out:
with tf.variable_scope("train"):
print("X:", X.name)
print("Y:", Y.name)
for i in range(100):
for (x, y) in zip(trX, trY):
sess.run(train_op, feed_dict={X: x, Y: y})
You will see that variable names stay the same, as they are already initialized.
If you'd like to retrieve a variable using its scope, you need to use get_variable within a tf.variable_scope enclosure:
with tf.variable_scope("param"):
w = tf.get_variable("weights", [1])
print("w:", w.name)