The following code results in a very unhelpful error:
import tensorflow as tf
x = tf.Variable(tf.constant(0.), name="x")
with tf.Session() as s:
val = s.run(x.assign(1))
print(val) # 1
val = s.run(x, {x: 2})
print(val) # 2
val = s.run(x.assign(1), {x: 0.}) # InvalidArgumentError
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input 0 of node Assign_1 was passed float from _arg_x_0_0:0 incompatible with expected float_ref.
How did I get this error?
Why do I get this error?
Here's what I could infer.
How did I get this error?
This error is seen when attempting to perform the following two operations in a single session run:
A Tensorflow variable is assigned a value
That same variable is also passed a value as part of the feed_dict
This is why the first 2 runs succeed (they both don't simultaneously attempt to perform both these operations).
Why do I get this error?
I am not sure, but I don't think this was an intentional design choice by Google. Here's my explanation:
Firstly, the TF(TensorFlow) source code (basically) resolves x.assign(1) to tf.assign(x, 1) which gives us a hint for better understand the error message when it says Input 0.
The error message refers to x when it says Input 0 of the assign op.
It goes on to say that the first argument of the assign op was passed float from _arg_x_0_0:0.
TLDR
Thus for a run where a TF variable is provided as a feed, that variable will no longer be treated as a variable (but instead as the value it was assigned), and thus any attempts at further assigning a value to it would be erroneous since only TF variables can be assigned a value in the graph.
Fix
If your graph has variable assignment operation, don't pass a value to that same variable in your feed_dict. ¯_(ツ)_/¯. Assuming you're using the feed_dict to provide an initial value, you could instead assign it a value in a prior session run. Or, leverage tf.control_dependencies when building your graph to assign it an initial value from a placeholder as shown below:
import tensorflow as tf
x = tf.Variable(tf.constant(0.), name="x")
initial_x = tf.placeholder(tf.float32)
assign_from_placeholder = x.assign(initial_x)
with tf.control_dependencies([assign_from_placeholder]):
x_assign = x.assign(1)
with tf.Session() as s:
val = s.run(x_assign, {initial_x: 0.}) # Success!
I'm trying to understand how local and global variables are different in tensorflow and what's the right way to initialize the variables.
According to the doc, tf.local_variables_initializer:
Returns an Op that initializes all local variables.
This is just a shortcut for variables_initializer(local_variables())
So the essential part is tf.local_variables. The doc:
Local variables - per process variables, usually not saved/restored to checkpoint and used for temporary or intermediate values. For example, they can be used as counters for metrics computation or number of epochs this machine has read data.
It sounds logical, however, no matter how I tried, I couldn't make any variable local.
features = 2
hidden = 3
with tf.variable_scope('start'):
x = tf.placeholder(tf.float32, shape=[None, features], name='x')
y = tf.placeholder(tf.float32, shape=[None], name='y')
with tf.variable_scope('linear'):
W = tf.get_variable(name='W', shape=[features, hidden])
b = tf.get_variable(name='b', shape=[hidden], initializer=tf.zeros_initializer)
z = tf.matmul(x, W) + b
with tf.variable_scope('optimizer'):
predict = tf.reduce_sum(z, axis=1)
loss = tf.reduce_mean(tf.square(y - predict))
optimizer = tf.train.AdamOptimizer(0.1).minimize(loss)
print(tf.local_variables())
The output is always an empty list. How and should I create local variables?
A local variable is just a regular variable that's added to a "special" collection.
The collection is tf.GraphKeys.LOCAL_VARIABLES.
You can pick any variable definition and just add the parameter collections=[tf.GraphKeys.LOCAL_VARIABLES] to add the variable to the specified collection list.
Think I found it. The magic addition to make a variable local is collections=[tf.GraphKeys.LOCAL_VARIABLES] in tf.get_variable. So this way W becomes are local variable:
W = tf.get_variable(name='W', shape=[features, hidden], collections=[tf.GraphKeys.LOCAL_VARIABLES])
The documentation mentions one more possibility that also works:
q = tf.contrib.framework.local_variable(0.0, name='q')
I want to use a variable where the shape is unknown in advance and it will change from time to time (although ndim is known and fixed).
I declare it like:
initializer = tf.random_uniform_initializer()
shape = (s0, s1, s2) # these are symbolic vars
foo_var = tf.Variable(initializer(shape=shape), name="foo", validate_shape=False)
This seems to work when I create the computation graph up to the point where I want to optimize w.r.t. this variable, i.e.:
optimizer = tf.train.AdamOptimizer(learning_rate=0.1, epsilon=1e-4)
optim = optimizer.minimize(loss, var_list=[foo_var])
That fails in the optimizer in some function create_zeros_slot where it seems to depend on the static shape information (it uses primary.get_shape().as_list()). (I reported this upstream here.)
So, using the optimizer works only with variables with static shape?
I.e. for every change of the shape of the variable, I need to rebuild the computation graph?
Or is there any way to avoid the recreation?
What you are doing does not make any sense. How would you optimize the values of a dynamic variable if its shape is changing? Sometimes a value would be there and sometimes it would not. When you go to save the graph which shape would the variable be in? The adam optimizer also needs to store information about each parameter in a variable between executions which it can not do without knowing the size.
Now if what you are looking to do is only use part of the variable at a time you can take slices of it and use them. This will work fine so long as the variable has a fixed shape of the maximum bounds of the slice. For instance...
initializer = tf.random_uniform_initializer()
shape = (S0, S1, S2) # these are now constants for the max bounds of si
foo_var = tf.Variable(initializer(shape=shape), name="foo")
foo_var_sub = foo_var[:s0, :s1, :s2]
In this case the optimizer will only act on the parts of foo_var which contributed to the slice.
My solution at the moment is a bit ugly but works.
def _tf_create_slot_var(primary, val, scope):
"""Helper function for creating a slot variable."""
from tensorflow.python.ops import variables
slot = variables.Variable(val, name=scope, trainable=False, validate_shape=primary.get_shape().is_fully_defined())
# pylint: disable=protected-access
if isinstance(primary, variables.Variable) and primary._save_slice_info:
# Primary is a partitioned variable, so we need to also indicate that
# the slot is a partitioned variable. Slots have the same partitioning
# as their primaries.
real_slot_name = scope[len(primary.op.name + "/"):-1]
slice_info = primary._save_slice_info
slot._set_save_slice_info(variables.Variable.SaveSliceInfo(
slice_info.full_name + "/" + real_slot_name,
slice_info.full_shape[:],
slice_info.var_offset[:],
slice_info.var_shape[:]))
# pylint: enable=protected-access
return slot
def _tf_create_zeros_slot(primary, name, dtype=None, colocate_with_primary=True):
"""Create a slot initialized to 0 with same shape as the primary object.
Args:
primary: The primary `Variable` or `Tensor`.
name: Name to use for the slot variable.
dtype: Type of the slot variable. Defaults to the type of `primary`.
colocate_with_primary: Boolean. If True the slot is located
on the same device as `primary`.
Returns:
A `Variable` object.
"""
if dtype is None:
dtype = primary.dtype
from tensorflow.python.ops import array_ops
val = array_ops.zeros(
primary.get_shape().as_list() if primary.get_shape().is_fully_defined() else tf.shape(primary),
dtype=dtype)
from tensorflow.python.training import slot_creator
return slot_creator.create_slot(primary, val, name, colocate_with_primary=colocate_with_primary)
def monkey_patch_tf_slot_creator():
"""
The TensorFlow optimizers cannot handle variables with unknown shape.
We hack this.
"""
from tensorflow.python.training import slot_creator
slot_creator._create_slot_var = _tf_create_slot_var
slot_creator.create_zeros_slot = _tf_create_zeros_slot
Then I need to call monkey_patch_tf_slot_creator() at some point.
Both are scalars. I'm trying to reassign the variable. But I'm unable to because every iteration changes the size of the variable. I tried all kinds of transformations but they do not work. Any idea?
I'm trying to just have a behaviour like appending an element to a list.
a = tf.Variable(0.00, tf.float32)
b = tf.Variable(0.00, tf.float32)
a.assign(tf.pack([a, b]))
This gives an error:
ValueError: Shapes () and (2,) are not compatible
a is a single scalar variable
b is a single scalar variable
you can only assign other single scalar values to these variables (is not like python that you can assign anything).
when you do tf.pack([a, b]) you are creating a tensor, and you cannot assign a tensor to a single scalar variable. You have to create a new Variable.
What's the differences between these functions?
tf.variable_op_scope(values, name, default_name, initializer=None)
Returns a context manager for defining an op that creates variables.
This context manager validates that the given values are from the same graph, ensures that that graph is the default graph, and pushes a name scope and a variable scope.
tf.op_scope(values, name, default_name=None)
Returns a context manager for use when defining a Python op.
This context manager validates that the given values are from the same graph, ensures that that graph is the default graph, and pushes a name scope.
tf.name_scope(name)
Wrapper for Graph.name_scope() using the default graph.
See Graph.name_scope() for more details.
tf.variable_scope(name_or_scope, reuse=None, initializer=None)
Returns a context for variable scope.
Variable scope allows to create new variables and to share already created ones while providing checks to not create or share by accident. For details, see the Variable Scope How To, here we present only a few basic examples.
Let's begin by a short introduction to variable sharing. It is a mechanism in TensorFlow that allows for sharing variables accessed in different parts of the code without passing references to the variable around.
The method tf.get_variable can be used with the name of the variable as the argument to either create a new variable with such name or retrieve the one that was created before. This is different from using the tf.Variable constructor which will create a new variable every time it is called (and potentially add a suffix to the variable name if a variable with such name already exists).
It is for the purpose of the variable sharing mechanism that a separate type of scope (variable scope) was introduced.
As a result, we end up having two different types of scopes:
name scope, created using tf.name_scope
variable scope, created using tf.variable_scope
Both scopes have the same effect on all operations as well as variables created using tf.Variable, i.e., the scope will be added as a prefix to the operation or variable name.
However, name scope is ignored by tf.get_variable. We can see that in the following example:
with tf.name_scope("my_scope"):
v1 = tf.get_variable("var1", [1], dtype=tf.float32)
v2 = tf.Variable(1, name="var2", dtype=tf.float32)
a = tf.add(v1, v2)
print(v1.name) # var1:0
print(v2.name) # my_scope/var2:0
print(a.name) # my_scope/Add:0
The only way to place a variable accessed using tf.get_variable in a scope is to use a variable scope, as in the following example:
with tf.variable_scope("my_scope"):
v1 = tf.get_variable("var1", [1], dtype=tf.float32)
v2 = tf.Variable(1, name="var2", dtype=tf.float32)
a = tf.add(v1, v2)
print(v1.name) # my_scope/var1:0
print(v2.name) # my_scope/var2:0
print(a.name) # my_scope/Add:0
This allows us to easily share variables across different parts of the program, even within different name scopes:
with tf.name_scope("foo"):
with tf.variable_scope("var_scope"):
v = tf.get_variable("var", [1])
with tf.name_scope("bar"):
with tf.variable_scope("var_scope", reuse=True):
v1 = tf.get_variable("var", [1])
assert v1 == v
print(v.name) # var_scope/var:0
print(v1.name) # var_scope/var:0
UPDATE
As of version r0.11, op_scope and variable_op_scope are both deprecated and replaced by name_scope and variable_scope.
Both variable_op_scope and op_scope are now deprecated and should not be used at all.
Regarding the other two, I also had problems understanding the difference between variable_scope and name_scope (they looked almost the same) before I tried to visualize everything by creating a simple example:
import tensorflow as tf
def scoping(fn, scope1, scope2, vals):
with fn(scope1):
a = tf.Variable(vals[0], name='a')
b = tf.get_variable('b', initializer=vals[1])
c = tf.constant(vals[2], name='c')
with fn(scope2):
d = tf.add(a * b, c, name='res')
print '\n '.join([scope1, a.name, b.name, c.name, d.name]), '\n'
return d
d1 = scoping(tf.variable_scope, 'scope_vars', 'res', [1, 2, 3])
d2 = scoping(tf.name_scope, 'scope_name', 'res', [1, 2, 3])
with tf.Session() as sess:
writer = tf.summary.FileWriter('logs', sess.graph)
sess.run(tf.global_variables_initializer())
print sess.run([d1, d2])
writer.close()
Here I create a function that creates some variables and constants and groups them in scopes (depending on the type I provided). In this function, I also print the names of all the variables. After that, I executes the graph to get values of the resulting values and save event-files to investigate them in TensorBoard. If you run this, you will get the following:
scope_vars
scope_vars/a:0
scope_vars/b:0
scope_vars/c:0
scope_vars/res/res:0
scope_name
scope_name/a:0
b:0
scope_name/c:0
scope_name/res/res:0
You see the similar pattern if you open TensorBoard (as you see b is outside of scope_name rectangular):
This gives you the answer:
Now you see that tf.variable_scope() adds a prefix to the names of all variables (no matter how you create them), ops, constants. On the other hand tf.name_scope() ignores variables created with tf.get_variable() because it assumes that you know which variable and in which scope you wanted to use.
A good documentation on Sharing variables tells you that
tf.variable_scope(): Manages namespaces for names passed to tf.get_variable().
The same documentation provides a more details how does Variable Scope work and when it is useful.
Namespaces is a way to organize names for variables and operators in hierarchical manner (e.g. "scopeA/scopeB/scopeC/op1")
tf.name_scope creates namespace for operators in the default graph.
tf.variable_scope creates namespace for both variables and operators in the default graph.
tf.op_scope same as tf.name_scope, but for the graph in which specified variables were created.
tf.variable_op_scope same as tf.variable_scope, but for the graph in which specified variables were created.
Links to the sources above help to disambiguate this documentation issue.
This example shows that all types of scopes define namespaces for both variables and operators with following differences:
scopes defined by tf.variable_op_scope or tf.variable_scope are compatible with tf.get_variable (it ignores two other scopes)
tf.op_scope and tf.variable_op_scope just select a graph from a list of specified variables to create a scope for. Other than than their behavior equal to tf.name_scope and tf.variable_scope accordingly
tf.variable_scope and variable_op_scope add specified or default initializer.
Let's make it simple: just use tf.variable_scope. Quoting a TF developer,:
Currently, we recommend everyone to use variable_scope and not use name_scope except for internal code and libraries.
Besides the fact that variable_scope's functionality basically extends those of name_scope, together they behave in a way that may surprises you:
with tf.name_scope('foo'):
with tf.variable_scope('bar'):
x = tf.get_variable('x', shape=())
x2 = tf.square(x**2, name='x2')
print(x.name)
# bar/x:0
print(x2.name)
# foo/bar/x2:0
This behavior has its use and justifies the coexistance of both scopes -- but unless you know what you are doing, sticking to variable_scope only will avoid you some headaches due to this.
As for API r0.11, op_scope and variable_op_scope are both deprecated.
name_scope and variable_scope can be nested:
with tf.name_scope('ns'):
with tf.variable_scope('vs'): #scope creation
v1 = tf.get_variable("v1",[1.0]) #v1.name = 'vs/v1:0'
v2 = tf.Variable([2.0],name = 'v2') #v2.name= 'ns/vs/v2:0'
v3 = v1 + v2 #v3.name = 'ns/vs/add:0'
You can think them as two groups: variable_op_scope and op_scope take a set of variables as input and are designed to create operations. The difference is in how they affect the creation of variables with tf.get_variable:
def mysum(a,b,name=None):
with tf.op_scope([a,b],name,"mysum") as scope:
v = tf.get_variable("v", 1)
v2 = tf.Variable([0], name="v2")
assert v.name == "v:0", v.name
assert v2.name == "mysum/v2:0", v2.name
return tf.add(a,b)
def mysum2(a,b,name=None):
with tf.variable_op_scope([a,b],name,"mysum2") as scope:
v = tf.get_variable("v", 1)
v2 = tf.Variable([0], name="v2")
assert v.name == "mysum2/v:0", v.name
assert v2.name == "mysum2/v2:0", v2.name
return tf.add(a,b)
with tf.Graph().as_default():
op = mysum(tf.Variable(1), tf.Variable(2))
op2 = mysum2(tf.Variable(1), tf.Variable(2))
assert op.name == 'mysum/Add:0', op.name
assert op2.name == 'mysum2/Add:0', op2.name
notice the name of the variable v in the two examples.
same for tf.name_scope and tf.variable_scope:
with tf.Graph().as_default():
with tf.name_scope("name_scope") as scope:
v = tf.get_variable("v", [1])
op = tf.add(v, v)
v2 = tf.Variable([0], name="v2")
assert v.name == "v:0", v.name
assert op.name == "name_scope/Add:0", op.name
assert v2.name == "name_scope/v2:0", v2.name
with tf.Graph().as_default():
with tf.variable_scope("name_scope") as scope:
v = tf.get_variable("v", [1])
op = tf.add(v, v)
v2 = tf.Variable([0], name="v2")
assert v.name == "name_scope/v:0", v.name
assert op.name == "name_scope/Add:0", op.name
assert v2.name == "name_scope/v2:0", v2.name
You can read more about variable scope in the tutorial.
A similar question was asked before on Stack Overflow.
From the last section of this page of the tensorflow documentation: Names of ops in tf.variable_scope()
[...] when we do with tf.variable_scope("name"), this implicitly opens a tf.name_scope("name"). For example:
with tf.variable_scope("foo"):
x = 1.0 + tf.get_variable("v", [1])
assert x.op.name == "foo/add"
Name scopes can be opened in addition to a variable scope, and then they will only affect the names of the ops, but not of variables.
with tf.variable_scope("foo"):
with tf.name_scope("bar"):
v = tf.get_variable("v", [1])
x = 1.0 + v
assert v.name == "foo/v:0"
assert x.op.name == "foo/bar/add"
When opening a variable scope using a captured object instead of a string, we do not alter the current name scope for ops.
Tensorflow 2.0 Compatible Answer: The explanations of Andrzej Pronobis and Salvador Dali are very detailed about the Functions related to Scope.
Of the Scope Functions discussed above, which are active as of now (17th Feb 2020) are variable_scope and name_scope.
Specifying the 2.0 Compatible Calls for those functions, we discussed above, for the benefit of the community.
Function in 1.x:
tf.variable_scope
tf.name_scope
Respective Function in 2.x:
tf.compat.v1.variable_scope
tf.name_scope (tf.compat.v2.name_scope if migrated from 1.x to 2.x)
For more information about migration from 1.x to 2.x, please refer this Migration Guide.