Dynamic function call, depneding on condition in tensorflow graph - tensorflow

I'm trying to implement a dynamic_rnn_decoder. However, I get an exception, because after the second element the Tensors in the cell are already created. Thus I want to set reuse=True after the first iteration.
Is there a op which calls dynamically a function depending on a condition (like fn_dyn = tf.cond(cond, fn1, fn2))
Hence, I want to implement this dynamically:
if i > 0:
variable_scope.get_variable_scope().reuse_variables()
The a simplified _time_step-function for _dynamic_rnn_loop could be something like that:
def _time_step(time, output_ta_t, *state):
input_t = input_ta.read(time)
# Restore some shape information
input_t.set_shape([const_batch_size, const_depth])
# Pack state back up for use by cell
state = (_packed_state(structure=state_size, state=state)
if state_is_tuple else state[0])
def call_with_previous(feed_previous_t):
if feed_previous_t:
prev = output_ta_t.read(time - 1)
if output_projection is not None:
prev = nn_ops.xw_plus_b(prev, output_projection[0], output_projection[1])
cell_input = math_ops.reduce_max(prev, 1)
print(cell_input.get_shape())
cell_input.set_shape([const_batch_size, const_depth])
else:
cell_input = input_t
def call_cell_t(cell_input_t, state_t):
# set ruse after first call
output_t, state_t = cell(cell_input_t, state_t)
variable_scope.get_variable_scope().reuse_variables()
return output_t, state_t
return lambda: call_cell_t(cell_input, state)
# >>> doesn't work
call_cell = tf.cond(tf.equal(time, tf.constant(0, dtype=tf.int32)),
call_with_previous(False),
call_with_previous(True))
if sequence_length is not None:
(output, new_state) = _rnn_step(
time=time,
sequence_length=sequence_length,
min_sequence_length=min_sequence_length,
max_sequence_length=max_sequence_length,
zero_output=zero_output,
state=state,
call_cell=call_cell,
state_size=state_size,
skip_conditionals=True)
else:
(output, new_state) = call_cell()
# Pack state if using state tuples
new_state = (tuple(_unpacked_state(new_state)) if state_is_tuple else (new_state,))
output_ta_t = output_ta_t.write(time, output)
return (time + 1, output_ta_t) + new_state
Thanks, cheers!

while_loop only calls the underlying body function once. not dynamically for every time step. if you're getting an error when getting the variable, it's because you also access the variable elsewhere in your code.
In this case, looks like it's because of your cond statement. this causes two calls to cell(). Try to factor this so the cell call is outside the cond.
Alternatively, as a hack, have the cell call inside a try except block. If you get a variable access error, just set reuse variable and call it again.
source: i wrote dynamic_rnn.

Related

Printing value of tensorflow variable inside object

I am trying to print the value of a Tensorflow variable that is defined inside an object. To better illustrate my issue, I am currently trying to run the monodepth library. It has 2 main files, dataloader and main. Basically dataloader iterates over a text file of
class MonodepthDataloader(object):
"""monodepth dataloader"""
def __init__(self, data_path, filenames_file, params, dataset, mode):
self.data_path = data_path
self.params = params
self.dataset = dataset
self.mode = mode
self.left_image_batch = None
self.right_image_batch = None
input_queue = tf.train.string_input_producer([filenames_file], shuffle=False)
line_reader = tf.TextLineReader()
_, line = line_reader.read(input_queue)
split_line = tf.string_split([line]).values
...
left_image_path = tf.string_join([self.data_path, split_line[0]])
left_image_o = self.read_image(left_image_path)
I am trying to print out left_image_path to verify that it is being generated correctly. However this is in an object being called by monodepth_main. That is monodepth_main calls the dataloader with the following lines:
dataloader = MonodepthDataloader(args.data_path, args.filenames_file, params, args.dataset, args.mode)
left = dataloader.left_image_batch
As a result I can't just use sess.run(x). I have also tried using tf.Print(line, [line]) but nothing shows up.
How do I print out the value of a tensorflow variable inside an object? Specifically left_image_path?

Tensorflow: LSTM with moving average of weights of another LSTM

I would like to have an LSTM in tensorflow whose weights are the exponential moving average of the weights of another LSTM. So basically I have this code with some input placeholder and some initial state placeholder:
def test_lstm(input_ph, init_ph, scope):
cell = tf.nn.rnn_cell.LSTMCell(128, use_peepholes=True)
input_ph = tf.transpose(input_ph, [1, 0, 2])
input_list = tf.unpack(input_ph)
with tf.variable_scope(scope) as vs:
outputs, states = tf.nn.rnn(cell, input_list, initial_state=init_ph)
theta = [v for v in tf.all_variables() if v.name.startswith(vs.name)]
return outputs, theta
lstm_1, theta_1 = test_lstm(input_1, state_init_1, scope="lstm_1")
What I would like to do now is something similar along these lines (which don't actually work because the exponential moving average puts the tag "ema" behind the variable name of the weights and they do not appear in variable scope because they were not created with tf.get_variable):
ema = tf.train.ExponentialMovingAverage(decay=1-self.tau, name="ema")
with tf.variable_scope("lstm_2"):
maintain_averages_theta_1 = ema.apply(theta_1)
theta_2_1 = [ema.average(x) for x in theta_1]
lstm_2 , theta_2_2 = test_lstm(input_2, state_init_2, scope="lstm_2"
where eventually theta_2_1 would be equal to theta_2_2 (or throw an exception because the variables already exist).
Change (3/May/2018): I added one line apart from the old answer. The old one itself is self-sufficient to this question, but it real practice we initialize 'target network' as the same value to the 'behavioural network'. I added this point. Search the line having 'Change0': You need to do this only once right after when the target network's parameters are just initialized. Without 'Change0', you first round of gradient-based update would become nasty since Q_target and Q_behaviour are not correlated while you need both to be somewhat related ex) TD = r + gamma*Q_target - Q_behaviour for the update.
Seems a bit late but hope this helps.
The critical problem of TF-RNN series possess is that we cannot directly designate the variables for RNNs unlike plain feedforward or convolutional NNs, so that we can't do simple work - get EMAed vars and plug into the network.
Let's go into the real deal(I attached a practice code to look-up for this, so please refer EMA_LSTM.py).
Shall we say, there is a network function containing LSTM:
def network(in_x,in_h):
# Input: 'in_x' is the input sequence, 'in_h' is the initial hidden cell(c,m)
# Output: 'hidden_outputs' is the output sequence(c-sequence), 'net' is the list of parameters used in this network function
cell = tf.nn.rnn_cell.BasicLSTMCell(3, state_is_tuple=True)
in_h = tf.nn.rnn_cell.LSTMStateTuple(in_h[0], in_h[1])
hidden_outputs, last_tuple = tf.nn.dynamic_rnn(cell, in_x, dtype=tf.float32, initial_state=in_h)
net = [v for v in tf.trainable_variables() if tf.contrib.framework.get_name_scope() in v.name]
return hidden_outputs, net
Then you declare tf.placeholder for the necessary inputs, which are:
in_x = tf.placeholder("float", [None,None,6])
in_c = tf.placeholder("float", [None,3])
in_m = tf.placeholder("float", [None,3])
in_h = (in_c, in_m)
Lastly, we run a session that proceeds the network() function with specified inputs:
init_cORm = np.zeros(shape=(1,3))
input = np.ones(shape=(1,1,6))
print '========================new-1(beh)=============================='
with tf.Session() as sess:
with tf.variable_scope('beh', reuse=False) as beh:
result, net1 = network(in_x,in_h)
sess.run(tf.global_variables_initializer())
list = sess.run([result, in_x, in_h, net1], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})
print 'result:', list[0]
print 'in_x:' , list[1]
print 'in_h:', list[2]
print 'net1:', list[3][0][0][:4]
Now, we are going to make the var_list called 'net4' that contains the ExponentialMovingAverage(EMA)-ed values of 'net1', as in below with original 'net1' being first assigned in beh session above and newly assigned in below by adding 1. for each element:
ema = tf.train.ExponentialMovingAverage(decay=0.5)
target_update_op = ema.apply(net1)
init_new_vars_op = tf.initialize_variables(var_list=[v for v in tf.global_variables() if 'ExponentialMovingAverage' in v.name]) # 'initialize_variables' will be replaced with 'variables_initializer' in 2017
sess.run(init_new_vars_op)
sess.run([param4.assign(param1.eval()) for param4, param1 in zip(net4,net1)]) # Change0
len_net1 = len(net1)
net1_ema = [[] for i in range(len_net1)]
for i in range(len_net1):
sess.run(net1[i].assign(1. + net1[i]))
sess.run(target_update_op)
Note that
we only initialised(by declaring 'init_new_vars_op' then running the declaration job) the variables of their name containing 'ExponentialMovingAverage', if not the variables in net1 will also be newly initialised.
'net1' is newly assigned with +1 for every elements of the variables in 'net1'. If an element of 'net1' was -0.5 and is now 0.5 by +1, then we want to have 'net4' as 0. when the EMA decay rate is 0.5
Finally, we run the EMA job with 'sess.run(target_update_op)'
Eventually, we declare 'net4' first with the 'network()' function and then assign&run the EMA(net1) values into 'net4'. When you run 'sess.run(result)', then it will be the one with EMA(net1)-ed variables.
with tf.variable_scope('tare', reuse=False) as tare:
result, net4 = network(in_x,in_h)
len_net4 = len(net4)
target_assign = [[] for i in range(len_net4)]
for i in range(len_net4):
target_assign[i] = net4[i].assign(ema.average(net1[i]))
sess.run(target_assign[i].op)
list = sess.run([result, in_x, in_h, net4], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})
What's happened in here? You just indirectly declares the LSTM variables as 'net4' in the 'network()' function. Then in the for loop, we points out that 'net4' is actually EMA of 'net1' and 'net1+1.' . Lastly, with the net4 specified what to do with(via 'network()') and what values it takes(via 'for loop of .assign(ema.average()) to itself'), you run the process.
It is somewhat counter-intuitive that we declare the 'result' first and specifies the value of parameters second. However, that's the nature of TF what they exactly look for, as it is always logical to set variables, processes, and their relation first then to assign values, then runs the processes.
Lastly, few things to go further for the real machine learning codes:
In here, I just second assigned 'net1' with 'net1 + 1.'. In real case, this '+1.' step is where you 'sess.run()'(after you 'declare' in somewhere) your optimiser. So every-time after you 'sess.run(optimiser)', 'sess.run(target_update_op)' and then'sess.run(target_assign[i].op)' should follow to update your 'net4' along the EMA of 'net1'. Conceretly, you can do this job with different order like below:
ema = tf.train.ExponentialMovingAverage(decay=0.5)
target_update_op = ema.apply(net1)
with tf.variable_scope('tare', reuse=False) as tare:
result, net4 = network(in_x,in_h)
len_net4 = len(net4)
target_assign = [[] for i in range(len_net4)]
for i in range(len_net4):
target_assign[i] = net4[i].assign(ema.average(net1[i]))
init_new_vars_op = tf.initialize_variables(var_list=[v for v in tf.global_variables() if 'ExponentialMovingAverage' in v.name]) # 'initialize_variables' will be replaced with 'variables_initializer' in 2017
sess.run(init_new_vars_op)
sess.run([param4.assign(param1.eval()) for param4, param1 in zip(net4,net1)]) # Change0
Lastly, be aware that you must sess.run(ema.apply(net)) right after the net changes, and then sess.run(net_ema.assign(ema.average(net)). W/O .apply, net_ema will not get assigned with averaged value
len_net1 = len(net1)
net1_ema = [[] for i in range(len_net1)]
for i in range(len_net1):
sess.run(net1[i].assign(1. + net1[i]))
sess.run(target_update_op)
for i in range(len_net4):
sess.run(target_assign[i].op)
list = sess.run([result, in_x, in_h, net4], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})

Why can't I access the variable I create using the variable name plus scope path in TensorFlow?

I was trying to get a variable I created in a simple function but I keep getting errors. I am doing:
x = tf.get_variable('quadratic/x')
but the python complains as follow:
python qm_tb_scopes.py
quadratic/x:0
Traceback (most recent call last):
File "qm_tb_scopes.py", line 24, in <module>
x = tf.get_variable('quadratic/x')
File "/Users/my_username/path/tensor_flow_experiments/venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 732, in get_variable
partitioner=partitioner, validate_shape=validate_shape)
File "/Users/my_username/path/tensor_flow_experiments/venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 596, in get_variable
partitioner=partitioner, validate_shape=validate_shape)
File "/Users/my_username/path/tensor_flow_experiments/venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 161, in get_variable
caching_device=caching_device, validate_shape=validate_shape)
File "/Users/my_username/path/tensor_flow_experiments/venv/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 457, in _get_single_variable
"but instead was %s." % (name, shape))
ValueError: Shape of a new variable (quadratic/x) must be fully defined, but instead was <unknown>.
it seems its trying to create a new variable, but I am simply trying to get a defined one. Why is it doing this?
The whole code is:
import tensorflow as tf
def get_quaratic():
# x variable
with tf.variable_scope('quadratic'):
x = tf.Variable(10.0,name='x')
# b placeholder (simualtes the "data" part of the training)
b = tf.placeholder(tf.float32,name='b')
# make model (1/2)(x-b)^2
xx_b = 0.5*tf.pow(x-b,2)
y=xx_b
return y,x
y,x = get_quaratic()
learning_rate = 1.0
# get optimizer
opt = tf.train.GradientDescentOptimizer(learning_rate)
# gradient variable list = [ (gradient,variable) ]
print x.name
x = tf.get_variable('quadratic/x')
x = tf.get_variable(x.name)
You need to pass the option reuse=True to tf.variable_scope() if you want to get the same variable twice.
See the documentation (https://www.tensorflow.org/versions/r0.9/how_tos/variable_scope/index.html)
for more details.
Alternatively, you could get the variable once, outside your Python function, and pass it in as a argument in Python. I find that a bit cleaner since it makes it explicit what variables the code uses.
I hope that helps!
This is not the best solution, but try creating the variable through tf.get_variable() with reuse=False to ensure a new variable is created. Then, when obtaining the variable, use tf.get_variable() with reuse=True to get the current variable. Setting reuse to tf.AUTO_REUSE risks the creation of a new variable if the exact var is not present. Also make sure to specify the shape of the variable in tf.get_variable().
import tensorflow as tf
def get_quaratic():
# x variable
with tf.variable_scope('quadratic', reuse=False):
x = tf.get_variable('x', ())
tf.assign(x, 10)
# b placeholder (simualtes the "data" part of the training)
b = tf.placeholder(tf.float32,name='b')
# make model (1/2)(x-b)^2
xx_b = 0.5*tf.pow(x-b,2)
y=xx_b
return y,x
y,x = get_quaratic()
learning_rate = 1.0
# get optimizer
opt = tf.train.GradientDescentOptimizer(learning_rate)
# gradient variable list = [ (gradient,variable) ]
print (x.name)
with tf.variable_scope('', reuse=True):
x = tf.get_variable('quadratic/x', shape=())
print(tf.global_variables()) # there is only 1 variable

PyOmo/Ipopt fails with "can't evaluate pow"

I am using PyOmo to generate a nonlinear model which will ultimately be solved with Ipopt. The model is as follows:
from pyomo.environ import *
from pyomo.dae import *
m = ConcreteModel()
m.t = ContinuousSet(bounds=(0,100))
m.T = Param(default=100,mutable=True)
m.a = Param(default=0.1)
m.kP = Param(default=20)
m.P = Var(m.t, bounds=(0,None))
m.S = Var(m.t, bounds=(0,None))
m.u = Var(m.t, bounds=(0,1), initialize=0.5)
m.Pdot = DerivativeVar(m.P)
m.Sdot = DerivativeVar(m.S)
m.obj = Objective(expr=m.S[100],sense=maximize)
def _Pdot(M,i):
if i == 0:
return Constraint.Skip
return M.Pdot[i] == (1-M.u[i])*(M.P[i]**0.75)
def _Sdot(M,i):
if i == 0:
return Constraint.Skip
return M.Sdot[i] == M.u[i]*0.2*(M.P[i]**0.75)
def _init(M):
yield M.P[0] == 2
yield M.S[0] == 0
yield ConstraintList.End
m.Pdotcon = Constraint(m.t, rule=_Pdot)
m.Sdotcon = Constraint(m.t, rule=_Sdot)
m.init_conditions = ConstraintList(rule=_init)
discretizer = TransformationFactory('dae.collocation')
discretizer.apply_to(m,nfe=100,ncp=3,scheme='LAGRANGE-RADAU')
discretizer.reduce_collocation_points(m,var=m.u,ncp=1,contset=m.t)
solver = SolverFactory('ipopt')
results = solver.solve(m,tee=False)
Running the model results in the following error:
Error evaluating constraint 1: can't evaluate pow'(0,0.75).
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/pyomo/opt/base/solvers.py", line 577, in solve
"Solver (%s) did not exit normally" % self.name)
pyutilib.common._exceptions.ApplicationError: Solver (asl) did not exit normally
The first part of the error comes from Ipopt whereas the second part comes from PyOmo. Evidently the issue has something ot do with the term M.P[i]**0.75 in the constraints, but changing the power does not resolve the issue (though 2.0 did work).
How can I resolve this?
The error message states that pow'(0,0.75) cannot be evaluated. The ' character in this function indicates the first derivative ('' would indiate the second derivative). The message is effectively saying that the first derivative does not exist or results in an infinity at zero.
Resolving the issue is easy: bound your variables to a non-zero value as follows:
m.P = Var(m.t, bounds=(1e-20,None))
m.S = Var(m.t, bounds=(1e-20,None))
I would add to Richard's answer:
you might also need to update the initial value of your variable as ipopt assumes 0 if not specified, so it will evaluate the variable at 0 for the first iteration.
hence:
m.P = Var(m.t, bounds=(1e-20,None), initialize=1e-20)
m.S = Var(m.t, bounds=(1e-20,None), initialize=1e-20)
instead of 1e-20 as initialize you might use a value more relevant to your problem

How to get the value of a variable defined in tf.name_scope()?

with tf.name_scope('hidden4'):
weights = tf.Variable(tf.convert_to_tensor(weights4))
biases = tf.Variable(tf.convert_to_tensor(biases4))
hidden4 = tf.sigmoid(tf.matmul(hidden3, weights) + biases)
I want to ues tf.get_variable to get the variable hidden4/weights defined as above, but failed as below:
hidden4weights = tf.get_variable("hidden4/weights:0")
*** ValueError: Variable hidden4/weights:0 already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/pdb.py", line 234, in default
exec code in globals, locals
File "/usr/local/lib/python2.7/cmd.py", line 220, in onecmd
return self.default(line)
Then I try hidden4/weights.eval (sess), but it also failed.
(Pdb) hidden4/weights.eval(sess)
*** NameError: name 'hidden4' is not defined
tf.name_scope() is used to visualize variables.
tf.name_scope(name)
Wrapper for Graph.name_scope() using the default graph.
What I think you are looking for is tf.variable_scope():
Variable Scope mechanism in TensorFlow consists of 2 main functions:
tf.get_variable(, , ): Creates or returns a variable with a given name.
tf.variable_scope(): Manages namespaces for names passed to tf.get_variable().
with tf.variable_scope('hidden4'):
# No variable in this scope with name exists, so it creates the variable
weights = tf.get_variable("weights", <shape>, tf.convert_to_tensor(weights4)) # Shape of a new variable (hidden4/weights) must be fully defined
biases = tf.get_variable("biases", <shape>, tf.convert_to_tensor(biases4)) # Shape of a new variable (hidden4/biases) must be fully defined
hidden4 = tf.sigmoid(tf.matmul(hidden3, weights) + biases)
with tf.variable_scope('hidden4', reuse=True):
hidden4weights = tf.get_variable("weights")
assert weights == hidden4weights
That should do it.
I have solved the problem above:
classifyerlayer_W=[v for v in tf.all_variables() if v.name == "softmax_linear/weights:0"][0] #find the variable by name "softmax_linear/weights:0"
init= numpy.random.randn(2048, 4382) # create a array you use to re-initial the variable
assign_op = classifyerlayer_W.assign(init) # create a assign operation
sess.run(assign_op) # run op to finish the assign