Tensorflow - running total - tensorflow

How can I add the number 5 after every iteration of the loop?
I want to do something like this:
weight = 0.225
for i in range(10):
weight += 5
print (weight)
Here is how I am trying in tensorflow but it never updates the weight
import tensorflow as tf
def dummy(x):
weights['h0'] = tf.add(weights['h0'], 5)
res = tf.add(weights['h0'], x)
return res
# build computational graph
a = tf.placeholder('float', None)
d = dummy(a)
weights = {
'h0': tf.Variable(tf.random_normal([1]))
}
# initialize variables
init = tf.global_variables_initializer()
# create session and run the graph
with tf.Session() as sess:
sess.run(init)
for i in range(10):
print (sess.run(d, feed_dict={a: [2]}))
# close session
sess.close()

There's an operation explicitly created for adding a value and assigning the result back to the input node: tf.assign_add
You should use it instead of tf.assing + tf.add.
Also, it's more important that you understand why you previous code won't work.
weights['h0'] = tf.add(weights['h0'], 5)
res = tf.add(weights['h0'], x)
At the fist line, you're defining a node add, whose inputs are weights['h0'] and 5 and you're assigning this node to a python variable weights['h0'].
Now, thus, weights['h0'] is a python variable holding a tensorflow node.
In the next line, you're defining another add node, between the previous node and x, and you return this node.
When the graph is evaluated, you evaluate the node pointed by res, that force the evaluation of the previous node (because res is a function of the node holded by weights['h0']).
The problem is the that your assignment at line 1 is a python assignment and not a tensorflow assignment.
Thus that assign operation is executed only in the python environment but it has no defined an assign node into the tensorflow graph.
P.S: when you use with you're defining a context manager that handles the closing operations for you. You can thus remove sess.close() because is executed automatically when you exit from that context

Apparently there is an assign operator
https://www.tensorflow.org/api_docs/python/tf/assign
weights['h0'] = tf.assign(weights['h0'], tf.add(weights['h0'], 5))

Related

How to apply a function to the value of a tensor and then assigning the output to the same tensor

I want to project the updated weights of my network (after performing optimization) to a special space in which I need the value of that tensor to be passed. The function which applies projection gets a numpy array as an input. Is there a way I can do this?
I used tf.assign() as a solution but since my function accepts arrays and not tensors it failed.
Here is a sketch of what I want to do:
W = tf.Variable(...)
...
opt = tf.train.AdamOptimizer(learning_rate).minimize(loss, var_list=['W'])
W = my_function(W)
It seems that tf.control_dependencies is what you need
one simple exmaple:
import tensorflow as tf
var = tf.get_variable('var', initializer=0.0)
# replace `tf.add` with your custom function
addop = tf.add(var, 1)
with tf.control_dependencies([addop]):
updateop = tf.assign(var, addop)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # pylint: disable=no-member
with tf.Session(config=config) as sess:
sess.run(tf.global_variables_initializer())
updateop.eval()
print(var.eval())
updateop.eval()
print(var.eval())
updateop.eval()
print(var.eval())
output:
1.0
2.0
3.0

tensorflow error - you must feed a value for placeholder tensor 'in'

I'm trying to implement queues for my tensorflow prediction but get the following error -
you must feed a value for placeholder tensor 'in' with dtype float and shape [1024,1024,3]
The program works fine if I use the feed_dict, Trying to replace feed_dict with queues.
The program basically takes a list of positions and passes the image np array to the input tensor.
for each in positions:
y,x = each
images = img[y:y+1024,x:x+1024,:]
a = images.astype('float32')
q = tf.FIFOQueue(capacity=200,dtypes=dtypes)
enqueue_op = q.enqueue(a)
qr = tf.train.QueueRunner(q, [enqueue_op] * 1)
tf.train.add_queue_runner(qr)
data = q.dequeue()
graph=load_graph('/home/graph/frozen_graph.pb')
with tf.Session(graph=graph,config=tf.ConfigProto(log_device_placement=True)) as sess:
p_boxes = graph.get_tensor_by_name("cat:0")
p_confs = graph.get_tensor_by_name("sha:0")
y = [p_confs, p_boxes]
x = graph.get_tensor_by_name("in:0")
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord,sess=sess)
confs, boxes = sess.run(y)
coord.request_stop()
coord.join(threads)
How can I make sure the input data that I populated to the queue is recognized while running the graph in the session.
In my original run I call the
confs, boxes = sess.run([p_confs, p_boxes], feed_dict=feed_dict_testing)
I'd suggest not using queues for this problem, and switching to the new tf.data API. In particular tf.data.Dataset.from_generator() makes it easier to feed in data from a Python function. You can rewrite your code to be much simpler, as follows:
def generator():
for y, x in positions:
images = img[y:y+1024,x:x+1024,:]
yield images.astype('float32')
dataset = tf.data.Dataset.from_generator(
generator, tf.float32, [1024, 1024, img.shape[3]])
# Add any extra transformations in here, like `dataset.batch()` or
# `dataset.repeat()`.
# ...
iterator = dataset.make_one_shot_iterator()
data = iterator.get_next()
Note that in your program, there's no connection between the data tensor and the graph you loaded in load_graph() (at least, assuming that load_graph() doesn't grab data from the global state!). You will probably need to use tf.import_graph_def() and the input_map argument to associate data with one of the tensors in your frozen graph (possibly "in:0"?) to complete the task.

How to understand sess.as_default() and sess.graph.as_default()?

I read the docs of sess.as_default()
N.B. The default session is a property of the current thread. If you create a new thread, and wish to use the default session in that thread, you must explicitly add a with sess.as_default(): in that thread's function.
My understanding is that if there are two more sessions when a new thread is created, we must set a session to run TensorFlow code in it. So, to do this, a session is chosen and as_default() is called.
N.B. Entering a with sess.as_default(): block does not affect the current default graph. If you are using multiple graphs, and sess.graph is different from the value of tf.get_default_graph, you must explicitly enter a with sess.graph.as_default(): block to make sess.graph the default graph.
In sess.as_default() block, to call a specific graph, one must call sess.graph.as_default() to run the graph?
The tf.Session API mentions that a graph is launched in a session. The following code illustrates this:
import tensorflow as tf
graph1 = tf.Graph()
graph2 = tf.Graph()
with graph1.as_default() as graph:
a = tf.constant(0, name='a')
graph1_init_op = tf.global_variables_initializer()
with graph2.as_default() as graph:
a = tf.constant(1, name='a')
graph2_init_op = tf.global_variables_initializer()
sess1 = tf.Session(graph=graph1)
sess2 = tf.Session(graph=graph2)
sess1.run(graph1_init_op)
sess2.run(graph2_init_op)
# Both tensor names are a!
print(sess1.run(graph1.get_tensor_by_name('a:0'))) # prints 0
print(sess2.run(graph2.get_tensor_by_name('a:0'))) # prints 1
with sess1.as_default() as sess:
print(sess.run(sess.graph.get_tensor_by_name('a:0'))) # prints 0
with sess2.as_default() as sess:
print(sess.run(sess.graph.get_tensor_by_name('a:0'))) # prints 1
with graph2.as_default() as g:
with sess1.as_default() as sess:
print(tf.get_default_graph() == graph2) # prints True
print(tf.get_default_session() == sess1) # prints True
# This is the interesting line
print(sess.run(sess.graph.get_tensor_by_name('a:0'))) # prints 0
print(sess.run(g.get_tensor_by_name('a:0'))) # fails
print(tf.get_default_graph() == graph2) # prints False
print(tf.get_default_session() == sess1) # prints False
You don't need to call sess.graph.as_default() to run the graph, but you need to get the correct tensors or operations in the graph to run it. The context allows you to get the graph or session using tf.get_default_graph or tf.get_default_session.
In the interesting line above, the default session is sess1 and it is implicitly calling sess1.graph, which is the graph in sess1, which is graph1, and hence it prints 0.
In the line following that, it fails because it is trying to run an operation in graph2 with sess1.

What is the effect of calling TensorArray.close()?

(tensorflow version: '0.12.head')
The documentation of TensorArray.close says that it close the current TensorArray. What does it mean for the status of TensorArray? I try the following code
import tensorflow as tf
sess = tf.InteractiveSession()
a1 = tf.TensorArray(tf.int32, 2)
a1.close().run()
a2 = a1.write(0, 0)
a2.close().run()
print(a2.read(0).eval())
And there are no errors. What is the usage of close?
Learning-to-learn includes TensorArray.close in the reset operations of the network. I can't figure out what the comment Empty array as part of the reset process means.
Update
For examples,
import tensorflow as tf
sess = tf.InteractiveSession()
N = 3
def cond(i, arr):
return i < N
def body(i, arr):
arr = arr.write(i, i)
i += 1
return i, arr
arr = tf.TensorArray(tf.int32, N)
_, result_arr = tf.while_loop(cond, body, [0, arr])
reset = arr.close() # corresponds to https://github.com/deepmind/learning-to-learn/blob/6ee52539e83d0452051fe08699b5d8436442f803/meta.py#L370
NUM_EPOCHS = 3
for _ in range(NUM_EPOCHS):
reset.run() # corresponds to https://github.com/deepmind/learning-to-learn/blob/6ee52539e83d0452051fe08699b5d8436442f803/util.py#L32
print(result_arr.stack().eval())
Why arr.close() doesn't make the while loop fail? What are the advantages of calling arr.close() at the beginning of each epoch?
This is a Python op which wraps a native op and both have help strings, but the native op help string is more informative. If you look at inspect.getsourcefile(fx_array.close) it'll point you to tensorflow/python/ops/tensor_array_ops.py. Inside the implementation you see that it defers to _tensor_array_close_v2. So you can do this
> from tensorflow.python.ops import gen_data_flow_ops
> help(gen_data_flow_ops._tensor_array_close_v2)
Delete the TensorArray from its resource container. This enables
the user to close and release the resource in the middle of a step/run.
That same doc string is also in tensorflow/core/ops/ops.pbtxt under TensorArrayCloseV2
Looking at tensorflow/core/kernels/tensor_array_ops.cc you see that TensorArrayCloseOp is the implementation registered for TensorArrayCloseV2 in , and has more info
// Delete the TensorArray from its resource container. This enables
// the user to close and release the resource in the middle of a step/run.
// TODO(ebrevdo): decide whether closing the grad op should happen
// here or on the python side.
class TensorArrayCloseOp : public OpKernel {
public:
explicit TensorArrayCloseOp(OpKernelConstruction* context)
: OpKernel(context) {}
void Compute(OpKernelContext* ctx) override {
TensorArray* tensor_array;
OP_REQUIRES_OK(ctx, GetTensorArray(ctx, &tensor_array));
core::ScopedUnref unref(tensor_array);
// Instead of deleting this TA from the ResourceManager, we just
// clear it away and mark it as closed. The remaining memory
// consumed store its mutex and handle Tensor. This will be
// cleared out at the end of the step anyway, so it's fine to keep
// it around until the end of the step. Further calls to the
// TensorArray will fail because TensorArray checks internally to
// see if it is closed or not.
The description seems inconsistent with the behavior you are seeing, could be a bug.
The TensorArray that is being closed in the Learning-to-learn example is not the original TensorArray that's being passed to the while-loop.
# original array (fx_array) declared here
fx_array = tf.TensorArray(tf.float32, size=len_unroll + 1,
clear_after_read=False)
# new array (fx_array) returned here
_, fx_array, x_final, s_final = tf.while_loop(
cond=lambda t, *_: t < len_unroll,
body=time_step,
loop_vars=(0, fx_array, x, state),
parallel_iterations=1,
swap_memory=True,
name="unroll")
Any subsequent calls to fx_array.close() from here close the new array returned by the while-loop and not the original array passed to the loop in the first iteration.
If you want to see how close behaves as expected then run:
session.run([reset, loss])
This will fail with TensorArray has already been closed. since the loss op tries to run pack() on the closed array.

Tensorflow: LSTM with moving average of weights of another LSTM

I would like to have an LSTM in tensorflow whose weights are the exponential moving average of the weights of another LSTM. So basically I have this code with some input placeholder and some initial state placeholder:
def test_lstm(input_ph, init_ph, scope):
cell = tf.nn.rnn_cell.LSTMCell(128, use_peepholes=True)
input_ph = tf.transpose(input_ph, [1, 0, 2])
input_list = tf.unpack(input_ph)
with tf.variable_scope(scope) as vs:
outputs, states = tf.nn.rnn(cell, input_list, initial_state=init_ph)
theta = [v for v in tf.all_variables() if v.name.startswith(vs.name)]
return outputs, theta
lstm_1, theta_1 = test_lstm(input_1, state_init_1, scope="lstm_1")
What I would like to do now is something similar along these lines (which don't actually work because the exponential moving average puts the tag "ema" behind the variable name of the weights and they do not appear in variable scope because they were not created with tf.get_variable):
ema = tf.train.ExponentialMovingAverage(decay=1-self.tau, name="ema")
with tf.variable_scope("lstm_2"):
maintain_averages_theta_1 = ema.apply(theta_1)
theta_2_1 = [ema.average(x) for x in theta_1]
lstm_2 , theta_2_2 = test_lstm(input_2, state_init_2, scope="lstm_2"
where eventually theta_2_1 would be equal to theta_2_2 (or throw an exception because the variables already exist).
Change (3/May/2018): I added one line apart from the old answer. The old one itself is self-sufficient to this question, but it real practice we initialize 'target network' as the same value to the 'behavioural network'. I added this point. Search the line having 'Change0': You need to do this only once right after when the target network's parameters are just initialized. Without 'Change0', you first round of gradient-based update would become nasty since Q_target and Q_behaviour are not correlated while you need both to be somewhat related ex) TD = r + gamma*Q_target - Q_behaviour for the update.
Seems a bit late but hope this helps.
The critical problem of TF-RNN series possess is that we cannot directly designate the variables for RNNs unlike plain feedforward or convolutional NNs, so that we can't do simple work - get EMAed vars and plug into the network.
Let's go into the real deal(I attached a practice code to look-up for this, so please refer EMA_LSTM.py).
Shall we say, there is a network function containing LSTM:
def network(in_x,in_h):
# Input: 'in_x' is the input sequence, 'in_h' is the initial hidden cell(c,m)
# Output: 'hidden_outputs' is the output sequence(c-sequence), 'net' is the list of parameters used in this network function
cell = tf.nn.rnn_cell.BasicLSTMCell(3, state_is_tuple=True)
in_h = tf.nn.rnn_cell.LSTMStateTuple(in_h[0], in_h[1])
hidden_outputs, last_tuple = tf.nn.dynamic_rnn(cell, in_x, dtype=tf.float32, initial_state=in_h)
net = [v for v in tf.trainable_variables() if tf.contrib.framework.get_name_scope() in v.name]
return hidden_outputs, net
Then you declare tf.placeholder for the necessary inputs, which are:
in_x = tf.placeholder("float", [None,None,6])
in_c = tf.placeholder("float", [None,3])
in_m = tf.placeholder("float", [None,3])
in_h = (in_c, in_m)
Lastly, we run a session that proceeds the network() function with specified inputs:
init_cORm = np.zeros(shape=(1,3))
input = np.ones(shape=(1,1,6))
print '========================new-1(beh)=============================='
with tf.Session() as sess:
with tf.variable_scope('beh', reuse=False) as beh:
result, net1 = network(in_x,in_h)
sess.run(tf.global_variables_initializer())
list = sess.run([result, in_x, in_h, net1], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})
print 'result:', list[0]
print 'in_x:' , list[1]
print 'in_h:', list[2]
print 'net1:', list[3][0][0][:4]
Now, we are going to make the var_list called 'net4' that contains the ExponentialMovingAverage(EMA)-ed values of 'net1', as in below with original 'net1' being first assigned in beh session above and newly assigned in below by adding 1. for each element:
ema = tf.train.ExponentialMovingAverage(decay=0.5)
target_update_op = ema.apply(net1)
init_new_vars_op = tf.initialize_variables(var_list=[v for v in tf.global_variables() if 'ExponentialMovingAverage' in v.name]) # 'initialize_variables' will be replaced with 'variables_initializer' in 2017
sess.run(init_new_vars_op)
sess.run([param4.assign(param1.eval()) for param4, param1 in zip(net4,net1)]) # Change0
len_net1 = len(net1)
net1_ema = [[] for i in range(len_net1)]
for i in range(len_net1):
sess.run(net1[i].assign(1. + net1[i]))
sess.run(target_update_op)
Note that
we only initialised(by declaring 'init_new_vars_op' then running the declaration job) the variables of their name containing 'ExponentialMovingAverage', if not the variables in net1 will also be newly initialised.
'net1' is newly assigned with +1 for every elements of the variables in 'net1'. If an element of 'net1' was -0.5 and is now 0.5 by +1, then we want to have 'net4' as 0. when the EMA decay rate is 0.5
Finally, we run the EMA job with 'sess.run(target_update_op)'
Eventually, we declare 'net4' first with the 'network()' function and then assign&run the EMA(net1) values into 'net4'. When you run 'sess.run(result)', then it will be the one with EMA(net1)-ed variables.
with tf.variable_scope('tare', reuse=False) as tare:
result, net4 = network(in_x,in_h)
len_net4 = len(net4)
target_assign = [[] for i in range(len_net4)]
for i in range(len_net4):
target_assign[i] = net4[i].assign(ema.average(net1[i]))
sess.run(target_assign[i].op)
list = sess.run([result, in_x, in_h, net4], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})
What's happened in here? You just indirectly declares the LSTM variables as 'net4' in the 'network()' function. Then in the for loop, we points out that 'net4' is actually EMA of 'net1' and 'net1+1.' . Lastly, with the net4 specified what to do with(via 'network()') and what values it takes(via 'for loop of .assign(ema.average()) to itself'), you run the process.
It is somewhat counter-intuitive that we declare the 'result' first and specifies the value of parameters second. However, that's the nature of TF what they exactly look for, as it is always logical to set variables, processes, and their relation first then to assign values, then runs the processes.
Lastly, few things to go further for the real machine learning codes:
In here, I just second assigned 'net1' with 'net1 + 1.'. In real case, this '+1.' step is where you 'sess.run()'(after you 'declare' in somewhere) your optimiser. So every-time after you 'sess.run(optimiser)', 'sess.run(target_update_op)' and then'sess.run(target_assign[i].op)' should follow to update your 'net4' along the EMA of 'net1'. Conceretly, you can do this job with different order like below:
ema = tf.train.ExponentialMovingAverage(decay=0.5)
target_update_op = ema.apply(net1)
with tf.variable_scope('tare', reuse=False) as tare:
result, net4 = network(in_x,in_h)
len_net4 = len(net4)
target_assign = [[] for i in range(len_net4)]
for i in range(len_net4):
target_assign[i] = net4[i].assign(ema.average(net1[i]))
init_new_vars_op = tf.initialize_variables(var_list=[v for v in tf.global_variables() if 'ExponentialMovingAverage' in v.name]) # 'initialize_variables' will be replaced with 'variables_initializer' in 2017
sess.run(init_new_vars_op)
sess.run([param4.assign(param1.eval()) for param4, param1 in zip(net4,net1)]) # Change0
Lastly, be aware that you must sess.run(ema.apply(net)) right after the net changes, and then sess.run(net_ema.assign(ema.average(net)). W/O .apply, net_ema will not get assigned with averaged value
len_net1 = len(net1)
net1_ema = [[] for i in range(len_net1)]
for i in range(len_net1):
sess.run(net1[i].assign(1. + net1[i]))
sess.run(target_update_op)
for i in range(len_net4):
sess.run(target_assign[i].op)
list = sess.run([result, in_x, in_h, net4], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})