Using TensorArrays in the context of a while_loop to accumulate values - tensorflow

Below I have an implementation of a Tensorflow RNN Cell, designed to emulate Alex Graves' algorithm ACT in this paper: http://arxiv.org/abs/1603.08983.
At a single timestep in the sequence called via rnn.rnn(with a static sequence_length parameter, so the rnn is unrolled dynamically - I am using a fixed batch size of 20), we recursively call ACTStep, producing outputs of size(1,200) where the hidden dimension of the RNN cell is 200 and we have a batch size of 1.
Using the while loop in Tensorflow, we iterate until the accumulated halting probability is high enough. All of this works reasonably smoothly, but I am having problems accumulating states, probabilities and outputs within the while loop, which we need to do in order to create weighted combinations of these as the final cell output/state.
I have tried using a simple list, as below, but this fails when the graph is compiled as the outputs are not in the same frame(is it possible to use the "switch" function in control_flow_ops to forward the tensors to the point at which they are required, ie the add_n function just before we return the values?). I have also tried using the TensorArray structure, but I am finding this difficult to use as it seems to destroy shape information and replacing it manually hasn't worked. I also haven't been able to find much documentation on TensorArrays, presumably as they are, I imagine, mainly for internal TF use.
Any advice on how it might be possible to correctly accumulate the variables produced by ACTStep would be much appreciated.
class ACTCell(RNNCell):
"""An RNN cell implementing Graves' Adaptive Computation time algorithm"""
def __init__(self, num_units, cell, epsilon, max_computation):
self.one_minus_eps = tf.constant(1.0 - epsilon)
self._num_units = num_units
self.cell = cell
self.N = tf.constant(max_computation)
#property
def input_size(self):
return self._num_units
#property
def output_size(self):
return self._num_units
#property
def state_size(self):
return self._num_units
def __call__(self, inputs, state, scope=None):
with vs.variable_scope(scope or type(self).__name__):
# define within cell constants/ counters used to control while loop
prob = tf.get_variable("prob", [], tf.float32,tf.constant_initializer(0.0))
counter = tf.get_variable("counter", [],tf.float32,tf.constant_initializer(0.0))
tf.assign(prob,0.0)
tf.assign(counter, 0.0)
# the predicate for stopping the while loop. Tensorflow demands that we have
# all of the variables used in the while loop in the predicate.
pred = lambda prob,counter,state,input,\
acc_state,acc_output,acc_probs:\
tf.logical_and(tf.less(prob,self.one_minus_eps), tf.less(counter,self.N))
acc_probs = []
acc_outputs = []
acc_states = []
_,iterations,_,_,acc_states,acc_output,acc_probs = \
control_flow_ops.while_loop(pred,
self.ACTStep,
[prob,counter,state,input,acc_states,acc_outputs,acc_probs])
# TODO:fix last part of this, need to use the remainder.
# TODO: find a way to accumulate the regulariser
# here we take a weighted combination of the states and outputs
# to use as the actual output and state which is passed to the next timestep.
next_state = tf.add_n([tf.mul(x,y) for x,y in zip(acc_probs,acc_states)])
output = tf.add_n([tf.mul(x,y) for x,y in zip(acc_probs,acc_outputs)])
return output, next_state
def ACTStep(self,prob,counter,state,input, acc_states,acc_outputs,acc_probs):
output, new_state = rnn.rnn(self.cell, [input], state, scope=type(self.cell).__name__)
prob_w = tf.get_variable("prob_w", [self.cell.input_size,1])
prob_b = tf.get_variable("prob_b", [1])
p = tf.nn.sigmoid(tf.matmul(prob_w,new_state) + prob_b)
acc_states.append(new_state)
acc_outputs.append(output)
acc_probs.append(p)
return [tf.add(prob,p),tf.add(counter,1.0),new_state, input,acc_states,acc_outputs,acc_probs]

I'm going to preface this response that this is NOT a complete solution, but rather some commentary on how to improve your cell.
To start off, in your ACTStep function, you call rnn.rnn for one timestep (as defined by [input]. If you're doing a single timestep, it is probably more efficient to simple use the actual self.cell call function. You'll see this same mechanism used in tensorflow rnncell wrappers
You mentioned that you have tried using TensorArrays. Did you pack and unpack the tensorarrays appropriately? Here is a repo where you'll find under model.py the tensorarrays are packed and unpacked properly.
You also asked if there is a function in control_flow_ops that will require all the tensors to be accumulated. I think you are looking for tf.control_dependencies
You can list all of your output tensors operations in control_dependicies and that will require tensorflow to compute all tensors up into that point.
Also, it looks like your counter variable is trainable. Are you sure you want this to be the case? If you're adding plus one to your counter, that probably wouldn't yield the correct result. On the other hand, you could have purposely kept it trainable to differentiate it at the end for the ponder cost function.
Also I believe the Remainder function should be in your script:
remainder = 1.0 - tf.add_n(acc_probs[:-1])
#note that there is a -1 in the list as you do not want to grab the last probability
Here is my version of your code edited:
class ACTCell(RNNCell):
"""An RNN cell implementing Graves' Adaptive Computation time algorithm
Notes: https://www.evernote.com/shard/s189/sh/fd165646-b630-48b7-844c-86ad2f07fcda/c9ab960af967ef847097f21d94b0bff7
"""
def __init__(self, num_units, cell, max_computation = 5.0, epsilon = 0.01):
self.one_minus_eps = tf.constant(1.0 - epsilon) #episolon is 0.01 as found in the paper
self._num_units = num_units
self.cell = cell
self.N = tf.constant(max_computation)
#property
def input_size(self):
return self._num_units
#property
def output_size(self):
return self._num_units
#property
def state_size(self):
return self._num_units
def __call__(self, inputs, state, scope=None):
with vs.variable_scope(scope or type(self).__name__):
# define within cell constants/ counters used to control while loop
prob = tf.constant(0.0, shape = [batch_size])
counter = tf.constant(0.0, shape = [batch_size])
# the predicate for stopping the while loop. Tensorflow demands that we have
# all of the variables used in the while loop in the predicate.
pred = lambda prob,counter,state,input,acc_states,acc_output,acc_probs:\
tf.logical_and(tf.less(prob,self.one_minus_eps), tf.less(counter,self.N))
acc_probs, acc_outputs, acc_states = [], [], []
_,iterations,_,_,acc_states,acc_output,acc_probs = \
control_flow_ops.while_loop(
pred,
self.ACTStep, #looks like he purposely makes the while loop here
[prob, counter, state, input, acc_states, acc_outputs, acc_probs])
'''mean-field updates for states and outputs'''
next_state = tf.add_n([x*y for x,y in zip(acc_probs,acc_states)])
output = tf.add_n([x*y for x,y in zip(acc_probs,acc_outputs)])
remainder = 1.0 - tf.add_n(acc_probs[:-1]) #you take the last off to avoid a negative ponder cost #the problem here is we need to take the sum of all the remainders
tf.add_to_collection("ACT_remainder", remainder) #if this doesnt work then you can do self.list based upon timesteps
tf.add_to_collection("ACT_iterations", iterations)
return output, next_state
def ACTStep(self,prob, counter, state, input, acc_states, acc_outputs, acc_probs):
'''run rnn once'''
output, new_state = rnn.rnn(self.cell, [input], state, scope=type(self.cell).__name__)
prob_w = tf.get_variable("prob_w", [self.cell.input_size,1])
prob_b = tf.get_variable("prob_b", [1])
halting_probability = tf.nn.sigmoid(tf.matmul(prob_w,new_state) + prob_b)
acc_states.append(new_state)
acc_outputs.append(output)
acc_probs.append(halting_probability)
return [p + prob, counter + 1.0, new_state, input,acc_states,acc_outputs,acc_probs]
def PonderCostFunction(self, time_penalty = 0.01):
'''
note: ponder is completely different than probability and ponder = roe
the ponder cost function prohibits the rnn from cycling endlessly on each timestep when not much is needed
'''
n_iterations = tf.get_collection_ref("ACT_iterations")
remainder = tf.get_collection_ref("ACT_remainder")
return tf.reduce_sum(n_iterations + remainder) #completely different from probability
This is a complicated paper to implement that I have been working on myself. I wouldn't mind collaborating with you to get it done in Tensorflow. If you're interested, please add me at LeavesBreathe on Skype and we can go from there.

Related

How to avoid memory leakage in an autoregressive model within tensorflow

Recently, I am training a LSTM with attention mechanism for regressionin tensorflow 2.9 and I met an problem during training with model.fit():
At the beginning, the training time is okay, like 7s/step. However, it was increasing during the process and after several steps, like 1000, the value might be 50s/step. Here below is a part of the code for my model:
class AttentionModel(tf.keras.Model):
def __init__(self, encoder_output_dim, dec_units, dense_dim, batch):
super().__init__()
self.dense_dim = dense_dim
self.batch = batch
encoder = Encoder(encoder_output_dim)
decoder = Decoder(dec_units,dense_dim)
self.encoder = encoder
self.decoder = decoder
def call(self, inputs):
# Creat a tensor to record the result
tempt = list()
encoder_output, encoder_state = self.encoder(inputs)
new_features = np.zeros((self.batch, 1, 1))
dec_initial_state = encoder_state
for i in range(6):
dec_inputs = DecoderInput(new_features=new_features, enc_output=encoder_output)
dec_result, dec_state = self.decoder(dec_inputs, dec_initial_state)
tempt.append(dec_result.logits)
new_features = dec_result.logits
dec_initial_state = dec_state
result=tf.concat(tempt,1)
return result
In the official documents for tf.function, I notice: "Don't rely on Python side effects like object mutation or list appends".
Since I use a dynamic python list with append() to record the intermediate variables, I guess each time during training, a new tf.graph was added. Is the reason my training is getting slower and slower?
Additionally, what should I use instead of python list to avoid this? I have tried with a numpy.zeros matrix but it will lead to another problem:
tempt = np.zeros(shape=(1,6))
...
for i in range(6):
dec_inputs = DecoderInput(new_features=new_features, enc_output=encoder_output)
dec_result, dec_state = self.decoder(dec_inputs, dec_initial_state)
tempt[i]=(dec_result.logits)
...
Cannot convert a symbolic tf.Tensor (decoder/dense_3/BiasAdd:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported.

Tensorflow: OOM when batch size too large

My script is failing due to too high memory usage. When I reduce the batch size it works.
#tf.function(autograph=not DEBUG)
def step(prev_state, input_b):
input_b = tf.reshape(input_b, shape=[1,input_b.shape[0]])
state = FastALIFStateTuple(v=prev_state[0], z=prev_state[1], b=prev_state[2], r=prev_state[3])
new_b = self.decay_b * state.b + (tf.ones(shape=[self.units],dtype=tf.float32) - self.decay_b) * state.z
thr = self.thr + new_b * self.beta
z = state.z
i_in = tf.matmul(input_b, W_in)
i_rec = tf.matmul(z, W_rec)
i_t = i_in + i_rec
I_reset = z * thr * self.dt
new_v = self._decay * state.v + (1 - self._decay) * i_t - I_reset
# Spike generation
is_refractory = tf.greater(state.r, .1)
zeros_like_spikes = tf.zeros_like(z)
new_z = tf.where(is_refractory, zeros_like_spikes, self.compute_z(new_v, thr))
new_r = tf.clip_by_value(state.r + self.n_refractory * new_z - 1,
0., float(self.n_refractory))
return [new_v, new_z, new_b, new_r]
#tf.function(autograph=not DEBUG)
def evolve_single(inputs):
accumulated_state = tf.scan(step, inputs, initializer=state0)
Z = tf.squeeze(accumulated_state[1]) # -> [T,units]
if self.model_settings['avg_spikes']:
Z = tf.reshape(tf.reduce_mean(Z, axis=0), shape=(1,-1))
out = tf.matmul(Z, W_out) + b_out
return out # - [BS,Num_labels]
# # - Using a simple loop
# out_store = []
# for i in range(fingerprint_3d.shape[0]):
# out_store.append(tf.squeeze(evolve_single(fingerprint_3d[i,:,:])))
# return tf.reshape(out_store, shape=[fingerprint_3d.shape[0],self.d_out])
final_out = tf.squeeze(tf.map_fn(evolve_single, fingerprint_3d)) # -> [BS,T,self.units]
return final_out
This code snippet is inside a tf.function, but I omitted it since I don't think it's relevant.
As can be seen, I run the code on fingerprint_3d, a tensor that has the dimension [BatchSize,Time,InputDimension], e.g. [50,100,20]. When I run this with BatchSize < 10 everything works fine, although tf.scan already uses a lot of memory for that.
When I now execute the code on a batch of size 50, suddenly I get an OOM, even though I am executing it in an iterative matter (here commented out).
How should I execute this code so that the Batch Size doesn't matter?
Is tensorflow maybe parallelizing my for loop so that it executed over multiple batches at once?
Another unrelated question is the following: What function instead of tf.scan should I use if I only want to accumulate one state variable, compared to the case for tf.scan where it just accumulates all the state variables? Or is that possible with tf.scan?
As mentioned in the discussions here, tf.foldl, tf.foldr, and tf.scan all require keeping track of all values for all iterations, which is necessary for computations like gradients. I am not aware of any ways to mitigate this issue; still, I would also be interested if anyone has a better answer than mine.
When I used
#tf.function
def get_loss_and_gradients():
with tf.GradientTape(persistent=False) as tape:
logits, spikes = rnn.call(fingerprint_input=graz_dict["train_input"], W_in=W_in, W_rec=W_rec, W_out=W_out, b_out=b_out)
loss = loss_normal(tf.cast(graz_dict["train_groundtruth"],dtype=tf.int32), logits)
gradients = tape.gradient(loss, [W_in,W_rec,W_out,b_out])
return loss, logits, spikes, gradients
it works.
When I remove the #tf.function decorator the memory blows up. So it really seems important that tensorflow can create a graph for you computations.

TensorFlow: LSTM State Saving/Updating within Graph

I am working with Reinforcement Learning and wanting to reduce the amount of data I feed through the sess.run() during training to speed up learning.
I was looking into the LSTM and with the need to look forward and reset to find proper Q values, I crafted a solution such as this with tf.case():
CurrentStateOption = tf.Variable(0, trainable=False, name='SavedState')
with tf.name_scope("LSTMLayer") as scope:
initializer = tf.random_uniform_initializer(-.1, .1)
lstm_cell_L1 = tf.nn.rnn_cell.LSTMCell(self.input_sizes, forget_bias=1.0, initializer=initializer, state_is_tuple=True)
self.cell_L1 = tf.nn.rnn_cell.MultiRNNCell([lstm_cell_L1] *self.NumberLSTMLayers, state_is_tuple=True)
self.state = self.cell_L1.zero_state(1,tf.float64)
self.SavedState = self.cell_L1.zero_state(1,tf.float64) #tf.Variable(state, trainable=False, name='SavedState')
#SaveCond = tf.cond(tf.equal(CurrentStateOption,tf.constant(1)), self.SaveState, self.SameState)
#RestoreCond = tf.cond(tf.equal(CurrentStateOption,tf.constant(-1)), self.RestoreState, self.SameState)
#ZeroCond = tf.cond(tf.less(CurrentStateOption,tf.constant(-1)), self.ZeroState, self.SameState)
self.state = tf.case({tf.equal(CurrentStateOption,tf.constant(1)): self.SaveState, tf.equal(CurrentStateOption,tf.constant(-1)): self.RestoreState,
tf.less(CurrentStateOption,tf.constant(-1)): self.ZeroState}, default=self.SameState, exclusive=True)
RunConditions = tf.group([SaveCond, RestoreCond, ZeroCond])
self.Xinputs = [tf.concat(1,[Xinputs])]
outputs, stateFINAL_L1 = rnn.rnn(self.cell_L1,self.Xinputs, initial_state=self.state, dtype=tf.float32)
def RestoreState(self):
#self.state = self.state.assign(self.SavedState)
self.state = self.SavedState
return self.state
def ZeroState(self):
self.state = self.cell_L1.zero_state(1,tf.float64)
return self.state
def SaveState(self):
#self.SavedState = self.SavedState.assign(self.state)
self.SavedState = self.state
return self.SavedState
def SameState(self):
return self.state
This seems to work well in concept as now I can feed an INT to instruct the LSTM Graph what to do with the state. If I Pass "1" it will save the state before executing, if I pass "-1" it will Restore the last saved state, if I pass "< -1" it will zero the state. If "0" it will use what is in the LSTM from last run (inference). I have tried a few different approaches, include a simpler tf.cond() approach.
The issue I think stems from the tf.case() Op needing tensors, but the LSTM state is a Tuple (and non-tuple is going to be depreciated). This became clear when I tried to tf.assign() the value to the graph variable.
My end goal is to leave the "state" within the graph, but pass an INT to instruct what to do with the state. In the future I would like to have multiple "store" locations for various look-backs.
Any ideas how to handle tf.case() type of structure with tuples vs tensors?
I believe having one tf.case() per element in the state tuple should work, since the tuple is just a python tuple.

Tensorflow: LSTM with moving average of weights of another LSTM

I would like to have an LSTM in tensorflow whose weights are the exponential moving average of the weights of another LSTM. So basically I have this code with some input placeholder and some initial state placeholder:
def test_lstm(input_ph, init_ph, scope):
cell = tf.nn.rnn_cell.LSTMCell(128, use_peepholes=True)
input_ph = tf.transpose(input_ph, [1, 0, 2])
input_list = tf.unpack(input_ph)
with tf.variable_scope(scope) as vs:
outputs, states = tf.nn.rnn(cell, input_list, initial_state=init_ph)
theta = [v for v in tf.all_variables() if v.name.startswith(vs.name)]
return outputs, theta
lstm_1, theta_1 = test_lstm(input_1, state_init_1, scope="lstm_1")
What I would like to do now is something similar along these lines (which don't actually work because the exponential moving average puts the tag "ema" behind the variable name of the weights and they do not appear in variable scope because they were not created with tf.get_variable):
ema = tf.train.ExponentialMovingAverage(decay=1-self.tau, name="ema")
with tf.variable_scope("lstm_2"):
maintain_averages_theta_1 = ema.apply(theta_1)
theta_2_1 = [ema.average(x) for x in theta_1]
lstm_2 , theta_2_2 = test_lstm(input_2, state_init_2, scope="lstm_2"
where eventually theta_2_1 would be equal to theta_2_2 (or throw an exception because the variables already exist).
Change (3/May/2018): I added one line apart from the old answer. The old one itself is self-sufficient to this question, but it real practice we initialize 'target network' as the same value to the 'behavioural network'. I added this point. Search the line having 'Change0': You need to do this only once right after when the target network's parameters are just initialized. Without 'Change0', you first round of gradient-based update would become nasty since Q_target and Q_behaviour are not correlated while you need both to be somewhat related ex) TD = r + gamma*Q_target - Q_behaviour for the update.
Seems a bit late but hope this helps.
The critical problem of TF-RNN series possess is that we cannot directly designate the variables for RNNs unlike plain feedforward or convolutional NNs, so that we can't do simple work - get EMAed vars and plug into the network.
Let's go into the real deal(I attached a practice code to look-up for this, so please refer EMA_LSTM.py).
Shall we say, there is a network function containing LSTM:
def network(in_x,in_h):
# Input: 'in_x' is the input sequence, 'in_h' is the initial hidden cell(c,m)
# Output: 'hidden_outputs' is the output sequence(c-sequence), 'net' is the list of parameters used in this network function
cell = tf.nn.rnn_cell.BasicLSTMCell(3, state_is_tuple=True)
in_h = tf.nn.rnn_cell.LSTMStateTuple(in_h[0], in_h[1])
hidden_outputs, last_tuple = tf.nn.dynamic_rnn(cell, in_x, dtype=tf.float32, initial_state=in_h)
net = [v for v in tf.trainable_variables() if tf.contrib.framework.get_name_scope() in v.name]
return hidden_outputs, net
Then you declare tf.placeholder for the necessary inputs, which are:
in_x = tf.placeholder("float", [None,None,6])
in_c = tf.placeholder("float", [None,3])
in_m = tf.placeholder("float", [None,3])
in_h = (in_c, in_m)
Lastly, we run a session that proceeds the network() function with specified inputs:
init_cORm = np.zeros(shape=(1,3))
input = np.ones(shape=(1,1,6))
print '========================new-1(beh)=============================='
with tf.Session() as sess:
with tf.variable_scope('beh', reuse=False) as beh:
result, net1 = network(in_x,in_h)
sess.run(tf.global_variables_initializer())
list = sess.run([result, in_x, in_h, net1], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})
print 'result:', list[0]
print 'in_x:' , list[1]
print 'in_h:', list[2]
print 'net1:', list[3][0][0][:4]
Now, we are going to make the var_list called 'net4' that contains the ExponentialMovingAverage(EMA)-ed values of 'net1', as in below with original 'net1' being first assigned in beh session above and newly assigned in below by adding 1. for each element:
ema = tf.train.ExponentialMovingAverage(decay=0.5)
target_update_op = ema.apply(net1)
init_new_vars_op = tf.initialize_variables(var_list=[v for v in tf.global_variables() if 'ExponentialMovingAverage' in v.name]) # 'initialize_variables' will be replaced with 'variables_initializer' in 2017
sess.run(init_new_vars_op)
sess.run([param4.assign(param1.eval()) for param4, param1 in zip(net4,net1)]) # Change0
len_net1 = len(net1)
net1_ema = [[] for i in range(len_net1)]
for i in range(len_net1):
sess.run(net1[i].assign(1. + net1[i]))
sess.run(target_update_op)
Note that
we only initialised(by declaring 'init_new_vars_op' then running the declaration job) the variables of their name containing 'ExponentialMovingAverage', if not the variables in net1 will also be newly initialised.
'net1' is newly assigned with +1 for every elements of the variables in 'net1'. If an element of 'net1' was -0.5 and is now 0.5 by +1, then we want to have 'net4' as 0. when the EMA decay rate is 0.5
Finally, we run the EMA job with 'sess.run(target_update_op)'
Eventually, we declare 'net4' first with the 'network()' function and then assign&run the EMA(net1) values into 'net4'. When you run 'sess.run(result)', then it will be the one with EMA(net1)-ed variables.
with tf.variable_scope('tare', reuse=False) as tare:
result, net4 = network(in_x,in_h)
len_net4 = len(net4)
target_assign = [[] for i in range(len_net4)]
for i in range(len_net4):
target_assign[i] = net4[i].assign(ema.average(net1[i]))
sess.run(target_assign[i].op)
list = sess.run([result, in_x, in_h, net4], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})
What's happened in here? You just indirectly declares the LSTM variables as 'net4' in the 'network()' function. Then in the for loop, we points out that 'net4' is actually EMA of 'net1' and 'net1+1.' . Lastly, with the net4 specified what to do with(via 'network()') and what values it takes(via 'for loop of .assign(ema.average()) to itself'), you run the process.
It is somewhat counter-intuitive that we declare the 'result' first and specifies the value of parameters second. However, that's the nature of TF what they exactly look for, as it is always logical to set variables, processes, and their relation first then to assign values, then runs the processes.
Lastly, few things to go further for the real machine learning codes:
In here, I just second assigned 'net1' with 'net1 + 1.'. In real case, this '+1.' step is where you 'sess.run()'(after you 'declare' in somewhere) your optimiser. So every-time after you 'sess.run(optimiser)', 'sess.run(target_update_op)' and then'sess.run(target_assign[i].op)' should follow to update your 'net4' along the EMA of 'net1'. Conceretly, you can do this job with different order like below:
ema = tf.train.ExponentialMovingAverage(decay=0.5)
target_update_op = ema.apply(net1)
with tf.variable_scope('tare', reuse=False) as tare:
result, net4 = network(in_x,in_h)
len_net4 = len(net4)
target_assign = [[] for i in range(len_net4)]
for i in range(len_net4):
target_assign[i] = net4[i].assign(ema.average(net1[i]))
init_new_vars_op = tf.initialize_variables(var_list=[v for v in tf.global_variables() if 'ExponentialMovingAverage' in v.name]) # 'initialize_variables' will be replaced with 'variables_initializer' in 2017
sess.run(init_new_vars_op)
sess.run([param4.assign(param1.eval()) for param4, param1 in zip(net4,net1)]) # Change0
Lastly, be aware that you must sess.run(ema.apply(net)) right after the net changes, and then sess.run(net_ema.assign(ema.average(net)). W/O .apply, net_ema will not get assigned with averaged value
len_net1 = len(net1)
net1_ema = [[] for i in range(len_net1)]
for i in range(len_net1):
sess.run(net1[i].assign(1. + net1[i]))
sess.run(target_update_op)
for i in range(len_net4):
sess.run(target_assign[i].op)
list = sess.run([result, in_x, in_h, net4], feed_dict={
in_x : input,
in_c : init_cORm,
in_m : init_cORm
})

How can I feed last output y(t-1) as input for generating y(t) in tensorflow RNN?

I want to design a single layer RNN in Tensorflow such that last output (y(t-1)) is participated in updating the hidden state.
h(t) = tanh(W_{ih} * x(t) + W_{hh} * h(t) + **W_{oh}y(t - 1)**)
y(t) = W_{ho}*h(t)
How can I feed last input y(t - 1) as input for updating the hidden state?
Is y(t-1) the last input or output? In both cases it is not a straight fit with the TensorFlow RNN cell abstraction. If your RNN is simple you can just write the loop on your own, then you have full control. Another way that I would use is to pre-process your RNN input, e.g., do something like:
processed_input[t] = tf.concat(input[t], input[t-1])
Then call the RNN cell with processed_input and split there.
One possibility is to use tf.nn.raw_rnn which I found in this article. Check my answer to this related post.
I would call what you described an "autoregressive RNN". Here's an (incomplete) code snippet that shows how you can create one using tf.nn.raw_rnn:
import tensorflow as tf
LSTM_SIZE = 128
BATCH_SIZE = 64
HORIZON = 10
lstm_cell = tf.nn.rnn_cell.LSTMCell(LSTM_SIZE, use_peepholes=True)
class RnnLoop:
def __init__(self, initial_state, cell):
self.initial_state = initial_state
self.cell = cell
def __call__(self, time, cell_output, cell_state, loop_state):
emit_output = cell_output # == None for time == 0
if cell_output is None: # time == 0
initial_input = tf.fill([BATCH_SIZE, LSTM_SIZE], 0.0)
next_input = initial_input
next_cell_state = self.initial_state
else:
next_input = cell_output
next_cell_state = cell_state
elements_finished = (time >= HORIZON)
next_loop_state = None
return elements_finished, next_input, next_cell_state, emit_output, next_loop_state
rnn_loop = RnnLoop(initial_state=initial_state_tensor, cell=lstm_cell)
rnn_outputs_tensor_array, _, _ = tf.nn.raw_rnn(lstm_cell, rnn_loop)
rnn_outputs_tensor = rnn_outputs_tensor_array.stack()
Here we initialize internal state of LSTM with some vector initial_state_tensor, and feed zero array as input at t=0. After that, the output of the current timestep is the input for the next timestep.