Batch Normalization - Tensorflow - tensorflow

I have looked at a few BN examples but still am a bit confused. So I am currently using this function which calls the function here;
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/api_docs/python/functions_and_classes/shard4/tf.contrib.layers.batch_norm.md
from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm
import tensorflow as tf
def bn(x,is_training,name):
bn_train = batch_norm(x, decay=0.9, center=True, scale=True,
updates_collections=None,
is_training=True,
reuse=None,
trainable=True,
scope=name)
bn_inference = batch_norm(x, decay=1.00, center=True, scale=True,
updates_collections=None,
is_training=False,
reuse=True,
trainable=False,
scope=name)
z = tf.cond(is_training, lambda: bn_train, lambda: bn_inference)
return z
This following part is a toy run where I am just checking that the function reuses the means and variances calculated in a training step for two features. Running this part of the code in test mode i.e. is_training=False, the running mean/variances calculated in the training step are changing which can be seen when we print out the BN variables which I get from calling bnParams
if __name__ == "__main__":
print("Example")
import os
import numpy as np
import scipy.stats as stats
np.set_printoptions(suppress=True,linewidth=200,precision=3)
np.random.seed(1006)
import pdb
path = "batchNorm/"
if not os.path.exists(path):
os.mkdir(path)
savePath = path + "bn.model"
nFeats = 2
X = tf.placeholder(tf.float32,[None,nFeats])
is_training = tf.placeholder(tf.bool,name="is_training")
Y = bn(X,is_training=is_training,name="bn")
mvn = stats.multivariate_normal([0,100])
bs = 4
load = 0
train = 1
saver = tf.train.Saver()
def bnCheck(batch,mu,std):
# Checking calculation
return (x - mu)/(std + 0.001)
with tf.Session() as sess:
if load == 1:
saver.restore(sess,savePath)
else:
tf.global_variables_initializer().run()
#### TRAINING #####
if train == 1:
for i in xrange(100):
x = mvn.rvs(bs)
y = Y.eval(feed_dict={X:x, is_training.name: True})
def bnParams():
beta, gamma, mean, var = [v.eval() for v in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES,scope="bn")]
return beta, gamma, mean, var
beta, gamma, mean, var = bnParams()
#### TESTING #####
for i in xrange(10):
x = mvn.rvs(1).reshape(1,-1)
check = bnCheck(x,mean,np.sqrt(var))
y = Y.eval(feed_dict={X:x, is_training.name: False})
print("x = {0}, y = {1}, check = {2}".format(x,y,check))
beta, gamma, mean, var = bnParams()
print("BN Params: Beta {0} Gamma {1} mean {2} var{3} \n".format(beta,gamma,mean,var))
saver.save(sess,savePath)
The first three iterations of test loop look as follows;
x = [[ -1.782 100.941]], y = [[-1.843 1.388]], check = [[-1.842 1.387]]
BN Params: Beta [ 0. 0.] Gamma [ 1. 1.] mean [ -0.2 99.93] var[ 0.818 0.589]
x = [[ -1.245 101.126]], y = [[-1.156 1.557]], check = [[-1.155 1.557]]
BN Params: Beta [ 0. 0.] Gamma [ 1. 1.] mean [ -0.304 100.05 ] var[ 0.736 0.53 ]
x = [[ -0.107 99.349]], y = [[ 0.23 -0.961]], check = [[ 0.23 -0.96]]
BN Params: Beta [ 0. 0.] Gamma [ 1. 1.] mean [ -0.285 99.98 ] var[ 0.662 0.477]
I am not doing BP so beta and gamma won't change. However my running means/variances are changing. Where am I going wrong?
EDIT:
It would be good to know why these variables need/do not need changing between test and train;
updates_collections, reuse, trainable

Your bn function is wrong. Use this instead:
def bn(x,is_training,name):
return batch_norm(x, decay=0.9, center=True, scale=True,
updates_collections=None,
is_training=is_training,
reuse=None,
trainable=True,
scope=name)
is_training is bool 0-D tensor signaling whether to update running mean etc. or not. Then by just changing the tensor is_training you're signaling whether you're in training or test phase.
EDIT:
Many operations in tensorflow accept tensors, and not constant True/False number arguments.

When you use slim.batch_norm,be sure to use slim.learning.create_train_op instead of tf.train.GradientDecentOptimizer(lr).minimize(loss) or other optimizer. Try it to see if it works!

Related

Seed for dropout in Tensorflow LSTM - Difference in model(X) and model.predict(X)

Outputs for LSTM layer in tensorflow when using model(X) and model.predict(X) differ when using dropout.
Let's call the output of model(X) as Fwd Pass and model.predict(X) as Prediction
For a regular dropout layer, we can specify the seed but LSTM layer doesn't have such an argument. I'm guessing this is causing the difference between these Fwd Pass and Prediction.
In the following code sample, if dropout=0.4, these the outputs are different but when dropout=0.0 they match exactly. This makes me believe that every evaluation is using a different operation level seed.
Is there a way to set that? I've already set the global seed for tensforflow.
Is there something else going on, that I am not aware of?
PS: I want to use dropout during inference, so that is by design.
Code
import os
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.initializers import GlorotUniform
SEED = 200
HIDDEN_UNITS = 4
N_OUTPUTS = 1
N_INPUTS = 4
BATCH_SIZE = 4
N_SAMPLES = 4
np.random.seed(SEED)
tf.random.set_seed(SEED)
# Simple LSTM Model
def my_model():
inputs = x = keras.Input(shape=(N_INPUTS, 1))
initializer = GlorotUniform(seed=SEED)
x = layers.LSTM(HIDDEN_UNITS,
kernel_initializer=initializer,
recurrent_dropout=0.0,
dropout=0.4,
# return_sequences=True,
use_bias=False)(x, training=True)
output = x
model = keras.Model(inputs=inputs, outputs=[output])
return model
# Create Sample Data
# Target Function
def f_x(x):
y = x[:, 0] + x[:, 1] ** 2 + np.sin(x[:, 2]) + np.sin(x[:, 3] ** 3)
y = y[:, np.newaxis]
return y
# Generate random inputs
d = np.linspace(0.1, 1, N_SAMPLES)
X = np.transpose(np.vstack([d*0.25, d*0.5, d*0.75, d]))
X = X[:, :, np.newaxis]
Y = f_x(X)
# PRINT FWD PASS
model = my_model()
n_out = model(X).numpy()
print('FWD PASS:')
print(n_out, '\n')
# PRINT PREDICT OUTPUT
print('PREDICT:')
out = model.predict(X)
print(out)
Output (dropout=0.4) - do not match
FWD PASS:
[[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0.0526864 -0.13284351 0.02326298 -0.30357683]
[ 0.06297918 -0.14084947 0.02214929 -0.44425806]]
PREDICT:
[[ 0.00975818 -0.029404 0.00678372 -0.03232396]
[ 0.0347842 -0.0974849 0.01938616 -0.15696262]
[ 0. 0. 0. 0. ]
[ 0.06297918 -0.14084947 0.02214929 -0.44425806]]
Output (dropout=0.0) - no dropout, outputs match
FWD PASS:
[[ 0.00593475 -0.01799661 0.00424165 -0.01876264]
[ 0.02226446 -0.06519517 0.01399653 -0.08595844]
[ 0.03620889 -0.10084937 0.01987283 -0.1663805 ]
[ 0.0475584 -0.12453148 0.02269932 -0.2541136 ]]
PREDICT:
[[ 0.00593475 -0.01799661 0.00424165 -0.01876264]
[ 0.02226446 -0.06519517 0.01399653 -0.08595844]
[ 0.03620889 -0.10084937 0.01987283 -0.1663805 ]
[ 0.0475584 -0.12453148 0.02269932 -0.2541136 ]]

TypeError: Fetch argument None has invalid type <class 'NoneType'> on operation that seems to be not none

N.B. Tensorflow version less than 2.0
In the following reproducible code, wd_d_op=sess.run([wd_d_op], feed_dict={X: x}) run successfully but grads_and_vars=sess.run([grad_and_vars], feed_dict={X: x}) raises the mentioned noneType error. If grad_and_vars then how come the next operation run successfully?
import tensorflow as tf
import numpy as np
from sklearn.datasets import make_blobs
##function for creating layer with fixed weight, don't worry about this
def fc_layer(input_tensor, input_dim, output_dim, component_name,act=tf.nn.relu, input_type='dense'):
# weight = tf.Variable(tf.truncated_normal([input_dim, output_dim], stddev=1. / tf.sqrt(input_dim / 2.)), name='weight')
if component_name=="weight1":
weight=tf.Variable([[-0.46401197, -0.02868146, -0.02945778, -0.19310321],[-0.06130088, -0.3782992 , -1.4025078 , -0.8482222 ]])
bias=tf.Variable([0.1,0.1,0.1,0.1])
else:
weight=tf.Variable([[ 0.27422005],[-1.2150304 ],[-0.43404067],[-0.3352416 ]])
bias=tf.Variable([0.1])
# weight=tf.Print(weight,[weight],component_name,summarize=-1)
bias = tf.Variable(tf.constant(0.1, shape=[output_dim]), name='bias')
# bias=tf.Print(bias,[type(bias)],component_name+"bias",summarize=-1)
weight=tf.cast(weight, tf.float32)
bias=tf.cast(bias, tf.float32)
input_tensor=tf.cast(input_tensor, tf.float32)
if input_type == 'sparse':
activations = act(tf.sparse_tensor_dense_matmul(input_tensor, weight) + bias)
else:
activations = act(tf.matmul(input_tensor, weight) + bias,name="features")
return activations
"""fixed input"""
x=np.array([[-0.9233333412304945, -0.5148649076298134],[-0.9366679176350374, -2.086600005395918],[50.366624846708156, -9.02965996391532],[51.09416621163187, -12.101430685982692]])
lr_wd_D = 1e-3
with tf.name_scope('input'):
X = tf.placeholder(dtype=tf.float32,name="exmaple")
with tf.name_scope('generator'):
h1 = fc_layer(X, 2, 4,component_name="weight1",input_type='dense')
output = fc_layer(h1, 4, 1,component_name="weight2",act=tf.identity,input_type='dense')
# output=tf.Print(output,[output],"output",summarize=-1)
output=tf.convert_to_tensor(output, dtype=tf.float32)
critic_s = tf.slice(output, [0, 0], [2, -1])
critic_t = tf.slice(output, [2, 0], [2, -1])
wd_loss = (tf.reduce_mean(critic_s) - tf.reduce_mean(critic_t))
# wd_loss=tf.convert_to_tensor(wd_loss, dtype=tf.float32)
theta_C = [v for v in tf.global_variables() if 'generator' in v.name]
wd_op=tf.train.AdamOptimizer(lr_wd_D)
"""only calling this operation does not work, raised the mentioned error"""
grad_and_vars = wd_op.compute_gradients(wd_loss,var_list=theta_C)
"""But the following operation works even that use the previous variable"""
wd_d_op=wd_op.apply_gradients(grad_and_vars)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
#this works
wd_loss,theta_C=sess.run([wd_loss,theta_C], feed_dict={X: x})
print("wd_loss")
print(wd_loss)
print("theta_C")
print(theta_C)
#this does works
wd_d_op=sess.run([wd_d_op], feed_dict={X: x})
#this does not work , even though grads_and_vars used by wd_d_op
grads_and_vars=sess.run([grad_and_vars], feed_dict={X: x})
Solution:
If you comment out the following two lines of code, it will run correctly.
# bias=tf.Variable([0.1,0.1,0.1,0.1])
# bias=tf.Variable([0.1])
Explain:
Gradients are returned None if there are no explicit connections between wd_loss and theta_C in the graph. If you print theta_C, you will find two bias variables. These two bias variables don't actually participate in the calculation of wd_loss.
I give an example of an error below when w3 does not participate in the calculation of y but differentiates it.
import tensorflow as tf
w1 = tf.Variable([[1.,2.]])
w2 = tf.Variable([[9.],[10.]])
w3 = tf.Variable([[5.,6.]])
y = tf.matmul(w1, w2)
# this work
grads = tf.gradients(y,[w1,w2])
# this does not work, TypeError: Fetch argument None has invalid type <class 'NoneType'>
# grads = tf.gradients(y,[w1,w2,w3])
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
gradval = sess.run(grads)
print(gradval)

Sampling from tensor that depends on a random variable in tensorflow

Is it possible to get samples from a tensor that depends on a random variable in tensorflow? I need to get an approximate sample distribution to use in a loss function to be optimized. Specifically, in the example below, I want to be able to obtain samples of Y_output in order to be able to calculate the mean and variance of the output distribution and use these parameters in a loss function.
def sample_weight(mean, phi, seed=1):
P_epsilon = tf.distributions.Normal(loc=0., scale=1.0)
epsilon_s = P_epsilon.sample([1])
s = tf.multiply(epsilon_s, tf.log(1.0+tf.exp(phi)))
weight_sample = mean + s
return weight_sample
X = tf.placeholder(tf.float32, shape=[None, 1], name="X")
Y_labels = tf.placeholder(tf.float32, shape=[None, 1], name="Y_labels")
sw0 = sample_weight(u0,p0)
sw1 = sample_weight(u1,p1)
Y_output = sw0 + tf.multiply(sw1,X)
loss = tf.losses.mean_squared_error(labels=Y_labels, predictions=Y_output)
train_op = tf.train.AdamOptimizer(0.5e-1).minimize(loss)
init_op = tf.global_variables_initializer()
losses = []
predictions = []
Fx = lambda x: 0.5*x + 5.0
xrnge = 50
xs, ys = build_toy_data(funcx=Fx, stdev=2.0, num=xrnge)
with tf.Session() as sess:
sess.run(init_op)
iterations=1000
for i in range(iterations):
stat = sess.run(loss, feed_dict={X: xs, Y_labels: ys})
Not sure if this answers your question, but: when you have a Tensor downstream from a sampling Op (e.g., the Op created by your call to P_epsilon.sample([1]), anytime you call sess.run on the downstream Tensor, the sample op will be re-run, and produce a new random value. Example:
import tensorflow as tf
from tensorflow_probability import distributions as tfd
n = tfd.Normal(0., 1.)
s = n.sample()
y = s**2
sess = tf.Session() # Don't actually do this -- use context manager
print(sess.run(y))
# ==> 0.13539088
print(sess.run(y))
# ==> 0.15465781
print(sess.run(y))
# ==> 4.7929106
If you want a bunch of samples of y, you could do
import tensorflow as tf
from tensorflow_probability import distributions as tfd
n = tfd.Normal(0., 1.)
s = n.sample(100)
y = s**2
sess = tf.Session() # Don't actually do this -- use context manager
print(sess.run(y))
# ==> vector of 100 squared random normal values
We also have some cool tools in tensorflow_probability to do the kind of thing you're driving at here. Namely the Bijector API and, somewhat simpler, the trainable_distributions API.
(Another minor point: I'd suggest using tf.nn.softplus, or at a minimum tf.log1p(tf.exp(x)) instead of tf.log(1.0 + tf.exp(x)). The latter has poor numerical properties due to floating point imprecision, which the former are optimized for).
Hope this is some help!

Unable to use core Estimator with contrib Predictor

I'm using canned estimators and are struggling with poor predict performance so I'm trying to use tf.contrib.predictor to improve my inference performance. I've made this minimalistic example to reproduce my problems:
import tensorflow as tf
from tensorflow.contrib import predictor
def serving_input_fn():
x = tf.placeholder(dtype=tf.string, shape=[1], name='x')
inputs = {'x': x }
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
input_feature_column = tf.feature_column.numeric_column('x', shape=[1])
estimator = tf.estimator.DNNRegressor(
feature_columns=[input_feature_column],
hidden_units=[10, 20, 10],
model_dir="model_dir\\predictor-test")
estimator_predictor = predictor.from_estimator(estimator, serving_input_fn)
estimator_predictor({"inputs": ["1.0"]})
This yields the following exception:
UnimplementedError (see above for traceback): Cast string to float is not supported
[[Node: dnn/input_from_feature_columns/input_layer/x/ToFloat = Cast[DstT=DT_FLOAT, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"](dnn/input_from_feature_columns/input_layer/x/ExpandDims)]]
I've tried using tf.estimator.export.TensorServingInputReceiver instead of ServingInputReceiver in my serving_input_fn(), so that I can feed my model with a numerical tensor which is what I want:
def serving_input_fn():
x = tf.placeholder(dtype=tf.float32, shape=[1], name='x')
return tf.estimator.export.TensorServingInputReceiver(x, x)
but then I get the following exception in my predictor.from_estimator() call:
ValueError: features should be a dictionary of Tensors. Given type: <class 'tensorflow.python.framework.ops.Tensor'>
Any ideas?
My understanding of all of this is not really solid but I got it working and given the size of the community, I'll try to share what I did.
First, I'm running tensorflow 1.5 binaries with this patch applied manually.
The exact code I'm running is this:
def serving_input_fn():
x = tf.placeholder(dtype=tf.float32, shape=[3500], name='x')
inputs = {'x': x }
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
estimator = tf.estimator.Estimator(
model_fn=model_fn,
model_dir="{}/model_dir_{}/model.ckpt-103712".format(script_dir, 3))
estimator_predictor = tf.contrib.predictor.from_estimator(
estimator, serving_input_fn)
p = estimator_predictor(
{"x": np.array(sample.normalized.input_data)})
My case is a bit different than your example because I'm using a custom Estimator but in your case, I guess you should try something like this:
def serving_input_fn():
x = tf.placeholder(dtype=tf.float32, shape=[1], name='x')
inputs = {'x': x }
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
estimator = ...
estimator_predictor = tf.contrib.predictor.from_estimator(
estimator, serving_input_fn)
estimator_predictor({"x": [1.0]})
error is in following line:
estimator_predictor({"inputs": ["1.0"]})
please put 1.0 out of quotes. Currently it's a string.
After having worked on this for a couple of days, I want to share what I have done. The following code is also available from https://github.com/dage/tensorflow-estimator-predictor-example
TL;DR: predictor works best with custom estimators and the performance increase is massive.
import tensorflow as tf
import numpy as np
import datetime
import time
FEATURES_RANK = 3 # The number of inputs
LABELS_RANK = 2 # The number of outputs
# Returns a numpy array of rank LABELS_RANK based on the features argument.
# Can be used when creating a training dataset.
def features_to_labels(features):
sum_column = features.sum(1).reshape(features.shape[0], 1)
labels = np.hstack((sum_column*i for i in range(1, LABELS_RANK+1)))
return labels
def serving_input_fn():
x = tf.placeholder(dtype=tf.float32, shape=[None, FEATURES_RANK], name='x') # match dtype in input_fn
inputs = {'x': x }
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
def model_fn(features, labels, mode):
net = features["x"] # input
for units in [4, 8, 4]: # hidden units
net = tf.layers.dense(net, units=units, activation=tf.nn.relu)
net = tf.layers.dropout(net, rate=0.1)
output = tf.layers.dense(net, LABELS_RANK, activation=None)
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode, predictions=output, export_outputs={"out": tf.estimator.export.PredictOutput(output)})
loss = tf.losses.mean_squared_error(labels, output)
if mode == tf.estimator.ModeKeys.EVAL:
return tf.estimator.EstimatorSpec(mode, loss=loss)
optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
# expecting a numpy array of shape (1, FEATURE_RANK) for constant_feature argument
def input_fn(num_samples, constant_feature = None, is_infinite = True):
feature_values = np.full((num_samples, FEATURES_RANK), constant_feature) if isinstance(constant_feature, np.ndarray) else np.random.rand(num_samples, FEATURES_RANK)
feature_values = np.float32(feature_values) # match dtype in serving_input_fn
labels = features_to_labels(feature_values)
dataset = tf.data.Dataset.from_tensors(({"x": feature_values}, labels))
if is_infinite:
dataset = dataset.repeat()
return dataset.make_one_shot_iterator().get_next()
estimator = tf.estimator.Estimator(
model_fn=model_fn,
model_dir="model_dir\\estimator-predictor-test-{date:%Y-%m-%d %H.%M.%S}".format(date=datetime.datetime.now()))
train = estimator.train(input_fn=lambda : input_fn(50), steps=500)
evaluate = estimator.evaluate(input_fn=lambda : input_fn(20), steps=1)
predictor = tf.contrib.predictor.from_estimator(estimator, serving_input_fn)
consistency_check_features = np.random.rand(1, FEATURES_RANK)
consistency_check_labels = features_to_labels(consistency_check_features)
num_calls_predictor = 100
predictor_input = {"x": consistency_check_features}
start_time_predictor = time.clock()
for i in range(num_calls_predictor):
predictor_prediction = predictor(predictor_input)
delta_time_predictor = 1./num_calls_predictor*(time.clock() - start_time_predictor)
num_calls_estimator_predict = 10
estimator_input = lambda : input_fn(1, consistency_check_features, False)
start_time_estimator_predict = time.clock()
for i in range(num_calls_estimator_predict):
estimator_prediction = list(estimator.predict(input_fn=estimator_input))
delta_time_estimator = 1./num_calls_estimator_predict*(time.clock() - start_time_estimator_predict)
print("{} --> {}\n predictor={}\n estimator={}.\n".format(consistency_check_features, consistency_check_labels, predictor_prediction, estimator_prediction))
print("Time used per estimator.predict() call: {:.5f}s, predictor(): {:.5f}s ==> predictor is {:.0f}x faster!".format(delta_time_estimator, delta_time_predictor, delta_time_estimator/delta_time_predictor))
On my laptop I get the following results:
[[0.55424854 0.98057611 0.98604857]] --> [[2.52087322 5.04174644]]
predictor={'output': array([[2.5221248, 5.049496 ]], dtype=float32)}
estimator=[array([2.5221248, 5.049496 ], dtype=float32)].
Time used per estimator.predict() call: 0.30071s, predictor(): 0.00057s ==> predictor is 530x faster!

Tensorflow value error: Variable already exists, disallowed

I am predicting financial time series with different time periods using tensorflow. In order to divide input data, I made sub-samples and used for loop.
However, I got an ValueError like this;
ValueError: Variable rnn/basic_lstm_cell/weights already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
Without subsample this code works well.
Below is my code.
import tensorflow as tf
import numpy as np
import matplotlib
import os
import matplotlib.pyplot as plt
class lstm:
def __init__(self, x, y):
# train Parameters
self.seq_length = 50
self.data_dim = x.shape[1]
self.hidden_dim = self.data_dim*2
self.output_dim = 1
self.learning_rate = 0.0001
self.iterations = 5 # originally 500
def model(self,x,y):
# build a dataset
dataX = []
dataY = []
for i in range(0, len(y) - self.seq_length):
_x = x[i:i + self.seq_length]
_y = y[i + self.seq_length]
dataX.append(_x)
dataY.append(_y)
train_size = int(len(dataY) * 0.7977)
test_size = len(dataY) - train_size
trainX, testX = np.array(dataX[0:train_size]), np.array(dataX[train_size:len(dataX)])
trainY, testY = np.array(dataY[0:train_size]), np.array(dataY[train_size:len(dataY)])
print(train_size,test_size)
# input place holders
X = tf.placeholder(tf.float32, [None, self.seq_length, self.data_dim])
Y = tf.placeholder(tf.float32, [None, 1])
# build a LSTM network
cell = tf.contrib.rnn.BasicLSTMCell(num_units=self.hidden_dim,state_is_tuple=True, activation=tf.tanh)
outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
self.Y_pred = tf.contrib.layers.fully_connected(outputs[:, -1], self.output_dim, activation_fn=None)
# We use the last cell's output
# cost/loss
loss = tf.reduce_sum(tf.square(self.Y_pred - Y)) # sum of the squares
# optimizer
optimizer = tf.train.AdamOptimizer(self.learning_rate)
train = optimizer.minimize(loss)
# RMSE
targets = tf.placeholder(tf.float32, [None, 1])
predictions = tf.placeholder(tf.float32, [None, 1])
rmse = tf.sqrt(tf.reduce_mean(tf.square(targets - predictions)))
# training
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run(init)
# Training step
for i in range(self.iterations):
_, step_loss = sess.run([train, loss], feed_dict={X: trainX, Y: trainY})
# prediction
train_predict = sess.run(self.Y_pred, feed_dict={X: trainX})
test_predict = sess.run(self.Y_pred, feed_dict={X: testX})
return train_predict, test_predict
# variables definition
tsx = []
tsy = []
tsr = []
trp = []
tep = []
x = np.loadtxt('data.csv', delimiter=',') # data for analysis
y = x[:,[-1]]
z = np.loadtxt('rb.csv', delimiter=',') # data for time series
z1 = z[:,0] # start cell
z2 = z[:,1] # end cell
for i in range(1): # need to change to len(z)
globals()['x_%s' % i] = x[int(z1[i]):int(z2[i]),:] # definition of x
tsx.append(globals()["x_%s" % i])
globals()['y_%s' % i] = y[int(z1[i])+1:int(z2[i])+1,:] # definition of y
tsy.append(globals()["y_%s" % i])
globals()['a_%s' % i] = lstm(tsx[i],tsy[i]) # definition of class
globals()['trp_%s' % i],globals()['tep_%s' % i] = globals()["a_%s" % i].model(tsx[i],tsy[i])
trp.append(globals()["trp_%s" % i])
tep.append(globals()["tep_%s" % i])
Everytime the model method is called, you are building the computational graph of your LSTM. The second time the model method is called, tensorflow discovers that you already created variables with the same name. If the reuse flag of the scope in which the variables are created, is set to False, a ValueError is raised.
To solve this problem you have to set the reuse flag to True by calling tf.get_variable_scope().reuse_variables() at the end of your loop.
Note that you can't add this in the beginning of your loop, because then you are trying to reuse variables that have not yet been created.
You find more info in the tensorflow docs here
You define some variables in the "model" function.
Try this when you want to call "model" function multiple times:
with tf.variable_scope("model_fn") as scope:
train_predict, test_predict = model(input1)
with tf.variable_scope(scope, reuse=True):
train_predict, test_predict = model(input2)