Predict next number in a pattern - tensorflow

I am trying to write a simple program using TensorFlow to predict the next number in a sequence.
I am not experienced in TensorFlow so instead of starting from scratch I started with this guide: http://monik.in/a-noobs-guide-to-implementing-rnn-lstm-using-tensorflow/
However, in contrast to the implementation in the link above I do not want to treat the problem as a classification problem - where I only have n possible outcomes - but instead just calculate a single value for a sequence.
I tried modifying the code to fit my problem:
import numpy as np
import random
from random import shuffle
import tensorflow as tf
NUM_EXAMPLES = 10000
train_input = ['{0:020b}'.format(i) for i in range(2**20)]
shuffle(train_input)
train_input = [map(int,i) for i in train_input]
ti = []
for i in train_input:
temp_list = []
for j in i:
temp_list.append([j])
ti.append(np.array(temp_list))
train_input = ti
train_output = []
for i in train_input:
count = 0
for j in i:
if j[0] == 1:
count+=1
#temp_list = ([0]*21)
#temp_list[count]=1
#train_output.append(temp_list)
train_output.append(count)
test_input = train_input[NUM_EXAMPLES:]
test_output = train_output[NUM_EXAMPLES:]
train_input = train_input[:NUM_EXAMPLES]
train_output = train_output[:NUM_EXAMPLES]
print "test and training data loaded"
target = tf.placeholder(tf.float32, [None, 1])
data = tf.placeholder(tf.float32, [None, 20,1]) #Number of examples, number of input, dimension of each input
#target = tf.placeholder(tf.float32, [None, 1])
#print('target shape: ', target.get_shape())
#print('shape[0]', target.get_shape()[1])
#print('int(shape) ', int(target.get_shape()[1]))
num_hidden = 24
cell = tf.nn.rnn_cell.LSTMCell(num_hidden)
val, _ = tf.nn.dynamic_rnn(cell, data, dtype=tf.float32)
val = tf.transpose(val, [1, 0, 2])
print('val shape, ', val.get_shape())
last = tf.gather(val, int(val.get_shape()[0]) - 1)
weight = tf.Variable(tf.truncated_normal([num_hidden, int(target.get_shape()[1])]))
bias = tf.Variable(tf.constant(0.1, shape=[target.get_shape()[1]]))
#prediction = tf.nn.softmax(tf.matmul(last, weight) + bias)
prediction = tf.matmul(last, weight) + bias
cross_entropy = -tf.reduce_sum(target - prediction)
optimizer = tf.train.AdamOptimizer()
minimize = optimizer.minimize(cross_entropy)
mistakes = tf.not_equal(tf.argmax(target, 1), tf.argmax(prediction, 1))
error = tf.reduce_mean(tf.cast(mistakes, tf.float32))
init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)
batch_size = 100
no_of_batches = int(len(train_input)) / batch_size
epoch = 500
for i in range(epoch):
ptr = 0
for j in range(no_of_batches):
inp, out = train_input[ptr:ptr+batch_size], train_output[ptr:ptr+batch_size]
ptr+=batch_size
sess.run(minimize,{data: inp, target: out})
print "Epoch ",str(i)
incorrect = sess.run(error,{data: test_input, target: test_output})
#print sess.run(prediction,{data: [[[1],[0],[0],[1],[1],[0],[1],[1],[1],[0],[1],[0],[0],[1],[1],[0],[1],[1],[1],[0]]]})
#print('Epoch {:2d} error {:3.1f}%'.format(i + 1, 100 * incorrect))
sess.close()
It is still work in progress, since the input is bogus as well as the cross entropy calculation.
However, my main problem is that the code doesn't compile at all.
I get this error:
ValueError: Cannot feed value of shape (100,) for Tensor
u'Placeholder:0', which has shape '(?, 1)'
The number 100 comes from the "batch_size" and the (?, 1) comes from the fact that my prediction is a one dimensional number. However, I do not have any idea where the problem is in my code?
Can anyone help me get the dimensions to match?

This error means your targets placeholder is being fed something with the wrong shape. To fix it, I think you should reshape something like test_output.reshape([-1, 1])

To fix the placeholders shape, change your code to
for i in range(epoch):
ptr = 0
for j in range(no_of_batches):
inp = train_input[ptr:ptr+batch_size]
out = train_output[ptr:ptr+batch_size]
ptr+=batch_size
out = np.reshape(out, (100,1)) #reshape
sess.run(minimize,{data: inp, target: out})
print ("Epoch ",str(i))
test_output = np.reshape(test_output, (1038576,1)) #reshape
incorrect = sess.run(error,{data: test_input, target: test_output})

Related

Time-Series Transformer Model trains well but performs worse on Test Data

I have created a transformer model for multivariate time series predictions (many-to-one classification model).
Details about the Dataset
I have the hourly varying data i.e., 8 different features (hour, month, temperature, humidity, windspeed, solar radiations concentration etc.) and with them I am trying to predict the time sequence (energy consumption of a building. So my input has the shape X.shape = (8783, 168, 8) i.e., 8783 time sequences, each sequence contains 168 hourly entries/vectors and each vector contains 8 features. My output has the shape Y.shape = (8783,1) i.e., 8783 sequences each containing 1 output value (i.e., building energy consumption value after every hour).
Model Details
I took as a model an example from the official keras site. It is created for classification problems, I modified it for my regression problem by changing the activation of last output layer from sigmoid to relu.
Input shape (train_f) = (8783, 168, 8)
Output shape (train_P) = (8783,1)
When I train the model for 100 no. of epochs it converges very well for less number of epochs as compared to my reference models (i.e., LSTMs and LSTMS with self attention). After training, when the model is asked to make prediction by feeding in the test data, the prediction performance is worse as compare to the reference models.
I would be grateful if you please have a look at the code and let me know of the potential steps to improve the prediction/test accuracy.
Here is the code;
df_weather = pd.read_excel(r"Downloads\WeatherData.xlsx")
df_energy = pd.read_excel(r"Downloads\Building_energy_consumption_record.xlsx")
visa = pd.concat([df_weather, df_energy], axis = 1)
df_data = visa.loc[:, ~visa.columns.isin(["Time1", "TD", "U", "DR", "FX"])
msna.bar(df_data)
plt.figure(figsize = (16,6))
sb.heatmap(df_data.corr(), annot = True, linewidths=1, fmt = ".2g", cmap= 'coolwarm')
plt.xticks(rotation = 'horizontal') # how the titles will look likemeans their orientation
extract_for_normalization = list(df_data)[1:9]
df_data_float = df_data[extract_for_normalization].astype(float)
from sklearn.model_selection import train_test_split
train_X, test_X = train_test_split(df_data_float, train_size = 0.7, shuffle = False)
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_train_X=scaler.fit_transform(train_X)
**Converting train_X into required shape (inputs,sequences, features)**
train_f = [] #features input from training data
train_p = [] # prediction values
#test_q = []
#test_r = []
n_future = 1 #number of days we want to predict into the future
n_past = 168 # no. of time series input features to be considered for training
for val in range(n_past, len(scaled_train_X) - n_future+1):
train_f.append(scaled_train_X[val - n_past:val, 0:scaled_train_X.shape[1]])
train_p.append(scaled_train_X[val + n_future - 1:val + n_future, -1])
train_f, train_p = np.array(train_f), np.array(train_p)
**Transformer Model**
def transformer_encoder(inputs, head_size, num_heads, ff_dim, dropout=0):
# Normalization and Attention
x = layers.LayerNormalization(epsilon=1e-6)(inputs)
x = layers.MultiHeadAttention(
key_dim=head_size, num_heads=num_heads, dropout=dropout
)(x, x)
x = layers.Dropout(dropout)(x)
res = x + inputs
# Feed Forward Part
x = layers.LayerNormalization(epsilon=1e-6)(res)
x = layers.Conv1D(filters=ff_dim, kernel_size=1, activation="relu")(x)
x = layers.Dropout(dropout)(x)
x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)
return x + res
def build_model(
input_shape,
head_size,
num_heads,
ff_dim,
num_transformer_blocks,
mlp_units,
dropout=0,
mlp_dropout=0,
):
inputs = keras.Input(shape=input_shape)
x = inputs
for _ in range(num_transformer_blocks):
x = transformer_encoder(x, head_size, num_heads, ff_dim, dropout)
x = layers.GlobalAveragePooling1D(data_format="channels_first")(x)
for dim in mlp_units:
x = layers.Dense(dim, activation="relu")(x)
x = layers.Dropout(mlp_dropout)(x)
outputs = layers.Dense(train_p.shape[1])(x)
return keras.Model(inputs, outputs)
input_shape = (train_f.shape[1], train_f.shape[2])
model = build_model(
input_shape,
head_size=256,
num_heads=4,
ff_dim=4,
num_transformer_blocks=4,
mlp_units=[128],
mlp_dropout=0.4,
dropout=0.25,
)
model.compile(loss=tf.keras.losses.mean_absolute_error,
optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
metrics=["mse"])
model.summary()
history = model.fit(train_f, train_p, epochs=100, batch_size = 32, validation_split = 0.15, verbose = 1)
trainYPredict = model.predict(train_f)
**Inverse transform the prediction and keep the last value(output)**
trainYPredict1 = np.repeat(trainYPredict, scaled_train_X.shape[1], axis = -1)
trainYPredict_actual = scaler.inverse_transform(trainYPredict1)[:, -1]
train_p_actual = np.repeat(train_p, scaled_train_X.shape[1], axis = -1)
train_p_actual1 = scaler.inverse_transform(train_p_actual)[:, -1]
Prediction_mse=mean_squared_error(train_p_actual1 ,trainYPredict_actual)
print("Mean Squared Error of prediction is:", str(Prediction_mse))
Prediction_rmse =sqrt(Prediction_mse)
print("Root Mean Squared Error of prediction is:", str(Prediction_rmse))
prediction_r2=r2_score(train_p_actual1 ,trainYPredict_actual)
print("R2 score of predictions is:", str(prediction_r2))
prediction_mae=mean_absolute_error(train_p_actual1 ,trainYPredict_actual)
print("Mean absolute error of prediction is:", prediction_mae)
**Testing of model**
scaled_test_X = scaler.transform(test_X)
test_q = []
test_r = []
for val in range(n_past, len(scaled_test_X) - n_future+1):
test_q.append(scaled_test_X[val - n_past:val, 0:scaled_test_X.shape[1]])
test_r.append(scaled_test_X[val + n_future - 1:val + n_future, -1])
test_q, test_r = np.array(test_q), np.array(test_r)
testPredict = model.predict(test_q )
Validation and training loss image is also attached Training and validation Loss

Chamfer distance Raises Error in a Keras autoencoder model

I am currently designing an auto-encoder using PointNet algorithm, I am facing a problem, apparently in my loss function Chamfer-distance.
I try to run the training process but the following Error raises.
I would really appreciate your help.
ValueError: Expected list with element dtype float but got list with element dtype double for '{{node gradient_tape/chamfer_distance_tf/map/while/gradients/chamfer_distance_tf/map/while/TensorArrayV2Write/TensorListSetItem_grad/TensorListGetItem}} = TensorListGetItem[element_dtype=DT_FLOAT](gradient_tape/chamfer_distance_tf/map/while/gradients/grad_ys_0, gradient_tape/chamfer_distance_tf/map/while/gradients/chamfer_distance_tf/map/while/TensorArrayV2Write/TensorListSetItem_grad/TensorListSetItem/TensorListPopBack:1, gradient_tape/chamfer_distance_tf/map/while/gradients/chamfer_distance_tf/map/while/TensorArrayV2Write/TensorListSetItem_grad/Shape)' with input shapes: [], [], [0].
ERROR:tensorflow:Error: Input value Tensor("chamfer_distance_tf/map/while/add:0", shape=(), dtype=float32) has dtype <dtype: 'float32'>, but expected dtype <dtype: 'float64'>. This leads to undefined behavior and will be an error in future versions of TensorFlow.
Here is my loss function:
def distance_matrix(array1, array2):
num_point, num_features = array1.shape
expanded_array1 = tf.tile(array1, (num_point, 1))
expanded_array2 = tf.reshape(
tf.tile(tf.expand_dims(array2, 1),
(1, num_point, 1)),
(-1, num_features))
distances = tf.norm(expanded_array1-expanded_array2, axis=1)
distances = tf.reshape(distances, (num_point, num_point))
return distances
def av_dist(array1, array2):
distances = distance_matrix(array1, array2)
distances = tf.reduce_min(distances, axis=1)
distances = tf.reduce_mean(distances)
return distances
def av_dist_sum(arrays):
array1, array2 = arrays
av_dist1 = av_dist(array1, array2)
av_dist2 = av_dist(array2, array1)
return av_dist1+av_dist2
def chamfer_distance_tf(array1, array2):
batch_size, num_point, num_features = array1.shape
dist = tf.reduce_mean(
tf.map_fn(av_dist_sum, elems=(array1, array2), dtype=tf.float64), axis=-1)
return dist
print(chamfer_distance_tf(X,X))
model.compile(
loss= chamfer_distance_tf,
optimizer=keras.optimizers.Adam(learning_rate=0.001),
metrics=["sparse_categorical_accuracy"],)
model.fit(train_data, train_data, epochs=5, batch_size=10)`
and Here is the first part of the code:
list_pcl = []
for filename in os.listdir(directory):
if filename.endswith(".ply") :
directory_file = os.path.join(directory, filename)
print(directory_file)
pcl = PlyData.read(directory_file)
data = pcl.elements[0].data
data = np.asarray(data.tolist())
data.resize(1024,3)
print(type(data))
print(data.shape)
list_pcl.append(data)
print(len(list_pcl))
#X = np.asarray(list_pcl[0:73999])
#X_val = np.asarray(list_pcl[74000:74299])
#X_test = np.asarray(list_pcl[74300:74329])
X = np.asarray(list_pcl[0:200])
X_val = np.asarray(list_pcl[200:220])
X_test = np.asarray(list_pcl[220:228])
random.shuffle(X)
random.shuffle(X_val)
random.shuffle(X_test)
"""**Reshaping the dataset**
The neural network is unable to treat data with different input size, that's why we apply a zero padding to all the data to reach the size of the point cloud data with the biggest number of raws.
We additioally reshape the outcome by adding one dimension corresponidng to the number of channels to the tesors.
"""
train_num = X.shape[0]
val_num = X_val.shape[0]
test_num = X_test.shape[0]
points_num = X.shape[1]
features_num = X.shape[2]
train_data = X.reshape([-1, points_num, features_num]).astype(float)
val_data = X_val.reshape([-1, points_num, features_num]).astype(float)
test_data = X_test.reshape([-1, points_num, features_num]).astype(float)
print(train_data.shape)
print(val_data.shape)
print(test_data.shape)

I don't understand the code related to RNN

from __future__ import print_function
import tensorflow as tf
import numpy as np
from tensorflow.contrib import rnn
tf.set_random_seed(777) # reproducibility
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
data_dim = len(char_set)
hidden_size = len(char_set)
num_classes = len(char_set)
sequence_length = 10 # Any arbitrary number
learning_rate = 0.1
dataX = []
dataY = []
for i in range(0, len(sentence) - sequence_length):
x_str = sentence[i:i + sequence_length]
y_str = sentence[i + 1: i + sequence_length + 1]
print(i, x_str, '->', y_str)
x = [char_dic[c] for c in x_str] # x str to index
y = [char_dic[c] for c in y_str] # y str to index
dataX.append(x)
dataY.append(y)
batch_size = len(dataX)
X = tf.placeholder(tf.int32, [None, sequence_length])
Y = tf.placeholder(tf.int32, [None, sequence_length])
# One-hot encoding
X_one_hot = tf.one_hot(X, num_classes)
print(X_one_hot) # check out the shape
def lstm_cell():
cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple=True)
return cell
multi_cells = rnn.MultiRNNCell([lstm_cell() for _ in range(2)], state_is_tuple=True)
# outputs: unfolding size x hidden size, state = hidden size
outputs, _states = tf.nn.dynamic_rnn(multi_cells, X_one_hot, dtype=tf.float32)
# FC layer
X_for_fc = tf.reshape(outputs, [-1, hidden_size])
outputs = tf.contrib.layers.fully_connected(X_for_fc, num_classes, activation_fn=None)
# reshape out for sequence_loss
outputs = tf.reshape(outputs, [batch_size, sequence_length, num_classes])
# All weights are 1 (equal weights)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
mean_loss = tf.reduce_mean(sequence_loss)
train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(mean_loss)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(500):
_, l, results = sess.run(
[train_op, mean_loss, outputs], feed_dict={X: dataX, Y: dataY})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
print(i, j, ''.join([char_set[t] for t in index]), l)
# Let's print the last char of each result to check it works
results = sess.run(outputs, feed_dict={X: dataX})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
if j is 0: # print all for the first result to make a sentence
print(''.join([char_set[t] for t in index]), end='')
else:
print(char_set[index[-1]], end='')
'''
0 167 tttttttttt 3.23111
0 168 tttttttttt 3.23111
0 169 tttttttttt 3.23111
…
499 167 of the se 0.229616
499 168 tf the sea 0.229616
499 169 the sea. 0.229616
g you want to build a ship, don't drum up people together to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea.
'''
(Please understand that English is not my native language)
I don't understand the last 'if, else' part of the code above, can anyone explain?
Why do print(''. Join ([char_set [t] for t in index]), end = '') only when j is 0,
In the case of else, why do print(char_set [index [-1]], end = '')?
Please explain how the code works
That last bit is just checking whether the network works or not. It generates multiple results first, and then iterates through those results. I guess the creator of this code snippet wanted to check the whole sentence in the first result, and then only the last characters for the rest. That is up to you entirely, if you want to change it.

Understanding model loss/accuracy and how not to leak information

This question is related to the starting one posted here.
The problem is to classify rows so that the classification of row number i can rely on the data for all the previous rows including class membership. The linked post contains an answer which is posted bellow.
For the sake of experimentation I've used a set of randomly crafted data, where the classifying property is a 0,1 uniform random variable.
What strikes me is that the loss of the model in the above example is really low and the accuracy is 99% whereas I would expect something in the 50% range.
So I am assuming that the way the model is testing the classification is leaking information somehow.
Does anybody happen to see what's the issue? What would be the proper way to evaluate the accuracy in such scenario?
import tensorflow as tf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from random import randint
SIZE = 100
df = pd.DataFrame({'Temperature': list(range(SIZE)),
'Weight': [randint(1,100) for _ in range(SIZE)],
'Size': [randint(1,10000) for _ in range(SIZE)],
'Property': [randint(0,1) for _ in range(SIZE)]})
df.Property = df.Property.shift(-1)
print ( df.head() )
# parameters
time_steps = 1
inputs = 3
outputs = 2
df = df.iloc[:-1,:]
df = df.values
train_X = df[:, :-1]
train_y = df[:, -1]
scaler = MinMaxScaler(feature_range=(0, 1))
train_X = scaler.fit_transform(train_X)
train_X = train_X[:,None,:]
onehot_encoder = OneHotEncoder()
encode_categorical = train_y.reshape(len(train_y), 1)
train_y = onehot_encoder.fit_transform(encode_categorical).toarray()
learning_rate = 0.001
epochs = 50000
batch_size = int(train_X.shape[0]/2)
length = train_X.shape[0]
display = 100
neurons = 100
tf.reset_default_graph()
X = tf.placeholder(tf.float32, [None, time_steps, inputs])
y = tf.placeholder(tf.float32, [None, outputs])
cell = tf.contrib.rnn.BasicLSTMCell(num_units=neurons, activation=tf.nn.relu)
cell_outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
stacked_outputs = tf.reshape(cell_outputs, [-1, neurons])
out = tf.layers.dense(inputs=stacked_outputs, units=outputs)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(
labels=y, logits=out))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)
accuracy = tf.metrics.accuracy(labels = tf.argmax(y, 1),
predictions = tf.argmax(out, 1),
name = "accuracy")
precision = tf.metrics.precision(labels=tf.argmax(y, 1),
predictions=tf.argmax(out, 1),
name="precision")
recall = tf.metrics.recall(labels=tf.argmax(y, 1),
predictions=tf.argmax(out, 1),
name="recall")
f1 = 2 * accuracy[1] * recall[1] / ( precision[1] + recall[1] )
with tf.Session() as sess:
tf.global_variables_initializer().run()
tf.local_variables_initializer().run()
for steps in range(epochs):
mini_batch = zip(range(0, length, batch_size),
range(batch_size, length+1, batch_size))
for (start, end) in mini_batch:
sess.run(training_op, feed_dict = {X: train_X[start:end,:,:],
y: train_y[start:end,:]})
if (steps+1) % display == 0:
loss_fn = loss.eval(feed_dict = {X: train_X, y: train_y})
print('Step: {} \tTraining loss: {}'.format((steps+1), loss_fn))
acc, prec, recall, f1 = sess.run([accuracy, precision, recall, f1],
feed_dict = {X: train_X, y: train_y})
print('\nEvaluation on training set')
print('Accuracy:', acc[1])
print('Precision:', prec[1])
print('Recall:', recall[1])
print('F1 score:', f1)

LSTM model error is percent of one output class

I'm having a rough time trying to figure out what's wrong with my LSTM model. I have 11 inputs, and 2 output classes (one-hot encoded) and very quickly, like within 1 batch or so, the error just goes to the % of one of the output classes and stays there.
I tried printing weights and biases, but they seem to all be full of NaN.
If i decrease the learning rate, or mess around with layers/units, I can get it to arrive at the % of one class error slowly, but it seems to always get to that point.
Here's the code:
num_units = 30
num_layers = 50
dropout_rate = 0.80
learning_rate=0.0001
batch_size = 180
epoch = 1
input_classes = len(train_input[0])
output_classes = len(train_output[0])
data = tf.placeholder(tf.float32, [None, input_classes, 1]) #Number of examples, number of input, dimension of each input
target = tf.placeholder(tf.float32, [None, output_classes]) #one-hot encoded: [1,0] = bad, [0,1] = good
dropout = tf.placeholder(tf.float32)
cell = tf.contrib.rnn.LSTMCell(num_units, state_is_tuple=True)
cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=dropout)
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers, state_is_tuple=True)
#Input shape [batch_size, max_time, depth], output shape: [batch_size, max_time, cell.output_size]
val, _ = tf.nn.dynamic_rnn(cell, data, dtype=tf.float32)
val = tf.transpose(val, [1, 0, 2]) #reshapes it to [sequence_size, batch_size, depth]
#get last entry as it includes previous results
last = tf.gather(val, int(val.get_shape()[0]) - 1)
weight = tf.get_variable("W", shape=[num_units, output_classes], initializer=tf.contrib.layers.xavier_initializer())
bias = tf.get_variable("B", shape=[output_classes], initializer=tf.contrib.layers.xavier_initializer())
logits = tf.matmul(last, weight) + bias
prediction = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=target)
prediction = tf.clip_by_value(prediction, 1e-10,100.0)
cost = tf.reduce_mean(prediction)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
minimize = optimizer.minimize(cost)
mistakes = tf.not_equal(tf.argmax(target, 1), tf.argmax(logits, 1))
error = tf.reduce_mean(tf.cast(mistakes, tf.float32))
init_op = tf.global_variables_initializer()
saver = tf.train.Saver()
sess = tf.Session()
sess.run(init_op)
no_of_batches = int((len(train_input)) / batch_size)
for i in range(epoch):
ptr = 0
for j in range(no_of_batches):
inp, out = train_input[ptr:ptr+batch_size], train_output[ptr:ptr+batch_size]
ptr+=batch_size
sess.run(minimize,{data: inp, target: out, dropout: dropout_rate })
sess.close()
Since you have one hot encoding use sparse_softmax_cross_entropy_with_logits instead of tf.nn.softmax_cross_entropy_with_logits.
Refer to this stackoverflow answer to understand the difference of two functions.
1