I've been walking through some tensorflow tutorials and am cobbling together a pet experiment. However, I am running into some dimension errors and I can seem to figure them out.
My goal: I have an input matrix for the shape 1xN. I have a training set of dimension 10xN. (1 and 10 were chosen arbitrarily). N is intended to represent N samples in a training set: 1 input value mapped to one vector of outputs. You can think of this as 1 input neuron and m output neurons. The training set is a set of these single values mapped to a 1d vector. I wish to train the network by running the set of these mapped inputs and outputs against it and reducing the error.
The simple algorithm that I am trying to accomplish:
For each value in the input vector
Load the input neuron with that value
Feed forward
Evaluate against the corresponding vector
Repeat to minimize error.
However, I seem to be getting mixed up with how to format the data to feed to the network. I have a placeholder of 1 input neurons and one of n output neurons. I want to follow the above algorithm but I am not sure if I am doing it right:
# Data parameters
num_frames = 10
stimuli_value_low = .00001
stimuli_value_high = 100
pixel_value_low = .00001
pixel_value_high = 256.0
stimuli_dimension = 1
frame_dimension = 10
stimuli = np.random.uniform(stimuli_value_low, stimuli_value_high, (stimuli_dimension, num_frames))
frames = np.random.uniform(pixel_value_low, pixel_value_high, (frame_dimension, num_frames))
# Parameters
learning_rate = 0.01
training_iterations = 1000
display_iteration = 10
# Network Parameters
n_hidden_1 = 100
n_hidden_2 = 100
num_input_neurons = stimuli_dimension
num_output_neurons = frame_dimension
# Create placeholders
input_placeholder = tf.placeholder("float", [None, num_input_neurons])
output_placeholder = tf.placeholder("float", [None, num_output_neurons])
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([num_input_neurons, n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_hidden_2, num_output_neurons]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'out': tf.Variable(tf.random_normal([num_output_neurons]))
}
# Create model
def neural_net(input_placeholder):
# Hidden fully connected layer
layer_1 = tf.add(tf.matmul(input_placeholder, weights['h1']), biases['b1'])
# Hidden fully connected layer
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
# Output fully connected layer with a neuron for each pixel
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
return out_layer
# Construct model
logits = neural_net(input_placeholder)
# Define loss operation and optimizer
loss_operation = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = output_placeholder))
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate)
train_operation = optimizer.minimize(loss_operation)
# Evaluate model (with test logits, for dropout to be disabled)
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(output_placeholder, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start Training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
for step in range(1, training_iterations + 1):
sess.run(train_operation, feed_dict = {X: stimuli, Y: frames})
if iteration % display_iteration == 0 or iteration == 1:
loss, accuracy = sess.run([loss_operation, accuracy_operation], feed_dict = {X: stimuli, Y: frames})
print("Step " + str(iteration) +
", Loss = " + "{:.4f}".format(loss) +
", Training Accuracy= " + \
"{:.3f}".format(acc))
print("Optimization finished!")
I think it is something to do with how I am structuring my data or feeding it to the run function.
Here is the error I am getting:
ValueError Traceback (most recent call last)
<ipython-input-420-7517598734d6> in <module>()
6 for step in range(1, training_iterations + 1):
7
----> 8 sess.run(train_operation, feed_dict = {X: stimuli, Y: frames})
9
10 if iteration % display_iteration == 0 or iteration == 1:
1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1147 'which has shape %r' %
1148 (np_val.shape, subfeed_t.name,
-> 1149 str(subfeed_t.get_shape())))
1150 if not self.graph.is_feedable(subfeed_t):
1151 raise ValueError('Tensor %s may not be fed.' % subfeed_t)
ValueError: Cannot feed value of shape (1, 10) for Tensor 'Placeholder_6:0', which has shape '(?, 1)'
How can I ensure I am formatting my input data correctly and forming my network corresponingly?
Turns out I had the dimensions of the arrays I was generating backwards:
stimuli = np.random.uniform(stimuli_value_low, stimuli_value_high, (stimuli_dimension, num_frames))
frames = np.random.uniform(pixel_value_low, pixel_value_high, (frame_dimension, num_frames))
should be:
stimuli = np.random.uniform(stimuli_value_low, stimuli_value_high, (num_frames, stimuli_dimension))
frames = np.random.uniform(pixel_value_low, pixel_value_high, (num_frames, frame_dimension))
Related
I am trying to train a dataset of 10,000+ images using Tensorflow GPU (GTX 1060 Max-Q 6GB). Because the size of the image in my dataset is huge (512 x 424 pixels) I get a MemoryError.
Traceback (most recent call last):
File "train.py", line 33, in <module>
data = dataset.read_train_sets(train_path, img_size, classes, validation_size=validation_size)
File "/home/nabeel/tf-realsense-gesture/dataset.py", line 103, in read_train_sets
images, labels, img_names, cls = shuffle(images, labels, img_names, cls)
File "/home/nabeel/anaconda3/envs/tensorflow/lib/python2.7/site-packages/sklearn/utils/__init__.py", line 403, in shuffle
return resample(*arrays, **options)
File "/home/nabeel/anaconda3/envs/tensorflow/lib/python2.7/site-packages/sklearn/utils/__init__.py", line 327, in resample
resampled_arrays = [safe_indexing(a, indices) for a in arrays]
File "/home/nabeel/anaconda3/envs/tensorflow/lib/python2.7/site-packages/sklearn/utils/__init__.py", line 216, in safe_indexing
return X.take(indices, axis=0)
MemoryError
The problem with my code is that I am training all seven classes at the same time which is why I get a memory error. I want to process single class at a time.
I have tried to implement a while/for loop inside but every time a loop finishes, the .meta file is overwritten and only works on one class. Is there any way to train multiple classes at the time or one by one?
train.py
batch_size = 1
# 7 classess for recognitions
#classes = ['up']
classes = ['up','down','left','right','forward','backward','none']
#classes = ['up','down','left','right','forward','backward','none']
num_classes = len(classes)
# 20% of the data will automatically be used for validation
validation_size = 0.2
img_size = 200
num_channels = 3
train_path='training_data'
# load all the training and validation images and labels into memory
data = dataset.read_train_sets(train_path, img_size, classes, validation_size=validation_size)
print("Complete reading input data. Will Now print a snippet of it")
print("Number of files in Training-set:\t\t{}".format(len(data.train.labels)))
print("Number of files in Validation-set:\t{}".format(len(data.valid.labels)))
session = tf.Session()
x = tf.placeholder(tf.float32, shape=[batch_size,img_size,img_size,num_channels], name='x')
# labels
y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')
y_true_cls = tf.argmax(y_true, dimension=1)
#Network graph params
filter_size_conv1 = 3
num_filters_conv1 = 32
filter_size_conv2 = 3
num_filters_conv2 = 32
filter_size_conv3 = 3
num_filters_conv3 = 64
filter_size_conv4 = 3
num_filters_conv4 = 128
filter_size_conv5 = 3
num_filters_conv5 = 256
filter_size_conv6 = 3
num_filters_conv6 = 512
filter_size_conv7 = 3
num_filters_conv7= 1024
fc_layer_size = 2048
def create_weights(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
def create_biases(size):
return tf.Variable(tf.constant(0.05, shape=[size]))
def create_convolutional_layer(input,num_input_channels,conv_filter_size,num_filters):
# define the weights that will be trained
weights = create_weights(shape=[conv_filter_size, conv_filter_size, num_input_channels, num_filters])
# create biases
biases = create_biases(num_filters)
# Creat convolutional layer
layer = tf.nn.conv2d(input=input,filter=weights,strides=[1, 1, 1, 1],padding='SAME')
layer += biases
# max-pooling
layer = tf.nn.max_pool(value=layer,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
padding='SAME')
# Relu is the activation function
layer = tf.nn.relu(layer)
return layer
def create_flatten_layer(layer):
layer_shape = layer.get_shape()
num_features = layer_shape[1:4].num_elements()
# Flatten the layer so reshape to num_features
layer = tf.reshape(layer, [-1, num_features])
return layer
def create_fc_layer(input,
num_inputs,
num_outputs,
use_relu=True):
# define trainable weights and biases.
weights = create_weights(shape=[num_inputs, num_outputs])
biases = create_biases(num_outputs)
# Fully connected layer
layer = tf.matmul(input, weights) + biases
if use_relu:
layer = tf.nn.relu(layer)
return layer
layer_conv1 = create_convolutional_layer(input=x,num_input_channels=num_channels,conv_filter_size=filter_size_conv1,
num_filters=num_filters_conv1)
layer_conv2 = create_convolutional_layer(input=layer_conv1,
num_input_channels=num_filters_conv1,
conv_filter_size=filter_size_conv2,
num_filters=num_filters_conv2)
layer_conv3= create_convolutional_layer(input=layer_conv2,
num_input_channels=num_filters_conv2,
conv_filter_size=filter_size_conv3,
num_filters=num_filters_conv3)
layer_conv4= create_convolutional_layer(input=layer_conv3,
num_input_channels=num_filters_conv3,
conv_filter_size=filter_size_conv4,
num_filters=num_filters_conv4)
layer_conv5= create_convolutional_layer(input=layer_conv4,
num_input_channels=num_filters_conv4,
conv_filter_size=filter_size_conv5,
num_filters=num_filters_conv5)
layer_conv6= create_convolutional_layer(input=layer_conv5,
num_input_channels=num_filters_conv5,
conv_filter_size=filter_size_conv6,
num_filters=num_filters_conv6)
layer_conv7= create_convolutional_layer(input=layer_conv6,
num_input_channels=num_filters_conv6,
conv_filter_size=filter_size_conv7,
num_filters=num_filters_conv7)
layer_flat = create_flatten_layer(layer_conv7)
layer_fc1 = create_fc_layer(input=layer_flat,num_inputs=layer_flat.get_shape()[1:4].num_elements(),num_outputs=fc_layer_size,
use_relu=True)
layer_fc2 = create_fc_layer(input=layer_fc1, num_inputs=fc_layer_size,num_outputs=num_classes, use_relu=False)
y_pred = tf.nn.softmax(layer_fc2,name='y_pred')
y_pred_cls = tf.argmax(y_pred, dimension=1)
session.run(tf.global_variables_initializer())
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,labels=y_true)
cost = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(cost)
correct_prediction = tf.equal(y_pred_cls, y_true_cls)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
session.run(tf.global_variables_initializer())
def show_progress(epoch, feed_dict_train, feed_dict_validate, val_loss):
acc = session.run(accuracy, feed_dict=feed_dict_train)
val_acc = session.run(accuracy, feed_dict=feed_dict_validate)
msg = "Training Epoch {0} --- Training Accuracy: {1:>6.1%}, Validation Accuracy: {2:>6.1%}, Validation Loss: {3:.3f}"
print(msg.format(epoch + 1, acc, val_acc, val_loss))
total_iterations = 0
saver = tf.train.Saver()
def train(num_iteration):
global total_iterations
for i in range(total_iterations,total_iterations + num_iteration):
x_batch, y_true_batch, _, cls_batch = data.train.next_batch(batch_size)
x_valid_batch, y_valid_batch, _, valid_cls_batch = data.valid.next_batch(batch_size)
feed_dict_tr = {x: x_batch,y_true: y_true_batch}
feed_dict_val = {x: x_valid_batch,y_true: y_valid_batch}
session.run(optimizer, feed_dict=feed_dict_tr)
if i % int(data.train.num_examples/batch_size) == 0:
val_loss = session.run(cost, feed_dict=feed_dict_val)
epoch = int(i / int(data.train.num_examples/batch_size))
show_progress(epoch, feed_dict_tr, feed_dict_val, val_loss)
saver.save(session, '/home/nabeel/tf-realsense-gesture/')
total_iterations += num_iteration
train(num_iteration=6000)
Since you are Facing Out Of Memory Issue in CNNs, you can try the below steps:
Increase the Strides of the Convolutional Layer i.e., instead of using Sh = 1 and Sw = 1, you can use either Sh = 2 and Sw = 2. This will reduce the Dimensionality of the Image and hence will reduce the RAM Consumption. Code for the same is shown below:
layer = tf.nn.conv2d(input=input,filter=weights,strides=[1, 2, 2, 1],padding='SAME')
Verify if you really require 7 Convolutional Layers. You can try with Less Number of Convolutional Layers (4 or 5 or 6) and can check the performance. Because each Convolutional Layer with some Number of Filters will increase the Memory usage.
Replace tf.float32 with tf.float16 and if it works without any error.
Using an Inception Module instead of Convolutional Layer.
I have written tensorflow code based on:
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/
but using precomputed word embeddings from the GoogleNews word2vec 300 dimension model.
I created my own data from the UCML News Aggregator Dataset in which I parsed the content of the news articles and have created my own labels.
Due to the size of the articles I use TF-IDF to filter out the top 120 words per article and embed those into 300 dimensions.
When I run the CNN I created regardless of the hyper parameters it converges to a small general accuracy, around 38%.
Hyper parameters changed:
Various filter sizes:
I've tried a single filter of 1,2,3
Combinations of filters [3,4,5], [1,3,4]
Learning Rate:
I've varied this from very low to very high, very low doesn't converge to 38% but anything between 0.0001 and 0.4 does.
Batch Size:
Tried many ranges between 5 and 100.
Weight and Bias Initialization:
Set stddev of weights between 0.4 and 0.01.
Set bias initial values between 0 and 0.1.
Tried using the xavier initializer for the conv2d weights.
Dataset Size:
I have only tried on two partial data sets, one with 15 000 training data, and the other on the 5000 test data. In total I have 263 000 data to train on. There is no accuracy difference whether trained and evaluated on the 15 000 training data or by using the 5000 test data as the training data (to save testing time).
I've run successful classifications on the 15 000 / 5000 split using a feed forward network with a BoW input (93% accurate), TF-IDF with SVM (92%), and TF-IDF with Native Bayes (91.5%). So I don't think it is the data.
What does this imply? Is the model just a poor model for this task? Is there an error in my work?
I feel like my do_eval function is incorrect to evaluate the accuracy / loss over an epoch of the data:
def do_eval(data_set,
label_set,
batch_size):
"""
Runs one evaluation against the full epoch of data.
data_set: The set of embeddings to eval
label_set: the set of labels to eval
"""
# And run one epoch of eval.
true_count = 0 # Counts the number of correct predictions.
steps_per_epoch = len(label_set) // batch_size
num_examples = steps_per_epoch * batch_size
totalLoss = 0
# Need to compute eval accuracy
for evalStep in xrange(steps_per_epoch):
input_batch, label_batch = nextBatch(data_set, labels_set, batchSize)
evalAcc, evalLoss = eval_step(input_batch, label_batch)
true_count += evalAcc * batchSize
totalLoss += evalLoss
precision = float(true_count) / num_examples
print(' Num examples: %d Num correct: %d Precision # 1: %0.04f' % (num_examples, true_count, precision))
print("Eval Loss: " + str(totalLoss))
The entire model is as follows:
class TextCNN(object):
"""
A CNN for text classification
Uses a convolutional, max-pooling and softmax layer.
"""
def __init__(
self, batchSize, numWords, num_classes,
embedding_size, filter_sizes, num_filters):
# Set place holders
self.input_placeholder = tf.placeholder(tf.float32,[batchSize,numWords,embedding_size,1])
self.labels = tf.placeholder(tf.int32, [batchSize,num_classes])
self.pKeep = tf.placeholder(tf.float32)
# Inference
'''
Ready to build conv layers followed by max pooling layers
Each conv layer produces a different shaped output so need to loop over
them and create a layer for each and then merge the results
'''
pooled_outputs = []
for i, filter_size in enumerate(filter_sizes):
with tf.name_scope("conv-maxpool-%s" % filter_size):
# Convolution Layer
filter_shape = [filter_size, embedding_size, 1, num_filters]
# W: Filter matrix
W = tf.Variable(tf.truncated_normal(filter_shape,stddev=0.01), name='W')
b = tf.Variable(tf.constant(0.0,shape=[num_filters]),name="b")
# Valid padding: Narrow convolution (no edge padded so filter slides over everything)
# Output size = (input_size (numWords in this case) + 2 * padding (0 in this case) - filter_size) + 1
conv = tf.nn.conv2d(
self.input_placeholder,
W,
strides=[1, 1, 1, 1],
padding="VALID",
name="conv")
# Apply nonlinearity i.e add the bias to Wx + b
# Where Wx is the conv layer above
# Then run it through the activation function
h = tf.nn.relu(tf.nn.bias_add(conv, b),name='relu')
# Max-pooling over the outputs
# Max-pool to control the output size
# By taking only the best features determined by the filter
# Ksize is the size of the window of the input tensor
pooled = tf.nn.max_pool(
h,
ksize=[1, numWords - filter_size + 1, 1, 1],
strides=[1, 1, 1, 1],
padding='VALID',
name="pool")
# Each pooled outputs a tensor of size
# [batchSize, 1, 1, num_filters] where num_filters represents the
# Number of features we wanted pooled
pooled_outputs.append(pooled)
# Combine all pooled features
num_filters_total = num_filters * len(filter_sizes)
# Concat the pool output along the 3rd (num_filters / feature size) dimension
self.h_pool = tf.concat(pooled_outputs, 3)
# Flatten
self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])
# Add drop out to regularize the learning curve / accuracy
with tf.name_scope("dropout"):
self.h_drop = tf.nn.dropout(self.h_pool_flat,self.pKeep)
# Fully connected output layer
with tf.name_scope("output"):
W = tf.Variable(tf.truncated_normal([num_filters_total,num_classes],stddev=0.01),name="W")
b = tf.Variable(tf.constant(0.0,shape=[num_classes]), name='b')
self.logits = tf.nn.xw_plus_b(self.h_drop, W, b, name='logits')
self.predictions = tf.argmax(self.logits, 1, name='predictions')
# Loss
with tf.name_scope("loss"):
losses = tf.nn.softmax_cross_entropy_with_logits(labels=self.labels,logits=self.logits, name="xentropy")
self.loss = tf.reduce_mean(losses)
# Accuracy
with tf.name_scope("accuracy"):
correct_predictions = tf.equal(self.predictions, tf.argmax(self.labels,1))
self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")
##################################################################################################################
# Running the training
# Define various parameters for network
batchSize = 100
numWords = 120
embedding_size = 300
num_classes = 4
filter_sizes = [3,4,5] # slide over a the number of words, i.e 3 words, 4 words etc...
num_filters = 126
maxSteps = 5000
initial_learning_rate = 0.001
dropoutRate = 1
data_set = np.load("/home/kevin/Documents/NSERC_2017/articles/classifyDataSet/TestSmaller_CNN_inputMat_0.npy")
labels_set = np.load("Test_NN_target_smaller.npy")
with tf.Graph().as_default():
sess = tf.Session()
with sess.as_default():
cnn = TextCNN(batchSize=batchSize,
numWords=numWords,
num_classes=num_classes,
num_filters=num_filters,
embedding_size=embedding_size,
filter_sizes=filter_sizes)
# Define training operation
# Pick an optimizer, set it's learning rate, and tell it what to minimize
global_step = tf.Variable(0,name='global_step', trainable=False)
optimizer = tf.train.AdamOptimizer(initial_learning_rate)
grads_and_vars = optimizer.compute_gradients(cnn.loss)
train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)
# Summaries to save for tensor board
# Set directory
out_dir = "/home/kevin/Documents/NSERC_2017/articles/classifyDataSet/tf_logs/CNN_Embedding/"
# Loss and accuracy summaries
loss_summary = tf.summary.scalar("loss",cnn.loss)
acc_summary = tf.summary.scalar("accuracy", cnn.accuracy)
# Train summaries
train_summary_op = tf.summary.merge([loss_summary,acc_summary])
train_summary_dir = out_dir + "train/"
train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)
# Test summaries
test_summary_op = tf.summary.merge([loss_summary, acc_summary])
test_summary_dir = out_dir + "test/"
test_summary_write = tf.summary.FileWriter(test_summary_dir, sess.graph)
# Init all variables
init = tf.global_variables_initializer()
sess.run(init)
############################################################################################
def train_step(input_data, labels_data):
'''
Single training step
:param input_data: input
:param labels_data: labels to train to
'''
feed_dict = {
cnn.input_placeholder: input_data,
cnn.labels: labels_data,
cnn.pKeep: dropoutRate
}
_, step, summaries, loss, accuracy = sess.run(
[train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],
feed_dict=feed_dict)
train_summary_writer.add_summary(summaries, step)
###############################################################################################
def eval_step(input_data, labels_data, writer=None):
"""
Evaluates model on a test set
Single step
"""
feed_dict = {
cnn.input_placeholder: input_data,
cnn.labels: labels_data,
cnn.pKeep: 1.0
}
step, summaries, loss, accuracy = sess.run(
[global_step, test_summary_op, cnn.loss, cnn.accuracy],
feed_dict)
if writer:
writer.add_summary(summaries, step)
return accuracy, loss
###############################################################################
def nextBatch(data_set, labels_set, batchSize):
'''
Get the next batch of data
:param data_set: entire training or test data set
:param labels_set: entire training or test label set
:param batchSize: batch size
:return: a batch of the data and it's corresponding labels
'''
# Generate random row indices for the documents
rand_index = np.random.choice(data_set.shape[0], size=batchSize)
# Grab the data to give to the feed dicts
data_batch, labels_batch = data_set[rand_index, :, :], labels_set[rand_index, :]
# Resize for tensorflow
data_batch = data_batch.reshape([data_batch.shape[0],data_batch.shape[1],data_batch.shape[2],1])
return data_batch, labels_batch
################################################################################
def do_eval(data_set,
label_set,
batch_size):
"""
Runs one evaluation against the full epoch of data.
data_set: The set of embeddings to eval
label_set: the set of labels to eval
"""
# And run one epoch of eval.
true_count = 0 # Counts the number of correct predictions.
steps_per_epoch = len(label_set) // batch_size
num_examples = steps_per_epoch * batch_size
totalLoss = 0
# Need to compute eval accuracy
for evalStep in xrange(steps_per_epoch):
input_batch, label_batch = nextBatch(data_set, labels_set, batchSize)
evalAcc, evalLoss = eval_step(input_batch, label_batch)
true_count += evalAcc * batchSize
totalLoss += evalLoss
precision = float(true_count) / num_examples
print(' Num examples: %d Num correct: %d Precision # 1: %0.04f' % (num_examples, true_count, precision))
print("Eval Loss: " + str(totalLoss))
######################################################################################################
# Training Loop
for step in range(maxSteps):
input_batch, label_batch = nextBatch(data_set,labels_set,batchSize)
train_step(input_batch,label_batch)
# Evaluate over the entire data set on last eval
if step % 100 == 0:
print "On Step : " + str(step) + " of " + str(maxSteps)
do_eval(data_set, labels_set,batchSize)
The embedding is done before the model:
def createInputEmbeddedMatrix(corpusPath, maxWords, svName):
# Create a [docNum, Words per Art, Embedding Size] matrix to fill
genDocsPath = "gen_docs_classifyData_smallerTest_TFIDF.npy"
# corpus = "newsCorpus_word2vec_All_Corpus.mm"
dictPath = 'news_word2vec_smallerDict.dict'
tf_idf_path = "news_tfIdf_word2vec_All.tfidf_model"
gen_docs = np.load(genDocsPath)
dictionary = gensim.corpora.dictionary.Dictionary.load(dictPath)
tf_idf = gensim.models.tfidfmodel.TfidfModel.load(tf_idf_path)
corpus = corpora.MmCorpus(corpusPath)
numOfDocs = len(corpus)
embedding_size = 300
id2embedding = np.load("smallerID2embedding.npy").item()
# Need to process in batches as takes up a ton of memory
step = 5000
totalSteps = int(np.ceil(numOfDocs / step))
for i in range(totalSteps):
# inputMatrix = scipy.sparse.csr_matrix([step,maxWords,embedding_size])
inputMatrix = np.zeros([step, maxWords, embedding_size])
start = i * step
end = start + step
for docNum in range(start, end):
print "On docNum " + str(docNum) + " of " + str(numOfDocs)
# Extract the top N words
topWords, wordVal = tf_idfTopWords(docNum, gen_docs, dictionary, tf_idf, maxWords)
# doc = corpus[docNum]
# Need to track word dex and doc dex seperate
# Doc dex because of the batch processing
wordDex = 0
docDex = 0
for wordID in wordVal:
inputMatrix[docDex, wordDex, :] = id2embedding[wordID]
wordDex += 1
docDex += 1
# Save the batch of input data
# scipy.sparse.save_npz(svName + "_%d" % i, inputMatrix)
np.save(svName + "_%d.npy" % i, inputMatrix)
#####################################################################################
Turns out my error was in the creation of the input matrix.
for i in range(totalSteps):
# inputMatrix = scipy.sparse.csr_matrix([step,maxWords,embedding_size])
inputMatrix = np.zeros([step, maxWords, embedding_size])
start = i * step
end = start + step
for docNum in range(start, end):
print "On docNum " + str(docNum) + " of " + str(numOfDocs)
# Extract the top N words
topWords, wordVal = tf_idfTopWords(docNum, gen_docs, dictionary, tf_idf, maxWords)
# doc = corpus[docNum]
# Need to track word dex and doc dex seperate
# Doc dex because of the batch processing
wordDex = 0
docDex = 0
for wordID in wordVal:
inputMatrix[docDex, wordDex, :] = id2embedding[wordID]
wordDex += 1
docDex += 1
docDex should not have been reset to 0 on each iteration of the inner loop, I was effectively overwriting the first row of my input matrix and thus the rest were 0's.
I have an error like this:
InvalidArgumentError (see above for traceback): logits and labels must
be same size: logits_size=[10,9] labels_size=[7040,9] [[Node:
SoftmaxCrossEntropyWithLogits =
SoftmaxCrossEntropyWithLogits[T=DT_FLOAT,
_device="/job:localhost/replica:0/task:0/gpu:0"](Reshape, Reshape_1)]]
But I can't find the tensor which occurs this error.... I think it is appeared by size mismatching...
My Input size is batch_size * n_steps * n_input
so, It will be 10*704*100, And I want to make the output
batch_size * n_steps * n_classes => It will by 10*700*9, by Bidirectional RNN
How should I change this code to fix the error?
batch_size means the number of datas like this:
data 1 : ABCABCABCAAADDD...
...
data 10 : ABCCCCABCDBBAA...
And
n_step means the length of each data ( The data was padded by 'O' to fix the length of each data) : 704
And
n_input means the data how to express the each alphabet in each data like this:
A - [1, 2, 1, -1, ..., -1]
And the output of the learning should be like this:
output of data 1 : XYZYXYZYYXY ...
...
output of data 10 : ZXYYRZYZZ ...
the each alphabet of output was effected by the surrounding and sequence of alphabet of input.
learning_rate = 0.001
training_iters = 100000
batch_size = 10
display_step = 10
# Network Parameters
n_input = 100
n_steps = 704 # timesteps
n_hidden = 50 # hidden layer num of features
n_classes = 9
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_steps, n_classes])
weights = {
'out': tf.Variable(tf.random_normal([2*n_hidden, n_classes]))
}
biases = {
'out': tf.Variable(tf.random_normal([n_classes]))
}
def BiRNN(x, weights, biases):
x = tf.unstack(tf.transpose(x, perm=[1, 0, 2]))
# Forward direction cell
lstm_fw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Backward direction cell
lstm_bw_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Get lstm cell output
try:
outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
dtype=tf.float32)
except Exception: # Old TensorFlow version only returns outputs not states
outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, x,
dtype=tf.float32)
# Linear activation, using rnn inner loop last output
return tf.matmul(outputs[-1], weights['out']) + biases['out']
pred = BiRNN(x, weights, biases)
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Evaluate model
correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
# Initializing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
step = 1
while step * batch_size < training_iters:
batch_x, batch_y = next_batch(batch_size, r_big_d, y_r_big_d)
#batch_x = batch_x.reshape((batch_size, n_steps, n_input))
# Run optimization op (backprop)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
if step % display_step == 0:
# Calculate batch accuracy
acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
# Calculate batch loss
loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
"{:.6f}".format(loss) + ", Training Accuracy= " + \
"{:.5f}".format(acc))
step += 1
print("Optimization Finished!")
test_x, test_y = next_batch(batch_size, v_big_d, y_v_big_d)
print("Testing Accuracy:", \
sess.run(accuracy, feed_dict={x: test_x, y: test_y}))
The first return value of static_bidirectional_rnn is a list of tensors - one for each rnn step. By using only the last one in your tf.matmul you're losing all the rest. Instead, stack them into a single tensor of the appropriate shape, reshape for the matmul then shape back.
outputs = tf.stack(outputs, axis=1)
outputs = tf.reshape(outputs, (batch_size*n_steps, n_hidden))
outputs = tf.matmul(outputs, weights['out']) + biases['out']
outputs = tf.reshape(outputs, (batch_size, n_steps, n_classes))
Alternatively, you could use tf.einsum:
outputs = tf.stack(outputs, axis=1)
outputs = tf.einsum('ijk,kl->ijl', outputs, weights['out']) + biases['out']
The one-D data concludes 80 samples, with everyone is 1089 length. I want to use 70 samples to training and 10 samples to testing.
I am totally beginner in python and tensorflow, so I use the code which is processing image(which is two-dimension). Here is the code I use(all the parameters are pretty low for I just want to test the code):
import tensorflow as tf
import scipy.io as sc
from tensorflow.python.ops import rnn, rnn_cell
# data read
feature_training = sc.loadmat("feature_training.mat")
feature_training = feature_training['feature_training']
print (feature_training.shape)
feature_testing = sc.loadmat("feature_testing.mat")
feature_testing = feature_testing['feature_testing']
print (feature_testing.shape)
label_training = sc.loadmat("label_training.mat")
label_training = label_training['label_training']
print (label_training.shape)
label_testing = sc.loadmat("label_testing.mat")
label_testing = label_testing['label_testing']
print (label_testing.shape)
# parameters
learning_rate = 0.1
training_iters = 100
batch_size = 70
display_step = 10
# network parameters
n_input = 70 # MNIST data input (img shape: 28*28)
n_steps = 100 # timesteps
n_hidden = 10 # hidden layer num of features
n_classes = 2 # MNIST total classes (0-9 digits)
# tf Graph input
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
# Define weights
weights = {
'out': tf.Variable(tf.random_normal([n_hidden, n_classes]))
}
biases = {
'out': tf.Variable(tf.random_normal([n_classes]))
}
def RNN(x, weights, biases):
# Prepare data shape to match `rgnn` function requirements
# Current data input shape: (batch_size, n_steps, n_input)
# Required shape: 'n_steps' tensors list of shape (batch_size, n_input)
# Permuting batch_size and n_steps
x = tf.transpose(x, [1, 0, 2])
# Reshaping to (n_steps*batch_size, n_input)
x = tf.reshape(x, [-1, n_input])
# Split to get a list of 'n_steps' tensors of shape (batch_size, n_input)
x = tf.split(0, n_steps, x)
# Define a lstm cell with tensorflow
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Get lstm cell output
outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)
# Linear activation, using rnn inner loop last output
return tf.matmul(outputs[-1], weights['out']) + biases['out']
pred = RNN(x, weights, biases)
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
# Initializing the variables
init = tf.initialize_all_variables()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
step = 1
# Keep training until reach max iterations
while step * batch_size < training_iters:
batch_x, batch_y = feature_training.next_batch(batch_size)
# Reshape data to get 28 seq of 28 elements
batch_x = batch_x.reshape((batch_size, n_steps, n_input))
# Run optimization op (backprop)
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
if step % display_step == 0:
# Calculate batch accuracy
acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y})
# Calculate batch loss
# loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y})
print ("Iter " + str(step*batch_size) + ", Training Accuracy= " +
"{:.5f}".format(acc))
step += 1
print ("Optimization Finished!")
# Calculate accuracy for 10 testing data
test_len = 10
test_data = feature_testing[:test_len].reshape((-1, n_steps, n_input))
test_label = label_testing[:test_len]
print ("Testing Accuracy:",
sess.run(accuracy, feed_dict={x: test_data, y: test_label}))
At last, it turns out the Error:
Traceback (most recent call last):
File "/home/xiangzhang/MNIST data test.py", line 92, in <module>
batch_x, batch_y = feature_training.batch(batch_size)
AttributeError: 'numpy.ndarray' object has no attribute 'next_batch'
I thought it must be related with the dimension of the data, but I do not know how to fix it. Please help me, thanks very much.
I am creating a deep learning fully connected NN for the MNIST input. I have a function (it takes placeholder input)
# Create model
def multilayer_perceptron(x, activation_fn, weights, biases, dbg=False):
layerDatas = OrderedDict()
# get each layer data
prev = x
for i in range(len(weights)-1):
weight = weights.items()[i][1]
bias = biases.items()[i][1]
var = 'layer_' + str(i+1)
layerData = tf.add(tf.matmul(prev, weight), bias)
layerData = activation_fn(layerData)
prev = layerData
layerDatas[var] = layerData
# output layer with linear function, using the last layer output value
val = tf.matmul(prev, weights['out'])
out_layer = tf.matmul(prev, weights['out']) + biases['out']
print x.eval() # debug the data
return out_layer
which takes multiple layers in weights and biases. I call the main program with
sess = tf.InteractiveSession() # start a session
print 'Data', n_input, n_classes
print 'Train', train_set_x.shape, train_set_y.shape
(weights, biases) = createWeightsBiases(layers, n_input, n_classes, dbg)
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
# Construct model
pred = multilayer_perceptron(x, activation_fn, weights, biases, dbg)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Initializing the variables
init = tf.initialize_all_variables()
done_looping = False
display_step = 1
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Launch the graph
sess.run(init)
# Training cycle
epochs = 1000
for epoch in range(epochs):
avg_cost = 0.
total_batch = int(len(train_set_x)/batch_size)
print 'Batch', total_batch, batch_size
# Loop over all batches
for i in range(total_batch):
batch_x = train_set_x[i * batch_size: (i + 1) * batch_size]
batch_y = train_set_y[i * batch_size: (i + 1) * batch_size]
# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
y: batch_y})
# Compute average loss
avg_cost += c / total_batch
# Display logs per epoch step
if epoch % display_step == 0:
print "Epoch:", '%04d' % (epoch+1), "cost=", \
"{:.9f}".format(avg_cost)
print "Optimization Finished!"
# Test model
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print "Accuracy:", accuracy.eval({x: valid_set_x, y: valid_set_y})
When I try to print the tensor in my multilayer_perceptron function, I get a crash with
tensorflow.python.framework.errors.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder' with dtype float
[[Node: Placeholder = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
I would appreciate help in getting around this.
You can't eval a placeholder. Instead, you can feed the graph with a proper value for the placeholder and only then extract the content (that's the value you fed the graph with).
So, remove the print x.eval() # debug the data line from the multilayer_perceptron function.
To inspect the value of the placeholder you have to fed it and extract the value you just feed it (side note: it's useless).
If you really want to do this, that's how:
placeholder_value = sess.run(x, feed_dict={x: [1,2,3,4]})
print placeholder_value
It will print the value [1,2,3,4]