Related
I've been walking through some tensorflow tutorials and am cobbling together a pet experiment. However, I am running into some dimension errors and I can seem to figure them out.
My goal: I have an input matrix for the shape 1xN. I have a training set of dimension 10xN. (1 and 10 were chosen arbitrarily). N is intended to represent N samples in a training set: 1 input value mapped to one vector of outputs. You can think of this as 1 input neuron and m output neurons. The training set is a set of these single values mapped to a 1d vector. I wish to train the network by running the set of these mapped inputs and outputs against it and reducing the error.
The simple algorithm that I am trying to accomplish:
For each value in the input vector
Load the input neuron with that value
Feed forward
Evaluate against the corresponding vector
Repeat to minimize error.
However, I seem to be getting mixed up with how to format the data to feed to the network. I have a placeholder of 1 input neurons and one of n output neurons. I want to follow the above algorithm but I am not sure if I am doing it right:
# Data parameters
num_frames = 10
stimuli_value_low = .00001
stimuli_value_high = 100
pixel_value_low = .00001
pixel_value_high = 256.0
stimuli_dimension = 1
frame_dimension = 10
stimuli = np.random.uniform(stimuli_value_low, stimuli_value_high, (stimuli_dimension, num_frames))
frames = np.random.uniform(pixel_value_low, pixel_value_high, (frame_dimension, num_frames))
# Parameters
learning_rate = 0.01
training_iterations = 1000
display_iteration = 10
# Network Parameters
n_hidden_1 = 100
n_hidden_2 = 100
num_input_neurons = stimuli_dimension
num_output_neurons = frame_dimension
# Create placeholders
input_placeholder = tf.placeholder("float", [None, num_input_neurons])
output_placeholder = tf.placeholder("float", [None, num_output_neurons])
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([num_input_neurons, n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_hidden_2, num_output_neurons]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'out': tf.Variable(tf.random_normal([num_output_neurons]))
}
# Create model
def neural_net(input_placeholder):
# Hidden fully connected layer
layer_1 = tf.add(tf.matmul(input_placeholder, weights['h1']), biases['b1'])
# Hidden fully connected layer
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
# Output fully connected layer with a neuron for each pixel
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
return out_layer
# Construct model
logits = neural_net(input_placeholder)
# Define loss operation and optimizer
loss_operation = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = output_placeholder))
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate)
train_operation = optimizer.minimize(loss_operation)
# Evaluate model (with test logits, for dropout to be disabled)
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(output_placeholder, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()
# Start Training
with tf.Session() as sess:
# Run the initializer
sess.run(init)
for step in range(1, training_iterations + 1):
sess.run(train_operation, feed_dict = {X: stimuli, Y: frames})
if iteration % display_iteration == 0 or iteration == 1:
loss, accuracy = sess.run([loss_operation, accuracy_operation], feed_dict = {X: stimuli, Y: frames})
print("Step " + str(iteration) +
", Loss = " + "{:.4f}".format(loss) +
", Training Accuracy= " + \
"{:.3f}".format(acc))
print("Optimization finished!")
I think it is something to do with how I am structuring my data or feeding it to the run function.
Here is the error I am getting:
ValueError Traceback (most recent call last)
<ipython-input-420-7517598734d6> in <module>()
6 for step in range(1, training_iterations + 1):
7
----> 8 sess.run(train_operation, feed_dict = {X: stimuli, Y: frames})
9
10 if iteration % display_iteration == 0 or iteration == 1:
1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1147 'which has shape %r' %
1148 (np_val.shape, subfeed_t.name,
-> 1149 str(subfeed_t.get_shape())))
1150 if not self.graph.is_feedable(subfeed_t):
1151 raise ValueError('Tensor %s may not be fed.' % subfeed_t)
ValueError: Cannot feed value of shape (1, 10) for Tensor 'Placeholder_6:0', which has shape '(?, 1)'
How can I ensure I am formatting my input data correctly and forming my network corresponingly?
Turns out I had the dimensions of the arrays I was generating backwards:
stimuli = np.random.uniform(stimuli_value_low, stimuli_value_high, (stimuli_dimension, num_frames))
frames = np.random.uniform(pixel_value_low, pixel_value_high, (frame_dimension, num_frames))
should be:
stimuli = np.random.uniform(stimuli_value_low, stimuli_value_high, (num_frames, stimuli_dimension))
frames = np.random.uniform(pixel_value_low, pixel_value_high, (num_frames, frame_dimension))
I've been working on a simple tensor flow neural network. My input placeholder is
x = tf.placeholder(tf.float32, shape=[None, 52000, 3]).
My weight matrix is initialized to all zeros as
W = tf.Variable(tf.zeros([52000, 10])).
I tried different combinations with and without the 3 for color channels, but I guess I'm just not understanding the dimensionality because I got the error:
Traceback (most recent call last): File
"C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py",
line 686, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py",
line 473, in exit
c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape
must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input
shapes: [?,52000,3], [52000,10].
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "rating.py", line 65, in
y = tf.matmul(x, W) + b # "fake" outputs to train/test File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\ops\math_ops.py",
line 1891, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File
"C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\ops\gen_math_ops.py",
line 2436, in _mat_mul
name=name) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py",
line 787, in _apply_op_helper
op_def=op_def) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py",
line 2958, in create_op
set_shapes_for_outputs(ret) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py",
line 2209, in set_shapes_for_outputs
shapes = shape_func(op) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\ops.py",
line 2159, in call_with_requiring
return call_cpp_shape_fn(op, require_shape_fn=True) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py",
line 627, in call_cpp_shape_fn
require_shape_fn) File "C:\Users\Everybody\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py",
line 691, in _call_cpp_shape_fn_impl
raise ValueError(err.message) ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input shapes: [?,52000,3],
[52000,10].
At first, I thought my next_batch() function was the culprit because I had to make my own due to the fact that I uploaded my images "manually" using scipy.misc.imread(), whose definition reads:
q = 0
def next_batch(batch_size):
x = images[q:q + batch_size]
y = one_hots[q:q + batch_size]
q = (q + batch_size) % len(images)
return x, y
However, after looking through, I don't see what's wrong with this, so I imagine that I'm just confused about dimensionality. It is supposed to be a "flattened" 200x260 color image. It just occurred to me now that maybe I have to flatten the color channels as well? I will place my full code below if curious. I'm a bit new to Tensorflow, so thanks, all. (Yes, it is not a CNN yet, I decided to start simple just to make sure I'm importing my dataset right. And, I know it is tiny, I'm starting my dataset small too.)
############# IMPORT DEPENDENCIES ####################################
import tensorflow as tf
sess = tf.InteractiveSession() #start session
import scipy.misc
import numpy as np
######################################################################
#SET UP DATA #########################################################
images = []
one_hots = []
########### IMAGES ##################################################
#put all the images in a list
for i in range(60):
images.append(scipy.misc.imread('./shoes/%s.jpg' % str(i+1)))
print("One image appended...\n")
#normalize them, "divide" by 255
for image in images:
print("One image normalized...\n")
for i in range(260):
for j in range(200):
for c in range(3):
image[i][j][c]/=255
for image in images:
tf.reshape(image, [52000, 3])
########################################################################
################# ONE-HOT VECTORS ######################################
f = open('rateVectors.txt')
lines = f.readlines()
for i in range(0, 600, 10):
fillerlist = []
for j in range(10):
fillerlist.append(float(lines[i+j][:-1]))
one_hots.append(fillerlist)
print("One one-hot vector added...\n")
########################################################################3
#set placeholders and such for input, output, weights, biases
x = tf.placeholder(tf.float32, shape=[None, 52000, 3])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
W = tf.Variable(tf.zeros([52000, 10])) # These are our weights and biases
b = tf.Variable(tf.zeros([10])) # initialized as zeroes.
#########################################################################
sess.run(tf.global_variables_initializer()) #initialize variables in the session
y = tf.matmul(x, W) + b # "fake" outputs to train/test
##################### DEFINING OUR MODEL ####################################
#our loss function
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
#defining our training as gradient descent
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
###################### TRAINING #############################################
#################### OUR CUSTOM BATCH FUNCTION ##############################
q = 0
def next_batch(batch_size):
x = images[q:q + batch_size]
y = one_hots[q:q + batch_size]
q = (q + batch_size) % len(images)
return x, y
#train
for i in range(6):
batch = next_batch(10)
train_step.run(feed_dict={x: batch[0], y_: batch[1]})
print("Batch Number: " + i + "\n")
print("Done training...\n")
################ RESULTS #################################################
#calculating accuracy
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
#print accuracy
print(accuracy.eval(feed_dict={x: images, y_: one_hots}))
Your placeholder should have the dimension [None, 200, 260, 3] where None is the batch size, 200, 260 is the image size, and 3 is the channels.
Your weights should be [filter_height, filter_width, num_channels, num_filters]
Your bias should be [num_filters]
And the dimensions for the labels should be [None, num_classes] where None is the batch size, and num_classes is the number of classes that your images have.
These are just to make sure that math works.
I took these codes from here
Version of Tensorflow: 1.2.1
Version of Python: 3.5
Operating System: Windows 10
Another poster has asked about this same problem on StackOverflow here, and he appears to be using code from the same Udacity Word2Vec tutorial. So, maybe I'm dense, but the code of this example is so busy and complex that I can't tell what fixed his problem.
The error occurs when I call tf.reduce_means:
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
train_labels, num_sampled, vocabulary_size))
Right before the call to tf.reduce_mean the key variables have the following data types.
train_dataset.dtype
>> tf.int32
train_labels.dtype
>> tf.int32
valid_dataset.dtype
>> tf.int32
embeddings.dtype
>> tf.float32_ref
softmax_weights.dtype
>> tf.float32_ref
softmax_biases.dtype
>> tf.float32_ref
embed.dtype
>> tf.float32
I tried every permutation of data type in the definitions of the variables train_dataset.dtype, train_labels.dtype and valid_dataset.dtype: making them all int64, all float32, all float64, and combinations of integer and floating point. Nothing worked. I didn't try altering the data types of softmax_weight and softmax_biases, because I'm afraid that might foul up the optimization algorithm. Don't these need to be floats to support the calculus that is done during backpropagation? (Tensorflow is often a very opaque black box with documentation that verges on completely useless, so I can suspect things but never know for sure.)
Program Flow at Time of Error:
After the call to reduce_mean program control transfers to sampled_softmax_loss() in file nn_impl.py which in turn calls _compute_sampled_logits():
logits, labels = _compute_sampled_logits(
weights=weights,
biases=biases,
labels=labels,
inputs=inputs,
num_sampled=num_sampled,
num_classes=num_classes,
num_true=num_true,
sampled_values=sampled_values,
subtract_log_q=True,
remove_accidental_hits=remove_accidental_hits,
partition_strategy=partition_strategy,
name=name)
At this point I check the data types of the passed-in parameters and get the following:
weights.dtype
>> tf.float32_ref
biases.dtype
>> tf.float32_ref
labels.dtype
>> tf.float32
inputs.dtype
>> tf.int32
On the very next step an exception occurs, and I am thrown into the StreamWrapper class in file ansitowin32.py. Running to the end, I get the following Traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
489 as_ref=input_arg.is_ref,
--> 490 preferred_dtype=default_dtype)
491 except TypeError as err:
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
740 if ret is None:
--> 741 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
742
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
613 "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614 % (dtype.name, t.dtype.name, str(t)))
615 return t
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
34 loss = tf.reduce_mean(
35 tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36 train_labels, num_sampled, vocabulary_size))
37
38 # Optimizer.
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
1266 remove_accidental_hits=remove_accidental_hits,
1267 partition_strategy=partition_strategy,
-> 1268 name=name)
1269 sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
1270 logits=logits)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
1005 row_wise_dots = math_ops.multiply(
1006 array_ops.expand_dims(inputs, 1),
-> 1007 array_ops.reshape(true_w, new_true_w_shape))
1008 # We want the row-wise dot plus biases which yields a
1009 # [batch_size, num_true] tensor of true_logits.
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
284
285 def multiply(x, y, name=None):
--> 286 return gen_math_ops._mul(x, y, name)
287
288
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
1375 A `Tensor`. Has the same type as `x`.
1376 """
-> 1377 result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
1378 return result
1379
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
524 "%s type %s of argument '%s'." %
525 (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526 inferred_from[input_arg.type_attr]))
527
528 types = [values.dtype]
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
Here's the complete program:
# These are all the modules we'll be using later.
# Make sure you can import them before proceeding further.
# %matplotlib inline
from __future__ import print_function
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import zipfile
from matplotlib import pylab
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE
print("Working directory = %s\n" % os.getcwd())
def read_data(filename):
"""Extract the first file enclosed in a zip file as a list of words"""
with zipfile.ZipFile(filename) as f:
data = tf.compat.as_str(f.read(f.namelist()[0])).split()
return data
filename = 'text8.zip'
words = read_data(filename)
print('Data size %d' % len(words))
vocabulary_size = 50000
def build_dataset(words):
count = [['UNK', -1]]
count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
dictionary = dict()
# Loop through the keys of the count collection dictionary
# (apparently, zeroing out counts)
for word, _ in count:
dictionary[word] = len(dictionary)
data = list()
unk_count = 0 # count of unknown words
for word in words:
if word in dictionary:
index = dictionary[word]
else:
index = 0 # dictionary['UNK']
unk_count = unk_count + 1
data.append(index)
count[0][1] = unk_count
reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
return data, count, dictionary, reverse_dictionary
data, count, dictionary, reverse_dictionary = build_dataset(words)
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
del words # Hint to reduce memory.
data_index = 0
def generate_batch(batch_size, num_skips, skip_window):
global data_index
assert batch_size % num_skips == 0
assert num_skips <= 2 * skip_window
batch = np.ndarray(shape=(batch_size), dtype=np.int32)
labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
span = 2 * skip_window + 1 # [ skip_window target skip_window ]
buffer = collections.deque(maxlen=span)
for _ in range(span):
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
for i in range(batch_size // num_skips):
target = skip_window # target label at the center of the buffer
targets_to_avoid = [ skip_window ]
for j in range(num_skips):
while target in targets_to_avoid:
target = random.randint(0, span - 1)
targets_to_avoid.append(target)
batch[i * num_skips + j] = buffer[skip_window]
labels[i * num_skips + j, 0] = buffer[target]
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
return batch, labels
print('data:', [reverse_dictionary[di] for di in data[:8]])
for num_skips, skip_window in [(2, 1), (4, 2)]:
data_index = 0
batch, labels = generate_batch(batch_size=8, num_skips=num_skips, skip_window=skip_window)
print('\nwith num_skips = %d and skip_window = %d:' % (num_skips, skip_window))
print(' batch:', [reverse_dictionary[bi] for bi in batch])
print(' labels:', [reverse_dictionary[li] for li in labels.reshape(8)])
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16 # Random set of words to evaluate similarity on.
valid_window = 100 # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64 # Number of negative examples to sample.
graph = tf.Graph()
with graph.as_default(), tf.device('/cpu:0'):
# Input data.
train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
# Variables.
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
softmax_weights = tf.Variable(
tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))
# Model.
# Look up embeddings for inputs.
embed = tf.nn.embedding_lookup(embeddings, train_dataset)
# Compute the softmax loss, using a sample of the negative labels each time.
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
train_labels, num_sampled, vocabulary_size))
# Optimizer.
# Note: The optimizer will optimize the softmax_weights AND the embeddings.
# This is because the embeddings are defined as a variable quantity and the
# optimizer's `minimize` method will by default modify all variable quantities
# that contribute to the tensor it is passed.
# See docs on `tf.train.Optimizer.minimize()` for more details.
optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)
# Compute the similarity between minibatch examples and all embeddings.
# We use the cosine distance:
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
valid_embeddings = tf.nn.embedding_lookup(
normalized_embeddings, valid_dataset)
similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))
I had the same issue and it looks like that two parameters that are passed on to the loss function are swapped around.
If you look at the tensorflow description for 'sample_softmax_loss' (https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss):
sampled_softmax_loss(
weights,
biases,
labels,
inputs,
num_sampled,
num_classes,
num_true=1,
sampled_values=None,
remove_accidental_hits=True,
partition_strategy='mod',
name='sampled_softmax_loss'
)
The third expected parameter is 'labels' and the fourth 'inputs'. In the supplied code, these two parameters seem to have been switched around. I'm a bit puzzled how this is possible. Maybe this used to be different in an older version of TF. Anyway, swapping those two parameters around will solve the problem.
In machine learning, it is common to represent a categorical (specifically: nominal) feature with one-hot-encoding. I am trying to learn how to use tensorflow's embedding layer to represent a categorical feature in a classification problem. I have got tensorflow version 1.01 installed and I am using Python 3.6.
I am aware of the tensorflow tutorial for word2vec, but it is not very instructive for my case. While building the tf.Graph, it uses NCE-specific weights and tf.nn.nce_loss.
I just want a simple feed-forward net as below, and the input layer to be an embedding. My attempt is below. It complains when I try to matrix multiply the embedding with the hidden layer due to shape incompatibility. Any ideas how I can fix this?
from __future__ import print_function
import pandas as pd;
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import LabelEncoder
if __name__ == '__main__':
# 1 categorical input feature and a binary output
df = pd.DataFrame({'cat2': np.array(['o', 'm', 'm', 'c', 'c', 'c', 'o', 'm', 'm', 'm']),
'label': np.array([0, 0, 1, 1, 0, 0, 1, 0, 1, 1])})
encoder = LabelEncoder()
encoder.fit(df.cat2.values)
X = encoder.transform(df.cat2.values)
Y = np.zeros((len(df), 2))
Y[np.arange(len(df)), df.label.values] = 1
# Neural net parameters
training_epochs = 5
learning_rate = 1e-3
cardinality = len(np.unique(X))
embedding_size = 2
input_X_size = 1
n_labels = len(np.unique(Y))
n_hidden = 10
# Placeholders for input, output
x = tf.placeholder(tf.int32, [None, 1], name="input_x")
y = tf.placeholder(tf.float32, [None, 2], name="input_y")
# Neural network weights
embeddings = tf.Variable(tf.random_uniform([cardinality, embedding_size], -1.0, 1.0))
h = tf.get_variable(name='h2', shape=[embedding_size, n_hidden],
initializer=tf.contrib.layers.xavier_initializer())
W_out = tf.get_variable(name='out_w', shape=[n_hidden, n_labels],
initializer=tf.contrib.layers.xavier_initializer())
# Neural network operations
embedded_chars = tf.nn.embedding_lookup(embeddings, x)
layer_1 = tf.matmul(embedded_chars,h)
layer_1 = tf.nn.relu(layer_1)
out_layer = tf.matmul(layer_1, W_out)
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_layer, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Initializing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
avg_cost = 0.
# Run optimization op (backprop) and cost op (to get loss value)
_, c = sess.run([optimizer, cost],
feed_dict={x: X, y: Y})
print("Optimization Finished!")
EDIT:
Please see below the error message:
Traceback (most recent call last):
File "/home/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 671, in _call_cpp_shape_fn_impl
input_tensors_as_shapes, status)
File "/home/anaconda3/lib/python3.6/contextlib.py", line 89, in __exit__
next(self.gen)
File "/home/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 3 for 'MatMul' (op: 'MatMul') with input shapes: [?,1,2], [2,10].
Just make your x placeholder be size [None] instead of [None, 1]
I'm trying to train a LSTM network and it trains successfully in one way, but throws an error in the other way. In the first example I reshape the input array X using numpy reshape and in the other way I reshape it using tensorflow reshape.
Works fine:
import numpy as np
import tensorflow as tf
import tensorflow.contrib.learn as learn
# Parameters
learning_rate = 0.1
training_steps = 3000
batch_size = 128
# Network Parameters
n_input = 4
n_steps = 10
n_hidden = 128
n_classes = 6
X = np.ones([1770,4])
y = np.ones([177])
# NUMPY RESHAPE OUTSIDE RNN_MODEL
X = np.reshape(X, (-1, n_steps, n_input))
def rnn_model(X, y):
# TENSORFLOW RESHAPE INSIDE RNN_MODEL
#X = tf.reshape(X, [-1, n_steps, n_input]) # (batch_size, n_steps, n_input)
# # permute n_steps and batch_size
X = tf.transpose(X, [1, 0, 2])
# # Reshape to prepare input to hidden activation
X = tf.reshape(X, [-1, n_input]) # (n_steps*batch_size, n_input)
# # Split data because rnn cell needs a list of inputs for the RNN inner loop
X = tf.split(0, n_steps, X) # n_steps * (batch_size, n_input)
# Define a GRU cell with tensorflow
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden)
# Get lstm cell output
_, encoding = tf.nn.rnn(lstm_cell, X, dtype=tf.float32)
return learn.models.logistic_regression(encoding, y)
classifier = learn.TensorFlowEstimator(model_fn=rnn_model, n_classes=n_classes,
batch_size=batch_size,
steps=training_steps,
learning_rate=learning_rate)
classifier.fit(X,y)
Does not work:
import numpy as np
import tensorflow as tf
import tensorflow.contrib.learn as learn
# Parameters
learning_rate = 0.1
training_steps = 3000
batch_size = 128
# Network Parameters
n_input = 4
n_steps = 10
n_hidden = 128
n_classes = 6
X = np.ones([1770,4])
y = np.ones([177])
# NUMPY RESHAPE OUTSIDE RNN_MODEL
#X = np.reshape(X, (-1, n_steps, n_input))
def rnn_model(X, y):
# TENSORFLOW RESHAPE INSIDE RNN_MODEL
X = tf.reshape(X, [-1, n_steps, n_input]) # (batch_size, n_steps, n_input)
# # permute n_steps and batch_size
X = tf.transpose(X, [1, 0, 2])
# # Reshape to prepare input to hidden activation
X = tf.reshape(X, [-1, n_input]) # (n_steps*batch_size, n_input)
# # Split data because rnn cell needs a list of inputs for the RNN inner loop
X = tf.split(0, n_steps, X) # n_steps * (batch_size, n_input)
# Define a GRU cell with tensorflow
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden)
# Get lstm cell output
_, encoding = tf.nn.rnn(lstm_cell, X, dtype=tf.float32)
return learn.models.logistic_regression(encoding, y)
classifier = learn.TensorFlowEstimator(model_fn=rnn_model, n_classes=n_classes,
batch_size=batch_size,
steps=training_steps,
learning_rate=learning_rate)
classifier.fit(X,y)
The latter throws the following error:
WARNING:tensorflow:<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f1c67c6f750>: Using a concatenated state is slower and will soon be deprecated. Use state_is_tuple=True.
Traceback (most recent call last):
File "/home/blabla/test.py", line 47, in <module>
classifier.fit(X,y)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/base.py", line 160, in fit
monitors=monitors)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 484, in _train_model
monitors=monitors)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 328, in train
reraise(*excinfo)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 254, in train
feed_dict = feed_fn() if feed_fn is not None else None
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/io/data_feeder.py", line 366, in _feed_dict_fn
out.itemset((i, self.y[sample]), 1.0)
IndexError: index 974 is out of bounds for axis 0 with size 177
A couple of suggestions:
* use input_fn instead of X, Y to the fit
* use learn.Estimator instead of learn.TensorFlowEstimator
since you have small data, following should work. Otherwise you need to batch your data.
```
def _my_inputs():
return tf.constant(np.ones([1770,4])), tf.constant(np.ones([177]))
I was able to get this working with a couple small changes:
# Parameters
learning_rate = 0.1
training_steps = 10
batch_size = 8
# Network Parameters
n_input = 4
n_steps = 10
n_hidden = 128
n_classes = 6
X = np.ones([177, 10, 4]) # <---- Use shape [batch_size, n_steps, n_input] here.
y = np.ones([177])
def rnn_model(X, y):
X = tf.transpose(X, [1, 0, 2]) #|
X = tf.unpack(X) #| These two lines do the same thing as your code, just a bit simpler ;)
# Define a LSTM cell with tensorflow
lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden)
# Get lstm cell output
outputs, _ = tf.nn.rnn(lstm_cell, X, dtype=tf.float64) # <---- I think you want to use the first return value here.
return tf.contrib.learn.models.logistic_regression(outputs[-1], y) # <----uses just the last output for classification, as is typical with RNNs.
classifier = tf.contrib.learn.TensorFlowEstimator(model_fn=rnn_model,
n_classes=n_classes,
batch_size=batch_size,
steps=training_steps,
learning_rate=learning_rate)
classifier.fit(X,y)
I think the central problem you were having was that X has to be shape [batch,...] when passed to fit(...). When you used numpy to reshape it outside the rnn_model() function, X had this shape so training worked.
I can't speak for the quality of the model this solution will produce, but at least it runs!