I'm a total beginner to TensorFlow, and I'm trying to multiply two matrices together, but I keep getting an exception that says:
ValueError: Shapes TensorShape([Dimension(2)]) and TensorShape([Dimension(None), Dimension(None)]) must have the same rank
Here's minimal example code:
data = np.array([0.1, 0.2])
x = tf.placeholder("float", shape=[2])
T1 = tf.Variable(tf.ones([2,2]))
l1 = tf.matmul(T1, x)
init = tf.initialize_all_variables()
with tf.Session() as sess:{x: data}
Confusingly, the following very similar code works fine:
data = np.array([0.1, 0.2])
x = tf.placeholder("float", shape=[2])
T1 = tf.Variable(tf.ones([2,2]))
init = tf.initialize_all_variables()
with tf.Session() as sess:*x, feed_dict={x: data}
Can anyone point to what the issue is? I must be missing something obvious here..

The tf.matmul() op requires that both of its inputs are matrices (i.e. 2-D tensors)*, and doesn't perform any automatic conversion. Your T1 variable is a matrix, but your x placeholder is a length-2 vector (i.e. a 1-D tensor), which is the source of the error.
By contrast, the * operator (an alias for tf.multiply()) is a broadcasting element-wise multiplication. It will convert the vector argument to a matrix by following NumPy broadcasting rules.
To make your matrix multiplication work, you can either require that x is a matrix:
data = np.array([[0.1], [0.2]])
x = tf.placeholder(tf.float32, shape=[2, 1])
T1 = tf.Variable(tf.ones([2, 2]))
l1 = tf.matmul(T1, x)
init = tf.initialize_all_variables()
with tf.Session() as sess:, feed_dict={x: data})
...or you could use the tf.expand_dims() op to convert the vector to a matrix:
data = np.array([0.1, 0.2])
x = tf.placeholder(tf.float32, shape=[2])
T1 = tf.Variable(tf.ones([2, 2]))
l1 = tf.matmul(T1, tf.expand_dims(x, 1))
init = tf.initialize_all_variables()
with tf.Session() as sess:
# ...
* This was true when I posted the answer at first, but now tf.matmul() also supports batched matrix multiplications. This requires both arguments to have at least 2 dimensions. See the documentation for more details.


How shape Tensor array?

I have lately been vexed by the following error message:
ValueError: Cannot feed value of shape (2455040,) for Tensor 'Placeholder:0', which has shape '(2455040, ?)'
Which is being produced from running the following code:
# set up to feed an array of images [images, size_of_image]
x = tf.placeholder(tf.float32, [NUMPIXELS,None])
# Define loss and optimizer..why is this 2d?
y_ = tf.placeholder(tf.float32, [None,NUMCLASSES])
sess = tf.InteractiveSession()
tl = get_tensor_list()
for f, n in tl:
str = '/users/me/downloads/train/' + f
mm =
mm = mm.convert('F')
i = mma.flatten() #now this is an array of floats of size NUMPIXELS, feed_dict={x: i, y_: n}) # <<DEATH
Somehow, that array is getting a shape that tf does not like [(x,) when it wants (x,?)]. How to satisfy the tensorgods in this case? The tensor must be what it must be for other mathematical reasons not discussed.
reshaping the array might help.
i = mma.flatten().reshape((NUMPIXELS,1))
The error happens because the two tensors have different ranks: tensor with shape (2455040,) has rank 1, while tensor with shape (2455040,?) has rank 2.
You can do this:
x = tf.placeholder(tf.float32, [None])
x = tf.reshape(x, [NUMPIXELS,-1])

Set value of loss function when calculating/applying gradients

I am using TensorFlow as a part of a larger system where I want to apply the gradient updates in batches. Ideally I'd like to do something along the lines of (in pseudo-code):
grads_and_vars = tf.gradients(loss, [vars])
list_of_losses = [2, 1, 3, ...]
for loss_vals in list_of_losses:
tf.apply_gradients(grads_and_vars, feed_dict = {loss : loss_vals}
My loss function depends on earlier predictions from my neural network and it takes a long time to compute thus my need for this.
When you call tf.gradients, the argument grad_ys let you specify custom values from upstream backprop graph. If you don't specify them, you end up with node that assumes that upstream backprop is tensor of 1's (Fill node). So you could either call tf.gradients with a placeholder that lets you specify custom upstream values, or just feed the Fill node.
a = tf.constant(2.)
b = tf.square(a)
grads = tf.gradients(b, [a]), feed_dict={"gradients/Fill:0": 0})
(Posted on behalf of the OP.)
Thanks for your suggestions Yaroslav! Below is the code I put together based on your suggestions. I think this solves my problem:
with tf.Session() as sess:
X = tf.placeholder("float", name="X")
W = tf.Variable(1.0, name="weight")
b = tf.Variable(0.5, name="bias")
pred = tf.sigmoid(tf.add(tf.multiply(X, W), b))
opt = tf.train.AdagradOptimizer(1.0)
gvs = tf.gradients(pred, [W, b], grad_ys=0.5)
train_step = opt.apply_gradients(zip(gvs, [W, b]))
for i in range(50):
val, _ =[pred, train_step], feed_dict= {X : 2})

restore a model trained with variable input length in tensorflow results in InvalidArgumentError

I am rather new to tensorflow and am currently experimenting with models of varying complexity. I have a problem with the save and restore functionality of the package. As far as I did understand the tutorials, I should be able to restore a trained graph and run it with some new input at some later point. However, I get the following error when I try to do just that.:
InvalidArgumentError (see above for traceback): Shape [-1,10] has negative dimensions
[[Node: Placeholder = Placeholderdtype=DT_FLOAT, shape=[?,10], _device="/job:localhost/replica:0/task:0/cpu:0"]]
My understanding of the message is that the restored graph does not like one dimension to be left arbitrary, which in turn is necessary for practical cases where I don't know beforehand how large my input will be. A code snippet as a minimal example, producing the error above, can be found below. I know how to restore each tensor individually but this gets impractical pretty quickly when the models grow in complexity. I am thankful for any help I get and apologize in case my question is stupid.
import numpy as np
import tensorflow as tf
def generate_random_input():
alist = []
for _ in range(10):
alist.append(np.random.uniform(-1, 1, 100))
return np.array(alist).T
def generate_random_target():
return np.random.uniform(-1, 1, 100)
x = tf.placeholder('float', [None, 10])
y = tf.placeholder('float')
# the model
w1 = tf.get_variable('w1', [10, 1], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer(seed=1))
b1 = tf.get_variable('b1', [1], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer(seed=1))
result = tf.add(tf.matmul(x, w1), b1, name='result')
loss = tf.reduce_mean(tf.losses.mean_squared_error(predictions=result, labels=y))
optimizer = tf.train.AdamOptimizer(0.03).minimize(loss)
saver = tf.train.Saver()
with tf.Session() as sess:[optimizer, loss], feed_dict={x: generate_random_input(), y: generate_random_target()}), 'file_name')
# now load the model in another session:
sess2 = tf.Session()
saver = tf.train.import_meta_graph('file_name.meta')
saver.restore(sess2, tf.train.latest_checkpoint('./'))
graph = tf.get_default_graph()
pred = graph.get_operation_by_name('result')
test_result =, feed_dict={x: generate_random_input()})
in the last line, you don't feed_dict the label_palceholder with the data. So in the placeholder, the [-1] dimension is still -1, other than the batch size. That's the cause.
I'm having the exact same problem as you. I'm importing and testing a bunch of different CNNs with different layer sizes and testing on various datasets. You can stick your model creation in a function like so and recreate it in your other code:
def create_model():
x = tf.placeholder('float', [None, 10])
y = tf.placeholder('float')
w1 = tf.get_variable('w1', [10, 1], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer(seed=1))
b1 = tf.get_variable('b1', [1], dtype=tf.float32, initializer=tf.contrib.layers.xavier_initializer(seed=1))
result = tf.add(tf.matmul(x, w1), b1, name='result')
return x, y, result
x, y, result = create_model()
loss = tf.reduce_mean(tf.losses.mean_squared_error(predictions=result, labels=y))
optimizer = tf.train.AdamOptimizer(0.03).minimize(loss)
saver = tf.train.Saver()
with tf.Session() as sess:[optimizer, loss], feed_dict={x: generate_random_input(), y: generate_random_target()}), 'file_name')
# now load the model in another session:
sess2 = tf.Session()
# This stuff is optional if everything is the same scope
x, y, result = create_model()
saver = tf.train.Saver()
# loss = ... if you want loss
# Now just restore the weights and run
saver.restore(sess, 'file_name')
test_result =, feed_dict={x: generate_random_input()})
This is a bit tedious if I want to import many complex architectures with different dimensions. For our situation, I don't know if there's any other way to restore an entire model than to recreate that architecture first in your second session.

tensor flow character recognition with softmax results in accuracy 1 due to [NaN...NaN] prediction

I am trying to use the softmax regression method discussed in to recognize characters.
My code is as follows.
train_data = pd.read_csv('CharDataSet/train.csv')
x = tf.placeholder(tf.float32, [None, 130])
W = tf.Variable(tf.zeros([130, 26]))
b = tf.Variable(tf.zeros([26]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 26])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
for _ in range(10):
batch_xs = train_data.iloc[:, 2:]
batch_ys = getencodedbatch(train_data.iloc[:, 1])
print(batch_ys), feed_dict={x: batch_xs, y_: batch_ys})
However, I am getting an accuracy of 1, which shouldn't be the case.
The reason why I am getting it so is because my y tensor results with an array like
[nan, ..., nan]
Can anyone explain to me what is wrong in my code?
I converted each character to a one-hot encoding using the method below
def getencodedbatch(param):
s = (param.shape[0],26)
y_encode = np.zeros(s)
# print(y_encode)
for val in param:
col = ord(val)-97
y_encode[row, col] = 1
row += 1
return pd.DataFrame(y_encode)
Here is the problem you are having:
You set your initial weights and biases to 0 (this is wrong, as your
network does not learn).
The result is that y consists of all zeros
You take the log of y.. and a log of 0 is not defined... Hence the NaN.
Good luck!
Edit to tell you how to fix it: look for an example on classifying MNIST characters and see what they do. You probably want to initialise your weights to be random normals ;)

Tensorflow error for zero vector: dims must be a vector of int32

I am trying to initialize zero vectors in tensorflow as follow:
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
init = tf.initialize_all_variables()
# Tensorflow run
sess = tf.Session()
However, I am getting the following error:
InvalidArgumentError: dims must be a vector of int32.
Can you please help me fixing this problem?
It works for me if you do this
W = tf.zeros([784, 10])
b = tf.zeros([10])
init = tf.initialize_all_variables()
# Tensorflow run
sess = tf.Session()
Also if you do it the way you're doing it. You'd still need to initialize W and b later on anyways as W below won't be initialized by the zeros tensor.
W = tf.Variable(tf.zeros([3,4]), name='x')
b = tf.Variable(x + 6, name='y')
model = tf.initialize_all_variables()
with tf.Session() as session:
#Error: Attempting to use uninitialized value b
The example above will give an error but the one below won't and will give the correct answer.
W = tf.zeros([3,4], name='x')
b = tf.Variable(x + 6, name='y')
model = tf.initialize_all_variables()
with tf.Session() as session:
If you want to do it the way you're with weights and biases (I'm guessing W and b stand for) I suggest looking here.
Let me know if you still have any questions.