Clarification on tf.Tensor.set_shape() - tensorflow

I have an image that is 478 x 717 x 3 = 1028178 pixels, with a rank of 1. I verified it by calling tf.shape and tf.rank.
When I call image.set_shape([478, 717, 3]), it throws the following error.
"Shapes %s and %s must have the same rank" % (self, other))
ValueError: Shapes (?,) and (478, 717, 3) must have the same rank
I tested again by first casting to 1028178, but the error still exists.
ValueError: Shapes (1028178,) and (478, 717, 3) must have the same rank
Well, that does make sense because one is of rank 1 and the other is of rank 3. However, why is it necessary to throw an error, as the total number of pixels still match.
I could of course use tf.reshape and it works, but I think that's not optimal.
As stated on the TensorFlow FAQ
What is the difference between x.set_shape() and x = tf.reshape(x)?
The tf.Tensor.set_shape() method updates the static shape of a Tensor
object, and it is typically used to provide additional shape
information when this cannot be inferred directly. It does not change
the dynamic shape of the tensor.
The tf.reshape() operation creates a new tensor with a different dynamic shape.
Creating a new tensor involves memory allocation and that could potentially be more costly when more training examples are involved. Is this by design, or am I missing something here?

As far as I know (and I wrote that code), there isn't a bug in Tensor.set_shape(). I think the misunderstanding stems from the confusing name of that method.
To elaborate on the FAQ entry you quoted, Tensor.set_shape() is a pure-Python function that improves the shape information for a given tf.Tensor object. By "improves", I mean "makes more specific".
Therefore, when you have a Tensor object t with shape (?,), that is a one-dimensional tensor of unknown length. You can call t.set_shape((1028178,)), and then t will have shape (1028178,) when you call t.get_shape(). This doesn't affect the underlying storage, or indeed anything on the backend: it merely means that subsequent shape inference using t can rely on the assertion that it is a vector of length 1028178.
If t has shape (?,), a call to t.set_shape((478, 717, 3)) will fail, because TensorFlow already knows that t is a vector, so it cannot have shape (478, 717, 3). If you want to make a new Tensor with that shape from the contents of t, you can use reshaped_t = tf.reshape(t, (478, 717, 3)). This creates a new tf.Tensor object in Python; the actual implementation of tf.reshape() does this using a shallow copy of the tensor buffer, so it is inexpensive in practice.
One analogy is that Tensor.set_shape() is like a run-time cast in an object-oriented language like Java. For example, if you have a pointer to an Object but know that, in fact, it is a String, you might do the cast (String) obj in order to pass obj to a method that expects a String argument. However, if you have a String s and try to cast it to a java.util.Vector, the compiler will give you an error, because these two types are unrelated.

Related

tf.reshape with the tensor size raises mismatched number of values

I have the following code:
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
nelems = tf.size(tensor, out_type=tf.int64, name='num_elements')
indices = tf.transpose(
tf.unravel_index(tf.range(nelems, dtype=tf.int64), shape),
name='sparse_indices')
values = tf.reshape(tensor, [nelems], name='sparse_values')
This code snippet is simply transforming a dense tensor into a sparse tensor. However I found that the reshape op sometimes raises an error in runtime:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 906 values, but the requested shape has 1024
It's hard to write a simple demo to reproduce this bad case. So please understand that I cannot provide a reproducible demo.
But notice that my code is very simple. The reshape op is simply reshaping the tensor into a 1D tensor with the dimension size as the tensor's size, which is the number of elements of the tensor (illustrated in TensorFlow's doc). And in my mind, the number of elements here simply means the number of of values in the error message. Thus the above error should never appear.
I tried to use production of the shape as the target dimension size instead of tf.size but it was no use:
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
# use production as the number of elements
nelems = tf.reduce_prod(shape, name='num_elements')
....
values = tf.reshape(tensor, [nelems], name='sparse_values')
So my question is, why is there a possibility that, for a certain tensor tensor, tf.size(tensor) or tf.shape(tensor) does not tell the actual number of elements of tensor? Can anyone remind if I have missed anything? Thanks.
I have figured out the problem on myself.
Problem:
In my project, the problem is that, tensor is produced by a third-party library. The library called tensor.set_shape([1024]) before returning tensor. While it can't ensure that there must be 1024 elements in tensor.
According to these codes, in TensorFlow's python frontend implementation, when the shape is fully determined, tf.shape and tf.size can go a fast way to get the result without really running the ShapeOp or SizeOp, and returning a constant tensor of the determined shape dimensions as the result.
As a result, in my case, the shape is obviously fully determined as [1024], so the code goes in the fast way and returned tf.constant([1024]). However the real shape of the Tensor object in the backend is [906].
Solution
According to the previously mentioned codes, we can see that tf.shape and tf.size actually calls shape_internal and size_internal defined in tensorflow.python.ops.array_ops. The latter functions takes one more argument optimize with default value True. And if optimize is false, the fast way will be ignored.
So the solution is to replace the tf.shape or tf.size with shape_internal or size_internal, and pass optimize=False.
# internal functions are not exposed by `tensorflow` root package
# so we have to import the `array_ops` package manualy
from tensorflow.python.ops import array_ops
....
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
#nelems = tf.size(tensor, out_type=tf.int64, name='num_elements')
nelems = array_ops.size_internal(tensor, optimize=False, out_type=tf.int64, name='num_elements')
....
values = tf.reshape(tensor, [nelems], name='sparse_values')

In Tensorflow, what is difference between name, ^name, and name:digits? [duplicate]

I wonder if this is the correct understanding:
All tensors are derived from some operation, and operations are either given a name in the constructor, or given the default name for a particular kind of operation. If the name is not unique, TensorFlow automatically handles this by appending "_1", "_2", etc. An operation with n tensor outputs name these tensors "op_name:0", "op_name:1", ..., "op_name:n-1".
One problem seems to arise: if x is a tf.Variable, then x.name gives "variable_name:0". This is confusing: to what does "variable_name" refer?
Your observations on Tensor naming are absolutely correct: the name of a Tensor is the concatenation of
the name of the operation that produced it,
a colon (:), and
the index of that tensor in the outputs of the operation that produced it.
Therefore the tensor named "foo:2" is the output of the op named "foo" at position 2 (with indices starting from zero).
The naming of tf.Variable objects is slightly strange. Every tf.Variable contains a mutable tensor object that holds the state of the variable (and a few other tensors). A "Variable" op (which has the name "variable_name" in your example) "produces" this mutable tensor each time it is run as its 0th output, so the name of the mutable tensor is "variable_name:0".
Since a tf.Variable is mostly indistinguishable from a tf.Tensor—in that it can be used in the same places—we took the decision to make variable names resemble tensor names, so the Variable.name property returns the name of the mutable tensor. (This contrasts with tf.QueueBase and tf.ReaderBase objects, which are not usable directly as tensors (instead you have to call methods on them to create ops that operate on their state), so these do not have a tensor-like name.)

What exactly qualifies as a 'Tensor' in TensorFlow?

I am new to TensorFlow and just went through the eager execution tutorial and came across the tf.decode_csv function. Not knowing about it, I read the documentation. https://www.tensorflow.org/api_docs/python/tf/decode_csv
I don't really understand it.
The documentation says 'records: A Tensor of type string.'
So, my question is: What qualifies as a 'Tensor'?
I tried the following code:
dec_res = tf.decode_csv('0.1,0.2,0.3', [[0.0], [0.0], [0.0]])
print(dec_res, type(dec_res))
l = [[1,2,3],[4,5,6],[7,8,9]]
r = tf.reshape(l, [9,-1])
print(l, type(l))
print(r, type(r))
So the list dec_res contains tf.tensor objects. That seems reasonable to me. But is an ordinary string also a 'Tensor' according to the documentation?
Then I tried something else with the tf.reshape function. In the documentation https://www.tensorflow.org/api_docs/python/tf/reshape it says that 'tensor: A Tensor.' So, l is supposed to be a tensor. But it is not of type tf.tensor but simply a python list. This is confusing.
Then the documentation says
Returns:
A Tensor. Has the same type as tensor.
But the type of l is list where the type of r is tensorflow.python.framework.ops.Tensor. So the types are not the same.
Then I thought that TensorFlow is very generous with things being a tensor. So I tried:
class car(object):
def __init__(self, color):
self.color = color
red_car = car('red')
#test_reshape = tf.reshape(red_car, [1, -1])
print(red_car.color) # to check, that red_car exists.
Now, the line in comments results in an error.
So, can anyone help me to find out, what qualifies as a 'Tensor'?
P.S.: I tried to read the source code of tf.reshape as given in the documentation
Defined in tensorflow/python/ops/gen_array_ops.py.
But this file does not exist in the Github repo. Does anyone know how to read it?
https://www.tensorflow.org/programmers_guide/tensors
TensorFlow, as the name indicates, is a framework to define and run
computations involving tensors. A tensor is a generalization of
vectors and matrices to potentially higher dimensions. Internally,
TensorFlow represents tensors as n-dimensional arrays of base
datatypes.
What you are observing commes from the fact that tensorflow operations (like reshape) can be built from various python types using the function tf.convert_to_tensor:
https://www.tensorflow.org/api_docs/python/tf/convert_to_tensor
All standard Python op constructors apply this function to each of
their Tensor-valued inputs, which allows those ops to accept numpy
arrays, Python lists, and scalars in addition to Tensor objects

TensorFlow shape checker

Unlike most programming languages, TensorFlow does not regard the shape of an array as part of the type. The downside of this is that, if you make a mistake and accidentally provide data of the wrong shape, it may silently give a wrong answer e.g. Slightly different shape converges to wrong number - why? which makes debugging difficult.
Does there exist a shape checker for TF? That is, a function or program that can analyze a graph (with sample feed_dict if need be) and raise the alarm if there is a shape mismatch?
Tensorflow does offer a shape checker mechanism which is technically the shape argument you should specify while declaring Tensorflow place holders. By default, tensorflow takes [None,None] for shape. But , for example if you do specify the shape while declaring your place holders, then it will raise shape error whenever user enters data of incorrect/conflicting shape. For example
lets say I declared a place holder named X and did specify its shape argument too:
X=tf.placeholder(dtype=tf.float32, shape=[None,256])
Now, this means that number of rows of X can vary but number of features will always be 256. And now , if I mistakenly feed data of shape lets say 1000 rows and 20 features, shape error will be raised.
Also, check this link :https://www.tensorflow.org/api_docs/python/tf/placeholder
Use:
print(str(tf.Shape(test_tensor))) # where test_tensor is
whatever your tensor's name is
Documentation available here: https://www.tensorflow.org/api_docs/python/tf/shape

Tensorflow error using while_loop: "List of Tensors when single Tensor expected"

I'm getting a TypeError("List of Tensors when single Tensor expected") when I run a Tensorflow while_loop. The error is from the third parameter, which should be a list of Tensors, according to the documentation. x, W, Win, Y, temp, and Wout are all previously declared as floats and arrays of floats. cond2 and test2 are functions I've written to be the condition and body. I use an almost identical call earlier in the program with no issues.
t=0
t,x,W,Win,Y,temp,Wout = sess.run(tf.while_loop(cond2, test2,
[t, tf.Variable(x), tf.constant(W),
tf.constant(Win), tf.Variable(Y),
tf.Variable(temp), tf.constant(Wout)],
shape_invariants=[tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None)]))
I fixed the error by removing the tf.constant() for Wout, since Wout was already declared as a tensor.
This would be easier to diagnose with (a) your definitions for condition and body, and (b) the full error output from TensorFlow (it usually also outputs a full dump of the input tensors when issuing these errors.)
With that said, the source of the problem seems to be that TensorFlow is viewing your loop_vars list as a single Tensor, and/or your cond2 and test2 functions only accept a single argument each. If neither of these is true, then providing more detail would help answer the question (specifically the full error message and the definition for every value/tensor/function you're passing to tf.while_loop. I've found that the majority of while_loop errors can be fixed by paying attention to the tensors in the error output.
The while_loop can throw pretty confusing errors at times so I'd like to help; I'll check back and update/edit my answer if more info is provided.