How to implement tf.nn.sigmoid_cross_entropy_with_logits - tensorflow

I am currently learning tensorflow, and I have run into an issue with
tf.nn.sigmoid_cross_entropy_with_logits(labels=y,logits=logits). The function description says that both labels and logits must be of the same type. I have the function below that I am using to classify MNIST images. The following are key section of my code
X=tf.placeholder(tf.float32,shape=(None,n_inputs),name="X")
y=tf.placeholder(tf.int32,shape=(None),name="y")
def neuron_layer(X,W,b,n_neurons,name,activation=None):
with tf.name_scope(name):
n_inputs=int(X.get_shape()[1])
stddev=2/np.sqrt(n_inputs)
z=tf.matmul(X,W)+b
if activation=="sigmoid":
return tf.math.sigmoid(z)
else:
return z
with tf.name_scope("dnn"):
hidden1=neuron_layer(X,W1,b1,n_hidden1,"hidden",activation="sigmoid")
logits=neuron_layer(hidden1,W2,b2,n_outputs,"outputs",activation="sigmoid")
with tf.name_scope("loss"):
xentropy=tf.nn.sigmoid_cross_entropy_with_logits(labels=y,logits=logits)
loss=tf.reduce_mean(xentropy,name="loss")
I get the error: input 'y' of 'Mul' Op has type int32 that does not match type float32 of argument 'x
if I change
y=tf.placeholder(tf.float32,shape=(None),name="y"). I get the error
Value passed to parameter 'targets' has DataType float32 not in list of allowed values: int32, int64. Yet logits can only be float32 or float64. Please help me fix the issue. Thanks

As mentioned in the comments, tf.nn.sigmoid_cross_entropy_with_logits is the wrong function. In your case you should use tf.nn.softmax_cross_entropy_with_logits instead (actually, that one yields a deprecation warning, so tf.nn.softmax_cross_entropy_with_logits_v2 is the correct one). Also note, as also mentioned in the comments, that the point of these two functions is that they have a sigmoid (or softmax, respectively) built in, so your model shouldn't have any activation function on the last layer.
Regarding the issue: I just tried it with tensorflow version 1.14.0. There, the issue still occurs if y has type int32. However, it works smoothly if both, y and labels, have type float32.
It's kind of inconsistent that tf.nn.sigmoid_cross_entropy_with_logits does not perform this cast itself, while tf.nn.softmax_cross_entropy_with_logits has no issue with y being int32.

Related

tf.reshape with the tensor size raises mismatched number of values

I have the following code:
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
nelems = tf.size(tensor, out_type=tf.int64, name='num_elements')
indices = tf.transpose(
tf.unravel_index(tf.range(nelems, dtype=tf.int64), shape),
name='sparse_indices')
values = tf.reshape(tensor, [nelems], name='sparse_values')
This code snippet is simply transforming a dense tensor into a sparse tensor. However I found that the reshape op sometimes raises an error in runtime:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 906 values, but the requested shape has 1024
It's hard to write a simple demo to reproduce this bad case. So please understand that I cannot provide a reproducible demo.
But notice that my code is very simple. The reshape op is simply reshaping the tensor into a 1D tensor with the dimension size as the tensor's size, which is the number of elements of the tensor (illustrated in TensorFlow's doc). And in my mind, the number of elements here simply means the number of of values in the error message. Thus the above error should never appear.
I tried to use production of the shape as the target dimension size instead of tf.size but it was no use:
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
# use production as the number of elements
nelems = tf.reduce_prod(shape, name='num_elements')
....
values = tf.reshape(tensor, [nelems], name='sparse_values')
So my question is, why is there a possibility that, for a certain tensor tensor, tf.size(tensor) or tf.shape(tensor) does not tell the actual number of elements of tensor? Can anyone remind if I have missed anything? Thanks.
I have figured out the problem on myself.
Problem:
In my project, the problem is that, tensor is produced by a third-party library. The library called tensor.set_shape([1024]) before returning tensor. While it can't ensure that there must be 1024 elements in tensor.
According to these codes, in TensorFlow's python frontend implementation, when the shape is fully determined, tf.shape and tf.size can go a fast way to get the result without really running the ShapeOp or SizeOp, and returning a constant tensor of the determined shape dimensions as the result.
As a result, in my case, the shape is obviously fully determined as [1024], so the code goes in the fast way and returned tf.constant([1024]). However the real shape of the Tensor object in the backend is [906].
Solution
According to the previously mentioned codes, we can see that tf.shape and tf.size actually calls shape_internal and size_internal defined in tensorflow.python.ops.array_ops. The latter functions takes one more argument optimize with default value True. And if optimize is false, the fast way will be ignored.
So the solution is to replace the tf.shape or tf.size with shape_internal or size_internal, and pass optimize=False.
# internal functions are not exposed by `tensorflow` root package
# so we have to import the `array_ops` package manualy
from tensorflow.python.ops import array_ops
....
shape = tf.shape(tensor, out_type=tf.int64, name='sparse_shape')
#nelems = tf.size(tensor, out_type=tf.int64, name='num_elements')
nelems = array_ops.size_internal(tensor, optimize=False, out_type=tf.int64, name='num_elements')
....
values = tf.reshape(tensor, [nelems], name='sparse_values')

How to return a Tensor type or an IndexedSlices type via tf.cond()?

I want to use the origin sparse tensor (tf.IndexedSlices type) when pct < 0.75, otherwise use a dense tensor (tf.Tensor type, created by tf.convert_to_tensor). Here is the code
def fn1():
return tf.convert_to_tensor(sparse_gradient)
def fn2():
return sparse_gradient
final_gradient = tf.cond(tf.less(pct, tf.constant(value=0.75, dtype=tf.float64)), fn1, fn2)
However, tf.cond need fn1() and fn2() have same return type, so this code will throw an Error:
ValueError: The two structures don't have the same nested structure.
How can I fix this? The control flow is a part of the Calculate graph, so I have to use tf.cond. Is there any other way to work it out?
I found that it is impossible in static graph mode.(Eager mode may not have this problem) Because the type will be determined after graph's compiling. So we can not use different type by the runtime tensor value.
We can also find that in merge function, which is a base op of tensorflow's control flow:
def merge(inputs, name=None):
"""
...
This op handles both `Tensor`s and `IndexedSlices`. If inputs has a mix of
`Tensor`s and `IndexedSlices`, all inputs are converted to IndexedSlices
before merging.
...
"""

TFP Linear Regression yhat=model(x_tst) - doesn't work for other data

I cannot see the difference between what I am doing and the working Google TFP example, whose structure I am following. What am I doing wrong/should I be doing differently?
[Setup: Win 10 Home 64-bit 20H2, Python 3.7, TF2.4.1, TFP 0.12.2, running in Jupyter Lab]
I have been building a model step by step following the example of TFP Probabilistic Layers Regression. The Case 1 code runs fine, but my parallel model doesn't and I cannot see the difference that might cause this
yhat = model(x_tst)
to fail with message Input 0 of layer sequential_14 is incompatible with the layer: : expected min_ndim=2, found ndim=1. Full shape received: (2019,) (which is the correct 1D size of x_tst)
For comparison: Google's load_dataset function for the TFP example returns y, x, x_tst, which are all np.ndarray of size 150, whereas I read data from a csv file with pandas.read_csv, split it into train_ and test_datasets and then take 1 col of data as independent variable 'g' and dependent variable 'redz' from the training dataset.
I know x, y, etc. need to be np.ndarray, but one does not create ndarray directly, so I have...
x = np.array(train_dataset['g'])
y = np.array(train_dataset['redz'])
x_tst = np.array(test_dataset['g'])
where x, y, x_tst are all 1-dimensional - just like the TFP example.
The model itself runs
model = tf.keras.Sequential([
tf.keras.layers.Dense(1),
tfp.layers.DistributionLambda(lambda t: tfd.Normal(loc=t, scale=1)),
])
# Do inference.
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.01), loss=negloglik)
model.fit(x, y, epochs=1, verbose=False);
(and when plotted gives the expected output for the google data - I don't get this far):
But, per the example when I try to "profit" by doing yhat = model(x_tst) I get the dimensions error given above.
What's wrong?
(If I try mode.predict I think I hit a known bug/gap in TFP; then it fails the assert)
Update - Explicit Reshape Resolves Issue
The hint from Frightera led to further investigation: x_tst had shape (2019,)
Reshaping by x_tst = x_tst.rehape(2019,1) resolved the issue. Is TF inconsistent in its requirements or is there some good reason that the explicit final dimension 1 was required? Who knows. At least predictions can be made now.
In this question Difference between numpy.array shape (R, 1) and (R,), the OP asked for the difference between (R,) and (R,1) but the answers given did not address this specific point.
Similarly in this question Difference between these array shapes in numpy
I believe the answer lies in the numpy glossary, where it says of (n,) that
A parenthesized number followed by a comma denotes a tuple with one
element. The trailing comma distinguishes a one-element tuple from a
parenthesized n.
Which, naturally, echoes the Python statements concerning tuples here
Thus an array of shape (R,) is a tuple describing an array as being 1D of a certain extent R, where the comma is appended to distinguish the tuple (R,) from the non-tuple (R).
However, for a 1D array, there is no sense of row or column ordering; (R,1) is R rows by 1 column, but (1, R) would be 1 row of R columns, and though it shouldn't matter to a 1D iterator either it does or the iterator doesn't correctly recognise ( ,) and thinks it is 2D. (i.e. I don't know the technical details of that part, but these seem to be the only options that account for the behaviour.)
This issue is unrelated to the indeterminacy of size that occurs in tensor definition in Tensorflow. In the context of Tensorflow, Tensors (arrays) may have indeterminate shapes, so that more data may be added along a certain axis as processing occurs, e.g. in batches, in which case the initial Tensor shape includes a leading None to indicate where array expansion is expected to occur. (See e.g. tensor's shape here)

What exactly qualifies as a 'Tensor' in TensorFlow?

I am new to TensorFlow and just went through the eager execution tutorial and came across the tf.decode_csv function. Not knowing about it, I read the documentation. https://www.tensorflow.org/api_docs/python/tf/decode_csv
I don't really understand it.
The documentation says 'records: A Tensor of type string.'
So, my question is: What qualifies as a 'Tensor'?
I tried the following code:
dec_res = tf.decode_csv('0.1,0.2,0.3', [[0.0], [0.0], [0.0]])
print(dec_res, type(dec_res))
l = [[1,2,3],[4,5,6],[7,8,9]]
r = tf.reshape(l, [9,-1])
print(l, type(l))
print(r, type(r))
So the list dec_res contains tf.tensor objects. That seems reasonable to me. But is an ordinary string also a 'Tensor' according to the documentation?
Then I tried something else with the tf.reshape function. In the documentation https://www.tensorflow.org/api_docs/python/tf/reshape it says that 'tensor: A Tensor.' So, l is supposed to be a tensor. But it is not of type tf.tensor but simply a python list. This is confusing.
Then the documentation says
Returns:
A Tensor. Has the same type as tensor.
But the type of l is list where the type of r is tensorflow.python.framework.ops.Tensor. So the types are not the same.
Then I thought that TensorFlow is very generous with things being a tensor. So I tried:
class car(object):
def __init__(self, color):
self.color = color
red_car = car('red')
#test_reshape = tf.reshape(red_car, [1, -1])
print(red_car.color) # to check, that red_car exists.
Now, the line in comments results in an error.
So, can anyone help me to find out, what qualifies as a 'Tensor'?
P.S.: I tried to read the source code of tf.reshape as given in the documentation
Defined in tensorflow/python/ops/gen_array_ops.py.
But this file does not exist in the Github repo. Does anyone know how to read it?
https://www.tensorflow.org/programmers_guide/tensors
TensorFlow, as the name indicates, is a framework to define and run
computations involving tensors. A tensor is a generalization of
vectors and matrices to potentially higher dimensions. Internally,
TensorFlow represents tensors as n-dimensional arrays of base
datatypes.
What you are observing commes from the fact that tensorflow operations (like reshape) can be built from various python types using the function tf.convert_to_tensor:
https://www.tensorflow.org/api_docs/python/tf/convert_to_tensor
All standard Python op constructors apply this function to each of
their Tensor-valued inputs, which allows those ops to accept numpy
arrays, Python lists, and scalars in addition to Tensor objects

Tensorflow error using while_loop: "List of Tensors when single Tensor expected"

I'm getting a TypeError("List of Tensors when single Tensor expected") when I run a Tensorflow while_loop. The error is from the third parameter, which should be a list of Tensors, according to the documentation. x, W, Win, Y, temp, and Wout are all previously declared as floats and arrays of floats. cond2 and test2 are functions I've written to be the condition and body. I use an almost identical call earlier in the program with no issues.
t=0
t,x,W,Win,Y,temp,Wout = sess.run(tf.while_loop(cond2, test2,
[t, tf.Variable(x), tf.constant(W),
tf.constant(Win), tf.Variable(Y),
tf.Variable(temp), tf.constant(Wout)],
shape_invariants=[tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None),
tf.TensorShape(None)]))
I fixed the error by removing the tf.constant() for Wout, since Wout was already declared as a tensor.
This would be easier to diagnose with (a) your definitions for condition and body, and (b) the full error output from TensorFlow (it usually also outputs a full dump of the input tensors when issuing these errors.)
With that said, the source of the problem seems to be that TensorFlow is viewing your loop_vars list as a single Tensor, and/or your cond2 and test2 functions only accept a single argument each. If neither of these is true, then providing more detail would help answer the question (specifically the full error message and the definition for every value/tensor/function you're passing to tf.while_loop. I've found that the majority of while_loop errors can be fixed by paying attention to the tensors in the error output.
The while_loop can throw pretty confusing errors at times so I'd like to help; I'll check back and update/edit my answer if more info is provided.