Reshape tensor using placeholder value - tensorflow

I want to reshape a tensor using the [int, -1] notation (to flatten an image, for example). But I don't know the first dimension ahead of time. One use case is train on a large batch, then evaluate on a smaller batch.
Why does this give the following error: got list containing Tensors of type '_Message'?
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, shape=[None, 28, 28])
batch_size = tf.placeholder(tf.int32)
def reshape(_batch_size):
return tf.reshape(x, [_batch_size, -1])
reshaped = reshape(batch_size)
with tf.Session() as sess:
sess.run([reshaped], feed_dict={x: np.random.rand(100, 28, 28), batch_size: 100})
# Evaluate
sess.run([reshaped], feed_dict={x: np.random.rand(8, 28, 28), batch_size: 8})
Note: when I have the reshape outside of the function it seems to work, but I have very large models that I use multiple times, so I need to keep them in a function and pass the dim using an argument.

To make this work, replace the function:
def reshape(_batch_size):
return tf.reshape(x, [_batch_size, -1])
…with the function:
def reshape(_batch_size):
return tf.reshape(x, tf.pack([_batch_size, -1]))
The reason for the error is that tf.reshape() expects a value that is convertible to a tf.Tensor as its second argument. TensorFlow will automatically convert a list of Python numbers to a tf.Tensor but will not automatically convert a mixed list of numbers and tensors (such as a tf.placeholder())—instead raising the somewhat unintuitive error message you saw.
The tf.pack() op takes a list of objects convertible to a tensor, and converts each element individually, so it can handle the combination of a placeholder and an integer.

hi all the issue is due to Keras version. I tried above all without any success. Uninstall Keras and install via pip. It worked for me.
I was facing this error with Keras 1.0.2 & resolved with Keras 1.2.0
Hope this will help. Thank you

Related

K-Means of Tensorflow - Graph disconnected error

I am trying to write a function that runs KMeans on a dataset and outputs the cluster centroids. My aim is to use this in a custom keras layer, so I am using TensorFlow's implementation of KMeans that takes a tensor as the input dataset.
My problem however is that I can't make it work even as a standalone function. The problem comes from the fact that KMeans accepts a generator function that provides mini-batches instead of a plain tensor, but when I am using closure to do that, I get a graph disconnected error:
import tensorflow as tf # version: 2.4.1
from tensorflow.compat.v1.estimator.experimental import KMeans
#tf.function
def KMeansCentroids(inputs, num_clusters, steps, use_mini_batch=False):
# `inputs` is a 2D tensor
def input_fn():
# Each one of the lines below results in the same "Graph Disconnected" error. Tuples don't really needed but just to be consistent with the documentation
return (inputs, None)
return (tf.data.Dataset.from_tensor_slices(inputs), None)
return (tf.convert_to_tensor(inputs), None)
kmeans = KMeans(
num_clusters=num_clusters,
use_mini_batch=use_mini_batch)
kmeans.train(input_fn, steps=steps) # This is where the error happens
return kmeans.cluster_centers()
>>> x = tf.random.uniform((100, 2))
>>> c = KMeansCentroids(x, 5, 10)
The exact error is:
ValueError:
Tensor("strided_slice:0", shape=(), dtype=int32)
must be from the same graph as
Tensor("Equal:0", shape=(), dtype=bool)
(graphs are FuncGraph(name=KMeansCentroids, id=..) and <tensorflow.python.framework.ops.Graph object at ...>).
If I were to use a numpy dataset and convert to tensor inside the function, the code would work just fine.
Also, making input_fn() return directly tf.random.uniform((100, 2)) (ignoring the inputs argument), would again work. That's why I am guessing that tensorflow doesn't support closures since it needs to build the computation graph at the beginning.
But I don't see how to work around that.
Could it be a version error due to KMeans being a compat.v1.experimental module?
Note that the documentation of KMeans states for the input_fn():
The function should construct and return one of the following:
A tf.data.Dataset object: Outputs of Dataset object must be a tuple (features, labels) with same constraints as below.
A tuple (features, labels): Where features is a tf.Tensor or a dictionary of string feature name to Tensor and labels is a Tensor or a dictionary of string label name to Tensor. Both features and labels are consumed by model_fn. They should satisfy the expectation of model_fn from inputs.
The problem you're facing is more about invoking tensor outside the created graph. Basically, when you called the .train function, a new graph will be created and that is with the graph defined in that input_fn and the graph defined in the model_fn.
kmeans.train(input_fn, steps=steps)
And, after that all the tensors those coming outside these functions will be treated as outsiders and won't part of this new graph. That's why you're getting a graph disconnected error for trying to use outsider tensor. To resolve this, you need to create the necessary tensors within these graphs.
import tensorflow as tf
from tensorflow.compat.v1.estimator.experimental import KMeans
#tf.function
def KMeansCentroids(num_clusters, steps, use_mini_batch=False):
def input_fn(batch_size):
pinputs = tf.random.uniform((100, 2))
dataset = tf.data.Dataset.from_tensor_slices((pinputs))
dataset = dataset.shuffle(1000).repeat()
return dataset.batch(batch_size)
kmeans = KMeans(
num_clusters=num_clusters,
use_mini_batch=use_mini_batch)
kmeans.train(input_fn = lambda: input_fn(5),
steps=steps)
return kmeans.cluster_centers()
c = KMeansCentroids(5, 10)
Here is some more info for reading, 1. FYI, I tested your code with a few versions of tf > 2, and I don't think it's related to version error or something.
Re-mentioning here for future readers. An alternative of using KMeans within Keras layers:
tf_kmeans.py
ClusteringLayer

ValueError from tensorflow estimator RNNClassifier with gcloud ml-engine job

I am working on the task.py file for submitting a gcloud MLEngine job. Previously I was using tensorflow.estimator.DNNClassifier successfully to submit jobs with my data (which consists solely of 8 columns of sequential numerical data for cryptocurrency prices & volume; no categorical).
I have now switched to the tensorflow contrib estimator RNNClassifier. This is my current code for the relevant portion:
def get_feature_columns():
return [
tf.feature_column.numeric_column(feature, shape=(1,))
for feature in column_names[:len(column_names)-1]
]
def build_estimator(config, learning_rate, num_units):
return tf.contrib.estimator.RNNClassifier(
sequence_feature_columns=get_feature_columns(),
num_units=num_units,
cell_type='lstm',
rnn_cell_fn=None,
optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate),
config=config)
estimator = build_estimator(
config=run_config,
learning_rate=args.learning_rate,
num_units=[32, 16])
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
However, I'm getting the following ValueError:
ValueError: All feature_columns must be of type _SequenceDenseColumn. You can wrap a sequence_categorical_column with an embedding_column or indicator_column. Given (type <class 'tensorflow.python.feature_column.feature_column_v2.NumericColumn'>): NumericColumn(key='LTCUSD_close', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)
I don't understand this, as the data is not categorical.
As #Ben7 pointed out sequence_feature_columns accepts columns like sequence_numeric_column. However, according to the documentation, RNNClassifier sequence_feature_columns expects SparseTensors and sequence_numeric_column is a dense tensor. This seems to be contradictory.
Here is a workaround I used to solve this issue (I took the to_sparse_tensor function from this answer):
def to_sparse_tensor(dense):
# sequence_numeric_column default is float32
zero = tf.constant(0.0, dtype=tf.dtypes.float32)
where = tf.not_equal(dense, zero)
indices = tf.where(where)
values = tf.gather_nd(dense, indices)
return tf.SparseTensor(indices, values, tf.shape(dense, out_type=tf.dtypes.int64))
def get_feature_columns():
return [
tf.feature_column.sequence_numeric_column(feature, shape=(1,), normalizer_fn=to_sparse_tensor)
for feature in column_names[:len(column_names)-1]
]
you got this error because you use a numeric feature column whereas this kind of estimator can only accept sequence feature columns as you can see it on the init function.
So, instead of using numeric column you have to use sequence_numeric_column.

tensorflow Dataset.from_generator using an generator that yield tensors

I am trying to convert some code into the new dataset API so that I can use the distribution strategy. Below is what I am trying to do.
def dataset_generator():
while True:
features, labels = ex_lib.get_image_batch(), ex_lib.get_feature_batch()
yield features, labels
def get_ssf_input_fn():
def input_fn():
return tf.data.Dataset.from_generator(dataset_generator,
(tf.float32, tf.float32), ([None, config.image_height, config.image_width, config.image_channels], [None, 256]))
return input_fn
the problem is ex_lib.get_image_batch and ex_lib.get_feature_batch gives me a tensor instead of a numpy array, and I cannot change the code in ex_lib. Also I cannot convert the tensor to numpy array here since I have no access to the sess here. With this code, it will throw
`generator` yielded an element that could not be converted to the expected type. The expected type was float32, but the yielded element was Tensor("GetImageBatch:0", dtype=uint8)
Is there a way to let my input_fn return a Dataset instead?
I am able to work around this problem with the following trick. Its efficiency is OK.
tf.data.Dataset.from_tensors(0).repeat().map(lambda _: dataset_generator())

Output from TensorFlow `py_func` has unknown rank/shape

I am trying to create a simple neural net in TensorFlow. The only tricky part is I have a custom operation that I have implemented with py_func. When I pass the output from py_func to a Dense layer, TensorFlow complains that the rank should be known. The specific error is:
ValueError: Inputs to `Dense` should have known rank.
I don't know how to preserve the shape of my data when I pass it through py_func. My question is how do I get the correct shape? I have a simple example below to illustrate the problem.
def my_func(x):
return np.sinh(x).astype('float32')
inp = tf.convert_to_tensor(np.arange(5))
y = tf.py_func(my_func, [inp], tf.float32, False)
with tf.Session() as sess:
with sess.as_default():
print(inp.shape)
print(inp.eval())
print(y.shape)
print(y.eval())
The output from this snippet is:
(5,)
[0 1 2 3 4]
<unknown>
[ 0.
1.17520118 3.62686038 10.01787472 27.28991699]
Why is y.shape <unknown>? I want the shape to be (5,) the same as inp. Thanks!
Since py_func can execute arbitrary Python code and output anything, TensorFlow can't figure out the shape (it would require analyzing Python code of function body) You can instead give the shape manually
y.set_shape(inp.get_shape())

writing a custom cost function in tensorflow

I'm trying to write my own cost function in tensor flow, however apparently I cannot 'slice' the tensor object?
import tensorflow as tf
import numpy as np
# Establish variables
x = tf.placeholder("float", [None, 3])
W = tf.Variable(tf.zeros([3,6]))
b = tf.Variable(tf.zeros([6]))
# Establish model
y = tf.nn.softmax(tf.matmul(x,W) + b)
# Truth
y_ = tf.placeholder("float", [None,6])
def angle(v1, v2):
return np.arccos(np.sum(v1*v2,axis=1))
def normVec(y):
return np.cross(y[:,[0,2,4]],y[:,[1,3,5]])
angle_distance = -tf.reduce_sum(angle(normVec(y_),normVec(y)))
# This is the example code they give for cross entropy
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
I get the following error:
TypeError: Bad slice index [0, 2, 4] of type <type 'list'>
At present, tensorflow can't gather on axes other than the first - it's requested.
But for what you want to do in this specific situation, you can transpose, then gather 0,2,4, and then transpose back. It won't be crazy fast, but it works:
tf.transpose(tf.gather(tf.transpose(y), [0,2,4]))
This is a useful workaround for some of the limitations in the current implementation of gather.
(But it is also correct that you can't use a numpy slice on a tensorflow node - you can run it and slice the output, and also that you need to initialize those variables before you run. :). You're mixing tf and np in a way that doesn't work.
x = tf.Something(...)
is a tensorflow graph object. Numpy has no idea how to cope with such objects.
foo = tf.run(x)
is back to an object python can handle.
You typically want to keep your loss calculation in pure tensorflow, so do the cross and other functions in tf. You'll probably have to do the arccos the long way, as tf doesn't have a function for it.
just realized that the following failed:
cross_entropy = -tf.reduce_sum(y_*np.log(y))
you cant use numpy functions on tf objects, and the indexing my be different too.
I think you can use "Wraps Python function" method in tensorflow. Here's the link to the documentation.
And as for the people who answered "Why don't you just use tensorflow's built in function to construct it?" - sometimes the cost function people are looking for cannot be expressed in tf's functions or extremely difficult.
This is because you have not initialized your variable and because of this it does not have your Tensor there right now (can read more in my answer here)
Just do something like this:
def normVec(y):
print y
return np.cross(y[:,[0,2,4]],y[:,[1,3,5]])
t1 = normVec(y_)
# and comment everything after it.
To see that you do not have a Tensor now and only Tensor("Placeholder_1:0", shape=TensorShape([Dimension(None), Dimension(6)]), dtype=float32).
Try initializing your variables
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
and evaluate your variable sess.run(y). P.S. you have not fed your placeholders up till now.