How to apply tf.map_fn on a sequence feature? Getting an error: TensorArray dtype is string but Op is trying to write dtype uint8 - tensorflow

I am writing a sequence to sequence model that maps video to text. I have the frames of the video encoded as JPEG strings in a sequence feature of the SequenceExample proto. When building my input pipeline, I am doing the following to get an array of decoded jpegs:
encoded_video, caption = parse_sequence_example(
serialized_sequence_example,
video_feature="video/frames",
caption_feature="video/caption_ids")
decoded_video = tf.map_fn(lambda x: tf.image.decode_jpeg(x, channels=3), encoded_video)
However, I am getting the following error:
InvalidArgumentError (see above for traceback): TensorArray dtype is string but Op is trying to write dtype uint8.
My goal is to apply image = tf.image.convert_image_dtype(image, dtype=tf.float32) after decoding it to get the pixel values of uint8 between [0,255] to float between [0,1].
I tried to the following:
decoded_video = tf.map_fn(lambda x: tf.image.decode_jpeg(x, channels=3), encoded_video, dtype=tf.uint8)
converted_video = tf.map_fn(lambda x: tf.image.convert_image_dtype(x, dtype=tf.float32), decoded_video)
However, I still get the same error. Anybody has any idea what might be going wrong? Thanks in advance.

Nevermind. Just had to explicitly add a dtype of tf.float32 in the following line:
converted_video = tf.map_fn(lambda x: tf.image.convert_image_dtype(x, dtype=tf.float32), decoded_video, dtype=tf.float32)

Related

rescale image in tensorflow to fall between [0,1]

I am fairly new to tensorflow and I have a tflite model which needs inference on a single image (ie no datasets). The docs say the input should be 224,224,3 and scaled to [0,1] (https://www.tensorflow.org/lite/tutorials/model_maker_image_classification#advanced_usage), but I am having trouble doing this rescaling to [0,1].
Currently I have something like so:
img = tf.io.read_file(image_path)
img = tf.io.decode_image(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.uint8)
print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
The min and max and 0 and 255 respectively. I would like to scale this to [0,1]
I am on tf 2.5 and I do not see a builtin method to do this..
I tried doing this:
img = tf.io.read_file(image_path)
img = tf.io.decode_image(img, channels=3)
scale=1./255
img=img*scale
img = tf.image.convert_image_dtype(img, tf.uint8)
print('min max img value',tf.reduce_min(img),tf.reduce_max(img))
and I get thrown:
TypeError: Cannot convert 0.00392156862745098 to EagerTensor of dtype uint8
I think there is some casting error :(
In order to avoid
TypeError: Cannot convert 0.00392156862745098 to EagerTensor of dtype uint8
error we have to cast img form tf.unit8 to tf.float32 like
img = tf.cast(img, dtype=tf.float32) / tf.constant(256, dtype=tf.float32)
print('min max img value', tf.reduce_min(img), tf.reduce_max(img))
Conversion an image tensor in tf.float32 normalized to scale [0, 1] to tf.uint8 is probably not a good idea.

split tensors based on size not known at graph time

in tensorflow I want to do the following:
receive N 1D tensors
concat them as a big 1D tensor of shape [m]
call a function that process this tensor and generates a tensor of shape [m]
split the resulting tensor in N 1D tensors
However at graph creation time, I don't know the size of each of the 1D tensors, which creates issues. Here's a snippet of what I'm doing:
def stack(tensors):
sizes = tf.convert_to_tensor([t.shape[0].value for t in tensors])
tensor_stacked = tf.concat(tensors, axis=0)
res = my_function(tensor_stacked)
return tf.split(res, sizes, 0)
tensor_A = tf.placeholder(
tf.int32,
shape=[None],
name=None
)
tensor_B = tf.placeholder(
tf.int32,
shape=[None],
name=None
)
res = stack([tensor_A, tensor_B])
This will fail on the "concat" line with the message
TypeError: Failed to convert object of type to Tensor. Contents: [None, None]. Consider casting elements to a supported type.
Is there any way I can do this in tensorflow ? At graph-time the "sizes" variables will always contain unknown sizes because the length of the 1D tensors is never known
Ok, in the meantime I found the answer
Apparently it's enough to replace the call to tensor.shape[0] to tf.shape(tensor)[0]
So now I have:
def stack(tensors):
sizes = tf.convert_to_tensor([tf.shape(t)[0] for t in tensors])
print(sizes)
tensor_stacked = tf.concat(tensors, axis=0)
res = my_function(tensor_stacked)
return tf.split(res, sizes, 0)

TensorFlow DecodePng throws Value Error

For decoding a png image, normally we use the following segment of code.
image_placeholder = tf.placeholder(tf.string)
image_tensor = tf.read_file(image_placeholder)
image_tensor = tf.image.decode_png(image_tensor, channels=1)
For deploying a model using Tensorflow serving, I followed the example of Inception_saved_model for my own version of model. Below is the code used in that program to read the incoming tensorproto.
image_placeholder = tf.placeholder(tf.string, name='images')
feature_configs = {'images': tf.FixedLenFeature(shape=[], dtype=tf.string), }
tf_example = tf.parse_example(image_placeholder, feature_configs)
image_tensor = tf_example['images']
image_tensor = tf.image.decode_png(image_tensor, channels=1)
When I use this code, Decode_png throws Value error:
ValueError: Shape must be rank 0 but is rank 1 for 'DecodePng' (op: 'DecodePng') with input shapes: [?].
Can someone help me on where I am going wrong? The code I presented here is similar to the one given in the Inception example.
tf.parse_example operates on a batch ("rank 1"), and decode_png expects a single image (a scalar string, "rank 0"). I'd either use tf.parse_single_example or add a reshape to scalar (shape=[]) before using decode_png.

List of Tensors when single Tensor expected

I use concat to get tensors as the input of CNN. But got the error: List of Tensors when single Tensor expected
image_raw = img.tobytes()
image = tf.decode_raw(image_raw, tf.uint8)
image = tf.reshape(image, [1, image_height, image_width, 3])
image_val = image
for i in range(batch_size-1):
image_val = tf.concat(0,[image_val,image])
return image_val
I have searched the answers for these question, add
image_val = tf.stack([image_val],0) before return, but still get the same error ,why?
**build environment:**
TensorFlow version 0.12
python 3.5
The error List of Tensors when single Tensor expected comes from the fact you wrote tf.concat(0,[image_val,image]) instead of tf.concat([image_val,image],0).
Maybe check again the type of image_height, image_width because sometimes it is necessary to cast these into an integer dtype, e.g. tf.cast(image_height, tf.int32)

Feeding dtype np.float32 to TensorFlow placeholder

I am trying to feed an numpy ndarray of type : float32 to a TensorFlow placeholder, but it's giving me the following error:
You must feed a value for placeholder tensor 'Placeholder' with dtype float
My place holders are defined as:
n_steps = 10
n_input = 13
n_classes = 1201
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
And the line it's giving me the above error is:
sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
where my batch_x and batch_y are numpy ndarrays of dtype('float32'). The following are the types that I printed using pdb:
(Pdb)batch_x.dtype
dtype('float32')
(Pdb)x.dtype
tf.float32
I have also tried type-casting batch_x and batch_y to tf.float32 as it seems like x is of dtype tf.float32 but running the code with type-casting:
sess.run(optimizer, feed_dict={x: tf.to_float(batch_x), y: tf.to_float(batch_y)})
gives the following error:
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
How should I feed the placeholders? of what type should I use?
Any help/advice will be much appreciated!
For your first problem, are you sure that batch_y is also float32? You only provide the trace of the batch_x type, and batch_y is more likely to be integer, since it appears to be a one-hot encoding of your classes.
For the second problem, what you do wrong is you use tf.to_float, which is a tensor operation, on a regular numpy array. You should use numpy cast intstead:
sess.run(optimizer, feed_dict={x: batch_x.astype(np.float32), y: batch_y.astype(np.float32)})