Batch input to a certain layer in tensorflow - tensorflow

I'm working on a network based on inception-v3 .I train the network successfully, and now I want to feed a batch of opencv images to my network and get some output.
The original placeholder of the network accepts a string and decodes it a jpg (this image) But I read the video frames with opencv and convert them in a list of nparray :
for cnt in range(batch_size):
frameBuffer = []
if (currentPosition >= nFrames):
break
ret, frame = vidFile.read()
img_data = np.asarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
frameBuffer.append(img_data)
currentPosition += multiplier
If I want to work with a single images, beacuse i read frames directly from opencv, I convert them to np-array and then feed it to "Cast:0" layer of the inception network:
pred = sess.run([predictions], {'Cast:0': img_data})
Results are OK to this point. But I want to feed a batch of frames: I tried to use feed_dict in the current way:
images = tf.placeholder(tf.float32, [batch_size,width,height, 3])
image_batch = tf.stack(frameBuffer)
feed_dict = {images: image_batch}
avgRepresentation, pred = sess.run([pool_avg, predictions],{'Cast:0': feed_dict})
but i got errors; I know i have a mistake in feeding the batch. do you have any suggestion how i can feed a batch of images to a certain layer of a network ?

There is (at least) a problem with your feed_dict: a feed_dict is typically a dictionary with tensors or strings (for tensor name) as keys, and the values (given as usual types, np arrays, etc.).
Here you're using {'Cast:0': feed_dict}, so the value of your dictionary is itself a dictionary, which makes no sense for tensorflow. You need to put the values there, i.e. the concatenation of the images (decoded, converted, etc.). Also, sorry if I'm missing something, but I guess that frameBuffer should contain all the images of the batch, so it should be initialized out of the for loop.
This code should work:
frameBuffer = []
for cnt in range(batch_size):
if (currentPosition >= nFrames):
break
ret, frame = vidFile.read()
img_data = np.asarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
frameBuffer.append(img_data)
currentPosition += multiplier
avgRepresentation, pred = sess.run([pool_avg, predictions],{'Cast:0': np.asarray(frameBuffer)})

Related

How to make image cube for 3d convolution with file path

Recently, I am studying 3d convolution for video image processing with tensorflow.
I make model with tutorial blog. But i want to make my custom dataset. My input image's shape is (128,128,3) and i want to make image cube(128,128,100,3). I use tensorflow.data.dataset and I tried to create a map function by recalling my memories I used for 2d convolution. I want to image cube using path that consist of (Number of image cube, 100) with tf.data.dataset map function because of running out of memory when using NumPy.
I tried to use code like the following
def load_image(path):
images = []
for i, p in enumerate(path):
image_string = tf.io.read_file(p)
image = tf.io.decode_jpeg(p, channels=3)
image = tf.reshape(image, [128,128,1,3])
image = image / 255
images.append(image)
image_block = tf.concat(images, axis=2)
return image_block
train_data = tf.data.Dataset.from_tensor_slices(total_files) # shape (1077,100)
train_data = train_data.map(load_images, num_parallel_calls=tf.data.experimental.AUTOTUNE)
But have error that tensor's shape changes. And i also use tf.Variable using .assign but have similar error.
How to make 3d convolution's input image cube with path??? I use tensorflow 2.0.
So you cannnot iterate over tensor like for x in tensor. In that case you can for example iterate over range and get value by index like
for x in range(tf.shape(tensor)[0]):
y = tensor[x]

Why can't I use my dataset anymore after using InceptionV3?

I'm currently working on video-captioning (frame-sequence to natural language).
I recently started using tf.data.Dataset class instead of feed_dict argument in tensorflow.
My goal is to feed this frames to a pretrained CNN (inceptionv3), extract the feature vector and then feed it to my RNN seq2seq network.
I've got a problem of tensorflow types after mapping my Dataset with the inception model: the dataset is then totally unusable, neither via dataset.batch() or dataset.take(). I can't even make a one shot iterator !
Here is how I proceed to build my Dataset:
Step 1: I first extract the same number of frames for every videos. I store all of it into a numpy array. Its shape is (nb_videos, nb_frames, width, height, channels)
Note that in this dataset, every video has the same size and has 3 color channels.
Step 2: Then I create a tf.data.Dataset object using this big numpy array
Note that printing this dataset via python gives:
With n_videos=2; width=240; height=320; channels=3
I already don't understand what "DataAdapter" stands for
At this point; I can create a one shot iterator but using dataset.batch(1) returns:
I don't understand why "?" and not "1" shape..
Step 3: I use the map function on dataset to resize all the frames of all the videos to 299*299*3 (required to use InceptionV3)
At this point, I can use the data in my dataset and make a one shot iterator.
Step 4: I use the map function again to extract every features using InceptionV3 pretrained model.
The problem occurs at this point:
Printing the dataset gives:
Ok looks good
However, it's now impossible to make a one shot iterator for this dataset
Step1 :
X_train_slice, Y_train = build_dataset(number_of_samples)
Step 2:
X_train = tf.data.Dataset.from_tensor_slices(X_train_slice)
Step 3:
def format_video(video):
frames = tf.image.resize_images(video, (299,299))
frames = tf.keras.applications.inception_v3.preprocess_input(frames)
return frames
X_train = X_train.map(lambda video: format_video(video))
Step 4:
Inception model:
image_model = tf.keras.applications.InceptionV3(include_top=False,
weights='imagenet')
new_input = image_model.input
hidden_layer = image_model.layers[-1].output
image_features_extract_model = tf.keras.Model(new_input, hidden_layer)
For the tf.reduce_mean; see how-to-get-pool3-features-of-inception-v3-model-using-keras (SO)
def extract_video_features(video):
batch_features = image_features_extract_model(video)
batch_features = tf.reduce_mean(batch_features, axis=(1, 2))
return batch_features
X_train = X_train.map(lambda video: extract_video_features(video))
Creating the iterator:
iterator = X_train.make_one_shot_iterator()
Here is the output:
ValueError: Failed to create a one-shot iterator for a dataset.
`Dataset.make_one_shot_iterator()` does not support datasets that capture
stateful objects, such as a `Variable` or `LookupTable`. In these cases, use
`Dataset.make_initializable_iterator()`. (Original error: Cannot capture a
stateful node (name:conv2d/kernel, type:VarHandleOp) by value.)
I don't really get it: it asks me to use a initializable_iterator but this kind of iterator is dedicated for placeholder. Here, I've got raw data !
You're using the pipelines wrong.
The idea of tf.data is to provide input pipelines to a model, not to contain the model itself. What you're trying to do it fit the model as a step of the pipeline (your step 4), but, as the error shows, this won't work.
What you should do instead is build the model as you are doing and then call model.predict on the input data, to obtain the features you want (as computed values). If you want to add further computation, add it in the model, since the predict call will run the model and return the values of the output layers.
Side note: image_features_extract_model = tf.keras.Model(new_input, hidden_layer) is completely irrelevant, given the choice you made for input and output tensors: the input is image_model's input and the output is image_model's output, so image_features_extract_model is identical to image_model.
The final code should be:
X_train_slice, Y_train = build_dataset(number_of_samples)
X_train = tf.data.Dataset.from_tensor_slices(X_train_slice)
def format_video(video):
frames = tf.image.resize_images(video, (299,299))
frames = tf.keras.applications.inception_v3.preprocess_input(frames)
return frames
X_train = X_train.map(lambda video: format_video(video))
image_model = tf.keras.applications.InceptionV3(include_top=False,
weights='imagenet')
bottlenecks = image_model.predict(X_train)
# Do something with your bottlenecks

run_inference_for_single_image(image, graph) - Tensorflow, object detection

In reference to object_detection_tutorial.ipynb. I am wondering if its possible to run for all the images in a directory.
Rather than writing a for loop and running a "run_inference_for_single_image(image, graph)". Is there a way to run the inference on all the images in a directory or run the inference on multiple images.
link
for f in files:
if f.lower().endswith(('.png', '.jpg', '.jpeg')):
image_path = files_dir + '/' + f
.... // Read image etc.
output_dict = run_inference_for_single_image(image_np, detection_graph)
This will create tf.session each time and i think its computationally expensive. Please correct me if i am wrong.
As you know, 'run_inference_for_single_image' method create each time.
If you wanna inference for multiple images, you should change code like,
Method Call
images = []
for f in files:
if f.lower().endswith(('.png', '.jpg', '.jpeg')):
image_path = files_dir + '/' + f
image = .... // Read image etc.
images.append(image)
output_dicts = run_inference_for_multiple_images(images, detection_graph)
run_inference_for_multiple_images
def run_inference_for_multiple_images(images, grapg):
with graph.as_default():
with tf.Session() as sess:
output_dicts = []
for index, image in enumerate(images):
... same as inferencing for single image
output_dicts.append(output_dict)
return output_dicts
This code will be performed without creating tf.session each time but once.
I found this tutorial from google - creating-object-detection-application-tensorflow. After looking into its github page --> object_detection_app --> app.py we only need to run detect_objects(image_path) function every single time we want to detect an object.
It is possible to run inference on batch of images depending on computational power of GPU and size of the images.
step 1: stacking all the test images in one array:
for image_path in glob.glob(PATH_TO_TEST_IMAGES_DIR + '/*.jpg'):
image_np = io.imread(image_path) #
image_array.append(image_np)
image_array = np.array(image_array)
step 2: run inference on batches: (higher batch size might cause out of memory issues)
BATCH_SIZE = 5
for i in range(0, image_array.shape[0],BATCH_SIZE):
output_dict = sess.run(tensor_dict, feed_dict={image_tensor: image_array[i:i+BATCH_SIZE]})
print("number of images inferenced = ", i+BATCH_SIZE)
output_dict_array.append(output_dict)
make sure dimensions of image_tensor and image_array match. In this example image_array is (?, height, width, 3)
some tips:
You would want to load the graph only once as it takes few seconds to load.
I observed that using skimage.io.imread() or cv2.imread() is pretty fast in loading images. These functions directly load images as numpy arrays.
skimage or opencv for saving images are faster than matplotlib.

Organizing tensor into batches of dynamically shaped tensors

I have the following situation:
I want to deploy a face detector model using Tensorflow Serving: https://www.tensorflow.org/serving/.
In Tensorflow Serving, there is a command line option called --enable_batching. This causes the model server to automatically batch the requests to maximize throughput. I want this to be enabled.
My model takes in a set of images (called images), which is a tensor of shape (batch_size, 640, 480, 3).
The model has two outputs: (number_of_faces, 4) and (number_of_faces,). The first output will be called faces. The last output, which we can call partitions is the index in the original batch for the corresponding face. For example, if I pass in a batch of 4 images and get 7 faces, then I might have this tensor as [0, 0, 1, 2, 2, 2, 3]. The first two faces correspond to the first image, the third face for the second image, the 3rd image has 3 faces, etc.
My issue is this:
In order for the --enable_batching flag to work, the output from my model needs to have the 0th dimension the same as the input. That is, I need a tensor with the following shape: (batch_size, ...). I suppose this is so that the model server can know which grpc connection to send each output in the batch towards.
What I want to do is to convert my output tensor from the face detector from this shape (number_of_faces, 4) to this shape (batch_size, None, 4). That is, an array of batches, where each batch can have a variable number of faces (e.g. one image in the batch may have no faces, and another might have 3).
What I tried:
tf.dynamic_partition. On the surface, this function looks perfect. However, I ran into difficulties after realizing that the num_partitions parameter cannot be a tensor, only an integer:
tensorflow_serving_output = tf.dynamic_partition(faces, partitions, batch_size)
If the tf.dynamic_partition function were to accept tensor values for num_partition, then it seems that my problem would be solved. However, I am back to square one since this is not the case.
Thank you all for your help! Let me know if anything is unclear
P.S. Here is a visual representation of the intended process:
I ended up finding a solution to this using TensorArray and tf.while_loop:
def batch_reconstructor(tensor, partitions, batch_size):
"""
Take a tensor of shape (batch_size, 4) and a 1-D partitions tensor as well as the scalar batch_size
And reconstruct a TensorArray that preserves the original batching
From the partitions, we can get the maximum amount of tensors within a batch. This will inform the padding we need to use.
Params:
- tensor: The tensor to convert to a batch
- partitions: A list of batch indices. The tensor at position i corresponds to batch # partitions[i]
"""
tfarr = tf.TensorArray(tf.int32, size=batch_size, infer_shape=False)
_, _, count = tf.unique_with_counts(partitions)
maximum_tensor_size = tf.cast(tf.reduce_max(count), tf.int32)
padding_tensor_index = tf.cast(tf.gather(tf.shape(tensor), 0), tf.int32)
padding_tensor = tf.expand_dims(tf.cast(tf.fill([4], -1), tf.float32), axis=0) # fill with [-1, -1, -1, -1]
tensor = tf.concat([tensor, padding_tensor], axis=0)
def cond(i, acc):
return tf.less(i, batch_size)
def body(i, acc):
partition_indices = tf.reshape(tf.cast(tf.where(tf.equal(partitions, i)), tf.int32), [-1])
partition_size = tf.gather(tf.shape(partition_indices), 0)
# concat the partition_indices with padding_size * padding_tensor_index
padding_size = tf.subtract(maximum_tensor_size, partition_size)
padding_indices = tf.reshape(tf.fill([padding_size], padding_tensor_index), [-1])
partition_indices = tf.concat([partition_indices, padding_indices], axis=0)
return (tf.add(i, 1), acc.write(i, tf.gather(tensor, partition_indices)))
_, reconstructed = tf.while_loop(
cond,
body,
(tf.constant(0), tfarr),
name='batch_reconstructor'
)
reconstructed = reconstructed.stack()
return reconstructed

Tensorflow: Create a batch from a list of image tensors

I have a function get_image(...) that performs preprocessing on my input images. I gather all images that belong to the same batch in a list like this:
batch = [get_image(file_path) for file_path in batch_files]
Now I want to convert this list into one single tensor with the first dimension being the batch size dimension, such that I could feed it to the input placeholder of my network.
_ = self.sess.run([loss],feed_dict={ input_placeholder: batch })
Any idea how I could do that?
batch_concat = tf.placeholder(shape=[None] + self.image_shape, dtype=tf.float32)
for i in xrange(0,self.batch_size):
if i == 0:
tmp_batch = tf.expand_dims(batch[i], 0)
batch_concat = tmp_batch
else:
tmp_batch = tf.expand_dims(batch[i], 0)
batch_concat = tf.concat(0, [batch_concat, tmp_batch])
When I try to concatenate all tensors, I get the following error:
TypeError: The value of a feed cannot be a tf.Tensor object. Acceptable feed values include Python scalars, strings, lists, or numpy ndarrays.
So maybe it would be enough to convert the tensor back into a numpy array before feeding it to the network?
In the TF r1.1 pack has been replaced with tf.stack
You can use tf.pack to pack a list of tensors into a batch.
image_list = [get_image(file_path) for file_path in batch_files]
image_batch = tf.pack(image_list)
You can also use tf.concat to concatenate the list along the first dimension and reshape it.
The issue here is using a tensor as a value in feed_dict. Instead of feeding batch as the value for input_placeholder, why don't you use batch instead of input_placeholder, assuming batch is your batched tensor?
So, instead of:
input_placeholder = tf.Placeholder(tf.int32)
loss = some_function(input_placeholder)
sess.run(loss, feed_dict={input_placeholder: batch})
Do:
loss = some_function(batch)
sess.run(batch)