Generate variable length data with Tensorflow ops - tensorflow

I am trying to learn a classifier on audio files. I read my WAV files and convert them to a sequence of spectrogram images for training in a custom Python function. The function is called with tf.py_func and returns an array of images with the same shape. In other words the image shape is well defined, yet the number of images is dynamic. (e.g. 3 spectrograms for short audio snippet, 15 for a long snippet)
Is there a way to unpack the resulting list for further processing / enqueueing in tf.train.batch_join()? The undefined sequence length seems to be a problem for many TF ops. Can the length be inferred somehow?
...
// Read the audio file name and label from a CSV file
audio_file, label = tf.decode_csv(csv_content)
def read_audio(audio_file):
signal = read_wav(audio_file)
images = [generate_image(segment) for segment in split_audio(signal)]
// This output is of varying length depending on the length of audio file.
return images
// Convert audio file to a variable length sequence of images
// Shape: <unknown>, which is to be expected from tf.py_func
image_sequence = tf.py_func(wav_to_spectrogram, [audio_file], [tf.float32])[0]
// Auxilliary to set a shape for the images defined in tf.py_func
def process_image(in_image):
image = tf.image.convert_image_dtype(in_image, dtype=tf.float32)
image.set_shape([600, 39, 1])
return (image, label)
// Shape: (?, 600, 39, 1)
images_labels = tf.map_fn(process_image, image_sequence, dtype=(tf.float32, tf.int32))
// This will not work. 'images_and_labels' needs to be a list
images, label_index_batch = tf.train.batch_join(
images_and_labels,
batch_size=batch_size,
capacity=2 * num_preprocess_threads * batch_size,
shapes=[data_shape, []],
)

You can use variable size Tensor as input and enqueue_many to treat this tensor as a variable size input batch.
Below is an example of py_func generating variable size batches and batch with enqueue_many converting it to constant size batches.
import tensorflow as tf
tf.reset_default_graph()
# start with time-out to prevent hangs when experimenting
config = tf.ConfigProto()
config.operation_timeout_in_ms=2000
sess = tf.InteractiveSession(config=config)
# initialize first queue with 1, 2, 1, 2
queue1 = tf.FIFOQueue(capacity=4, dtypes=[tf.int32])
queue1_input = tf.placeholder(tf.int32)
queue1_enqueue = queue1.enqueue(queue1_input)
sess.run(queue1_enqueue, feed_dict={queue1_input: 1})
sess.run(queue1_enqueue, feed_dict={queue1_input: 2})
sess.run(queue1_enqueue, feed_dict={queue1_input: 1})
sess.run(queue1_enqueue, feed_dict={queue1_input: 2})
sess.run(queue1.close())
# call_func will produce variable size tensors
def range_func(x):
return np.array(range(x), dtype=np.int32)
[call_func] = tf.py_func(range_func, [queue1.dequeue()], [tf.int32])
queue2_dequeue = tf.train.batch([call_func], batch_size=3, shapes=[[]], enqueue_many=True)
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
try:
while True:
print sess.run(queue2_dequeue)
except tf.errors.OutOfRangeError:
pass
finally:
coord.request_stop()
coord.join(threads)
sess.close()
You should see
[0 0 1]
[0 0 1]

Related

Input shape axis 0 must equal 4, got shape [5] when trying to crop image batch in tensorflow dataset pipeline

I get the following error when I try to crop a batch of images inside a tf.data.Dataset pipeline:
InvalidArgumentError: Input shape axis 0 must equal 4, got shape [5]
[[{{node crop_to_bounding_box/unstack}}]] [Op:IteratorGetNext]
def crop(img_batch, label_batch):
#cropped_image = img_batch
cropped_image = tf.image.crop_to_bounding_box(img_batch, 0, 0, 100, 100)
return cropped_image, label_batch
train_dataset_cropped = train_dataset.map(crop)
But when I try to run the following for loop I get the mentioned error:
for img_batch, label_batch in train_dataset_cropped:
print(type(img_batch), img_batch.shape, label_batch.shape)
Note that the pipeline works without the tf.image.crop_to_bounding_box inside the crop function (directly using cropped_image = img_batch).
Do you know how to correctly crop a batch of images inside a tf.data.Dataset pipeline?
I didn't find any documentation for this, but I think you can't call methods from tf.image in a method that will be used within a tf.data.Dataset.map. A simple workaround for your problem is then to do:
def crop(img_batch, label_batch):
cropped_image = img_batch[:, :100, :100] # if your dataset is already batched
# cropped_image = img_batch[:100, :100] # otherwise
return cropped_image, label_batch

TypeError: <lambda>() takes 1 positional argument but 2 were given

Here is my code:
img_gen = tf.keras.preprocessing.image.ImageDataGenerator()
gen = img_gen.flow_from_directory('/train/',(224, 224),'rgb', batch_size = 2)
training_set = tf.data.Dataset.from_generator(lambda : gen, output_types=(tf.float32, tf.float32), output_shapes = ([2,224,224,3],[2,2]))
def read_images(features):
return features['image']
training_set = training_set.map(lambda x: read_images(x), num_parallel_calls=tf.data.experimental.AUTOTUNE)
The error was:
TypeError: <lambda>() takes 1 positional argument but 2 were given
So how can I solve the problem in funtion read_images.
Documentation of flow_from_directory -
Returns
A DirectoryIterator yielding tuples of (x, y) where x is a
numpy array containing a batch of images with shape (batch_size,
*target_size, channels) and y is a numpy array of corresponding labels.
You can see that it returns a tuple with 2 elements, so your map function needs to handle that.
def read_images(features):
# some processing
output = features
return output
training_set = training_set.map(lambda image, label: read_images(image), num_parallel_calls=tf.data.experimental.AUTOTUNE)
The ImageDataGenerator itself have a lot of processing options available.
You can also check out other tutorials in tensorflow pages - load images
Looking at the dataset content would also help debug issues
for line in training_set.take(1):
print(len(line))
print(line)

How to use Tensorflow's tf.cond() with two different Dataset iterators without iterating both?

I want to feed a CNN with the tensor "images". I want this tensor to contain images from the training set ( which have FIXED size ) when the placeholder is_training is True, otherwise I want it to contain images from the test set ( which are of NOT FIXED size ).
This is needed because in training I take a random fixed crop from the training images, while in test I want to perform a dense evaluation and feed the entire images inside the network ( it is fully convolutional so it will accept them)
The current NOT WORKING way is to create two different iterators, and try to select the training/test input with tf.cond at the session.run(images,{is_training:True/False}).
The problem is that BOTH the iterators are evaluated. The training and test dataset are also of different size so I cannot iterate both of them until the end. Is there a way to make this work? Or to rewrite this in a smarter way?
I've seen some questions/answers about this but they always used tf.assign which takes a numpy array and assigns it to a tensor. In this case I cannot use tf.assign because I already have a tensor coming from the iterators.
The current code that I have is this one. It simply checks the shape of the tensor "images":
train_filenames, train_labels = list_images(args.train_dir)
val_filenames, val_labels = list_images(args.val_dir)
graph = tf.Graph()
with graph.as_default():
# Preprocessing (for both training and validation):
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
image = tf.cast(image_decoded, tf.float32)
return image, label
# Preprocessing (for training)
def training_preprocess(image, label):
# Random flip and crop
image = tf.image.random_flip_left_right(image)
image = tf.random_crop(image, [args.crop,args.crop, 3])
return image, label
# Preprocessing (for validation)
def val_preprocess(image, label):
flipped_image = tf.image.flip_left_right(image)
batch = tf.stack([image,flipped_image],axis=0)
return batch, label
# Training dataset
train_filenames = tf.constant(train_filenames)
train_labels = tf.constant(train_labels)
train_dataset = tf.contrib.data.Dataset.from_tensor_slices((train_filenames, train_labels))
train_dataset = train_dataset.map(_parse_function,num_threads=args.num_workers, output_buffer_size=args.batch_size)
train_dataset = train_dataset.map(training_preprocess,num_threads=args.num_workers, output_buffer_size=args.batch_size)
train_dataset = train_dataset.shuffle(buffer_size=10000)
batched_train_dataset = train_dataset.batch(args.batch_size)
# Validation dataset
val_filenames = tf.constant(val_filenames)
val_labels = tf.constant(val_labels)
val_dataset = tf.contrib.data.Dataset.from_tensor_slices((val_filenames, val_labels))
val_dataset = val_dataset.map(_parse_function,num_threads=1, output_buffer_size=1)
val_dataset = val_dataset.map(val_preprocess,num_threads=1, output_buffer_size=1)
train_iterator = tf.contrib.data.Iterator.from_structure(batched_train_dataset.output_types,batched_train_dataset.output_shapes)
val_iterator = tf.contrib.data.Iterator.from_structure(val_dataset.output_types,val_dataset.output_shapes)
train_images, train_labels = train_iterator.get_next()
val_images, val_labels = val_iterator.get_next()
train_init_op = train_iterator.make_initializer(batched_train_dataset)
val_init_op = val_iterator.make_initializer(val_dataset)
# Indicates whether we are in training or in test mode
is_training = tf.placeholder(tf.bool)
def f_true():
with tf.control_dependencies([tf.identity(train_images)]):
return tf.identity(train_images)
def f_false():
return val_images
images = tf.cond(is_training,f_true,f_false)
num_images = images.shape
with tf.Session(graph=graph) as sess:
sess.run(train_init_op)
#sess.run(val_init_op)
img = sess.run(images,{is_training:True})
print(img.shape)
The problem is that when I want to use only the training iterator, I comment the line to initialize the val_init_op but there is the following error:
FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[[2,?,?,3], []], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/cpu:0"](Iterator_1)]]
If I do not comment that line everything works as expected, when is_training is true I get training images and when is_training is False I get validation images. The issue is that both the iterators need to be initialized and when I evaluate one of them, the other is incremented too. Since as I said they are of different size this causes an issue.
I hope there is a way to solve it! Thanks in advance
The trick is to call iterator.get_next() inside the f_true() and f_false() functions:
def f_true():
train_images, _ = train_iterator.get_next()
return train_images
def f_false():
val_images, _ = val_iterator.get_next()
return val_images
images = tf.cond(is_training, f_true, f_false)
The same advice applies to any TensorFlow op that has a side effect, like assigning to a variable: if you want that side effect to happen conditionally, the op must be created inside the appropriate branch function passed to tf.cond().

Strange values of training and testing when running my CNN in Tensorflow

I´ve been trying to train and evaluate a convolutional neural network using my own data, which consists in 200 training images and 20 testing images. My complete script is here:
Error while running a convolutional network using my own data in Tensorflow
When I run it, I don´t get any error and it seems to complete the whole process just fine, but the training values and testing result change randomly each time I run it, so I think that it´s not training anything at all.
When I print the values of image_train_batch_eval and label_train_batch_eval I get a tensor with 5 examples and 5 labels (as batch_size_train is 5) so I think that the batching process works fine.
I don´t really know what might be the problem, but there must be something I´m missing. Thank you in advance.
EDIT: These are the results I get.
Step 0, Traininig accuracy: 0.2
Step 2, Traininig accuracy: 0.4
Step 4, Traininig accuracy: 1
Step 6, Traininig accuracy: 1
Step 8, Traininig accuracy: 0.6
Step 10, Traininig accuracy: 0.8
Step 12, Traininig accuracy: 0.8
Step 14, Traininig accuracy: 0
Step 16, Traininig accuracy: 0.8
Step 18, Traininig accuracy: 0
Step 20, Traininig accuracy: 0.8
Step 22, Traininig accuracy: 0
Step 24, Traininig accuracy: 0
Step 26, Traininig accuracy: 0.2
Step 28, Traininig accuracy: 0.8
Step 30, Traininig accuracy: 0.4
Step 32, Traininig accuracy: 0
Step 34, Traininig accuracy: 1
Step 36, Traininig accuracy: 1
Step 38, Traininig accuracy: 0
Step 40, Traininig accuracy: 0.2
Step 42, Traininig accuracy: 0
Step 44, Traininig accuracy: 0.8
Step 46, Traininig accuracy: 0
Step 48, Traininig accuracy: 0.8
Testing accuracy: 0
But these values change everytime.
sinc I can't follow what your code. here an example a full conv layer script using Tensorflow.
1st
If you're working with images it really does make sense to serialize your data convolution operations are tense enough!
The following script serializes youe images in TFrecords format. [based on Inception example ].
'''
Converts image data to TFRecords file format with Example protos.
The image data set is expected to reside in JPEG files located in the
following directory structure.
trainingset/label_0/image0.jpeg
trainingset/label_0/image1.jpg
...
testset/label_1/weird-image.jpeg
testset/label_1/my-image.jpeg
'''
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import os
import random
import sys
import threading
import numpy as np
import tensorflow as tf
tf.app.flags.DEFINE_string('train_directory', '/tmp/',
'Training data directory')
tf.app.flags.DEFINE_string('validation_directory', '/tmp/',
'Validation data directory')
tf.app.flags.DEFINE_string('output_directory', '/tmp/',
'Output data directory')
tf.app.flags.DEFINE_integer('train_shards', 2,
'Number of shards in training TFRecord files.')
tf.app.flags.DEFINE_integer('validation_shards', 2,
'Number of shards in validation TFRecord files.')
tf.app.flags.DEFINE_integer('num_threads', 2,
'Number of threads to preprocess the images.')
# The labels file contains a list of valid labels are held in this file.
# Assumes that the file contains entries as such:
# dog
# cat
# flower
# where each line corresponds to a label. We map each label contained in
# the file to an integer corresponding to the line number starting from 0.
tf.app.flags.DEFINE_string('labels_file', '', 'Labels file')
FLAGS = tf.app.flags.FLAGS
def _int64_feature(value):
"""Wrapper for inserting int64 features into Example proto."""
if not isinstance(value, list):
value = [value]
return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
def _bytes_feature(value):
"""Wrapper for inserting bytes features into Example proto."""
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def _convert_to_example(filename, image_buffer, label, text, height, width):
"""Build an Example proto for an example.
Args:
filename: string, path to an image file, e.g., '/path/to/example.JPG'
image_buffer: string, JPEG encoding of RGB image
label: integer, identifier for the ground truth for the network
text: string, unique human-readable, e.g. 'dog'
height: integer, image height in pixels
width: integer, image width in pixels
Returns:
Example proto
"""
colorspace = 'RGB'
channels = 3
image_format = 'JPEG'
example = tf.train.Example(features=tf.train.Features(feature={
'image/height': _int64_feature(height),
'image/width': _int64_feature(width),
'image/colorspace': _bytes_feature(tf.compat.as_bytes(colorspace)),
'image/channels': _int64_feature(channels),
'image/class/label': _int64_feature(label),
'image/class/text': _bytes_feature(tf.compat.as_bytes(text)),
'image/format': _bytes_feature(tf.compat.as_bytes(image_format)),
'image/filename': _bytes_feature(tf.compat.as_bytes(os.path.basename(filename))),
'image/encoded': _bytes_feature(tf.compat.as_bytes(image_buffer))}))
return example
class ImageCoder(object):
"""Helper class that provides TensorFlow image coding utilities."""
def __init__(self):
# Create a single Session to run all image coding calls.
self._sess = tf.Session()
# Initializes function that converts PNG to JPEG data.
self._png_data = tf.placeholder(dtype=tf.string)
image = tf.image.decode_png(self._png_data, channels=3)
self._png_to_jpeg = tf.image.encode_jpeg(image, format='rgb', quality=100)
# Initializes function that decodes RGB JPEG data.
self._decode_jpeg_data = tf.placeholder(dtype=tf.string)
self._decode_jpeg = tf.image.decode_jpeg(self._decode_jpeg_data, channels=3)
def png_to_jpeg(self, image_data):
return self._sess.run(self._png_to_jpeg,
feed_dict={self._png_data: image_data})
def decode_jpeg(self, image_data):
image = self._sess.run(self._decode_jpeg,
feed_dict={self._decode_jpeg_data: image_data})
assert len(image.shape) == 3
assert image.shape[2] == 3
return image
def _is_png(filename):
"""Determine if a file contains a PNG format image.
Args:
filename: string, path of the image file.
Returns:
boolean indicating if the image is a PNG.
"""
return '.png' in filename
def _process_image(filename, coder):
"""Process a single image file.
Args:
filename: string, path to an image file e.g., '/path/to/example.JPG'.
coder: instance of ImageCoder to provide TensorFlow image coding utils.
Returns:
image_buffer: string, JPEG encoding of RGB image.
height: integer, image height in pixels.
width: integer, image width in pixels.
"""
# Read the image file.
with tf.gfile.FastGFile(filename, 'rb') as f:
image_data = f.read()
# Convert any PNG to JPEG's for consistency.
if _is_png(filename):
print('Converting PNG to JPEG for %s' % filename)
image_data = coder.png_to_jpeg(image_data)
# Decode the RGB JPEG.
image = coder.decode_jpeg(image_data)
# Check that image converted to RGB
assert len(image.shape) == 3
height = image.shape[0]
width = image.shape[1]
assert image.shape[2] == 3
return image_data, height, width
def _process_image_files_batch(coder, thread_index, ranges, name, filenames,
texts, labels, num_shards):
"""Processes and saves list of images as TFRecord in 1 thread.
Args:
coder: instance of ImageCoder to provide TensorFlow image coding utils.
thread_index: integer, unique batch to run index is within [0, len(ranges)).
ranges: list of pairs of integers specifying ranges of each batches to
analyze in parallel.
name: string, unique identifier specifying the data set
filenames: list of strings; each string is a path to an image file
texts: list of strings; each string is human readable, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth
num_shards: integer number of shards for this data set.
"""
# Each thread produces N shards where N = int(num_shards / num_threads).
# For instance, if num_shards = 128, and the num_threads = 2, then the first
# thread would produce shards [0, 64).
num_threads = len(ranges)
assert not num_shards % num_threads
num_shards_per_batch = int(num_shards / num_threads)
shard_ranges = np.linspace(ranges[thread_index][0],
ranges[thread_index][1],
num_shards_per_batch + 1).astype(int)
num_files_in_thread = ranges[thread_index][1] - ranges[thread_index][0]
counter = 0
for s in range(num_shards_per_batch):
# Generate a sharded version of the file name, e.g. 'train-00002-of-00010'
shard = thread_index * num_shards_per_batch + s
output_filename = '%s-%.5d-of-%.5d' % (name, shard, num_shards)
output_file = os.path.join(FLAGS.output_directory, output_filename)
writer = tf.python_io.TFRecordWriter(output_file)
shard_counter = 0
files_in_shard = np.arange(shard_ranges[s], shard_ranges[s + 1], dtype=int)
for i in files_in_shard:
filename = filenames[i]
label = labels[i]
text = texts[i]
try:
image_buffer, height, width = _process_image(filename, coder)
except Exception as e:
print(e)
print('SKIPPED: Unexpected eror while decoding %s.' % filename)
continue
example = _convert_to_example(filename, image_buffer, label,
text, height, width)
writer.write(example.SerializeToString())
shard_counter += 1
counter += 1
if not counter % 1000:
print('%s [thread %d]: Processed %d of %d images in thread batch.' %
(datetime.now(), thread_index, counter, num_files_in_thread))
sys.stdout.flush()
writer.close()
print('%s [thread %d]: Wrote %d images to %s' %
(datetime.now(), thread_index, shard_counter, output_file))
sys.stdout.flush()
shard_counter = 0
print('%s [thread %d]: Wrote %d images to %d shards.' %
(datetime.now(), thread_index, counter, num_files_in_thread))
sys.stdout.flush()
def _process_image_files(name, filenames, texts, labels, num_shards):
"""Process and save list of images as TFRecord of Example protos.
Args:
name: string, unique identifier specifying the data set
filenames: list of strings; each string is a path to an image file
texts: list of strings; each string is human readable, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth
num_shards: integer number of shards for this data set.
"""
assert len(filenames) == len(texts)
assert len(filenames) == len(labels)
# Break all images into batches with a [ranges[i][0], ranges[i][1]].
spacing = np.linspace(0, len(filenames), FLAGS.num_threads + 1).astype(np.int)
ranges = []
for i in range(len(spacing) - 1):
ranges.append([spacing[i], spacing[i + 1]])
# Launch a thread for each batch.
print('Launching %d threads for spacings: %s' % (FLAGS.num_threads, ranges))
sys.stdout.flush()
# Create a mechanism for monitoring when all threads are finished.
coord = tf.train.Coordinator()
# Create a generic TensorFlow-based utility for converting all image codings.
coder = ImageCoder()
threads = []
for thread_index in range(len(ranges)):
args = (coder, thread_index, ranges, name, filenames,
texts, labels, num_shards)
t = threading.Thread(target=_process_image_files_batch, args=args)
t.start()
threads.append(t)
# Wait for all the threads to terminate.
coord.join(threads)
print('%s: Finished writing all %d images in data set.' %
(datetime.now(), len(filenames)))
sys.stdout.flush()
def _find_image_files(data_dir, labels_file):
"""Build a list of all images files and labels in the data set.
Args:
data_dir: string, path to the root directory of images.
Assumes that the image data set resides in JPEG files located in
the following directory structure.
data_dir/dog/another-image.JPEG
data_dir/dog/my-image.jpg
where 'dog' is the label associated with these images.
labels_file: string, path to the labels file.
The list of valid labels are held in this file. Assumes that the file
contains entries as such:
dog
cat
flower
where each line corresponds to a label. We map each label contained in
the file to an integer starting with the integer 0 corresponding to the
label contained in the first line.
Returns:
filenames: list of strings; each string is a path to an image file.
texts: list of strings; each string is the class, e.g. 'dog'
labels: list of integer; each integer identifies the ground truth.
"""
print('Determining list of input files and labels from %s.' % data_dir)
unique_labels = [l.strip() for l in tf.gfile.FastGFile(
labels_file, 'r').readlines()]
labels = []
filenames = []
texts = []
# Leave label index 0 empty as a background class.
label_index = 1
# Construct the list of JPEG files and labels.
for text in unique_labels:
jpeg_file_path = '%s/%s/*' % (data_dir, text)
matching_files = tf.gfile.Glob(jpeg_file_path)
labels.extend([label_index] * len(matching_files))
texts.extend([text] * len(matching_files))
filenames.extend(matching_files)
if not label_index % 100:
print('Finished finding files in %d of %d classes.' % (
label_index, len(labels)))
label_index += 1
# Shuffle the ordering of all image files in order to guarantee
# random ordering of the images with respect to label in the
# saved TFRecord files. Make the randomization repeatable.
shuffled_index = list(range(len(filenames)))
random.seed(12345)
random.shuffle(shuffled_index)
filenames = [filenames[i] for i in shuffled_index]
texts = [texts[i] for i in shuffled_index]
labels = [labels[i] for i in shuffled_index]
print('Found %d JPEG files across %d labels inside %s.' %
(len(filenames), len(unique_labels), data_dir))
return filenames, texts, labels
def _process_dataset(name, directory, num_shards, labels_file):
"""Process a complete data set and save it as a TFRecord.
Args:
name: string, unique identifier specifying the data set.
directory: string, root path to the data set.
num_shards: integer number of shards for this data set.
labels_file: string, path to the labels file.
"""
filenames, texts, labels = _find_image_files(directory, labels_file)
_process_image_files(name, filenames, texts, labels, num_shards)
def main(unused_argv):
assert not FLAGS.train_shards % FLAGS.num_threads, (
'Please make the FLAGS.num_threads commensurate with FLAGS.train_shards')
assert not FLAGS.validation_shards % FLAGS.num_threads, (
'Please make the FLAGS.num_threads commensurate with '
'FLAGS.validation_shards')
print('Saving results to %s' % FLAGS.output_directory)
# Run it!
_process_dataset('validation', FLAGS.validation_directory,
FLAGS.validation_shards, FLAGS.labels_file)
_process_dataset('train', FLAGS.train_directory,
FLAGS.train_shards, FLAGS.labels_file)
if __name__ == '__main__':
tf.app.run()
you need to start the script as followed :
python Building_Set.py --train_directory=TrainingSet --output_directory=TF_Recordsfolder --validation_directory=ReferenceSet --labels_file=labels.txt --train_shards=1 --validation_shards=1 --num_threads=1
PS: you need a labels.txt where the labels are saved.
After generating both training and test sets serialized files you can now use the data in the following convNN script:
import tensorflow as tf
import sys
import numpy as np
import matplotlib.pyplot as plt
filter_max_dimension = 50
filter_max_depth = 30
filter_h_and_w = [3,3]
filter_depth = [3,3]
numberOFclasses = 21
TensorBoard = "TB_conv2NN"
TF_Records = "TF_Recordsfolder"
learning_rate = 1e-5
max_numberofiteretion =100000
batchSize = 21
img_height = 128
img_width = 128
# 1st function to read images form TF_Record
def getImage(filename):
with tf.device('/cpu:0'):
# convert filenames to a queue for an input pipeline.
filenameQ = tf.train.string_input_producer([filename],num_epochs=None)
# object to read records
recordReader = tf.TFRecordReader()
# read the full set of features for a single example
key, fullExample = recordReader.read(filenameQ)
# parse the full example into its' component features.
features = tf.parse_single_example(
fullExample,
features={
'image/height': tf.FixedLenFeature([], tf.int64),
'image/width': tf.FixedLenFeature([], tf.int64),
'image/colorspace': tf.FixedLenFeature([], dtype=tf.string,default_value=''),
'image/channels': tf.FixedLenFeature([], tf.int64),
'image/class/label': tf.FixedLenFeature([],tf.int64),
'image/class/text': tf.FixedLenFeature([], dtype=tf.string,default_value=''),
'image/format': tf.FixedLenFeature([], dtype=tf.string,default_value=''),
'image/filename': tf.FixedLenFeature([], dtype=tf.string,default_value=''),
'image/encoded': tf.FixedLenFeature([], dtype=tf.string, default_value='')
})
# now we are going to manipulate the label and image features
label = features['image/class/label']
image_buffer = features['image/encoded']
# Decode the jpeg
with tf.name_scope('decode_img',[image_buffer], None):
# decode
image = tf.image.decode_jpeg(image_buffer, channels=3)
# and convert to single precision data type
image = tf.image.convert_image_dtype(image, dtype=tf.float32)
# cast image into a single array, where each element corresponds to the greyscale
# value of a single pixel.
# the "1-.." part inverts the image, so that the background is black.
image=tf.reshape(1-tf.image.rgb_to_grayscale(image),[img_height*img_width])
# re-define label as a "one-hot" vector
# it will be [0,1] or [1,0] here.
# This approach can easily be extended to more classes.
label=tf.stack(tf.one_hot(label-1, numberOFclasses))
return label, image
with tf.device('/cpu:0'):
train_img,train_label = getImage(TF_Records+"/train-00000-of-00001")
validation_img,validation_label=getImage(TF_Records+"/validation-00000-of-00001")
# associate the "label_batch" and "image_batch" objects with a randomly selected batch---
# of labels and images respectively
train_imageBatch, train_labelBatch = tf.train.shuffle_batch([train_img, train_label], batch_size=batchSize,capacity=50,min_after_dequeue=10)
# and similarly for the validation data
validation_imageBatch, validation_labelBatch = tf.train.shuffle_batch([validation_img, validation_label],
batch_size=batchSize,capacity=50,min_after_dequeue=10)
def train():
with tf.device('/gpu:0'):
config =tf.ConfigProto(log_device_placement=False, allow_soft_placement=True)
#config.gpu_options.allow_growth = True
#config.gpu_options.per_process_gpu_memory_fraction=0.9
sess = tf.InteractiveSession(config = config)
#defining tensorflow graph :
with tf.name_scope("input"):
x = tf.placeholder(tf.float32,[None, img_width*img_height],name ="pixels_values")
y_= tf.placeholder(tf.float32,[None,numberOFclasses],name='Prediction')
with tf.name_scope("input_reshape"):
image_shaped =tf.reshape(x,[-1,img_height,img_width,1])
tf.summary.image('input_img',image_shaped,numberOFclasses)
#defining weigths and biases:
def weights_variable (shape):
return tf.Variable(tf.truncated_normal(shape,stddev=0.1))
def bias_variable(shape):
return tf.Variable(tf.constant(0.1,shape=shape))
#help function to generates summaries for given variables
def variable_summaries(var):
with tf.name_scope('summaries'):
mean = tf.reduce_mean(var)
tf.summary.scalar('mean', mean)
with tf.name_scope('stddev'):
stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
tf.summary.scalar('stddev', stddev)
tf.summary.scalar('max', tf.reduce_max(var))
tf.summary.scalar('min', tf.reduce_min(var))
tf.summary.histogram('histogram', var)
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')
with tf.name_scope('1st_conv_layer'):
W_conv1 = weights_variable([filter_h_and_w[0],filter_h_and_w[0], 1, filter_depth[0]])
b_conv1 = bias_variable([filter_depth[0]])
h_conv1 = tf.nn.relu(conv2d(tf.reshape(x,[-1,img_width,img_height,1]), W_conv1) + b_conv1)
with tf.name_scope('1nd_Pooling_layer'):
h_conv1 = max_pool_2x2(h_conv1)
with tf.name_scope('2nd_conv_layer'):
W_conv2 = weights_variable([filter_h_and_w[1],filter_h_and_w[1], filter_depth[0], filter_depth[1]])
b_conv2 = bias_variable([filter_depth[1]])
h_conv2 = tf.nn.relu(conv2d(h_conv1, W_conv2) + b_conv2)
with tf.name_scope('1st_Full_connected_Layer'):
W_fc1 = weights_variable([filter_depth[1]*64, 1024])
b_fc1 = bias_variable([1024])
h_pool_flat = tf.reshape(h_conv2, [-1,filter_depth[1]*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool_flat, W_fc1) + b_fc1)
with tf.name_scope('Dropout'):
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
with tf.name_scope('Output_layer'):
W_fc3 = weights_variable([1024, numberOFclasses])
b_fc3 = bias_variable([numberOFclasses])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc3) + b_fc3)
with tf.name_scope('cross_entropy'):
# The raw formulation of cross-entropy,
#
# tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)),
# reduction_indices=[1]))
#
# can be numerically unstable.
#
# So here we use tf.nn.softmax_cross_entropy_with_logits on the
# raw outputs of the nn_layer above, and then average across
# the batch.
diff = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)
with tf.name_scope('total'):
cross_entropy = tf.reduce_mean(diff)
tf.summary.scalar('cross_entropy', cross_entropy)
with tf.name_scope('train'):
train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)
with tf.name_scope('accuracy'):
with tf.name_scope('correct_prediction'):
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
with tf.name_scope('accuracy'):
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
tf.summary.scalar('accuracy', accuracy)
# Merging Summaries
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(TensorBoard + '/train', sess.graph)
test_writer = tf.summary.FileWriter(TensorBoard + '/test')
# initialize the variables
sess.run(tf.global_variables_initializer())
# start the threads used for reading files
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess,coord=coord)
# feeding function
def feed_dict(train):
if True :
#img_batch, labels_batch= tf.train.shuffle_batch([train_label,train_img],batch_size=batchSize,capacity=500,min_after_dequeue=200)
img_batch , labels_batch = sess.run([ train_labelBatch ,train_imageBatch])
dropoutValue = 0.7
else:
# img_batch,labels_batch = tf.train.shuffle_batch([validation_label,validation_img],batch_size=batchSize,capacity=500,min_after_dequeue=200)
img_batch,labels_batch = sess.run([ validation_labelBatch,validation_imageBatch])
dropoutValue = 1
return {x:img_batch,y_:labels_batch,keep_prob:dropoutValue}
for i in range(max_numberofiteretion):
if i%10 == 0:#Run a Test
summary, acc = sess.run([merged,accuracy],feed_dict=feed_dict(False))
#plt.imshow(output[0,:,:,1],cmap='gray')
#plt.show()
test_writer.add_summary(summary,i)# Save to TensorBoard
else: # Training
if i % 100 == 99: # Record execution stats
run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
run_metadata = tf.RunMetadata()
summary, _ = sess.run([merged, train_step],
feed_dict=feed_dict(True),
options=run_options,
run_metadata=run_metadata)
train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
train_writer.add_summary(summary, i)
else: # Record a summary
output , summary, _ = sess.run([h_conv1,merged, train_step], feed_dict=feed_dict(True))
train_writer.add_summary(summary, i)
# finalise
coord.request_stop()
coord.join(threads)
train_writer.close()
test_writer.close()
filter_h_and_w[0] = np.random.randint(3, filter_max_dimension)
filter_h_and_w[1] = np.random.randint(3, filter_max_dimension)
filter_depth[0] = np.random.randint(3, filter_max_depth)
filter_depth[1] = np.random.randint(3, filter_max_depth)
TensorBoard = "ConV2NN/_filter"+str(filter_h_and_w[0])+"To"+str(filter_h_and_w[1])+"D"+str(filter_depth[0])+"To"+str(filter_depth[1])+"R10e5"
with tf.device('/gpu:0') :
train()
The script is using both GPU and CPU if you don't have GPU TF is going to use the cpu of your device. The code is self explaining, u need to change the image resolution value and number of class. and you need to start Tensorboard, the script is save a test and train folder for tensorboard you just need to start it in your browser.
since you have only 2 classes I think two conv layers are enough, if you think you need more it pretty easy to add ones.
I hope this will help

Updating the Initial state of a recurrent neural network in tensorflow

Currently I have the following code:
init_state = tf.Variable(tf.zeros([batch_partition_length, state_size])) # -> [16, 1024].
final_state = tf.Variable(tf.zeros([batch_partition_length, state_size]))
And inside my inference method that is responsible producing the output, I have the following:
def inference(frames):
# Note that I write the final_state as a global valriable to avoid the shadowing issue, since it is referenced at the dynamic_rnn line.
global final_state
# .... Here we have some conv layers and so on...
# Now the RNN cell
with tf.variable_scope('local1') as scope:
# Move everything into depth so we can perform a single matrix multiply.
shape_d = pool3.get_shape()
shape = shape_d[1] * shape_d[2] * shape_d[3]
# tf_shape = tf.stack(shape)
tf_shape = 1024
print("shape:", shape, shape_d[1], shape_d[2], shape_d[3])
# So note that tf_shape = 1024, this means that we have 1024 features are fed into the network. And
# the batch size = 1024. Therefore, the aim is to divide the batch_size into num_steps so that
reshape = tf.reshape(pool3, [-1, tf_shape])
# Now we need to reshape/divide the batch_size into num_steps so that we would be feeding a sequence
rnn_inputs = tf.reshape(reshape, [batch_partition_length, step_size, tf_shape])
print('RNN inputs shape: ', rnn_inputs.get_shape()) # -> (16, 64, 1024).
cell = tf.contrib.rnn.BasicRNNCell(state_size)
# note that rnn_outputs are the outputs but not multiplied by W.
rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, rnn_inputs, initial_state=init_state)
# linear Wx + b
with tf.variable_scope('softmax_linear') as scope:
weight_softmax = \
tf.Variable(
tf.truncated_normal([state_size, n_classes], stddev=1 / state_size, dtype=tf.float32, name='weight_softmax'))
bias_softmax = tf.constant(0.0, tf.float32, [n_classes], name='bias_softmax')
softmax_linear = tf.reshape(
tf.matmul(tf.reshape(rnn_outputs, [-1, state_size]), weight_softmax) + bias_softmax,
[batch_size, n_classes])
print('Output shape:', softmax_linear.get_shape())
return softmax_linear
# Here we define the loss, accuracy and the optimzer.
# now run the graph:
with tf.Session() as sess:
_, accuracy_train, loss_train, summary = \
sess.run([optimizer, accuracy, cost_scalar, merged], feed_dict={x: image_batch,
y_valence: valences,
confidence_holder: confidences})
....
Problem: How I would be able to assign initial_state the value stored in final_state? That is, how to more update a Variable value given the other?
I have used the following:
tf.assign(init_state, final_state.eval())
under session after running the sess.run command. But, this is throwing an error:
You must feed a value for placeholder tensor 'inputs' with dtype float
Where tf.Variable: "input" is declared as follows:
x = tf.placeholder(tf.float32, [None, 112, 112, 3], name='inputs')
And the feeding is done after reading the images from the tfRecords through the following command:
example = tf.train.Example()
example.ParseFromString(string_record)
height = int(example.features.feature['height']
.int64_list
.value[0])
width = int(example.features.feature['width']
.int64_list
.value[0])
img_string = (example.features.feature['image_raw']
.bytes_list
.value[0])
img_1d = np.fromstring(img_string, dtype=np.uint8)
reconstructed_img = img_1d.reshape((height, width, -1)) # Where this is added to the image_batch list, which is fed into the placeholder.
And if tried the following:
img_1d = np.fromstring(img_string, dtype=np.float32)
This will produce the following error:
ValueError: cannot reshape array of size 9408 into shape (112,112,newaxis)
Any help is much appreciated!!
So here are the mistakes that I have done so far. After doing some revision I figured out the following:
I shouldn't create the final_state as a tf.Variable. Since tf.nn.dynamic_rnn return tensors as ndarray, then, I should not instantiate the final_state int the beginning. And I should not use the global final_state under the function definition.
In order to assign the initial state the final_state, I used:
tf.assign(intial_state, final_state)
And things work out.
Note: in tensorflow, an operation returns the data as numpy array in python and as tensorflow::Tensor in C and C++.
Have a look at https://www.tensorflow.org/versions/r0.10/get_started/basic_usage for more informaiton.