What is the best way to run saved model with different batch size in TensorFlow? - tensorflow

I trained Cifar10 example model from TensorFlow's repository with batch_size 128 and it worked fine. Then I froze graph and managed to run it with C++ just like they do it in their C++ label image example.
The only problem was that I had to artificially generate tensor of shape [128, image_height, image_width, channels] to classify single image with C++ because saved model expects input of 128 samples in a batch since that is number of samples that comes from queue.
I tried training Cifar10 example with batch_size = 1 and then I managed to classify examples one by one when I run model with C++, but that doesn't seem like a great solution. I also tried manually changing tensor shapes in saved graph file but it didn't work.
My question is what is the best way to train model with fixed batch size (like 32, 64, 128 etc.) and then save model so that it can be used with batch size of arbitrary length? If that's not possible, then how to save model to be able to classify samples one by one.

It sounds like the problem is that TensorFlow is "baking in" the batch size to other tensors in the graph (e.g. if the graph contains tf.shape(t) for some tensor t whose shape depends on the batch size, the batch size might be stored in the graph as a constant). The solution is to change your program slightly so that tf.train.batch() returns tensors with a variable batch size.
The tf.train.batch() method accepts a tf.Tensor for the batch_size argument. Perhaps the simplest way to modify your program for variable-sized batches would be to define a placeholder for the batch size:
# Define a scalar tensor for the batch size, so that you can alter it at
# Session.run()-time.
batch_size_tensor = tf.placeholder(tf.int32, shape=[])
input_tensors = tf.train.batch(..., batch_size=batch_size_tensor, ...)
This would prevent the batch size from being baked into your GraphDef, so you should be able to feed values of any batch size in C++. However, this modification would require you to feed a value for the batch size on every step, which is slightly tedious.
Assuming that you always want to train with batch size 128, but retain the flexibility to change the batch size later, you could use a tf.placeholder_with_default() to specify that the batch size should be 128 when you don't feed an alternative value:
# Define a scalar tensor for the batch size, so that you can alter it at
# Session.run()-time.
batch_size_tensor = tf.placeholder_with_default(128, shape=[])
input_tensors = tf.train.batch(..., batch_size=batch_size_tensor, ...)

Is there a reason you need fixed batch size in the graph?
I think a good way is to build a graph with a variable batch size - by putting None as the first dimension. During training, you can then pass the batch size flag to your data provider, so it feeds the desired amount of data in each iteration.
After the model is trained, you can export the graph using tf.train.Saver(), which exports the metagraph. To do inference, you can load the exported files and just evaluate with any number of examples - also just one.
Note, this is different from the frozen graph.


VGG19 .h5 file modfiying

I'm using pretrained VGG19 in my modified neural transfer code (Gatys algorithm), but my PC doesn't allow me to use input image in original size (original height is 2499 pix, but with 20GB RAM I can use it only 1000 pix maximum)
As I read, the solution for me will be decreasing batch_size. So, my question is - how can I modify VGG19 .h5 file to change batch_size inside it? Or maybe I can override batch_size of it in my code?
Assuming the pretrained model is defined on ImageNet, the maximum input data size for a single sample is 224*224.
If you try and pass a large input, it's possible your deep learning framework will reshape it into many images to be classified at once.
Resizing your input data to 224*224, you will run with a single image (batch size of 1).
You could make a custom implementation of your model to take larger input sizes. However sizing down to 224*224 generally gets good results, depending on the task.

TensorFlow Increase Batch Size mid experiment

I am looking to replicate some of the behavior in this paper "don't decay the learning rate, increase the batch size" and I am wondering if there is a simple approach to increase the batch size within a GCMLE experiment. I have a custom estimator and I am trying to think of any ways to adjust the batch size within the experiment. I realize that I could run with one batch size for a certain number of epochs and then load this saved graph and kick off a subsequent experiment, but I am wondering if there are any other options to update the batch size within the same experiment?
Setting up your graph to support a variable batch size is pretty easy, just use a None in the shape of the first dimension. Take a look at this article:
Build a graph that works with variable batch size using Tensorflow
Then you feed in any size batch at every sess.run(train_op, feed_dict=[X:data, Y:labels]) where the first dimension of X, your batch, is variable length.
It pretty much just works as you'd expect.
Example graph structure wiht variable batch size:
X = tf.placeholder("float", [None, num_input])
Y = tf.placeholder("float", [None, num_classes])
In general, you're allowed to have 1 unknown dimension in your tensors. Tensorflow will infer that dimension based on the actual data you pass it at runtime.
In this example, in your first iterations your data shape might be [10, 784] (batches of 10), and in later iterations maybe your shape becomes [50, 784] (batches of 50). The rest of your graph setup will work without change.
One approach is to set delay_workers_by_global_step=True in the constructor to Experiment.
The reason this works is because the effective batch size is batch_size * num_workers. So if you delay the start of other workers, your batch size will gradually increase.
Of course, your throughput will be correspondingly lower in the early phases.
If you directly want to control the batch_size, you will have to effectively replicate the behavior of learn_runner.run in your own code. That wouldn't be too bad, except for the fact that deep down in experiment.py, it starts a server which, AFAICT, cannot be disabled.

Neural Network with my own dataset

I have downloaded many face images from web. In order to learn Tensorflow I want to feed those images to a simple fully-connected neural network with a single hidden layer. I have found an example code in here.
Since I am a beginner, I don't know how to train, evaluate, and test the network with the downloaded images. The code owner used a '.mat' file and a .pkl file. I don't understand how he organized training and test set.
In order to run the code with my images;
Do I need to divide my images into training, test, and validation folders and turn each folder into a mat file? How am I going to provide labels for the training?
Besides, I don't understand why he used a '.pkl' file?
All in all, I would like to change this code so that I can find test, training , and validation set classification performance with my image dataset.
It might be an easy question, but it is important for me as it is a starting step. Thanks for your understanding.
First, you don't have to use .mat files nor pickles. Tensorflow expects numpy array.
For instance, let's say you have 70000 images of size 28x28 (=784 dimensions) belonging to 10 classes. Let's also assume that you'd like to train a simple feedforward neural network to classify the images.
The first step would be to split the images between train and test (and validation, but let's put this aside for the sake of simplicity). For the sake of the example, let's imagine that you chose randomly 60000 images for your training set and 10000 for your test set.
The second step would be to ensure that your data has the right format. Here, you'd like your training set to consist in one numpy array of shape (60000, 784) for the images and another one of shape (60000, 10) for the labels (if you use one-hot encoding to represent your classes). As for your test set, you should have an array of shape (10000, 784) for the images and one of shape (10000, 10) for the labels.
Once you have these big numpy arrays, you should define placeholders that will allow you to feed data to you network during training and evaluation.
images = tf.placeholder(tf.float32, shape=[None, 784])
labels = tf.placeholder(tf.int64, shape=[None, 10])
The None here means that you can feed a batch of any size, i.e. as many images as you want, as long as you numpy array is of shape (anything, 784).
The third step consists in defining your model as well as the loss function and the optimizer.
The fourth step consists in training your network by feeding it with random batches of data using the placeholders created above. As your network is training, you can periodically print its performance like the training loss/accuracy as well as the test loss/accuracy.
You can find a complete and very simple example here.

reduce size of pretrained deep learning model for feature generation

I am using an pretrained model in Keras to generate features for a set of images:
model = InceptionV3(weights='imagenet', include_top=False)
train_data = model.predict(data).reshape(data.shape[0],-1)
However, I have a lot of images and the Imagenet model outputs 131072 features (columns) for each image.
With 200k images I would get an array of (200000, 131072) which is too large to fit into memory.
More importantly, I need to save this array to disk and it would take 100 GB of space when saved as .npy or .h5py
I could circumvent the memory problem by feeding only batches of like 1000 images and saving them to disk, but not the disk space problem.
How can I make the model smaller without losing too much information?
as the answer suggested I include the next layer in the model as well:
base_model = InceptionV3(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('avg_pool').output)
this reduced the output to (200000, 2048)
update 2:
another interesting solution may be the bcolz package to reduce size of numpy arrays https://github.com/Blosc/bcolz
I see at least two solutions to your problem:
Apply a model = AveragePooling2D((8, 8), strides=(8, 8))(model) where model is an InceptionV3 object you loaded (without top). This is the next step in InceptionV3 architecture - so one may easily assume - that these features still hold loads of discriminatory clues.
Apply a some kind of dimensionality reduction (e.g. like PCA) on a sample of data and reduce the dimensionality of all data to get the reasonable file size.

Feeding individual examples into TensorFlow graph trained on files?

I'm new to TensorFlow and am getting a bit tripped up on the mechanics of reading data. I set up a TensorFlow graph on the mnist data, but I'd like to modify it so that I can run one program to train it + save the model out, and run another to load said graph, make predictions, and compute test accuracy.
Where I'm getting confused is how to bypass the original I/O system in the training graph and "inject" an image to predict or an (image, label) tuple of test data for accuracy testing. To read the training data, I'm using this code:
_, input_data = util.read_examples(
feature_map = {
'label': tf.FixedLenFeature(
shape=[], dtype=tf.int64, default_value=[-1]),
'image': tf.FixedLenFeature(
shape=[NUM_PIXELS * NUM_PIXELS], dtype=tf.int64),
example = tf.parse_example(input_data, features=feature_map)
I then feed example to a convolution layer, etc. and generate the output.
Now imagine that I train my graph with that code specifying the input, save out the graph and weights, and then restore the graph and weights in another script for prediction -- I'd like to take (say) 10 images and feed them to the graph to generate predictions. How do I "inject" those 10 images so that the predictions come out the other end?
I played around with feed dictionaries and placeholders, but I'm not sure if they're the right things for me to use... it seems like they rely on having data in memory, as opposed to reading from a queue of test data, for example.
A feed dictionary with placeholders would make sense if you wanted to perform a small number of inferences/evaluations (i.e. enough to fit in memory) - e.g. if you were serving a simple model or running small eval loops.
If you specifically want to infer or evaluate large batches then you should use the same approach you've used for training, but with a different path to your test/eval/live data. e.g.
_, eval_data = util.read_examples(
paths_to_files, # CHANGE THIS BIT
You can use this as a normal python variable and set up successive, dependent steps to use this as a provided variable. e.g.
def get_example(data):
return tf.parse_example(data, features=feature_map)