I gather from this question and its answer [ feeding image data in tensorflow for transfer learning ] that adding a new op to the imported graph will help, but it isn't clear to me if the resulting graph will handle both png and jpeg inputs automatically, and at the same time.
The answer to the above question suggests the following:
png_data = tf.placeholder(tf.string, shape=[])
decoded_png = tf.image.decode_png(png_data, channels=3)
# ...
graph_def = ...
softmax_tensor = tf.import_graph_def(
graph_def,
input_map={'DecodeJpeg:0': decoded_png},
return_elements=['softmax:0'])
sess.run(softmax_tensor, {png_data: ...})
Does this mean that a PNG input must be passed in as
sess.run(softmax_tensor, {png_data: image_array})
And a JPEG input must be given to the graph as
sess.run(softmax_tensor, {'DecodeJpeg:0': image_array})
Would the second statement work after the graph has been modified and an op added at the bottom?
The answers in the previous question center around switching the graph from taking JPEGs to PNGs. With the network as specified, there's no way for it to handle both.
You have a few options if you need to deal with both types.
Handle the decoding yourself, either with PIL, or TensorFlow, and feed the decoded image bytes into the graph at the output of the existing decode node.
If you're happy feeding the network, then do a two-step operation where you re-plumb the input to read from a variable, and create two new nodes that write decoded output to that variable.
sess.run(feed_jpeg, feed_dict={in_jpg: my_jpg})
sess.run(the_network)
or
sess.run(feed_png, feed_dict={in_png: my_png})
sess.run(the_network)
Create a more complex conditional input path where you can feed a flag value that tells it what data type it is, and uses TF conditionals to only pull on the specified decode node.
Write a new op that dispatches to either decode_png or decode_jpeg as necessary, based upon the format string at the start of the data.
I'm hoping we'll expose some string comparison ops so that you could write (4) in pure TensorFlow, but I don't have a timeline for any of that.
Related
I have a network with weights filled by manual tf.assign, and now I want to save the network with the weight values but without the placeholder inputs. It seems tf.train.Saver works only when I have the feed_dict available, and tf.train.export_meta_graph only saves the network structure. I tried pickle and dill but they both have errors. Are there any better solutions for this kind of saving?
Placeholders convert the input data into Tensors so I guess they are an important part of the Graph and I don't understand why you don't want to include them.
Even if you use tf.assign, you can freeze the graph, which means combining the structure with the weights. What freezing does is to convert Tensorflow variables into constants.
You have to save the structure of your graph:
gdef = g.as_graph_def()
tf.train.write_graph(gdef,".","graph.pb",False)
Then save the weights (after training)
saver.save(sess, 'tmp/my-weights')
And freeze the graph according to the tutorial in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite
After that, you can use the Graph.
I'm using Tensorflow's 1.3 Estimator API to perform some image classification. Since I have a considerable amount of data, I gave the TFRecords a go. Saved the file and can read the examples to a Dataset using a parser function inside the input_fn of the estimator model. So far so good.
The issue is when I want to do some image augmentation (rotating and shearing in this case).
1) I tried using the tf.contrib.keras.preprocessing.image.random_shearand the likes. Turns out Keras doesn't like the format of TF's shape ('Dimension') and I can't cast it to a list because its arguments are the axis indexes not the actual value.
2) Then I tried using the tf.contrib.image.rotate and tf.contrib.image.transform with random values in my chosen range. This time I get an error of NotFoundError: Op type not registered 'ImageProjectiveTransform' in binary running on MYPC. Make sure the Op and Kernel are registered in the binary running in this process. which is an open issue (https://github.com/tensorflow/tensorflow/issues/9672). At the moment I can't move from Windows, so I would very interested in possible alternatives.
3) Searched for a way to read TFRecords and transform it to numpy array and do the augmentation with other tools, but can't find a way from within the input_fn from where I can't access the session.
Thanks!
Have you tried using function from the answer to the question below?tensorflow: how to rotate an image for data augmentation?
I want to use transfer learning with Google's Inception network for an image recognition problem. I am using retrain.py from the TensorFlow example source for inspiration.
In retrain.py, the Inception graph is loaded and a feed dict is used to feed the new images into the model's input layer. However, I have my data serialized in TFRecord files and have been using an input pipeline to feed in my inputs, as demonstrated here.
So I have a tensor images which returns my input data in batches when run. But how can I feed these images into Inception? I can't use a feed dict since my inputs are tensors, not NumPy arrays. My two ideas are
1) simply call sess.run() on each batch to convert it to a NumPy array, and then use a feed dict to pass it to Inception.
2) replace the input node in the Inception graph with my own batch input tensor
I think (1) would work, but it seems a little inelegant. (2) seems more natural to me, but I can't do exactly that because TensorFlow graphs can only be appended to and not otherwise modified.
Is there a better approach?
You can implement option (2), replacing the input node, but you will need to modify retrain.py to do so. The tf.import_graph_def() function supports a limited form of modification to the imported graph, by remapping tensors in the imported graph to existing tensors in the target graph.
This line in retrain.py calls tf.import_graph_def() to import the Inception model, where jpeg_data_tensor becomes the tensor that you feed with input data:
bottleneck_tensor, jpeg_data_tensor, resized_input_tensor = (
tf.import_graph_def(graph_def, name='', return_elements=[
BOTTLENECK_TENSOR_NAME, JPEG_DATA_TENSOR_NAME,
RESIZED_INPUT_TENSOR_NAME]))
Instead of retrieving jpeg_data_tensor from the imported graph, you can remap it to an input pipeline that you construct yourself:
# Output of a training pipeline, returning a `tf.string` tensor containing
# a JPEG-encoded image.
jpeg_data_tensor = ...
bottleneck_tensor, resized_input_tensor = (
tf.import_graph_def(
graph_def,
input_map={JPEG_DATA_TENSOR_NAME: jpeg_data_tensor},
return_elements=[BOTTLENECK_TENSOR_NAME, RESIZED_INPUT_TENSOR_NAME]))
Wherever you previously fed jpeg_data_tensor, you no longer need to need it, because the inputs will be read from the input pipeline you constructed. (Note that you might need to handle resized_input_tensor as well... I'm not intimately familiar with retrain.py, so some restructuring might be necessary.)
I trained Cifar10 example model from TensorFlow's repository with batch_size 128 and it worked fine. Then I froze graph and managed to run it with C++ just like they do it in their C++ label image example.
The only problem was that I had to artificially generate tensor of shape [128, image_height, image_width, channels] to classify single image with C++ because saved model expects input of 128 samples in a batch since that is number of samples that comes from queue.
I tried training Cifar10 example with batch_size = 1 and then I managed to classify examples one by one when I run model with C++, but that doesn't seem like a great solution. I also tried manually changing tensor shapes in saved graph file but it didn't work.
My question is what is the best way to train model with fixed batch size (like 32, 64, 128 etc.) and then save model so that it can be used with batch size of arbitrary length? If that's not possible, then how to save model to be able to classify samples one by one.
It sounds like the problem is that TensorFlow is "baking in" the batch size to other tensors in the graph (e.g. if the graph contains tf.shape(t) for some tensor t whose shape depends on the batch size, the batch size might be stored in the graph as a constant). The solution is to change your program slightly so that tf.train.batch() returns tensors with a variable batch size.
The tf.train.batch() method accepts a tf.Tensor for the batch_size argument. Perhaps the simplest way to modify your program for variable-sized batches would be to define a placeholder for the batch size:
# Define a scalar tensor for the batch size, so that you can alter it at
# Session.run()-time.
batch_size_tensor = tf.placeholder(tf.int32, shape=[])
input_tensors = tf.train.batch(..., batch_size=batch_size_tensor, ...)
This would prevent the batch size from being baked into your GraphDef, so you should be able to feed values of any batch size in C++. However, this modification would require you to feed a value for the batch size on every step, which is slightly tedious.
Assuming that you always want to train with batch size 128, but retain the flexibility to change the batch size later, you could use a tf.placeholder_with_default() to specify that the batch size should be 128 when you don't feed an alternative value:
# Define a scalar tensor for the batch size, so that you can alter it at
# Session.run()-time.
batch_size_tensor = tf.placeholder_with_default(128, shape=[])
input_tensors = tf.train.batch(..., batch_size=batch_size_tensor, ...)
Is there a reason you need fixed batch size in the graph?
I think a good way is to build a graph with a variable batch size - by putting None as the first dimension. During training, you can then pass the batch size flag to your data provider, so it feeds the desired amount of data in each iteration.
After the model is trained, you can export the graph using tf.train.Saver(), which exports the metagraph. To do inference, you can load the exported files and just evaluate with any number of examples - also just one.
Note, this is different from the frozen graph.
I want to do real time data augmentation by chaining different image transformation operators in tensorflow. My code begins with image decoding and then runs different transformations but it throw a ValueError('\'image\' must be fully defined.'). Here is an example to reproduce this error :
def decode_and_augment(image_raw):
decoded = tf.image.decode_jpeg(image_raw)
flipped = tf.image.random_flip_left_right(decoded)
return flipped
This error arises because the tf.image.random_flip_left_right() op checks the static shape of its input when you build the graph, and tf.image.decode_jpeg() produces tensors that have a data dependency on the contents of image_raw so it the shape isn't statically known. Currently the only way to work around this is to set the static shape of the decoded tensor using Tensor.set_shape(), as follows:
decoded = tf.image.decode_jpeg(image_raw)
decoded.set_shape([IMAGE_HEIGHT, IMAGE_WIDTH, NUM_CHANNELS])
flipped = tf.image.random_flip_left_right(decoded)
The downside of this is that all images must now have the same size (and number of channels).
Many of the image ops don't follow the same gradual and dynamic shape inference as the rest of TensorFlow (which allows you to have unknown shapes or dimensions, assumes that the program is correct as you build the graph, and checks the real shapes at runtime). This is considered a bug at the present time, and we'll figure out a way to fix it.