How to predict Faster RCNN model in batches? - tensorflow

I have a trained RCNN (Keras-Retinanet) model and I could predict only one image at a time.
boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
Full script is here.
Is there a way to predict multiple images at a time?
Thanks

The reason you expand the dims there is to artificially add a batch dimension. Simply stack a bunch of images together and pass it into this function to get a batch of results.
Said differently, right now you are passing in:
[image].
You could instead pass in:
[image_1, image_2, image_3, ...] (as a numpy, not python array)

Related

Subsection of grid as input to cnn

I have two huge grids (input and output) representing some spatial data of the same area. I want to be able to generate the output pixel-by-pixel by feeding a neural network a small part of the input grid, around the pixel of interest.
The naive way of training and evaluating on the CNN would be to extract sections separately, and giving those to the fit() function. But if the sub-grid the CNN operates on is e.g. a 256×256 area of the input, then I would copy each data point 65536 (!!!) times per epoch.
So is there any way to have karas just use subsections of a bigger data structure as training?
To me, this sounds a bit like training RNN's on sequencial sections of a data series, instead of copying each section separately.
The performance consideration is mainly in the case of evaluating the model. I want to use this model to generate output grid of a huge geographical area (denmark) with a resolution of 12,5 cm
It seems to me that you are looking for a fully convolutional network (FCN).
By using only layers that scale in size with their inputs (banishing the use of dense layers specifically), an FCN is able to produce an output with a spatial range that grows proportionally with that of the input — typically, the ouput has the same resolution as the input, as in your case.
If your inputs are very large, you can still train an FCN on subimages. Then for inference, you can
run the network on your entire image: indeed, sometimes the inputs are too big to be batched together during training, but can be feed alone for inference.
or split your input into subimages and tile the results back. In that case, I would probably use overlapping tiles to avoid potential border effects.
You can probably go well with a Sequence generator.
You will still have to create slices for each batch, but taking slices isn't slow at all compared with the CNN operations.
And by using a keras.utils.Sequence, the generation of the batches is parallel with the model's execution, so no penalty:
class GridGenerator(keras.utils.Sequence):
def __init__(self, originalGrid_maybeFileName, outputGrid, subGridSize):
self.originalGrid = originalGrid_maybeFileName
self.outputGrid = outputGrid
self.subgridSize = subgridSize
def __len__(self):
#naive implementation, if grids are squares and the sizes are multiples of each other
self.divs = self.originalGrid.shape[:,:,1] // self.subgridSize
return self.divs * self.divs
def __getitem__(self,i):
row, column = divmod(i, self.divs)
#using channels_last
x= self.originalGrid[:,row:row+self.subgridSize, column:column+self.subgridSize]
y= self.outputGrid[:,row:row+self.subgridSize, column:column+self.subgridSize]
return x,y
If the full grid doesn't fit your PC's memory, then you should find ways of loading parts of the grid at a time. (Use the generator to load these parts)
Create the generator and train with fit_generator:
generator = GridGenerator(xGrid, yGrid, subSize)
#you can create additional generators to take a part of that as training and another part as validation
model.fit_generator(generator, len(generator), ...., workers = 4)
The workers argument determines how many batches will be loaded in parallel before sent to the model.

how to plot attention vector on tensorboard graph

Attention vector for sequence 2 sequence model is basically a array of shape [batch_size, time_step,1], which indicates the weighs of a particular time step.
But if I use tf.summary.histogram to show it on tensorboard, tensorflow will only show the distributions of weights, I can't tell the which time step is more important. I can use tf.summary.scalar, but length of my source sequence is 128 , it is too much plots. The most nature way of show this kind of data it a picture like this, but how can I do it in tensorboad?
Tensorboard does not currently support visualizing tensor summaries. There is a summary op for it, but Tensorboard will just skip it when reading summaries from disk at the moment. I am also not aware of any third party plugins that support this, though it is very much doable.
In your plot, it seems like there are only 19 time steps. One way is to create 19 scalar summaries. Alternatively, you can use tf.summary.tensor_summary op, but process the tensorboard event file (containing this data) with your own script.

Can I save a graph with its values without saving the inputs?

I have a network with weights filled by manual tf.assign, and now I want to save the network with the weight values but without the placeholder inputs. It seems tf.train.Saver works only when I have the feed_dict available, and tf.train.export_meta_graph only saves the network structure. I tried pickle and dill but they both have errors. Are there any better solutions for this kind of saving?
Placeholders convert the input data into Tensors so I guess they are an important part of the Graph and I don't understand why you don't want to include them.
Even if you use tf.assign, you can freeze the graph, which means combining the structure with the weights. What freezing does is to convert Tensorflow variables into constants.
You have to save the structure of your graph:
gdef = g.as_graph_def()
tf.train.write_graph(gdef,".","graph.pb",False)
Then save the weights (after training)
saver.save(sess, 'tmp/my-weights')
And freeze the graph according to the tutorial in https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite
After that, you can use the Graph.

Neural Network with my own dataset

I have downloaded many face images from web. In order to learn Tensorflow I want to feed those images to a simple fully-connected neural network with a single hidden layer. I have found an example code in here.
Since I am a beginner, I don't know how to train, evaluate, and test the network with the downloaded images. The code owner used a '.mat' file and a .pkl file. I don't understand how he organized training and test set.
In order to run the code with my images;
Do I need to divide my images into training, test, and validation folders and turn each folder into a mat file? How am I going to provide labels for the training?
Besides, I don't understand why he used a '.pkl' file?
All in all, I would like to change this code so that I can find test, training , and validation set classification performance with my image dataset.
It might be an easy question, but it is important for me as it is a starting step. Thanks for your understanding.
First, you don't have to use .mat files nor pickles. Tensorflow expects numpy array.
For instance, let's say you have 70000 images of size 28x28 (=784 dimensions) belonging to 10 classes. Let's also assume that you'd like to train a simple feedforward neural network to classify the images.
The first step would be to split the images between train and test (and validation, but let's put this aside for the sake of simplicity). For the sake of the example, let's imagine that you chose randomly 60000 images for your training set and 10000 for your test set.
The second step would be to ensure that your data has the right format. Here, you'd like your training set to consist in one numpy array of shape (60000, 784) for the images and another one of shape (60000, 10) for the labels (if you use one-hot encoding to represent your classes). As for your test set, you should have an array of shape (10000, 784) for the images and one of shape (10000, 10) for the labels.
Once you have these big numpy arrays, you should define placeholders that will allow you to feed data to you network during training and evaluation.
images = tf.placeholder(tf.float32, shape=[None, 784])
labels = tf.placeholder(tf.int64, shape=[None, 10])
The None here means that you can feed a batch of any size, i.e. as many images as you want, as long as you numpy array is of shape (anything, 784).
The third step consists in defining your model as well as the loss function and the optimizer.
The fourth step consists in training your network by feeding it with random batches of data using the placeholders created above. As your network is training, you can periodically print its performance like the training loss/accuracy as well as the test loss/accuracy.
You can find a complete and very simple example here.

How should I structure my labels for TensorFlow?

I'm trying to use TensorFlow to train output servo commands given an input image.
I plan on using a file as #mrry suggested in this question, with the images like so:
../some/path/some_img.JPG *some_label*
My question is, what are the label formats I can provide to TensorFlow and what structures are suggested?
My data is basically n servo commands from 0-10 seconds. A vector would work great:
[0,2,4,3]
or similarly:
[0,.25,.4,.3]
I couldn't find much about labels in the docs. Can anyone shed any light on TensorFlow labels?
And a very related question is what is the best way to structure these for TensorFlow to properly learn from them?
In Tensorflow Labels are just generic tensor. You can use any kind of tensor to store your labels. In your case a 1-D tensor with shape (4,) seems to be desired.
Labels do only differ from the rest of the data by its use in the computational graph. (Usually) labels should only be used inside the loss function while you propagate the other data through the whole network. For your problem a 4-d regression function should work.
Also, look at my newest comment to the (old) question. Using the slice_input_producer seems to be preferable in your case.