VGG19 .h5 file modfiying - tensorflow

I'm using pretrained VGG19 in my modified neural transfer code (Gatys algorithm), but my PC doesn't allow me to use input image in original size (original height is 2499 pix, but with 20GB RAM I can use it only 1000 pix maximum)
As I read, the solution for me will be decreasing batch_size. So, my question is - how can I modify VGG19 .h5 file to change batch_size inside it? Or maybe I can override batch_size of it in my code?

Assuming the pretrained model is defined on ImageNet, the maximum input data size for a single sample is 224*224.
If you try and pass a large input, it's possible your deep learning framework will reshape it into many images to be classified at once.
Resizing your input data to 224*224, you will run with a single image (batch size of 1).
You could make a custom implementation of your model to take larger input sizes. However sizing down to 224*224 generally gets good results, depending on the task.

Related

HOW (how) does Keras and Pytorch handle the last batch when batch_size is not a multiple of size of training data?

Before you happen to suggest or report as duplicate, I am asking HOW does the two libraries handle the problem NOT WHAT HAPPENS because I know it can make a batch from remaining data. And I do know that it handles but all I am asking is HOW.
If I have 100 images as training data and batch_size=15, the last batch will have 10 images to train. My Question is that when the Input() layer already knows that the data is coming as a shape of (Batch_size,channel,width,height) for PyTorch and (batch_size,width,height,channels) for Keras using Tensorflow as backend.
If the last batch has size=10, isn't the model supposed to throw an error because it will get (10,1,28,28) in place of (15,1,28,28) given we have (28,28) pixels Grayscale images?
What is happening behind the scenes?
If you look at the doc for example you can see that the batch size is optional. That is because the batch size is treated a variable in any iteration

Why we have target_size for DeepLab while CNN can accept any sizes?

I still have not understood a concept. One reason that we use fully convolutional layer at the end in a CNN network is to handle different images sizes during training. My question is that if this is the case why we always crop or squeeze images into squared sizes in the input section. Please do not say the question is repeated, we use squared images to make it easier, check pyramid pooling, and so on.
For example, Here's a link
DeepLab can accept any images with different sizes. But in its code, there is a target_size as (513). Now, if CNN can accept images with different sizes, why we need to use target_size. If this is for converting images into a standard format, why 513?
During training, we should specify batch size. What is our batch_size in this case: (5, None, None, None). Is it possible to have images with different sizes in a batch?
I read many posts and still, I am confused with these questions:
- How can we train a model on images with different sizes (imagine that sizes are standard)? I see some codes use a batch size of one. I think it is not a solution.
- Is there any snipped code that shows how can we define batches for a model like FCN to accept dataset with different sizes?
- In this paper: Here's a link my problem was explained but authors again resized images into squared format, if we can use batches comprises of images with different sizes why they proposed that idea of using squared images between 180 by 180 and 224 by 224.
Has DeepLab used this part: link to make images into a standard format? or for other reason?
width, height = image.size
resize_ratio = 1.0 * 513 / max(width, height)
target_size = (int(resize_ratio * width), int(resize_ratio * height))
I could not find the place of their code when they training the model on PASCAL dataset.
I expected to find a simple code for Keras or Tensorflow whereas it shows easily that we can apply a CNN model such as FCN or DeepLab for a dataset such as PASCAL VOC2012 (for Segmentation) with images of different sizes without any resizing or cropping. Still, I am looking.
Thank you for detail answers in advance. Please do not repeat answers like you can use batch size one, squared images are common and better, you can add black margins to the images, fully connected layer is the problem, you can use global max pooling, and so on. I am looking to find a code that works on images with different sizes.
I could not find the place of DeepLab model in TensorFlow GitHub where it accepts batches with different sizes?? here
Also in here FCN it is trained on COCO dataset with target_size of 320 by 320. Why? it should be any size for FCN.
Also, could one explain to me how can we have a batch of images with different sizes? Could we have an np array of different sized images? Batch = [5, none, none, 3] each of 5 with different sizes.
I also found another confusing part in semantic segmentation. Using Keras Augmentation we can not augment image with more than 4 channels. It means that using Keras augmentation, we can not train PASCAL dataset with 21 channels. ??

tensorflow: batches of variable-sized images

When one passes to tf.train.batch, it looks like the shape of the element has to be strictly defined, else it would complain that All shapes must be fully defined if there exist Tensors with shape Dimension(None). How, then, does one train on images of different sizes?
You could set dynamic_pad=True in the argument of tf.train.batch.
dynamic_pad: Boolean. Allow variable dimensions in input shapes. The given dimensions are padded upon dequeue so that tensors within a batch have the same shapes.
Usually, images are resized to a certain number of pixels.
Depending on your task you might be able to use other techniques in order to process images of varying sizes. For example, for face recognition and OCR, a fix sized window is used, that is then moved over the image. On other tasks, convolutional neural networks with pooling layers or recurrent neural networks can be helpful.
I see that this is quite old question, but in case someone will be searching how variable-size images can be still used in batches, I can tell what I did for Image-to-Image convolutional network (inference), which was trained for variable image size and batch 1. Why: when I tried to process images in batches using padding, the results become much worse, because signal was "spreading" inside of the network and started to influence its convolution pyramids.
So what I did is possible when you have source code and can load weights manually into convolutional layers. I modified the network in the following way: along with a batch of zero-padded images, I added additional placeholder which received a batch of binary masks with 1 where actual data was on the patch, and 0 where padding was applied. Then I multiplied signal by these masks after each convolutional layer inside the network, fighting "spreading". Multiplication isn't expensive operation, so it did not affect performance much.
The result was not deformed already, but still had some border artifacts, so I modified this approach further by adding small (2px) symmetric padding around input images (kernel size of all the layers of my CNN was 3), and kept it during propagation by using slightly bigger (+[2px,2px]) mask.
One can apply the same approach for training as well. Then some sort of "masked" loss is needed, where only the ROI on each patch is used to calculate loss. For example, for L1/L2 loss you can calculate the difference image between generated and label images and apply masks before summing up. More complicated losses might involve unstacking or iterating batch, and extracting ROI using tf.where or tf.boolean_mask.
Such training can be indeed beneficial in some cases, because you can combine small and big inputs for the network without small inputs being affected by the loss of big padded surroundings.

reduce size of pretrained deep learning model for feature generation

I am using an pretrained model in Keras to generate features for a set of images:
model = InceptionV3(weights='imagenet', include_top=False)
train_data = model.predict(data).reshape(data.shape[0],-1)
However, I have a lot of images and the Imagenet model outputs 131072 features (columns) for each image.
With 200k images I would get an array of (200000, 131072) which is too large to fit into memory.
More importantly, I need to save this array to disk and it would take 100 GB of space when saved as .npy or .h5py
I could circumvent the memory problem by feeding only batches of like 1000 images and saving them to disk, but not the disk space problem.
How can I make the model smaller without losing too much information?
update
as the answer suggested I include the next layer in the model as well:
base_model = InceptionV3(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('avg_pool').output)
this reduced the output to (200000, 2048)
update 2:
another interesting solution may be the bcolz package to reduce size of numpy arrays https://github.com/Blosc/bcolz
I see at least two solutions to your problem:
Apply a model = AveragePooling2D((8, 8), strides=(8, 8))(model) where model is an InceptionV3 object you loaded (without top). This is the next step in InceptionV3 architecture - so one may easily assume - that these features still hold loads of discriminatory clues.
Apply a some kind of dimensionality reduction (e.g. like PCA) on a sample of data and reduce the dimensionality of all data to get the reasonable file size.

What is the best way to run saved model with different batch size in TensorFlow?

I trained Cifar10 example model from TensorFlow's repository with batch_size 128 and it worked fine. Then I froze graph and managed to run it with C++ just like they do it in their C++ label image example.
The only problem was that I had to artificially generate tensor of shape [128, image_height, image_width, channels] to classify single image with C++ because saved model expects input of 128 samples in a batch since that is number of samples that comes from queue.
I tried training Cifar10 example with batch_size = 1 and then I managed to classify examples one by one when I run model with C++, but that doesn't seem like a great solution. I also tried manually changing tensor shapes in saved graph file but it didn't work.
My question is what is the best way to train model with fixed batch size (like 32, 64, 128 etc.) and then save model so that it can be used with batch size of arbitrary length? If that's not possible, then how to save model to be able to classify samples one by one.
It sounds like the problem is that TensorFlow is "baking in" the batch size to other tensors in the graph (e.g. if the graph contains tf.shape(t) for some tensor t whose shape depends on the batch size, the batch size might be stored in the graph as a constant). The solution is to change your program slightly so that tf.train.batch() returns tensors with a variable batch size.
The tf.train.batch() method accepts a tf.Tensor for the batch_size argument. Perhaps the simplest way to modify your program for variable-sized batches would be to define a placeholder for the batch size:
# Define a scalar tensor for the batch size, so that you can alter it at
# Session.run()-time.
batch_size_tensor = tf.placeholder(tf.int32, shape=[])
input_tensors = tf.train.batch(..., batch_size=batch_size_tensor, ...)
This would prevent the batch size from being baked into your GraphDef, so you should be able to feed values of any batch size in C++. However, this modification would require you to feed a value for the batch size on every step, which is slightly tedious.
Assuming that you always want to train with batch size 128, but retain the flexibility to change the batch size later, you could use a tf.placeholder_with_default() to specify that the batch size should be 128 when you don't feed an alternative value:
# Define a scalar tensor for the batch size, so that you can alter it at
# Session.run()-time.
batch_size_tensor = tf.placeholder_with_default(128, shape=[])
input_tensors = tf.train.batch(..., batch_size=batch_size_tensor, ...)
Is there a reason you need fixed batch size in the graph?
I think a good way is to build a graph with a variable batch size - by putting None as the first dimension. During training, you can then pass the batch size flag to your data provider, so it feeds the desired amount of data in each iteration.
After the model is trained, you can export the graph using tf.train.Saver(), which exports the metagraph. To do inference, you can load the exported files and just evaluate with any number of examples - also just one.
Note, this is different from the frozen graph.