tensorflow retrain.py understanding train_batch_size - tensorflow

I'm working my way through the Tensorflow InceptionV3 tutorial: https://www.tensorflow.org/tutorials/image_retraining#bottlenecks
I come across the following pargraph:
By default this script will run 4,000 training steps. Each step chooses ten images at random from the training set, finds their bottlenecks from the cache, and feeds them into the final layer to get predictions. Those predictions are then compared against the actual labels to update the final layer's weights through the back-propagation process.
Do the "ten images at random" mean that train_batch_size=10? Meanwhile in the source code I found this:
parser.add_argument(
'--train_batch_size',
type=int,
default=100,
help='How many images to train on at a time.'
)
Does this mean I'm interpreting the paragraph incorrectly? If so, what does train_batch_size mean, and how is it different from the ten random images? Or does it simply mean that the tutorial page is out of date with the actual code?
Source Code: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py

Turns out it was a typo. The 10 random images is actually supposed to be 100 random images, which corresponds to train_batch_size.
Pull request that addressed the issue:
https://github.com/tensorflow/tensorflow/pull/17638

Related

HOW (how) does Keras and Pytorch handle the last batch when batch_size is not a multiple of size of training data?

Before you happen to suggest or report as duplicate, I am asking HOW does the two libraries handle the problem NOT WHAT HAPPENS because I know it can make a batch from remaining data. And I do know that it handles but all I am asking is HOW.
If I have 100 images as training data and batch_size=15, the last batch will have 10 images to train. My Question is that when the Input() layer already knows that the data is coming as a shape of (Batch_size,channel,width,height) for PyTorch and (batch_size,width,height,channels) for Keras using Tensorflow as backend.
If the last batch has size=10, isn't the model supposed to throw an error because it will get (10,1,28,28) in place of (15,1,28,28) given we have (28,28) pixels Grayscale images?
What is happening behind the scenes?
If you look at the doc for example you can see that the batch size is optional. That is because the batch size is treated a variable in any iteration

Using ssd_inception_v2 to train on different resolution

The dataset contains images of different sizes.
The pretrained weights are trained on 300x300 resolution.
I am training on widerface dataset where objects are as small as 15x15.
Q1. I want to train with 800x800 resolution do i need to resize all the images manually or this will be done by Tensorflow automatically ?
I am using the following command to train:
python3 /opt/github/models/research/object_detection/legacy/train.py --logtostderr --train_dir=/opt/github/object_detection_retraining/wider_face_checkpoint/ --pipeline_config_path=/opt/github/object_detection_retraining/models/ssd_inception_v2_coco_2018_01_28/pipeline.config
Q2. I also tried training it using the model_main.py but after 1000 iterations it is evaluating the dataset with each iteration.
I am using the following command to train:
python3 /opt/github/models/research/object_detection/model_main.py --num_train_steps=200000 --logtostderr --model_dir=/opt/github/object_detection_retraining/wider_face_checkpoint/ --pipeline_config_path=/opt/github/object_detection_retraining/models/ssd_inception_v2_coco_2018_01_28/pipeline.config
Q3. Also if you can suggest any model i should use for real time face detection apart from mobilenet and inception, please suggest.
Thanks.
Q1. No you do not need to resize manually. See this detailed answer.
Q2. By 1000 iterations you meant steps right? (An iteration counts as a complete cycle of the dataset.) Usually the model performed evaluation after a certain amount of time, e.g. 10 minutes. So in every 10 minutes, the checkpoints are saved and an evaluation of the model on evaluation set is performed.
Q3. SSD models with mobilenet is one of the fast detectors, apart from that you can try YOLO models for real time detection

Replicating results in tensorflow estimators

I am building a classifier using tensorflow estimators.
But I am getting two separate results for the same experiment if run multiple times.
These experiments have all the same parameters starting from the batches of data to the architecture of the network.
I set the random seed in the config of the estimator as well.
Is there any other random seed that I am not aware of?
I have checked the data side of the experiment, I am making sure of sending the same batches and in the same sequence in these experiments.
The training curve looks close of each other but the validation curve is way off. As you can see in the image below. These are the loss curve for two same experiments, I ran them on the same dataset creating same batches at each step
This code tells how I set the random seed of tensorflow estimators.
config = tf.estimator.RunConfig(save_summary_steps=t.save_summary_steps,
log_step_count_steps=t.log_step_count_steps,
save_checkpoints_steps=t.save_checkpoints_steps,
keep_checkpoint_max=t.keep_checkpoint_max,
tf_random_seed=t.random_number
)

Use Tensorflow LSTM PTB example for scoring sentences

I try to use an example LSTM, trained according to Tensorflow LSTM example. This example allows to get perplexity on whole test set. But I need to use the trained model to score (get loglikes) of each sentence separately (to score hypotheses of STT decoder output). I modified reader a bit and used code:
mtests=list()
with tf.name_scope("Test"):
for test_data_item in test_data:
test_input.append(PTBInput(config=eval_config, data=test_data_item, name="TestInput"))
with tf.variable_scope("Model", reuse=True, initializer=initializer):
for test_input_item in test_input:
mtests.append(PTBModel(is_training=False, config=eval_config,
input_=test_input_item))
sv = tf.train.Supervisor(logdir=FLAGS.model_dir)
with sv.managed_session() as session:
checkpoint=tf.train.latest_checkpoint(FLAGS.model_dir)
sv.saver.restore(session, checkpoint)
sys.stderr.write("model restored\n")
for mtest in mtests:
score, test_perplexity = run_epoch_test(session, mtest)
print(score)
So, using that code, I get score of each sentence independently. If I pass 5 sentences, it works ok. But if I pass 1k sentences to this code, it works extremely slow and uses a lot of memory, because I create 1k models mtest. So, could you tell me another way to reach my goal? Thank you.
It seems like the model can take a batch of inputs, which is set to 20 in all cases by default. You should be able to feed a larger batch of sentences to one test model to get the output for all of them without having to create multiple models instances. This probably involves some experimenting with the reader, which you are already familiar with.

Neural Network with my own dataset

I have downloaded many face images from web. In order to learn Tensorflow I want to feed those images to a simple fully-connected neural network with a single hidden layer. I have found an example code in here.
Since I am a beginner, I don't know how to train, evaluate, and test the network with the downloaded images. The code owner used a '.mat' file and a .pkl file. I don't understand how he organized training and test set.
In order to run the code with my images;
Do I need to divide my images into training, test, and validation folders and turn each folder into a mat file? How am I going to provide labels for the training?
Besides, I don't understand why he used a '.pkl' file?
All in all, I would like to change this code so that I can find test, training , and validation set classification performance with my image dataset.
It might be an easy question, but it is important for me as it is a starting step. Thanks for your understanding.
First, you don't have to use .mat files nor pickles. Tensorflow expects numpy array.
For instance, let's say you have 70000 images of size 28x28 (=784 dimensions) belonging to 10 classes. Let's also assume that you'd like to train a simple feedforward neural network to classify the images.
The first step would be to split the images between train and test (and validation, but let's put this aside for the sake of simplicity). For the sake of the example, let's imagine that you chose randomly 60000 images for your training set and 10000 for your test set.
The second step would be to ensure that your data has the right format. Here, you'd like your training set to consist in one numpy array of shape (60000, 784) for the images and another one of shape (60000, 10) for the labels (if you use one-hot encoding to represent your classes). As for your test set, you should have an array of shape (10000, 784) for the images and one of shape (10000, 10) for the labels.
Once you have these big numpy arrays, you should define placeholders that will allow you to feed data to you network during training and evaluation.
images = tf.placeholder(tf.float32, shape=[None, 784])
labels = tf.placeholder(tf.int64, shape=[None, 10])
The None here means that you can feed a batch of any size, i.e. as many images as you want, as long as you numpy array is of shape (anything, 784).
The third step consists in defining your model as well as the loss function and the optimizer.
The fourth step consists in training your network by feeding it with random batches of data using the placeholders created above. As your network is training, you can periodically print its performance like the training loss/accuracy as well as the test loss/accuracy.
You can find a complete and very simple example here.