How can I use tensorflow pretrained model (i.e inception v3) in batch mode by using GPU? - gpu

I wanna use the inception v3 model in tensor flow for feature extraction. But the number of the images I am using is a lot, so it takes long time to run. So, I am going to use GPU. I have installed Cuda 7.5 and cuDnn correctly.
I am using following code in the CPU mode for one image:
with tf.Session as sess:
softmax_tensor =sess.graph.get_tensor_by_name('pool_3:0')
feat_vect = numpy.squeeze(sess.run(softmax_tensor,{'DecodeJpeg:0': in_image}))
So, my question is that how should I change my code so I can use it for many batches by GPU?

Related

Counting FLOPS in tensorflow

Is there a way to count FLOPS for the training and prediction of tensorflow models?
The models are running on a CPU using tensorflow 2.8.0 and i would not like to use an external (e.g. command line) tool.

Question about GPU usage in Google Colab when training Keras/TF models

I have a quick question: when using Google Colab with the GPU enabled, does all of the code already run on the GPU then or is there some setting in the code that we must change to make it run on the GPU? Specifically, if I am training a neural network model in Keras/TF, do I need to edit my code in any way to ensure that the model is trained on the GPU?
Thanks!
as showed in the Tensorflow with Gpu example notebook you can run your model in the following way to make sure it is running on the chosen device:
def gpu():
with tf.device('/device:GPU:0'):
random_image_gpu = tf.random.normal((100, 100, 100, 3))
net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu)
return tf.math.reduce_sum(net_gpu)

Keras getting frozen when using regularizer in CNN model

I had a custom CNN implementation in keras running with TensorFlow backend. To improve generalizability I was working on adding regularization to the CNN model. The model works fine without any activity/kernel regularization. The moment I add an activity/kernel regularization the model freezes in between; training typically stops in between batches/iterations of a single epoch (for e.g. 67/172 batch). The issue is very repeatable and reproducible on my system and I was able to localize the issue to the implementation of regularization. It was strange to see this behavior and I could not find similar issues by others. I am not sure if I need to provide any additional information, if someone can guide me on what is lacking, I would be more than happy to provide the required information, and guidance on the issue would be greatly appreciated.
The following are some helpful information about things like the libraries/dependencies
Keras 2.4.3
Tensorflow 2.3.1
GPU: NVIDIA 1070 TI (8GB)
cudart64_101.dll was successfully openedT
The code was written in Spyder running on Python 3.8
Input: 32 batch size, input size (32, 256,64,1)
Using model.fit function to train the model
100,277 parameters, 99523 trainable
Actually, I think this issue is fixed after I updated the NVIDIA software to the latest version (11.1) and added the most recent ones to the path

Can we run training and validation on separate GPUs using tensorflow object detection API running on tensorflow 1.12?

I have two Nvidia Titan X cards on my machine and want to finetune COCO pretrained Inception V2 model on a single specific class. I have created the train/val tfrecords and changed the config to run the tensorflow object detection training pipeline.
I am able to start the training but it hangs (without any OOM) whenever it tries to evaluate a checkpoint. Currently it is using only GPU 0 with other resource parameters (like RAM, CPU, IO etc) in normal range. So I am guessing that GPU is the bottleneck. I wanted to try splitting training and validation on separate GPUs and see if it works.
I tried to look for a place where I could do something like setting "CUDA_VISIBLE_DEVICES" differently for both the processes but unfortunately the latest tensorflow object detection API code (using tensorflow 1.12) makes it very difficult to do so. I am also unable to verify my assumption about training and validation running in same process as my machine hangs. Could someone please suggest where to look for to solve it?

Difference between `train.py` and `model_main.py` in Tensorflow Object Detection API

I usually just use train.py to train using Tensorflow Object Detection API. However, I read from https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/discussion/68581 that you can also use model_main.py to train your model and see real-time plots and images on Tensorboard.
How do you exactly use model_main.py on Tensorboard?
What is the difference between train.py and model_main.py?
On TensorBoard, the model_main.py output similar graphs like train.py, but in model_main.py, the performance of the model on the evaluation dataset is measured too.
model_main.py is the newer version in TensorFlow Object Detection API. It is used for training and also evaluating the model. When using train.py we have to run a separate program for evaluation (eval.py), while model_main.py executes both. For example, training code will be running for a certain time (for example 5 mins or every 2000 steps), then the training will be stopped and evaluation will be run. After the evaluation has finished, the training will be continued again. Then the same cycle is repeated again.
The newer version of Object Detection API of Tensorflow offers model_main.py that trains as well as evaluates the model using the various pre-conditions and preprocessing where as the older versions of Tensorflow Object Detection APIs uses train.py for training and eval.py for evaluating.
Reference : https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10