Why a 2070 Max-Q takes longer time for training a Neural Network than a GTX 960m? - tensorflow

I have two laptops, both with Windows 10, that I use for work:
MSI GE70: i7 4720, 12 GB Ram, GTX 960m 2GB, 258 GB SSD.
Dell G7: i7 9750, 32 GB Ram, RTX 2070 Max-Q 8Gb, 500 GB SSD.
I made a 'mirror' installation of TensorFlow in both laptops following the official TensorFlow page.
In both laptops I installed Python 3.6.8, TensorFlow 2.2, CUDA 10.1, cuDNN 7.6 and 456.71 Nvidia Driver version. When I run the following line in CMD I can see that both GPUs are visibles to TensorFlow and ready to use.
python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))
MSI with 960m
Dell with 2070 Max-Q
Then, when I train the same Neural Network in both laptops, I can see that the MSI takes 7 minutes per epoch, while the Dell G7 takes almost an one hour per epoch. Why the GPU 2070 Max-Q takes so longer time for train the Neural Network in comparison with the 960m? There is some problem with the Dell G7 that I can't see?
This is the structure of the NN:
modelo=Sequential()
modelo.add(Bidirectional(LSTM(units=na, return_sequences=True),input_shape=dim_entrada))
modelo.add(Dropout(0.25))
modelo.add(Bidirectional(LSTM(units=na)))
modelo.add(Dropout(0.25))
modelo.add(Dense(units=3))
opt = tf.optimizers.Adam(learning_rate=0.0015)
modelo.compile(optimizer=opt, loss='mse', metrics=['accuracy'])
modelo.fit(X_train,Y_train,epochs=20,batch_size=32,validation_data=(X_validacion_imu12,Y_validacion_vi12))

I found the problem. I don't know why but the Dell G7 must be plugged into electricity. I think is a power option that prevents the use of the GPU if it is not plugged in.

Related

Our YOLOv4-tiny suddenly loses accuracy

Im training yolov4 tiny custom dataset, and suddenly loss and other markers drop to -nan
As you can see on the chart, all progress is lost after some iterations (around 800 iterations).
Yolov4 accuracy chart
Training log for given chart:
Darknet training log
Any ideas on given problem? It is running on ubuntu with 4 x GeForce GTX 1080 6GB.
When testing the same network on PC with single GeForce GTX 1060 6GB, it does not crash.

How long does it take to train over the fashion-MNIST database?

I'm new to deep learning. I wanted to build an image classifier using CNN to classify clothing images. I decided to train over the fashion MNIST-dataset which is a dataset of 60,000 images. But I'm aware that training is a very heavy task.
I wanted to know how long will my PC take to train over this dataset and should I go for pre-trained models instead with a compromise of accuracy.
My PC configurations are:
- Intel Core i5-6400 CPU # 2.70 GHz
- 8GB RAM.
- NVIDIA GeForce GTX 1050 Ti.
Even though it depends on data-set size & number of EPOCS(i tried with 50 Epocs) ,here it is small that is 32x32.
So for me when i tried on a machine with
Intel Core i7-6400 CPU # 2.70 GHz
8GB RAM.
NVIDIA GeForce GTX 1050 Ti.
with image size(28x28) as provided in MNIST dataset in Tensorflow.org it took less than 5 minutes.

Tensorflow training: CPU Xeon or 2 GPU gtx750. Who is faster?

I use CPU xeon E5-1650 (3.2 GHz, 6Cores, 12 Threads) for training Tensorflow model.
But training is so slow...
If I will use desktop computer with typical CPU and 2 GPU GeForce GTX750 (2 Gb), it will be faster?
Using the GPU's will be faster. The only things to keep in mind are that the size of your model is then constrained by the memory of the GPUs and that you have to choose the right sets of version numbers and drivers, such that your GPU is supported.

Why Tensorflow did not increased speed after GPU upgrade?

I have Tensorflow 1.4 GPU version installed. Cuda8 is installed too.
I trained my pretty simple GAN network on MNIST data.
I have AMD FX 8320 CPU, 16Gb system memory and SSD hard drive.
It took about 17 seconds per epoch on GeForce 720 GPU with 1GB memory.
The training utilized about 25% of GPU and 99% of memory. CPU was loaded prettyhigh, close to 100%.
Then I inserted other video board with GeForce1050 Ti GPU and 4Gb memory instead of previous. The GPU was loaded only for 5-6%, memory was utilized for 93%.
But I still got about 17s per epoch and high load for CPU.
So maybe Tensorflow has some settings to utilize more GPU?
Or what is a cause of high CPU load and low GPU load?
If you are training a simple GAN network it is fairly likely that your old GPU was not the bottleneck in the first place. So, improving it had no effect. If the amount of work done per sess.run() call very small, the overheads (executing your Python code, copying the input data to GPU, starting and running the TensorFlow executor, scheduling all the operations to GPU, etc) can dominate your computation.
The only sure way of knowing what happens is to profile. You can take a look here https://www.tensorflow.org/performance/performance_guide as a starting point. The timeline tool it mentions can be fairly useful. See here for more details: Can I measure the execution time of individual operations with TensorFlow?.
Agree, for MNIST datasets, there are probably other bottlenecks in the system, not the GPU. I ran 2 side-by side TensorFlows,
Intel i7 4600M with NVIDIA Quadro K1100M GPU and 12 GB RAM, which is a 4th Gen Haswell Intel machine, and
Intel i5 8300U with No Cuda GPU and 16GB of RAM.
Basically 8th Gen Kaby Lake Intel CPU vs 4th Gen Intel, and I got:
4th Gen Intel chip with NVIDIA GPU:
311.5 sec, 315.9 sec, 313.0 sec to complete all 10 epocs on a MNIST run
8th Gen Intel chip with no GPU:
252.7 sec, 243.5 sec, 254.9 sec
So I'm running 20% faster with no GPU, just a newer generation of Intel chip.

Recommended GPUs for Tensorflow

I understand that Tensorflow requires (for GPU computation) a GPU with Nvidia Compute Capability >= 3.0. There are many such GPUs to choose from. The gaming oriented GPUs, e.g. GeForce models, are much less expensive than the compute-oriented models, e.g. Tesla. My limited undertanding is that the compute-oriented models may lack video output (not needed for computation) and that the gaming models may be doing 32-bit math instead of 64. Assuming that Tensorflow uses (or prefers) 64-bit, does this mean that the gaming models will not work or will produce deficient results if used with Tensorflow? What attributes should one look for in choosing a GPU to use with Tensorflow?
The GPU-enabled version of TensorFlow has the following requirements:
64-bit Linux
Python 2.7
NVIDIA CUDA® 7.5 (CUDA 8.0 required for Pascal GPUs)
NVIDIA cuDNN v4.0 (minimum) or v5.1 (recommended)
TensorFlow GPU support requires having a GPU card with NVidia Compute Capability >= 3.0. Supported cards include but are not limited to:
NVidia Titan
NVidia Titan X
NVidia K20
NVidia K40
You can see their official docs Tensorflow GPU support
Gaming GPUs can work quite well. You want a very recent GPU with lots of memory and CUDA cores. Most people training neural nets these days on GPU use 32 bit floats.