Can't use GPU with Pytorch - tensorflow

I keep getting this error when trying to use Pytorch.
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
I installed Pytorch using conda install pytorch torchvision cudatoolkit=10.1 -c pytorch.
With tensorflow my GPU runs just fine.

You can fix this error by installing CUDA 10.2 (The Latest Version) and, additionally re-install Pytorch with this command:
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

Use official installation command from pytorch.org
For Windows+GPU (for others check pytorch.org)
pip install torch===1.5.0 torchvision===0.6.0 -f https://download.pytorch.org/whl/torch_stable.html

Related

Using Object Detection API on local GPU but not last version (v2.5.0)

I am trying to use my local GPU to train an EfficientDetD0 model. I already have a good pipeline (that works on Google Colab for example), I modified it a bit to use it locally, but one problem happens every time I launch the training.
I use conda to install tensorflow-gpu with cuda and cudnn but it makes TensorFlow v2.4.1 environments and when I launch the training the Object Detection API automatically install TensorFlow V2.5.0. So my env is not using the gpu for the training because cuda and cudnn are waiting for TensorFlow to be v2.4.1 and not v2.5.0.
Is there a way to get the Object Detection API in v2.4.1 and not v2.5.0 ?
I tried many things but it doesn't work (training is failing or going for CPU training).
Here is the code that install dependencies and overwrite TensorFlow version to TensorFlow v2.5.0:
os.system("cp object_detection/packages/tf2/setup.py .")
os.system("python -m pip install .")
SYSTEM:
gpu : Nvidia RTX 3070
os : Ubuntu 20.04 LTS
tensorflow: 2.4.1
P.S.: I go with conda install -c conda-forge tensorflow-gpu for installing TensorFlow, cuda and cudnn in my training env because manually there was a dependency problem, so I took the easy way.
EDIT : solution found explained in comments.
Follow these steps to install specific version of tensorflow gpu
1. Set Up Anaconda Environments
conda create -n tf_gpu cudatoolkit=11.0
2. Activate the new Environment
source activate tf_gpu
3. Install tensorflow-gpu 2.4.1
pip install tensorflow==2.4.1
Try to run object_detection without "installing" it. Dont run setup.py. Just setup the neccesery paths and packages manually.
Or edit the setup.py to skip installing the specific verison of TF. I quess that this version is a requirement of some of the packages installed in setup.py.
I use the object_detection without running the setup.py or doing any "installation" without any problems.

Installed pytorch with conda which changed my TF version to 1.13.0 now conda install tensorflow-gpu=2.0 not working?

Like I said in title I installed pytorch with conda install and that downgraded my tensorflow version to 1.13.0 and now conda install tensorflow-gpu=2.0 is not working how can I get the command to execute?
I would suggest that you try to install tensorflow with pip. pip install -U tensorflow-gpu
https://www.tensorflow.org/install/gpu
I am using pytorch, but my env has pytorch 1.2 + tensorflow 2.1
You should have installed pyTorch in another virtual environment but since now it has been installed.
I would recommend you to create a virtual environment and install TF plus other libraries in it. Because I am sure you would not use both PyTorch and TF in the same program for ML.

xgboost install on tensorflow GPU support

I already install tensorflow GPU support.
try install xgboost on tensorflow by
'conda install -c anaconda py-xgboost'
I wonder the xgboost what GPU support or not.
I don't install https://xgboost.readthedocs.io/en/latest/build.html#building-with-gpu-support
only tensorflow GPU support.
Do i need install xgboost Gpu support or not??? if i want use xgboost with GPU support
You can check if your xgboost is compiled for gpu, just try to run some model with tree_method='gpu_hist' or another gpu method (here).
If it would raise an error that xgboost's not compiled for gpu, then reinstall it following the instructions that you have found.
Probably, you don't need install CUDA (if you have successfully installed tensorflow-gpu and it works, then CUDA must be installed already), but you definitely should build gpu-supported xgboost.

Installation issues with Tensorflow in Windows10

Installation method:
I'm using the Anaconda distribution of Python instead of having multiple versions of python on my computer. I used the instructions under TensorFlow with Anaconda
(link1)(link2) with the following commands:
C:> conda create -n tensorflow python=3.6
C:> activate tensorflow
(tensorflow)C:> pip install --ignore-installed --upgrade tensorflow
Error:
When running the test hello world code from within a tensorflow environment I received the following errors:
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
2018-01-23 02:44:09.201798: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
>>> print(sess.run(hello))
b'Hello, TensorFlow!'
Questions:
Does this mean my CPU does not support Tensorflow? (i7-6500U, 2.59GHz)
Does the b' signify an environment output, or is this an error?
I noticed the TensorFlow library doesn't appear in my CMD prompt version of python, nor in my Spyder executable. Should I use pip and install a second version of the library? Or does TensorFlow require an active environment to invoke the library?
Edit:
I just noticed this line in a re-read:
In Anaconda, you may use conda to create a virtual environment.
However, within Anaconda, we recommend installing TensorFlow with the
pip install command, not with the conda install command.
if you have installed tensforflow 1.8 for cpu correctly in win10 (python3.5.x) and you have an error, try to change version to 1.5
pip3 install tensorflow==1.5
I have spend one day to know it :)
You should be able to run tensorflow just fine with that installation. However, you can install a specific version of tensorflow that was compiled to include instruction sets that will make the computation faster that your processor has access to.
Read this guide to find out how to build form source and improve your performance: https://www.tensorflow.org/install/install_sources
or feel free to continue using the installation you have now.

Keras with Tensorflow backend on GPU. MKL ERROR: Parameter 4 was incorrect on entry to DLASCL

I installed Tensorflow with GPU support and Keras to an environment in Anaconda (v1.6.5) by using following commands:
conda install -n EnvName tensorflow-gpu
conda install -n EnvName -c conda-forge keras-gpu
I have NVIDIA Quadro 2200K on my machine with driver v384.66, cuda-8.0, cudnn 7.0
When I am trying to run a python code with Keras at the stage of training I get the following
Intel MKL ERROR: Parameter 4 was incorrect on entry to DLASCL.
and later
File
"/home/User/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/numpy/linalg/linalg.py",
line 99, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge") numpy.linalg.linalg.LinAlgError: SVD did not converge
Other relevant sources suggest to check data for NaNs and Infs, but my data is clean for sure. By the way, CPU version of the installation is working fine, the issue occurs only when trying to run on GPU
I tried to reinstall Anaconda, to reinstall CUDA and numpy, but it didn't work out.
The problem was in package mkl (2018.0.0) - it seems like it has recently been released and conflicts with the version of some packages supplied with Tensorflow(1.3.0) and Keras(2.0.5) via conda*.
So I manually downgraded mkl using Anaconda Navigator to v11.3.3 which led automatically to downgrade of other packages and everything is working well now.