Questions about docker, nvidia-docker, cudnn and tensorflow - gpu

I have some questions about cudnn and tensorflow-gpu docker image. here is what I did on my computer:
I installed docker-CE
I installed nvidia-docker
I pulled tensorflow:latest-gpu-py3-jupyter image from docker hub.
I create one container with the following command:
sudo docker create -ti --rumtime=nvidia -p 21001:22 -v
/home/project:/project tensorflow:latest-gpu-py3-jupyter /bin/bash
I start and get into the container, then I train my model which is built using Keras. but I get one error message---'Segmentation fault (core dumped)'
It seems that the cudnn is not installed in my container. Does anybody know why it happened? Does tensorflow image does not include the cudnn library?

Related

E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)

I am using rasa 1.9.6 on ubuntu in Vmware I have been getting this error in both training as well as running the model. It allows training the model but I am unable to run it I need to run my Bot can someone please help
According to rasa forum, the origin of this issue is due to tensorflow and graphics card configuration. GPU’s do not typically provide an advantage for the Rasa models. This can be safely ignored
Installing nvidia-modprobe can solve this issue.
sudo apt install nvidia-modprobe
Other solutions you can try are :
Uninstall and install CUDA and cuDNN.
Install tensorflow-gpu.
Uninstall and install different Nvidia driver versions.
The problem also could be that only some /dev/nvidia* files are present before running Python with sudo, check using $ ls /dev/nvidia*, after running the Device Node verification script the /dev/nvidia-uvm file gets added.

conda install -c conda-forge tensorflow just stuck in Solving environment

I am trying to run this statement in MacOS.
conda install -c conda-forge tensorflow
It just stuck at the
Solving Environment:
Never finish.
$ conda --version
conda 4.5.12
Nothing worked untill i ran this in conda terminal:
conda upgrade conda
Note that this was for poppler (conda install -c conda-forge poppler)
On win10 I waited about 5-6 minutes but it depends of the number of installed python packages and your internet connection.
Also you can install it via Anaconda Navigator
One can also resolve the "Solving environment" issue by using the mamba package manager.
I installed tensorflow-gpu==2.6.2 on Linux (CentOS Stream 8) using the following commands
conda create --name deeplearning python=3.8
conda activate deeplearning
conda install -c conda-forge mamba
mamba install -c conda-forge tensorflow-gpu
To check the successful usage of GPU, simply run either of the commands
python -c "import tensorflow as tf;print('\n\n\n====================== \n GPU Devices: ',tf.config.list_physical_devices('GPU'), '\n======================')"
python -c "import tensorflow as tf;print('\n\n\n====================== \n', tf.reduce_sum(tf.random.normal([1000, 1000])), '\n======================' )"
References
Conda Forge blog post
mamba install instead of conda install
The same error happens with me .I've tried to install tensorboard with anaconda prompt but it was stuck on the environment solving .So i've added these paths to my environment variables:
C:\Anaconda3
C:\Anaconda3\Library\mingw-w64\bin
C:\Anaconda3\Library\usr\bin
C:\Anaconda3\Library\bin
C:\Anaconda3\Scripts
and it worked well.
Follow the instruction by nekomatic.
I left it running for 1 hour. Yes. it is finally finished.
But now I got the conflicts
Solving environment: failed
UnsatisfiableError: The following specifications were found to be in conflict:
- anaconda==2018.12=py37_0 -> bleach==3.0.2=py37_0
- anaconda==2018.12=py37_0 -> html5lib==1.0.1=py37_0
- anaconda==2018.12=py37_0 -> numexpr==2.6.8=py37h7413580_0
- anaconda==2018.12=py37_0 -> scikit-learn==0.20.1=py37h27c97d8_0
- tensorflow
Use "conda info <package>" to see the dependencies for each package.

Tensorflow GPU for debian

Is it possible to install tensorflow GPU in debian? I am using Nvidia GTX 1070 ti and debian 9.3.0. I have tried several tutorials for Ubuntu but failed as debian doesn't have the same PPA repository supported by Ubuntu, also saw many saying that adding ubuntu's repository to debian is not recommended
It is possible, but it's a hassle :)
I got Debian 9.3 with Openbox to work nicely with Tensorflow 1.6 and Cuda 9.0 + cuDNN 7.0.5.15 (eventually also Wavenet).
https://github.com/ella1011/debian_gpu_jungle
You may want to consider using docker/nvidia-docker. TensorFlow binary releases include docker images, so you could use those to avoid having to mess with your local environment.
Once you have docker/nvidia-docker installed, it would be something like this:
docker run -it --runtime=nvidia --rm tensorflow/tensorflow:1.6.0-gpu
And of course, you can use the -v flag to make directories in your host machine visible to the docker container.

Tensorflow on windows 10

The docker image does not provide updated version of tensorflow. How should I upgrade to 0.12.0 cpu version?
I tried getting the latest-devel cpu version using:
docker run -it -p 8888:8888 -v /notebooks_proj b.gcr.io/tensorflow/tensorflow:latest-devel
but it is 0.8.0 version. How to get 0.12.0 docker?
The latest-devel and latest tags on gcr.io and docker hub should both be up-to-date (0.12.0-rc1 currently)
For gcr.io
docker run -it --rm gcr.io/tensorflow/tensorflow:latest-devel python -c "import tensorflow as tf; print(tf.__version__)"
gives 0.12.0-rc1
For docker hub
docker run -it --rm tensorflow/tensorflow:latest-devel python -c "import tensorflow as tf; print(tf.__version__)"
gives 0.12.0-rc1

How do I install TensorFlow's tensorboard?

How do I install TensorFlow's tensorboard?
The steps to install Tensorflow are here: https://www.tensorflow.org/install/
For example, on Linux for CPU-only (no GPU), you would type this command:
pip install -U pip
pip install tensorflow
Since TensorFlow depends on TensorBoard, running the following command should not be necessary:
pip install tensorboard
Try typing which tensorboard in your terminal. It should exist if you installed with pip as mentioned in the tensorboard README (although the documentation doesn't tell you that you can now launch tensorboard without doing anything else).
You need to give it a log directory. If you are in the directory where you saved your graph, you can launch it from your terminal with something like:
tensorboard --logdir .
or more generally:
tensorboard --logdir /path/to/log/directory
for any log directory.
Then open your favorite web browser and type in localhost:6006 to connect.
That should get you started. As for logging anything useful in your training process, you need to use the TensorFlow Summary API. You can also use the TensorBoard callback in Keras.
If your Tensorflow install is located here:
/usr/local/lib/python2.7/dist-packages/tensorflow
then the python command to launch Tensorboard is:
$ python /usr/local/lib/python2.7/dist-packages/tensorflow/tensorboard/tensorboard.py --logdir=/home/user/Documents/.../logdir
The installation from pip allows you to use:
$ tensorboard --logdir=/home/user/Documents/.../logdir
It may be helpful to make an alias for it.
Install and find your tensorboard location:
pip install tensorboard
pip show tensorboard
Add the following alias in .bashrc:
alias tensorboard='python pathShownByPip/tensorboard/main.py'
Open another terminal or run exec bash.
For Windows users, cd into pathShownByPip\tensorboard and run python main.py from there.
For Python 3.x, use pip3 instead of pip, and don't forget to use python3 in the alias.
TensorBoard isn't a separate component. TensorBoard comes packaged with TensorFlow.
Adding this just for the sake of completeness of this question (some questions may get closed as duplicate of this one).
I usually use user mode for pip ie. pip install --user even if instructions assume root mode. That way, my tensorboard installation was in ~/.local/bin/tensorboard, and it was not in my path (which shouldn't be ideal either). So I was not able to access it.
In this case, running
sudo ln -s ~/.local/bin/tensorboard /usr/bin
should fix it.
pip install tensorflow.tensorboard # install tensorboard
pip show tensorflow.tensorboard
# Location: c:\users\<name>\appdata\roaming\python\python35\site-packages
# now just run tensorboard as:
python c:\users\<name>\appdata\roaming\python\python35\site-packages\tensorboard\main.py --logdir=<logidr>
If you're using the anaconda distribution of Python, then simply do:
$❯ conda install -c conda-forge tensorboard
or
$❯ conda install -c anaconda tensorboard
Also, you can have a look at various builds by search the packages repo by:
$❯ anaconda search -t conda tensorboard
which would list the channels and the corresponding builds, the supported OS, Python versions etc.,
The pip package you are looking for is tensorflow-tensorboard developed by Google.
If you installed TensorFlow using pip, then the location of TensorBoard can be retrieved by issuing the command which tensorboard on the terminal. You can then edit the TensorBoard file, if necessary.
It is better not to mix up the virtual environments or perform installation on the root directory. Steps I took for hassle free installation are as below. I used conda for installing all my dependencies instead of pip. I'm answering with extra details, because when I tried to install tensor board and tensor flow on my root env, it messed up.
Create a virtual env
conda create --name my_env python=3.6
Activate virtual environment
source activate my_env
Install basic required modules
conda install pandas
conda install tensorflow
Install tensor board
conda install -c condo-forge tensor board
Hope that helps
I have a local install of tensorflow 1.15.0 (with tensorboard obviously included) on MacOS.
For me, the path to the relevant file within my user directory is Library/Python/3.7/lib/python/site-packages/tensorboard/main.py. So, which does not work for me, but you have to look for the file named main.py, which is weird since it apparently is named something else for other users.