Tensorflow GPU /device:GPU:0 not found on Ubuntu Bionics - tensorflow

I have a p3.2xlarge template from Amazon EC2 with Ubuntu Bionics. It's supposed to have a GPU device installed . But when I run this code it says there is no GPU. Now, this is a virtual and not physica machine but there is still supposed to be a GPU. Note that I started TensorFlow using Docket, which should not work is the GPU was missing:
sudo docker run -it -p 8888:8888 tensorflow/tensorflow
with tf.device('/device:CPU:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))
I get this error:
InvalidArgumentError: Cannot assign a device for operation 'MatMul': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]
But there are NVIDIA drivers loaded, and you cannot load those without a device:
lspci -nn | grep '\[03'
00:02.0 VGA compatible controller [0300]: Cirrus Logic GD 5446 [1013:00b8]
00:1e.0 3D controller [0302]: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] [10de:1db1] (rev a1)
dpkg -l "*cuda*"
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
ii cuda-command-l 9.0.176-1 amd64 CUDA command-line tools
ii cuda-core-9-0 amd64 CUDA core tools
ii cuda-cublas-9- amd64 CUBLAS native runtime libraries
ii cuda-cudart-9- 9.0.176-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-de 9.0.176-1 amd64 CUDA Runtime native dev links, he
ii cuda-cufft-9-0 9.0.176-1 amd64 CUFFT native runtime libraries
ii cuda-curand-9- 9.0.176-1 amd64 CURAND native runtime libraries
ii cuda-cusolver- 9.0.176-1 amd64 CUDA solver native runtime librar
ii cuda-cusparse- 9.0.176-1 amd64 CUSPARSE native runtime libraries
ii cuda-driver-de 9.0.176-1 amd64 CUDA Driver native dev stub libra
ii cuda-license-9 9.0.176-1 amd64 CUDA licenses
ii cuda-misc-head 9.0.176-1 amd64 CUDA miscellaneous headers
ii cuda-repo-ubun 9.1.85-1 amd64 cuda repository configuration fil
un libcuda1-340 <none> <none> (no description available)
ii nvinfer-runtim 1.0-1 amd64 nvinfer-runtime-trt repository co
ii nvinfer-runtim 1-1 amd64 nvinfer-runtime-trt repository co

I am going to put this on hold temporarily. CUDA does not support Ubuntu Bionics yet. I tried to use the latest version of CUDA anyway and get these errors:
cat /var/log/nvidia-installer.log
nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Mon Aug 13 10:23:44 2018
installer version: 396.37
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
nvidia-installer command line:
Using built-in stream user interface
-> Detected 8 CPUs online; setting concurrency level to 8.
-> Installing NVIDIA driver version 396.37.
-> For some distributions, Nouveau can be disabled by adding a file in the modprobe configuration directory. Would you like nvidia-installer to attempt to create this modprobe file for you? (Answer: Yes)
-> One or more modprobe configuration files to disable Nouveau have been written. For some distributions, this may be sufficient to disable Nouveau; other distributions may require modification of the initial ramdisk. Please reboot your system and attempt NVIDIA driver installation again. Note if you later wish to reenable Nouveau, you will need to delete these files: /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
-> Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later. (Answer: Yes)
-> Installing both new and classic TLS OpenGL libraries.
-> Installing both new and classic TLS 32bit OpenGL libraries.
-> Install NVIDIA's 32-bit compatibility libraries? (Answer: Yes)
-> Will install GLVND GLX client libraries.
-> Will install GLVND EGL client libraries.
-> Skipping GLX non-GLVND file: "libGL.so.396.37"
-> Skipping GLX non-GLVND file: "libGL.so.1"
-> Skipping GLX non-GLVND file: "libGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.396.37"
-> Skipping EGL non-GLVND file: "libEGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.1"
-> Skipping GLX non-GLVND file: "./32/libGL.so.396.37"
-> Skipping GLX non-GLVND file: "libGL.so.1"
-> Skipping GLX non-GLVND file: "libGL.so"
-> Skipping EGL non-GLVND file: "./32/libEGL.so.396.37"
-> Skipping EGL non-GLVND file: "libEGL.so"
-> Skipping EGL non-GLVND file: "libEGL.so.1"
Looking for install checker script at ./libglvnd_install_checker/check-libglvnd-install.sh
executing: '/bin/sh ./libglvnd_install_checker/check-libglvnd-install.sh'...
Checking for libglvnd installation.
Checking libGLdispatch...
Checking libGLdispatch dispatch table
Checking call through libGLdispatch
All OK
libGLdispatch is OK
Checking for libGLX
libGLX is OK
Checking for libEGL
eglGetDisplay failed
Checking entrypoint library libOpenGL.so.0
Checking call through libGLdispatch
Checking call through library libOpenGL.so.0
dlopen("libOpenGL.so.0") failed: libOpenGL.so.0: cannot open shared object file: No such file or directory
Checking entrypoint library libGL.so.1
Checking call through libGLdispatch
Checking call through library libGL.so.1
glGetString was not called
Found libglvnd libraries: libGLX.so.0 libGLdispatch.so.0
Missing libglvnd libraries: libGL.so.1 libOpenGL.so.0 libEGL.so.1
-> An incomplete installation of libglvnd was found. Do you want to install a full copy of libglvnd? This will overwrite any existing libglvnd libraries. (Answer: Abort installation.)
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.


ValueError: invalid literal for int() with base 10: '' while building tensorflow from source with gpu support [duplicate]

When I install tensorflow-gpu through Conda; it gives me the following output:
conda install tensorflow-gpu
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /home/psychotechnopath/anaconda3/envs/DeepLearning3.6
added / updated specs:
- tensorflow-gpu
The following packages will be downloaded:
package | build
_tflow_select-2.1.0 | gpu 2 KB
cudatoolkit-10.1.243 | h6bb024c_0 347.4 MB
cudnn-7.6.5 | cuda10.1_0 179.9 MB
cupti-10.1.168 | 0 1.4 MB
tensorflow-2.1.0 |gpu_py36h2e5cdaa_0 4 KB
tensorflow-base-2.1.0 |gpu_py36h6c5654b_0 155.9 MB
tensorflow-gpu-2.1.0 | h0d30ee6_0 3 KB
Total: 684.7 MB
The following NEW packages will be INSTALLED:
cudatoolkit pkgs/main/linux-64::cudatoolkit-10.1.243-h6bb024c_0
cudnn pkgs/main/linux-64::cudnn-7.6.5-cuda10.1_0
cupti pkgs/main/linux-64::cupti-10.1.168-0
tensorflow-gpu pkgs/main/linux-64::tensorflow-gpu-2.1.0-h0d30ee6_0
I see that installing tensorflow-gpu automatically triggers the installation of the cudatoolkit and cudnn. Does this mean that I no longer need to install CUDA and CUDNN manually anymore to be able to use tensorflow-gpu? Where does this conda installation of CUDA reside?
I first installed CUDA and CuDNN the old way (e.g. by following these installation instructions: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html )
And then I noticed that tensorflow-gpu was also installing cuda and cudnn
Do i now have two versions of CUDA/CuDNN installed and how do I check this?
Do i now have two versions of CUDA installed and how do I check this?
conda installs the bare minimum redistributable library components required to support the CUDA accelerated packages they offer. The package name cudatoolkit is a complete misnomer. It is nothing of the sort. Even though it is now greatly expanded in scope from what it used to be (literally 5 files -- I think at some point they must have gotten a licensing deal from NVIDIA because some of this wasn't/isn't on the official "freely redistributable" list AFAIK), it still is basically just a handful of libraries.
You can check this for yourself:
cat /opt/miniconda3/conda-meta/cudatoolkit-10.1.168-0.json
"build": "0",
"build_number": 0,
"channel": "https://repo.anaconda.com/pkgs/main/linux-64",
"constrains": [],
"depends": [],
"extracted_package_dir": "/opt/miniconda3/pkgs/cudatoolkit-10.1.168-0",
"features": "",
"files": [
i.e. what you get is (keeping in mind most of those "files" above are just symlinks)
CUBLAS runtime
The CUDA runtime library
CUFFT runtime
CUrand runtime
CUsparse rutime
CUsolver runtime
NPP runtime
nvblas runtime
NVTX runtime
NVgraph runtime
NVjpeg runtime
NVRTC/NVVM runtime
The CUDNN package that conda installs is the redistributable binary distribution which is identical to what NVIDIA distribute -- which is exactly two files, a header file and a library.
You would still require a supported NVIDIA driver installation to make the tensorflow which conda installs work.
If you want to actually compile and build CUDA code, you need to install a separate CUDA toolkit which contains all the the development components which conda deliberately omits from their distribution.

Installing Rakudo on Android with ARM processor architecture

I am trying to install Rakudo on my Android with armv7l processor architecture using Termux.
I tried compiling from source, but it didn't work. Then someone pointed out the Termux user its-pointless and his package for this, but that package does not work on my phone.
How can I run Raku on my phone while it is offline? I'm open to solutions not using Termux.
Termux on SSH results:
u0_a74#localhost ~/rakudo [100]> pkg show rakudo -a
Package: rakudo Version: 2020.05 Maintainer: Termux members #termux
Installed-Size: 37.7 MB Depends: moarvm Homepage: https://rakudo.org
Download-Size: 5062 kB APT-Manual-Installed: yes APT-Sources:
https://its-pointless.github.io/files/24 termux/extras arm Packages
Description: Perl 6 implementation on top of Moar virtual machine
Package: rakudo Version: 2020.01-1 Maintainer: Fredrik Fornwall
#fornwall Installed-Size: 93.1 MB Depends: moarvm Homepage:
https://rakudo.org Download-Size: 10.9 MB APT-Sources:
https://its-pointless.github.io/files/24 termux/extras arm Packages
Description: Perl 6 implementation on top of Moar virtual machine
u0_a74#localhost ~/rakudo> raku
CANNOT LINK EXECUTABLE "raku": cannot locate symbol "ffi_type_double"
referenced by "/data/data/com.termux/files/usr/lib/libmoar.so"...
u0_a74#localhost ~/rakudo> raku --version
CANNOT LINK EXECUTABLE "raku": cannot locate symbol "ffi_type_double"
referenced by "/data/data/com.termux/files/usr/lib/libmoar.so"...
u0_a74#localhost ~/rakudo> raku --help
CANNOT LINK EXECUTABLE "raku": cannot locate symbol "ffi_type_double"
referenced by "/data/data/com.termux/files/usr/lib/libmoar.so"...
u0_a74#localhost ~/rakudo> uname -a
Linux localhost 3.4.42-g3d041de #1 SMP PREEMPT Sat Dec 24 19:56:29 PST
2016 armv7l Android
Does it have to be on Termux? I have successfully installed Raku on Android via UserLand, using Debian SSH. sudo apt-get install rakudo works.

Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.6.0; Ubuntu 18.04

I am trying to address the issue in the title:
Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.6.0. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version
I have read several other posts (example: Loaded runtime CuDNN library: 5005 (compatibility version 5000) but source was compiled with 5103 (compatibility version 5100))
that basically tells me that my machine has CuDNN 7.1.2 but I need 7.6.0. The answer is then to download and install 7.6.*
the only issue is that I thought I did that by following the instructions on nvidia archive (https://developer.nvidia.com/rdp/cudnn-archive)
and if I go to /usr/local/cuda/include and read cudnn.h it shows
#if !defined(CUDNN_H_)
#define CUDNN_H_
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 6
Currently I have CUDA-10.0, 10.1, and 10.2 installed with my .bashrc set to 10.0 (although nvcc --version states I have cuda 9.1 --another issue I cant seem to fix).
Any suggestions? I have been trying to tackle this for days but no luck.
Here are the paths I have
export PATH=$PATH:/usr/local/cuda-10.0/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64
export CUDA_HOME=/usr/local/cuda
Before this is closed could you help with either suggesting a proper path to set or to find old cudnn please?
I hit a very similar error:
Loaded runtime CuDNN library: 7.1.4 but source was compiled with: 7.6.5. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
and tracked it down to accidentally having an older CuDNN in my ldconfig:
$ sudo ldconfig -p | grep libcudnn
libcudnn.so.7 (libc6,x86-64) => /usr/local/cuda-9.0/lib64/libcudnn.so.7
libcudnn.so.7 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcudnn.so.7
libcudnn.so (libc6,x86-64) => /usr/local/cuda-9.0/lib64/libcudnn.so
libcudnn.so (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libcudnn.so
The libcudnn.so.7 file in the cuda-9.0 directory was pointing to the older version:
ls -alh /usr/local/cuda-9.0/lib64/libcudnn.so.7
lrwxrwxrwx 1 root root 17 Dec 16 2018 /usr/local/cuda-9.0/lib64/libcudnn.so.7 -> libcudnn.so.7.1.4
But I had compiled tensorflow against the newer version:
ls -alh /usr/lib/x86_64-linux-gnu/libcudnn.so.7
lrwxrwxrwx 1 root root 17 Oct 27 2019 /usr/lib/x86_64-linux-gnu/libcudnn.so.7 -> libcudnn.so.7.6.5
Since ldconfig uses /etc/ld.so.conf to determine where to look for libraries (I guess in conjunction with LD_LIBRARY_PATH), I checked it and it showed:
include /etc/ld.so.conf.d/*.conf
When I listed the files in that directory, I spotted the problem file and removed it:
$ cat /etc/ld.so.conf.d/cuda9.conf
$ sudo rm /etc/ld.so.conf.d/cuda9.conf
After that I re-ran ldconfig to reload the config, and then everything worked as expected and the error disappeared.

Matplotlib 2.2.2 installation error on High Sierra

I'm running Mac OS 10.13.5 and struggling to install Matplotlib on Python 3.7 any help is greatly appreciated.
Here is the error that I get when I use pip3 install matplotlib:
matplotlib: yes [2.2.2]
python: yes [3.7.0 (v3.7.0:1bf9cc5093, Jun 26 2018,
23:26:24) [Clang 6.0 (clang-600.0.57)]]
platform: yes [darwin]
numpy: yes [version 1.14.5]
install_requires: yes [handled by setuptools]
libagg: yes [Requires patches that have not been merged
upstream. Using local copy.]
freetype: no [The C/C++ header for freetype2 (ft2build.h)
could not be found. You may need to install the
development package.]
png: yes [version 1.6.34]
However I have already installed and linked freetype via Homebrew:
Ocean-Gypsys-MacBook-Pro:~ Aysegul$ more /usr/X11/lib/pkgconfig/freetype2.pc
Name: FreeType 2
URL: http://freetype.org
Description: A free, high-quality, and portable font engine.
Version: 18.6.12
Libs: -L${libdir} -lfreetype
Libs.private: -lz -lbz2
Cflags: -I${includedir}/freetype2
/usr/X11/lib/pkgconfig/freetype2.pc (END)

syntaxnet ./configure error

I was trying to using syntaxnet and I have finished most of processes. Upgrade bazel version to 0.43 in case of errors (Ubuntu 16.04 Ver, Anaconda python 2.7).
However, I am having a troubles with ./configure part. I am reading the official instruction via tensorflow github.
git clone --recursive https://github.com/tensorflow/models.git
cd models/syntaxnet/tensorflow
cd ..
bazel test syntaxnet/... util/utf8/...
# On Mac, run the following:
bazel test --linkopt=-headerpad_max_install_names \
syntaxnet/... util/utf8/...
Following logs will help you to understand what’s going on my machine. Thanks for the advice
Please specify the location of python. [Default is /home/ryan/anaconda2/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] n
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] n
No Hadoop File System support will be enabled for TensorFlow
Found possible Python library paths:
Please input the desired Python library path to use. Default is [/home/ryan]
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5.0
Please specify the location where cuDNN 5.0 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Invalid path to cuDNN toolkit. Neither of the following two files can be found:
Please specify the Cudnn version you want to use. [Leave empty to use system default]:
Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
libcudnn.so resolves to libcudnn.5
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]:
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=120
INFO: Reading options for 'clean' from /home/ryan/git_ryan/models/syntaxnet/tensorflow/tools/bazel.rc:
Inherited 'build' options: --force_python=py2 --host_force_python=py2 --python2_path=/home/ryan/anaconda2/bin/python --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --define PYTHON_BIN_PATH=/home/ryan/anaconda2/bin/python --spawn_strategy=standalone --genrule_strategy=standalone
**INFO: Reading options for 'clean' from /etc/bazel.bazelrc:
Inherited 'build' options: --action_env=PATH --action_env=LD_LIBRARY_PATH --action_env=TMPDIR --test_env=PATH --test_env=LD_LIBRARY_PATH
Unrecognized option: --action_env=PATH
ERROR: /home/ryan/git_ryan/models/syntaxnet/tensorflow/tensorflow/tensorflow.bzl:568:26: Traceback (most recent call last):
File "/home/ryan/git_ryan/models/syntaxnet/tensorflow/tensorflow/tensorflow.bzl", line 562
rule(attrs = {"srcs": attr.label_list..."), <3 more arguments>)}, <2 more arguments>)
File "/home/ryan/git_ryan/models/syntaxnet/tensorflow/tensorflow/tensorflow.bzl", line 568, in rule
attr.label_list(cfg = "data", allow_files = True)
expected ConfigurationTransition or NoneType for 'cfg' while calling label_list but got string instead: data.
ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Extension file 'tensorflow/tensorflow.bzl' has errors.
Configuration finished**
I think the version of your bazel is too high for Syntaxnet. you can try bazel-0.3.1 please.