Tensorflow bazel build failing - not generating bazel-bin directory - tensorflow

I'm trying to install Tensorflow from the source using the following configuration:
NVIDIA GTX 1070
UBUNTU 16.04
CUDA 8.0
Cudnn v5.0
I have followed the following steps from here:
installed bazel
installed dependencies
installed CUDA support
./configure with CUDA 8.0 support
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
After this step, to my knowledge, there should be a bazel-bin directory, so that I can subsequently execute
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# The name of the .whl file will depend on your platform.
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.10.0rc0-py2-none-any.whl
However, there is no such directory.
I have a feeling this error message might have something to do with it?
ERROR: /usr/local/lib/python2.7/dist-packages/tensorflow_clone/tensorflow/contrib/rnn/BUILD:45:1: error while parsing .d file: /home/volcart/.cache/bazel/_bazel_volcart/62dff5ffffc63bcd8a9350984645e0be/execroot/tensorflow_clone/bazel-out/local_linux-opt/bin/tensorflow/contrib/rnn/_objs/python/ops/_lstm_ops_gpu/tensorflow/contrib/rnn/kernels/lstm_ops_gpu.cu.pic.d (No such file or directory).
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
In file included from third_party/gpus/cuda/include/cuda_runtime.h:78:0,
from <command-line>:0:
third_party/gpus/cuda/include/host_config.h:115:2: error: #error -- unsupported GNU version! gcc versions later than 5.3 are not supported!
#error -- unsupported GNU version! gcc versions later than 5.3 are not supported!
Upon re-executing bazel build ... I found this...
WARNING: /usr/local/lib/python2.7/dist-packages/tensorflow/util/python/BUILD:11:16: in includes attribute of cc_library rule //util/python:python_headers: 'python_include' resolves to 'util/python/python_include' not in 'third_party'. This will be an error in the future.
I should also add this...
$ bazel version
Build label: 0.3.1
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Jul 29 09:09:52 2016 (1469783392)
Build timestamp: 1469783392
Build timestamp as int: 1469783392

bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
Caused a permissions issue. Added sudo
sudo bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

Related

Bazel build error, to use graph transform to reduce the size of .pb model

A question when I was trying to use command Bazel build to build TensorFlow. After I entered this command:
bazel build tensorflow/tools/graph_transforms:transform_graph
The compile started, and about 3 minutes later, it said
this (below you can see I check my cudnn and cuda version)
Anyone could help me with this?
Works on my machine
Just did a checkout of TensorFlow:
git clone https://github.com/tensorflow/tensorflow.git
Then build the target (using bazelisk):
bazelisk build tensorflow/tools/graph_transforms:transform_graph
The build of the target succeeds:
INFO: Elapsed time: 2111.652s, Critical Path: 293.22s
INFO: 5811 processes: 5811 local.
INFO: Build completed successfully, 5969 total actions
CUDA version check:
cat /usr/local/cuda/version.txt gives me CUDA Version 10.0.130
bazelisk version gives me 1.1.0
lsb_release -a
results in:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic
Your machine
There is an error reported:
unsupported gpu architecture 'compute_36'
It seems that nvcc cannot compile your CUDA code, since the define GPU architecture is not supported.
Also, Bazel advises you to use the --verbose_failures flag to get more details about our error.

Building from sources on debian 10 with cuda 9.1, cudnn 7.1

Trying to build tensorflow 1.9 on debian 10 with cuda 9.1.85
and cudnn 7.1.4.18
When using gcc-6 as the compiler and build command
bazel build --verbose_failures --config=opt --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/tools/pip_package:build_pip_package
I get
INFO: From Compiling external/nccl_archive
/src/libwrap.cu.cc:
/usr/lib/cuda/include/cuda_fp16.h(2958): error: identifier "__float2half_rn" is undefined
/usr/lib/cuda/include/cuda_fp16.h(3000): error: identifier "__float2half_rn" is undefined
2 errors detected in the compilation of "/tmp/tmpxft_000070b1_00000000-6_libwrap.cu.cpp1.ii".
This was happening as the build was finding some cuda headers left over from a previous cuda 8.0 installation and these functions are not implemented there. After cleaning up these the build completed successfully.

Tensorflow Custom Compile on Windows

So, I've installed Bazel via Chocolatey, installed Python 3.5 and 2.7, installed CUDA v8, and cuDNN v6, and installed JDK 8.0, I'm now trying to custom-build TensorFlow on my Windows 10 device, with AVX, AVX 2 and CUDA. TensorFlow-GPU, the pre-built version, does work, I've already tested and run that successfully.
I've followed the instructions of other articles, both on TensorFlows' actual site (trying to adapt some sections from the Linux/Mac installs), and on here. The furthest I've made it is; cloning the Github repository via Msys2, running configure.py, then attempting to build via bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package I receive an error, the header of which is:
Error reading java.io.IOException: CreateProcess(): The system cannot find the file specified.
: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0/include/cudnn.h
I've double checked, that file does exist, so I'm not sure why I'm getting this error.
EDIT: Also attempted to run via Powershell, reached the same point.
Any help would be much appreciated.
I had the exact same error while trying to build Tensorflow on Windows (using cuDNN 5.1). I fixed it by launching bazel from the msys2 terminal (instead of from the windows command prompt) and manually setting the BAZEL_SH environment variable before attempting to build.
export BAZEL_SH=c:/tools/msys64/usr/bin/bash.exe
bazel build -c opt --config=win-cuda tensorflow/cc:cc_ops
The following steps helped me to compile Tensorflow on Windows 10.
pacman -Syuu patch
ln -s "c:\python27\python.exe" /usr/bin/python
export BAZEL_SH=c:/tools/msys64/usr/bin/bash.exe
"C:\Documents and Settings\All Users\chocolatey\bin\bazel.exe" build --config=opt --config=win-cuda //tensorflow/tools/pip_package:build_pip_package
But after 1 hour of compilation I got another error:
C:\tools\msys64\tmp_bazel_dmitry\x1e5egqw\execroot\org_tensorflow\external\protobuf_archive\python\google\protobuf\internal\api_implementation.cc
: fatal error C1083: Cannot open compiler generated file: '': Invalid
argument Target //tensorflow/tools/pip_package:build_pip_package
failed to build

TensorFlow: Extension file not found: 'google/protobuf/protobuf.bzl'

I am following this tutorial to install GPU-enabled TensorFlow that is compatible with CUDA Compute Capability 3.0.
I installed Java-JDK8, Bazel 0.1.0, TensorFlow 0.6.0, and changed the configurations to run on CUDA Compute Capability 3.0. Everything is good so far.
But when I enter this command:
$HOME/bin/bazel build -c opt --config=cuda
//tensorflow/cc:tutorials_example_trainer
I see this output:
Extracting Bazel installation...
.....
ERROR: /home/me/tensorflow/tensorflow/core/BUILD:1: Extension file not found: 'google/protobuf/protobuf.bzl'.
ERROR: /home/me/tensorflow/tensorflow/cc/BUILD:65:1: error loading package 'tensorflow/core': Extension file not found: 'google/protobuf/protobuf.bzl' and referenced by '//tensorflow/cc:tutorials_example_trainer'.
ERROR: Loading failed; build aborted.
INFO: Elapsed time: 1.006s
Any advice?
The problem is fixed by running this command:
$ git clone -b 0.6.0 –recurse-submodules https://github.com/tensorflow/tensorflow.git
The error message I received is documented here. Pulling all submodules fixed the problem.
I've had issues with the command above, -recurse-submodules does not exist
Try this:
$ git clone --recursive git#github.com:tensorflow/tensorflow.git

How do I check Bazel version?

I am trying to find out which version of Bazel I currently have on my computer to install TensorFlow from source (it requires version 0.1.4)
evan#evan-box:~/Apps/tensorflow$ bazel --version
Unknown Bazel startup option: '--version'.
For more info, run 'blaze help startup_options'.
evan#evan-box:~/Apps/tensorflow$ bazel version
Build label: head (#125b349)
Build target: bazel-out/local_linux-fastbuild/bin/src/main/java/bazel-main_deploy.jar
Build time: Fri Nov 13 01:23:30 2015 (1447377810)
Build timestamp: 1447377810
Build timestamp as int: 1447377810
So where is the version actually?
See Bazel users manual
From the command line:
$ bazel version
Build label: 0.1.1
Since bazel version 0.27.0 there is bazel --version available, without the side-effect of starting the server daemon.