Tensor flow installation from source issue in Build your target with GPU support - tensorflow

This first time for me to install Tensorflow. I followed the instructions on official website here. I installed from source because I found some people advice to do that instead of pip installation. I used version r0.9 of tensorflow and I installed it on Ubuntu-mate 15.10. I followed all instructions. Installed python, cuda version 7.5, cudnn release 5. Nvidia GPU 5200 with compute capability of 2.1. Every thing went well until I reached bazel build step here,
Latitude-E6430:~/tensorflow$ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
"Latetude-E6430" is the user of the machine.
I got the following errors immediately:
ERROR: /home/meqdad/tensorflow/tensorflow/core/BUILD:91:1: //tensorflow/core:protos_all_py: no such attribute 'imports' in 'py_library' rule.
ERROR: /home/meqdad/tensorflow/tensorflow/cc/BUILD:28:1: Target '//tensorflow/core:sparse_ops_op_lib' contains an error and its package is in error and referenced by '//tensorflow/cc:ops/sparse_ops_gen_cc'.
ERROR: Loading failed; build aborted.
I tried to find a solution to this problem, I was using bazel 0.1 then I upgraded to bazel 0.3. However, on the net, there is several issues like this one but non of them was related to this one. I re-installed py_library but it did nothing to solve the problem.
Please can you advice me how to move on with this step of installation.

Related

Building tensorflow from source - missing input file

newbie here, first message :)
I need tensorflow with CUDA, AVX and SSE on a windows machine. As far as I understood I need to build it myself. I first tried with Anaconda, but it was a mess, so I uninstalled anything related to python and I started following step by step the official guide
I used several commands to build, for instance:
bazel build -c opt --copt=-march=native --copt=-mfpmath=both --config=cuda -k //tensorflow/tools/pip_package:build_pip_package
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package
bazel build --config=opt --config=cuda --define=no_tensorflow_py_deps=true //tensorflow/tools/pip_package:build_pip_package
The first two commands came from here and might be old.
The building always fails with this message:
ERROR: missing input file 'external/llvm-project/mlir/include/mlir/Interfaces/SideEffectInterfaces.h', owner: '#llvm-project//mlir:include/mlir/Interfaces/SideEffectInterfaces.h'
Does anybody understand what is going on here?
Also, what is the best command to build among the one I used?
Is there any way to install it in Anaconda on windows (with CUDA, avx and SSE capabilities)?
Building tensorflow on windows can be tough, there are many ways for it to fail. The procedure will vary with the version you are compiling. For the most part, the instructions on the tensorflow site are correct if followed verbatim. I think the trickiest part is matching the version of tools used to compile with version of tensorflow you are working with.
My suggestion would be to lock into a particular version of tensorflow and stick with that until you succeed. If you git clone the source from github, I would suggest git checkout r2.2. This will put you into the most recent version.
I would avoid anaconda as that presents complications with the python version you are working with. I have had good results with python 3.6.8., but it may be possible to use 3.7.
You will need a specific version of Bazel as well, 2.0.0 is what works with tensorflow r2.2. Be mindful that you need to configure the BAZEL_VC environment variable before you start compiling. It should look something like C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC. You can do a bazel clean --expunge after the variable is set to avoid some confusion.
The r2.2 tensorflow also requires MSVC 2019, it will not compile with other versions. You will need the build tools for this version as well.
The last bazel build command you showed is the correct one. Don't forget to python ./configure.py before starting a fresh compile.
If my guess is correct, the error message you are getting is from using an older version of MSVC on the later tensorflow source code, but that's just a guess.

Error while building TensorFlow from source

I am trying to build TensorFlow cpu only r1.11 from source on a Debian OS following this tutorial: https://www.tensorflow.org/install/source
I installed bazel as indicated using this tutorial https://docs.bazel.build/versions/master/install-ubuntu.html from the binary installer as recommended.
Then I followed each step and everything worked fine until this command:
bazel test -c opt -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/contrib/lite/...
It shows me this error:
ERROR: error loading package '': Encountered error while reading extension file 'closure/defs.bzl': no such package '#io_bazel_rules_closure//closure': The native http_archive rule is deprecated. load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") for a drop-in replacement.
Use --incompatible_remove_native_http_archive=false to temporarily continue using the native rule.
ERROR: error loading package '': Encountered error while reading extension file 'closure/defs.bzl': no such package '#io_bazel_rules_closure//closure': The native http_archive rule is deprecated. load("#bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") for a drop-in replacement.
Use --incompatible_remove_native_http_archive=false to temporarily continue using the native rule.
INFO: Elapsed time: 0.088s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
FAILED: Build did NOT complete successfully (0 packages loaded)
I read online that this is likely related to bazel. So I tried reinstalling bazel using the Using Bazel custom APT repository but I got the same error.
As of version 1.12.0, TensorFlow uses some deprecated Bazel features that are being completely dropped in recent versions of Bazel. Instead of using the most recent version, try using an older one for now. I was able to build TensorFlow 1.12.0 on Windows using Bazel 0.18.1, most likely that should work with TensorFlow 1.11 too.
I agree with #jdehesa. Even I was struggling to build the tensorflow using from scratch. I tried different versions, 0.26,0.21,0.19.1 and finally it worked with 0.18.1. So, it is a bazel issue rather than Tensorflow. TF_version=1.12.0

Geting ERROR: Config value cuda is not defined in any .rc file when trying to train mobilenet in tensorflow

I'm trying to run MobileNet_v1 on ImageNet and for that I'm using the official Tensorflow Model repository and following their guide.
However when I actually tried to run the training for MobileNet_v1 by first initiating :
models/research/slim$ bazel build -c opt --config=cuda mobilenet_v1_{eval,train}
I got :
Starting local Bazel server and connecting to it...
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=146
ERROR: Config value cuda is not defined in any .rc file
All previous steps have been successful and everything seems fine except this one.
I'm using :
Ubuntu 16.04
TF version: v1.7.0 (and 1.10.1)
Cuda v9.0
cuDNN v7.0 (and 7.1.3)
bazel release 0.16.1
According to the bazel releases page,
--[no]allow_undefined_configs no longer exists, passing undefined
configs is an error.
as of version 0.16.0 and above.
Using an older version of bazel should solve your problem for now.

Tensorflow Custom Compile on Windows

So, I've installed Bazel via Chocolatey, installed Python 3.5 and 2.7, installed CUDA v8, and cuDNN v6, and installed JDK 8.0, I'm now trying to custom-build TensorFlow on my Windows 10 device, with AVX, AVX 2 and CUDA. TensorFlow-GPU, the pre-built version, does work, I've already tested and run that successfully.
I've followed the instructions of other articles, both on TensorFlows' actual site (trying to adapt some sections from the Linux/Mac installs), and on here. The furthest I've made it is; cloning the Github repository via Msys2, running configure.py, then attempting to build via bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package I receive an error, the header of which is:
Error reading java.io.IOException: CreateProcess(): The system cannot find the file specified.
: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0/include/cudnn.h
I've double checked, that file does exist, so I'm not sure why I'm getting this error.
EDIT: Also attempted to run via Powershell, reached the same point.
Any help would be much appreciated.
I had the exact same error while trying to build Tensorflow on Windows (using cuDNN 5.1). I fixed it by launching bazel from the msys2 terminal (instead of from the windows command prompt) and manually setting the BAZEL_SH environment variable before attempting to build.
export BAZEL_SH=c:/tools/msys64/usr/bin/bash.exe
bazel build -c opt --config=win-cuda tensorflow/cc:cc_ops
The following steps helped me to compile Tensorflow on Windows 10.
pacman -Syuu patch
ln -s "c:\python27\python.exe" /usr/bin/python
export BAZEL_SH=c:/tools/msys64/usr/bin/bash.exe
"C:\Documents and Settings\All Users\chocolatey\bin\bazel.exe" build --config=opt --config=win-cuda //tensorflow/tools/pip_package:build_pip_package
But after 1 hour of compilation I got another error:
C:\tools\msys64\tmp_bazel_dmitry\x1e5egqw\execroot\org_tensorflow\external\protobuf_archive\python\google\protobuf\internal\api_implementation.cc
: fatal error C1083: Cannot open compiler generated file: '': Invalid
argument Target //tensorflow/tools/pip_package:build_pip_package
failed to build

Tensorflow with gpu support installation error - the specified --crosstool_top is not a valid cc_toolchain_suite rule

I've been trying to install tensorflow with GPU support using these steps:
http://www.nvidia.com/object/gpu-accelerated-applications-tensorflow-installation.html
and also using:
http://thelazylog.com/install-tensorflow-with-gpu-support-on-sandbox-redhat/
This is the error message that I'm getting when I try to run the bazel build command for building the tensorflow pip package (with the --config-cuda flag set):
The specified --crosstool_top '//third_party/gpus/crosstool:crosstool' is not a valid cc_toolchain_suite rule.
What's strange is that if i remove the --config=cuda flag, I don't get the error message while building and I'm able to install tensorflow successfully - but without GPU support.
I experienced the same issue, using the nvidia instructions. What I did was to drop the git reset line in the instructions, and it works.
Details (from the error message):
Close, reopen terminal
Run git clone (again), and cd tensorflow
Run ./configure
Bazel build, etc
This may be unrelated, but I experienced an issue with the .whl line, the error message was that the wheel cannot be found or something along those lines. This is the "And finally install the TensorFlow pip package" section. To resolve it in my case, I typed in the terminal all the way to "..._pkg/tensorflow", and then pressed tab for auto-completion. The file name that popped up was significantly longer than that in the guide, but it worked. Also, if anyone face a numpy not installed message based on the nvidia instructions, replace the python-pip and dev with python-numpy and run that line again to install.
Configuration: Fresh Ubuntu 16.04, GTX970M, running driver 367.48 (from CUDA installation), CUDA 8.0, CuDNN 5.1
Full setup path:
Fresh Ubuntu, with downloads and 3rd party apps selected during installation.
Control panel => Software and updates => Other Software => Canonical ticked
Install CUDA using nvidia instructions in CUDA documentation, .deb format
CuDNN 5.1 installed, the rest from the nvidia link.
I hope everything works out for you!
(I'm sorry for the poor formatting)
I was going through same problem and recently found the solution. The problem is with the installation of Bazel which leads to this kind of error.
After installation of bazel from installer, make sure that you would give the correct path to ~./bashrc and also activate the path using
source "path-to-your-bin-directory-for-bazel"
Please change the git source version slightly as shown below
$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
// $ git reset --hard 70de76e
$ git reset --hard 287db3a
And please refer the below l
https://github.com/tensorflow/tensorflow/issues/4944
Also, zlib has been updated since this TF build. You need to check http://www.zlib.net/ to get the latest version and SHA-256, then update tensorflow/workspace.bzl with that information (lines 254-266 in this build). At this time, the correct version info would include the following:
url = "http://zlib.net/zlib-1.2.11.tar.gz",
sha256 = "c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1",
strip_prefix = "zlib-1.2.11",