tensorflow build fails with "missing dependency" error - tensorflow

I'm completely new to bazel and tensorflow so the solution to this may be obvious to someone with some experience. My bazel build of tensorflow fails with a "missing dependency" error message. Here is the relevant sequence of build commands and output:
(tf-gpu)kss#linux-9c32:~/projects> git clone --recurse-submodules https://github.com/tensorflow/tensorflow tensorflow-nogpu
Cloning into 'tensorflow-nogpu'...
remote: Counting objects: 16735, done.
remote: Compressing objects: 100% (152/152), done.
remote: Total 16735 (delta 73), reused 0 (delta 0), pack-reused 16583
Receiving objects: 100% (16735/16735), 25.25 MiB | 911.00 KiB/s, done.
Resolving deltas: 100% (10889/10889), done.
Checking connectivity... done.
Submodule 'google/protobuf' (https://github.com/google/protobuf.git) registered for path 'google/protobuf'
Cloning into 'google/protobuf'...
remote: Counting objects: 30266, done.
remote: Compressing objects: 100% (113/113), done.
remote: Total 30266 (delta 57), reused 0 (delta 0), pack-reused 30151
Receiving objects: 100% (30266/30266), 28.90 MiB | 1.98 MiB/s, done.
Resolving deltas: 100% (20225/20225), done.
Checking connectivity... done.
Submodule path 'google/protobuf': checked out '0906f5d18a2548024b511eadcbb4cfc0ca56cd67'
(tf-gpu)kss#linux-9c32:~/projects> cd tensorflow-nogpu/
(tf-gpu)kss#linux-9c32:~/projects/tensorflow-nogpu> ./configure
Please specify the location of python. [Default is /home/kss/.venv/tf-gpu/bin/python]:
Do you wish to build TensorFlow with GPU support? [y/N]
No GPU support will be enabled for TensorFlow
Configuration finished
(tf-gpu)kss#linux-9c32:~/projects/tensorflow-nogpu> bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
Sending SIGTERM to previous Bazel server (pid=8491)... done.
....
INFO: Found 1 target...
ERROR: /home/kss/.cache/bazel/_bazel_kss/b97e0e942a10977a6b42467ea6712cbf/external/re2/BUILD:9:1: undeclared inclusion(s) in rule '#re2//:re2':
this rule is missing dependency declarations for the following files included by 'external/re2/re2/perl_groups.cc':
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/stddef.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/stdarg.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/stdint.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/x86intrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/ia32intrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/mmintrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/xmmintrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/mm_malloc.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/emmintrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/immintrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/fxsrintrin.h'
'/usr/lib64/gcc/x86_64-suse-linux/4.8/include/adxintrin.h'.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 144.661s, Critical Path: 1.18s
(tf-gpu)kss#linux-9c32:~/projects/tensorflow-nogpu>
The version of bazel I'm using is release 0.1.4, I'm running on openSUSE 13.2. I confirmed that the header files do exist which is probably expected:
(tf-gpu)kss#linux-9c32:~/projects/tensorflow-nogpu> ll /usr/lib64/gcc/x86_64-suse-linux/4.8/include/stddef.h
-rw-r--r-- 1 root root 13619 Oct 6 2014 /usr/lib64/gcc/x86_64-suse-linux/4.8/include/stddef.h
Note for anyone who finds this question:
Use Damien's answer below except that you have to use --crosstool_top rather than --crosstool. Also if you are building for GPU acceleration you will also need to modify the CROSSTOOL file in the tensorflow repo like:
(tf-gpu)kss#linux-9c32:~/projects/tensorflow-gpu> git diff third_party/gpus/crosstool/CROSSTOOL | cat
diff --git a/third_party/gpus/crosstool/CROSSTOOL b/third_party/gpus/crosstool/CROSSTOOL
index dfde7cd..b63f950 100644
--- a/third_party/gpus/crosstool/CROSSTOOL
+++ b/third_party/gpus/crosstool/CROSSTOOL
## -56,6 +56,7 ## toolchain {
cxx_builtin_include_directory: "/usr/lib/gcc/"
cxx_builtin_include_directory: "/usr/local/include"
cxx_builtin_include_directory: "/usr/include"
+ cxx_builtin_include_directory: "/usr/lib64/gcc"
tool_path { name: "gcov" path: "/usr/bin/gcov" }
# C(++) compiles invoke the compiler (as that is the one knowing where

You should tweak the C++ compiler.
To do so, here's the best way to proceed:
edit the file tools/cpp/CROSSTOOL (https://github.com/bazelbuild/bazel/blob/master/tools/cpp/CROSSTOOL) from your package path directory (should be in ~/.bazel/base_workspace, can be found with bazel info package_path) to add a line cxx_builtin_include_directory: /usr/lib64/gcc around line 100 (see https://github.com/bazelbuild/bazel/blob/master/tools/cpp/CROSSTOOL#L101).
Then echo "build --crosstool=//tools/cpp:toolchain" >>~/.bazelrc and then retries to build.
Sorry for the mess, we are working on making C++ toolchain work better out of the box.

Bazel complaints of system header files because compiler uses -MD (as opposed to -MMD) flag when generating dependences. While using -MD is reasonable for an environment that changes often, listing dependency on system header files causes the 'missing dependency declarations' errors.
What helped me was converting the '-MD' flag into '-MMD' flag in the compiler wrapper files third_party/gpus/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc.tpl just before 'subprocess.call([CPU_COMPILER]...)':
cpu_compiler_flags = ['-MMD' if flag == '-MD' else flag for flag in cpu_compiler_flags]
and third_party/sycl/crosstool/computecpp.tpl, similar place:
computecpp_device_compiler_flags = ['-MMD' if flag == '-MD' else flag for flag in computecpp_device_compiler_flags]

Related

Tensorflow bazel quantization build error

I am trying to build tensorflow tools package with bazel 0.18.0
following steps are ok
git clone https://github.com/tensorflow/tensoflow
bazel build --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel build --config=cuda //tensorflow/examples/label_image:label_image
until trying to run this command line
bazel build --config=cuda //tensorflow/contrib/quantization:quantize_graph
it show error, so should i give something else then quantize_graph? and what i can use or find?
root#24663fb1018d:/srv/wu/tensorflow-src/tensorflow# bazel build --config=cuda //tensorflow/contrib/quantization:quantize_graph
WARNING: Duplicate rc file: /srv/wu/tensorflow-src/tensorflow/tools/bazel.rc is read multiple times, most recently imported from /srv/wu/tensorflow-src/tensorflow/.bazelrc
WARNING: Processed legacy workspace file /srv/wu/tensorflow-src/tensorflow/tools/bazel.rc. This file will not be processedin the next release of Bazel. Please read https://github.com/bazelbuild/bazel/issues/6319 for further information, including how to upgrade.
Starting local Bazel server and connecting to it...
WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
ERROR: Skipping '//tensorflow/contrib/quantization:quantize_graph': no such target '//tensorflow/contrib/quantization:quantize_graph': target 'quantize_graph' not declared in package 'tensorflow/contrib/quantization' defined by /srv/wu/tensorflow-src/tensorflow/tensorflow/contrib/quantization/BUILD
WARNING: Target pattern parsing failed.
ERROR: no such target '//tensorflow/contrib/quantization:quantize_graph': target 'quantize_graph' not declared in package 'tensorflow/contrib/quantization' defined by /srv/wu/tensorflow-src/tensorflow/tensorflow/contrib/quantization/BUILD
INFO: Elapsed time: 1.195s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (1 packages loaded)
and then i tried with tools path, no luck
bazel build --config=cuda //tensorflow/tools/quantization:quantize_graph
WARNING: Duplicate rc file: /srv/wu/tensorflow-src/tensorflow/tools/bazel.rc is read multiple times, most recently imported from /srv/wu/tensorflow-src/tensorflow/.bazelrcWARNING: Processed legacy workspace file /srv/wu/tensorflow-src/tensorflow/tools/bazel.rc. This file will not be processed in the next release of Bazel. Please read https://github.com/bazelbuild/bazel/issues/6319 for further information, including how to upgrade.WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.ERROR: Skipping '//tensorflow/tools/quantization:quantize_graph': no such package'tensorflow/tools/quantization': BUILD file not found on package pathWARNING: Target pattern parsing failed.
ERROR: no such package 'tensorflow/tools/quantization': BUILD file not found on package path
INFO: Elapsed time: 0.506s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
also the toco is not working
bazel build --config=cuda tensorflow/contrib/lite/toco:toco
WARNING: Duplicate rc file: /srv/wu/tensorflow-src/tensorflow/tools/bazel.rc is read multiple times, most recently imported from /srv/wu/tensorflow-src/tensorflow/.bazelrc
WARNING: Processed legacy workspace file /srv/wu/tensorflow-src/tensorflow/tools/bazel.rc. This file will not be processed in the next release of Bazel. Please read https://github.com/bazelbuild/bazel/issues/6319 for further information, including how to upgrade.WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.ERROR: Skipping 'tensorflow/contrib/lite/toco:toco': no such package 'tensorflow/contrib/lite/toco': BUILD file not found on package path
WARNING: Target pattern parsing failed.ERROR: no such package 'tensorflow/contrib/lite/toco': BUILD file not found on package pathINFO: Elapsed time: 0.500s
INFO: 0 processes.FAILED: Build did NOT complete successfully (0 packages loaded)
To verify where targets are, look into the BUILD file in the package directory.
The quantize_graph target has been moved to the //tensorflow/contrib/quantize package. This should work:
$ bazel build --config=cuda //tensorflow/contrib/quantize:quantize_graph
The toco target has been moved from //tensorflow/contrib/lite/toco to //tensorflow/lite/toco. Like quantize_graph, this should work:
$ bazel build --config=cuda //tensorflow/lite/toco:toco

error bazel build in tensorflow

At first, I would like to use bazel to help me run tensorflow with SSE and avx so I tried this within work space:
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.2 --config=cuda -k //tensorflow/tools/pip_package:build_pip_package
but it gives me a new error like following, I wonder what is wrong and what should I do? Thanks for help.
WARNING: Config values are not defined in any .rc file: cuda
ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': no such package 'tensorflow/tools/pip_package': BUILD file not found on package path
WARNING: Target pattern parsing failed.
INFO: Analysed 0 targets (2 packages loaded).
INFO: Found 0 targets...
ERROR: command succeeded, but there were errors parsing the target pattern
INFO: Elapsed time: 2.727s, Critical Path: 0.02s
FAILED: Build did NOT complete successfully
You probably have an outdated bazel. I am not sure but you can try to use --config=opt instead of -c opt for initial versions.
You have to run ./configure. That will create a .bazelrc and .tf_configure.bazel file in your Tensorflow workspace.
The --config=cuda Bazel flag refers to entries in those two files (they are both text files). The entries typically look like this: build:cuda --some_bazel_flag.
It was answered here

Compiling Tensorflow with Bazel

I tried compiling tensorflow 1.3 from the HEAD of the master branch using the following line of shell command after running ./configure
sudo bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 --config=cuda -k --verbose_failures //tensorflow/tools/pip_package:build_pip_package
I get the following error in the end.
At global scope:cc1plus: warning: unrecognized command line option '-Wno-self-assign'
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 3834.785s, Critical Path: 196.95s FAILED: Build did NOT complete successfully
These were the warnings it gave initially.
WARNING: /home/pranav/tensorflow_install/tensorflow/tensorflow/core/BUILD:1634:1: in includes attribute of cc_library rule //tensorflow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'tensorflow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /home/pranav/tensorflow_install/tensorflow/tensorflow/tensorflow.bzl:911:30
WARNING: /home/pranav/tensorflow_install/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': No longer supported. Switch to SavedModel immediately.
WARNING: /home/pranav/tensorflow_install/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': No longer supported. Switch to SavedModel immediately.
INFO: Analysed target //tensorflow/tools/pip_package:build_pip_package (208 packages loaded).
Then loads of INFO. I'm not sure if it is of any help.
Bazel Version:
Build label: 0.5.4
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Aug 25 10:00:00 2017 (1503655200)
Build timestamp: 1503655200
Build timestamp as int: 1503655200
I read in some answer to run the following code,
$ bazel query --output=build 'somepath("//tensorflow/core:version_info_gen", "//tensorflow/tools/git:gen/spec.json")'
And it gave me this.maybe this will be of help.
# /home/pranav/tensorflow_install/tensorflow/tensorflow/core/BUILD:1546:1
genrule(
name = "version_info_gen",
generator_name = "version_info_gen",
generator_function = "tf_version_info_genrule",
generator_location = "tensorflow/core/BUILD:1546",
srcs = ["//tensorflow/tools/git:gen/spec.json", "//tensorflow/tools/git:gen/head", "//tensorflow/tools/git:gen/branch_ref"],
tools = ["//tensorflow/tools/git:gen_git_source.py"],
outs = ["//tensorflow/core:util/version_info.cc"],
cmd = "$(location //tensorflow/tools/git:gen_git_source.py) --generate $(SRCS) \"$#\"",
local = True,
)
Also, "the bazel command i wrote" > log.txt doesn't fill the text file with the terminal outputs.
If you guys want more information to help me. Suggest me a way to copy the terminal output to a text file so that i can upload it on github and give you the link.
I also used --explain to write all explanations to a file . I can upload that also if you want.
I also tried --local_resources 2048,.5,1.0 to reduce my memory allocation in case of memory issues. Still doesn't work.
Thanks a lot in advance.

TensorFlow Serving Error

I have installed Tensorflow Serving as outlined on the install page at https://tensorflow.github.io/serving/setup. However, when I follow the build instruction on the page I get the following error:
$ bazel build tensorflow_serving/...
ERROR: /home/**PATH**/external/org_tensorflow/third_party/py/python_configure.bzl:183:20: unexpected keyword 'environ' in call to repository_rule(implementation: function, *, attrs: dict or NoneType = None, local: bool = False).
ERROR: com.google.devtools.build.lib.packages.BuildFileContainsErrorsException: error loading package '': Extension file 'third_party/py/python_configure.bzl' has errors.
INFO: Elapsed time: 0.623s
I am running on Ubuntu and TensorFlow 1.0.1 build. I am using Python 2.7 and have set up a virtualenv.
I can successfully build the bazel hello example and also am able to complete the gRPC quick start found at http://www.grpc.io/docs/quickstart/python.html.
Any suggestions?
-Dave
The trouble was an old copy of bazel. To determine your version
$ bazel version
Build label: 0.4.5
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Mar 16 12:19:38 2017 (1489666778)
Build timestamp: 1489666778
Build timestamp as int: 1489666778
In my case it required a manual removal of the old version
rm -fr ~/.bazel ~/.bazelrc
Next, I chose the install using the installer for ubuntu.
$ ./bazel-0.4.5-installer-linux-x86_64.sh
Bazel installer
---------------
Bazel is bundled with software licensed under the GPLv2 with Classpath exception.
You can find the sources next to the installer on our release page:
https://github.com/bazelbuild/bazel/releases
# Release 0.4.5 (2017-03-16)
There was still another trick to getting it to work.
$cd ..
$ bazel test tensorflow_serving/...
Python Configuration Error: 'PYTHON_BIN_PATH' environment variable is not set
This error is also related to versioning, but in this case it was an issue with serving. The solution was to revert to an earlier version and update the submodule from git (I had previously cloned the repository). From the serving directory:
$ git checkout 0.5.1
M tensorflow
M tf_models
Note: checking out '0.5.1'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 51bb356... Merge pull request #325 from kirilg/0.5.1
(tensorflow) $ git submodule update
Submodule path 'tensorflow': checked out '07bb8ea2379bd459832b23951fb20ec47f3fdbd4'
Submodule path 'tf_models': checked out '2fd3dcf3f31707820126a4d9ce595e6a1547385d'
(tensorflow) $ bazel test tensorflow_serving/...
Serving now reports success:
INFO: Found 199 targets and 57 test targets...
[1,299 / 4,037] Still waiting for 200 jobs to complete:
Running (standalone):

Problems adding DKMS support to kernel module

I'm trying to add DKMS support in a kernel module i'm working on.
I have placed the kernel module source with a static lib to be linked against in the following directory:
/usr/src/dpx/1.0
With the following files:
dkms.conf
Makefile
dpxmtt.c
lib.a
dkms.conf file is like this:
MAKE="make"
CLEAN="make clean"
BUILT_MODULE_NAME=dpx
BUILT_MODULE_LOCATION=src/
DEST_MODULE_LOCATION=/kernel/drivers/input/touchscreen
PACKAGE_NAME=dpxm
PACKAGE_VERSION=1.0
REMAKE_INITRD=yes
And the makefile is like this:
EXTRA_CFLAGS+=-DLINUX_DRIVER -mhard-float
obj-m += dpx.o
dpx-objs:= dpxmtt.o ../source/lib.a
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
The ../source/lib.a is an hack since when the makefile is invoked by the dkms building system it was saying that it couldn't be found in directory (the build directory), but since it was being copied to the source directory, i'm referencing it relatively.
When I call
sudo dkms build -m dpx -v 1.0
The result is almost perfect:
santos#NS-PC:~$ sudo dkms build -m dpx -v 1.0
Kernel preparation unnecessary for this kernel. Skipping...
Building module:
cleaning build area....
make KERNELRELEASE=3.0.0-14-generic....
ERROR (dkms apport): binary package for dpx: 1.0 not found
Error! Build of dpx.ko failed for: 3.0.0-14-generic (i686)
Consult the make.log in the build directory
/var/lib/dkms/dpx/1.0/build/ for more information.
nsantos#NS-PC:~$
And the content of the log file is:
DKMS make.log for dpx-1.0 for kernel 3.0.0-14-generic (i686)
Thu Jan 19 11:07:54 WET 2012
make -C /lib/modules/3.0.0-14-generic/build M=/var/lib/dkms/dpx/1.0/build modules
make[1]: Entering directory `/usr/src/linux-headers-3.0.0-14-generic'
CC [M] /var/lib/dkms/dpx/1.0/build/dpxmtt.o
LD [M] /var/lib/dkms/dpx/1.0/build/dpx.o
Building modules, stage 2.
MODPOST 1 modules
CC /var/lib/dkms/dpx/1.0/build/dpx.mod.o
LD [M] /var/lib/dkms/dpx/1.0/build/dpx.ko
make[1]: Leaving directory `/usr/src/linux-headers-3.0.0-14-generic'
The module was built correctly but it ends with the error:
ERROR (dkms apport): binary package for dpx: 1.0 not found
Error! Build of dpx.ko failed for: 3.0.0-14-generic (i686)
And I don't know what it means. Does anybody know?
Using:
$(shell uname -r)
in the Makefile it might be also wrong! The "shell uname -r" refers to the currently running kernel, but the main reason to use the dkms it's because it offers an automated method to recompile the kernel modules that reside outside of the kernel tree for every newly installed kernel. What i mean is that the Makefile might refers to a different kernel which the dkms is building the module for.
Use:
${kernelver} instead.
I had a similar problem. I think your BUILT_MODULE_LOCATION is set incorrectly to the src directory. It should be set in your example to the current directory, or you can just omit this variable and dkms would default to the current directory.