Tensorflow Serving Compile Error Using Docker on OSX - tensorflow

I'm trying to install TensorFlow serving on OSX El Capitan using Docker but keep running into an error. Here is the tutorial I'm following:
https://tensorflow.github.io/serving/docker.html
Here is the command causing the error:
bazel test tensorflow_serving/...
Here's the error I'm getting:
for (int i = 0; i < suffix.size(); ++i) {
^
ERROR: /root/.cache/bazel/_bazel_root/f8d1071c69ea316497c31e40fe01608c/external/tf/tensorflow/core/kernels/BUILD:212:1: C++ compilation of rule '#tf//tensorflow/core/kernels:mirror_pad_op' failed: gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer '-std=c++0x' -iquote external/tf -iquote ... (remaining 65 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 4.
gcc: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.

Solved! Looks like the issues was with running out of memory in the VM.
Here's how I fixed it:
1) When creating the machine, make sure it has more memory (mine was only 1GB). Here is how you create a docker machine with 4GB:
docker-machine create -d virtualbox --virtualbox-memory 4096 default
2) When running the bazel command pass in a parameter limiting the amount of memory to use. Here I'm running the command using only 2GB:
bazel build -c opt --copt=-mavx --verbose_failures --local_resources 2048,2.0,1.0 -j 1 //tensorflow_serving/example:mnist_export
Where the original command was:
bazel build //tensorflow_serving/example:mnist_export

Related

Compile errors running the ot-br-posix ./script/setup on RPi4

I'm trying to run the ./script/setup, but get compile errors:
Please note that the total 65 steps listed below is because I've restarted the setup script. The initial number of steps were closer to 465.
[1/65] Building CXX object src/common/CMakeFiles/otbr-common.dir/mainloop.cpp.o
FAILED: src/common/CMakeFiles/otbr-common.dir/mainloop.cpp.o
/usr/bin/c++ -DHAVE_LIBSYSTEMD=1 -DOTBR_ENABLE_BACKBONE_ROUTER=1 -DOTBR_ENABLE_BORDER_AGENT=1 -DOTBR_ENABLE_BORDER_ROUTING=1 -DOTBR_ENABLE_BORDER_ROUTING_COUNTERS=1 -DOTBR_ENABLE_DBUS_SERVER=1 -DOTBR_ENABLE_DNSSD_DISCOVERY_PROXY=1 -DOTBR_ENABLE_NAT64=1 -DOTBR_ENABLE_NOTIFY_UPSTART=1 -DOTBR_ENABLE_REST_SERVER=1 -DOTBR_ENABLE_SRP_ADVERTISING_PROXY=1 -DOTBR_ENABLE_SRP_SERVER_AUTO_ENABLE_MODE=1 -DOTBR_ENABLE_VENDOR_INFRA_LINK_SELECT=0 -DOTBR_MESHCOP_SERVICE_INSTANCE_NAME="\"OpenThread BorderRouter\"" -DOTBR_PACKAGE_NAME=\"OpenThread_BorderRouter\" -DOTBR_PACKAGE_VERSION=\"0.3.0-0cdef3c\" -DOTBR_PRODUCT_NAME=\"BorderRouter\" -DOTBR_SYSLOG_FACILITY_ID=LOG_USER -DOTBR_VENDOR_NAME=\"OpenThread\" -I../../include -I../../src -Ithird_party/openthread/repo/etc/cmake -I../../third_party/openthread/repo/etc/cmake -I../../third_party/openthread/repo/include -I../../third_party/openthread/repo/src/posix/platform/include -I../../third_party/openthread/repo/src -Wall -Wextra -Werror -Wfatal-errors -Wuninitialized -Wno-missing-braces -std=c++11 -MD -MT src/common/CMakeFiles/otbr-common.dir/mainloop.cpp.o -MF src/common/CMakeFiles/otbr-common.dir/mainloop.cpp.o.d -o src/common/CMakeFiles/otbr-common.dir/mainloop.cpp.o -c ../../src/common/mainloop.cpp
In file included from /usr/include/c++/8/list:63,
from ../../src/common/mainloop_manager.hpp:41,
from ../../src/common/mainloop.cpp:30:
/usr/include/c++/8/bits/stl_list.h:811:19: error: expected ‘)’ before ‘&’ token
list(_InputIterat&... __args)`
compilation terminated due to -Wfatal-errors.
I receive a lot more errors, but they follow the same pattern as above.
I have followed the guide from openthread.io to setup an Open Thread Border Router
The execution of the bootstrap script ran smoothly.
Additional information:
Git local repository path: ~/src/openthread/ot-br-posix
Command for executing the setup script:
pi#raspberrypi:~/src/openthread/ot-br-posix$> INFRA_IF_NAME=eth0 ./script/setup
RPi OS: Recommended image from the guide Raspberry Pi OS lite
Libgcc versions:
libgcc-8-dev/oldstable,now 8.3.0-6+rpi1 armhf [installed,automatic]
libgcc1/oldstable,now 1:8.3.0-6+rpi1 armhf [installed]
Cmake versions:
cmake-data/oldstable,now 3.16.3-3~bpo10+1 all [installed,automatic]
cmake/oldstable,now 3.16.3-3~bpo10+1 armhf [installed]

Checkpoint Resuming Throwing Assertion - ARM Arch

I am trying to create and resume from a checkpoint for an ARM compiled binary (LLVM Test Suite).
I cross compiled the LLVM Test Suite with the following command in a Makefile:
./arm-linux-gnueabihf-gcc -O0 -ggdb3 -std=c99 -static $< -o $#
(basically using the arm-linux-gnueabihf-gcc cross compiler version 7.4)
I created the checkpoints using the following command:
./build/ARM/gem5.opt --outdir=chkpt_only/ configs/example/se.py --checkpoint-dir chkpt_only/ --take-checkpoints=0,20000000000 --cpu-type=AtomicSimpleCPU --cmd=../../../Benchmarks/LLVM_Test_Suite/SingleSource/Benchmarks/Stanford/Towers
I tried to resume from the checkpoint with the following command:
./build/ARM/gem5.opt --outdir=chkpt_only/ configs/example/se.py -r 1 --checkpoint-dir chkpt_only/ --cpu-type=O3_ARM_v7a_3 --caches --cmd=../../../Benchmarks/LLVM_Test_Suite/SingleSource/Benchmarks/Stanford/Towers
The above seems to work when the --cpu-type is In-order but for any O3 CPU I get the following assertion:
gem5.opt: build/ARM/cpu/o3/rename_map.hh:282: const PhysRegId* UnifiedRenameMap::lookup(const RegId&) const: Assertion `vecMode == Enums::Elem' failed.
Can someone please help me to understand/fix this assertion?
PS: The git commit is 2775f55447edb344d99f30273ad93fea515d7e2b

How to edit the linker flags bazel uses to build syntaxnet/tensorflow

I don't get Tensorflow with Syntaxnet built with CUDA on Ubuntu 16.04.
I have built it successfully without CUDA on this system.
Most likely the error is rooted in the configuration. The bazel build of tensorflow with CUDA generates linker commands for shared libraries with the linker option
-pie for generating executables with position independent code. This causes the error "undefined reference to `main'".
/home/patrick/.cache/bazel/_bazel_patrick/5b9c9cf56f3e0138be05b0752b134bcb/external/com_google_absl/absl/base/BUILD.bazel:28:1: Linking of rule '#com_google_absl//absl/base:spinlock_wait' failed (Exit 1):
crosstool_wrapper_driver_is_not_gcc failed: error executing command
`(cd /home/patrick/.cache/bazel/_bazel_patrick/5b9c9cf56f3e0138be05b0752b134bcb `/execroot/__main__ && exec env - \
CUDA_TOOLKIT_PATH=/usr/local/cuda \
CUDNN_INSTALL_PATH=/usr/local/cuda \
GCC_HOST_COMPILER_PATH=/usr/bin/gcc \
LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:/usr/local/cuda-9.0/extras/CUPTI/lib64:/usr/local/cuda-9.0/nvvm/lib64 \
NCCL_INSTALL_PATH=/usr \ PATH=/home/patrick/bin:/home/patrick/.local/bin:/usr/local/cuda/bin:/usr/bin:/bin \
PWD=/proc/self/cwd \
PYTHON_BIN_PATH=/usr/bin/python \
PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
TF_CUDA_CLANG=0 \
TF_CUDA_COMPUTE_CAPABILITIES=6.1 \
TF_CUDA_VERSION=9.0 \
TF_CUDNN_VERSION=7 \
TF_NCCL_VERSION=2 \
TF_NEED_CUDA=1 \
TF_NEED_OPENCL_SYCL=0 \
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -shared -o bazel-out/k8-opt/bin/external/com_google_absl/absl/base/libspinlock_wait.so -Wl,-no-as-needed -B/usr/bin/ -pie -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,--gc-sections -Wl,#bazel-out/k8-opt/bin/external/com_google_absl/absl/base/libspinlock_wait.so-2.params)
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/Scrt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
This linking command succeeds when removing the option -pie.
Help would be appreciated to either find a way to edit the linker flags Bazel uses or to get a hint to the configuration error I made from users that encountered a similar problem. I don't think that posting the configuration steps I did will lead to other suggestions than the ones I already read on other posts. The build process looks too shaky for me.
I already had a look at the definition in the CROSSTOOL and BUILD files. I did not edit them and they look Ok (-pie is only enabled for linking executables).
I work with
Bazel 0.15.2
Tensorflow 1.8.0
Ubuntu 16.04
gcc 5.4
CUDA 9.0
CUDNN 7.1
NCCL 2.1

can't MAKE tensorflow Raspberry pi examples

I installed tensorflow on the Raspberry Pi 3, running Jessie and did that in two ways, via the .whl file / pip install for Python 2.7.:
https://github.com/samjabrahams/tensorflow-on-raspberry-pi
as well as a full compile via:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/makefile
Both installs (I did them on different OS images) where successfull and went through without error.
Next, I wanted to compile the official Raspberry Pi examples from tensorflow's git repository:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/pi_examples
However, none of the examples does compile (neither the camera example nor the image recognition), both give the same error and Google doesn't tell me anything:
make -f tensorflow/contrib/pi_examples/camera/Makefile gcc --std=c++11
-O0 -I/usr/local/include -I. -I/home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/../../makefile/downloads
-I/home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/../../makefile/downloads/eigen/
-I/home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/../../makefile/gen/proto/
-I/home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/../../makefile/gen/proto_text/
-c tensorflow/contrib/pi_examples/camera/camera.cc -o /home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/gen/obj/tensorflow/contrib/pi_examples/camera/camera.o
In file included from ./tensorflow/core/platform/mutex.h:31:0,
from ./tensorflow/core/framework/variant.h:31,
from ./tensorflow/core/framework/allocator.h:26,
from ./tensorflow/core/framework/tensor.h:20,
from tensorflow/contrib/pi_examples/camera/camera.cc:33:
./tensorflow/core/platform/default/mutex.h:25:22: fatal error:
nsync_cv.h: No such file or directory #include "nsync_cv.h"
^ compilation terminated. tensorflow/contrib/pi_examples/camera/Makefile:80: recipe for target
'/home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/gen/obj/tensorflow/contrib/pi_examples/camera/camera.o'
failed make: ***
[/home/pi/tensorflow/tensorflow/contrib/pi_examples/camera/gen/obj/tensorflow/contrib/pi_examples/camera/camera.o]
Error 1
as well as:
make -f tensorflow/contrib/pi_examples/label_image/Makefile gcc
--std=c++11 -O0 -I/usr/local/include -I. -I/home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/../../makefile/downloads
-I/home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/../../makefile/downloads/eigen/
-I/home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/../../makefile/gen/proto/
-I/home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/../../makefile/gen/proto_text/
-c tensorflow/contrib/pi_examples/label_image/label_image.cc -o /home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/gen/obj/tensorflow/contrib/pi_examples/label_image/label_image.o
In file included from ./tensorflow/core/platform/mutex.h:31:0,
from ./tensorflow/core/framework/variant.h:31,
from ./tensorflow/core/framework/allocator.h:26,
from ./tensorflow/core/framework/tensor.h:20,
from tensorflow/contrib/pi_examples/label_image/label_image.cc:33:
./tensorflow/core/platform/default/mutex.h:25:22: fatal error:
nsync_cv.h: No such file or directory #include "nsync_cv.h"
^ compilation terminated. tensorflow/contrib/pi_examples/label_image/Makefile:79: recipe for
target
'/home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/gen/obj/tensorflow/contrib/pi_examples/label_image/label_image.o'
failed make: ***
[/home/pi/tensorflow/tensorflow/contrib/pi_examples/label_image/gen/obj/tensorflow/contrib/pi_examples/label_image/label_image.o]
Error 1
How can I locate / add / compile "nsync_cv.h"?
And btw:
export HOST_NSYNC_LIB=`tensorflow/contrib/makefile/compile_nsync.sh`
gives me:
g++ -M -std=c++11 -DNSYNC_USE_CPP11_TIMEPOINT -DNSYNC_ATOMIC_CPP11
-I../../platform/c++11 -I../../platform/gcc -I../../platform/posix -pthread -I../../public -I../../internal ../../internal/*.c ../../testing/*.c ../../platform/c++11/src/nsync_semaphore_mutex.cc
../../platform/c++11/src/per_thread_waiter.cc
../../platform/c++11/src/yield.cc
../../platform/c++11/src/time_rep_timespec.cc
../../platform/c++11/src/nsync_panic.cc \
../../platform/c++11/src/start_thread.cc > dependfile make: 'nsync.a' is up to date.
I once compiled tensorflow with the makefile on a nvidia Jetson TX1 and I could compile and run the examples by adding some lines to the Makefiles of the examples:
after line 18:
NSYNCLIBDIR := $(TFMAKEFILE_DIR)/downloads/nsync/builds/default.linux.c++11 !!!change folder default.linux.c++11 to where your libnsync.a is!!!
NSYNCLIBS := $(NSYNCLIBDIR)/libnsync.a
after line 26:
NSYNC := $(TFMAKEFILE_DIR)/downloads/nsync/public/
after line 36:
-L$(NSYNCLIBDIR) \
after line 43:
-I$(NSYNC) \
after line 51:
-lnsync \
change line 72 to:
$(EXECUTABLE_NAME): $(EXECUTABLE_OBJS) $(TFLIBS) $(NSYNCLIBS)
Hope it works with that changes, good luck :)

TensorFlow 1.1 “bazel build tensorflow/tools/graph_transforms:transform_graph"

When I build quantization tool use the command
"bazel build tensorflow/tools/graph_transforms:transform_graph"
The complie result as below:
WARNING: Sandboxed execution is not supported on your system and thus hermeticity of actions cannot be guaranteed. See http://bazel.build/docs/bazel-user-manual.html#sandboxing for more information. You can turn off this warning via --ignore_unsupported_sandboxing.
INFO: Found 1 target...
ERROR: /root/tensorflow-master/tensorflow/core/BUILD:1287:1: C++ compilation of rule '//tensorflow/core:framework_internal' failed: gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -B/usr/bin -B/usr/bin -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG ... (remaining 106 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
tensorflow/core/framework/reader_op_kernel.cc:20:61: error: definition of implicitly-declared 'tensorflow::ReaderOpKernel::ReaderOpKernel(tensorflow::OpKernelConstruction*)'
ReaderOpKernel::ReaderOpKernel(OpKernelConstruction* context)
^
tensorflow/core/framework/reader_op_kernel.cc:27:33: error: definition of implicitly-declared 'virtual tensorflow::ReaderOpKernel::~ReaderOpKernel()'
ReaderOpKernel::~ReaderOpKernel() {
^
tensorflow/core/framework/reader_op_kernel.cc:34:50: error: no 'void tensorflow::ReaderOpKernel::Compute(tensorflow::OpKernelContext*)' member function declared in class 'tensorflow::ReaderOpKernel'
void ReaderOpKernel::Compute(OpKernelContext* ctx) {
^
Target //tensorflow/tools/graph_transforms:transform_graph failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 5.989s, Critical Path: 5.02s
What's the problem?
Thanks!
Please try two things:
(1) configure the compile environment first and then compile, like
./configure
bazel build tensorflow/tools/graph_transforms:transform_graph
(2) You can also try:
sudo bazel build tensorflow/tools/graph_transforms:transform_graph
Point (2) worked for me sometimes to remove dependencies of including header files.
please download the latest tensorflow and bazel, if needed.