Errors when trying to build label_image neural net with bazel - tensorflow

Environment info
Operating System: El Capitan, 10.11.1
I'm doing this tutorial: https://petewarden.com/2016/09/27/tensorflow-for-mobile-poets/
Trying to classify images using tensorflow on iOS app.
When I try to build my net using bazel:
bazel build tensorflow/examples/label_image:label_image
I get these errors:
https://gist.github.com/galharth/36b8f6eeb12f847ab120b2642083a732

From the related github issue https://github.com/tensorflow/tensorflow/issues/6487 I think we narrowed it down to a lack of resources on the virtual machine. Bazel tends to get flakey with only 2GB of RAM allocated to it.

Related

TFLite cross-compiling arm64 build error?

As I am currently trying to Tensorflow lite for ARM64 architecture, I just try to follow the instructions from below:
https://www.tensorflow.org/lite/guide/build_arm64#cross-compile_for_arm64
But I simply get a compilation error:
tensorflow/lite/tools/make/downloads/ruy/ruy/cpuinfo.cc:9:21: fatal error: cpuinfo.h: No such file or directory
I am surprised this starting build not working out of the box.
Btw, I am trying to do the above in Ubuntu 16.04 VM.
Anybody had the same issues?
Solved now. You need to explicity install Bazel for this.
The tensorflow-lite dependencies script does not do this. Not sure why, should have been part of this.

Tensorflow does not generate GPU tracing information

I started a new machine learning project.
In according to this document (https://www.tensorflow.org/tensorboard/tensorboard_profiling_keras)
TF with Tensorboard appears to support GPU profiling. So, i used the same code in my Jupyter Notebook for testing.
The sample code generates profiling resulting. However, there is no GPU tracing information in resulting file. (only CPU)
This is my main problem.
I am using two RTX 2080 TI graphic cards.
And also, they were working when running the code.
The sample code does not use MirroredStrategy. So, i could see the one of them was running.
At first, i thought Tensorboard was the problem. But,i realized soon that TF does not generate the GPU tracing information.
The image above is the resulting file (local.trace). There was no GPU data.
It is my system specification.
OS ubuntu 18.04
jupyter-client 5.3.4
jupyter-core 4.6.1
jupyter-tensorboard 0.1.10
tensorflow-gpu 2.0.0
tensorflow-estimator 2.0.1
tensorflow-metadata 0.15.1
tensorboard 2.0.2
nVidia 410.104
CUDA 10.0
anaconda 4.7.12 (with python 3.6)
It looks irrelevant, but there was a warning message like the image below.
I have tested this on other PC and got the same resulting. It could be the GPU profiling is only supporting on Google Colab. (I am still confusing) Recently, I have searched it on google to fix the problem. I could not get still the answer.
Is there someone who is using GPU profiling on your own System instead of Google Colab?
Please give me piece of advices.
I figured out what caused the problem.
It was related with CUPTI(CUDA Profiling Tools Interface)
In contrast to Jupyter Notebook, there was a warning message when the code is running on Ubunto shell.
CUPTI error: CUPTI could not be loaded or symbol could not be found.
TF could not find CUPTI libraries. This is the main reason of the problem.
After adding the path to LD_LABRARY_PATH as below link, the problem is fixed!
https://stackoverflow.com/a/58752904/5553618

Is it possible to compile tensorflow in Mac?

So I started to build tensorflow in Mac and the thing is that it doesn't seem possible to build tensorflow in Mac OS platform.
After following instructions in here, I get this package directory.
It seems like the build settings for bazel is only for linux distro. The reason why I thought so is because there is a .so file in package directory that is needed to be linked after importing tensorflow using python binary.
This is the result I get after importing tensorflow using python.
Is there any other way I can build tensorflow on Mac OS?
It seems like there are no options but to install tensorflow with pip. So I just created a new virtual machine and installed ubuntu 16.04 to use it as my docker host. By doing so, I can create a new docker container which can now link and execute the linux library.

Bazel builds cause issues when I install TensorFlow using pip

So the documentation mentions that it is better to install from source, then build a pip package. Why is this recommended over doing a direct pip install using the wheel file provided on the downloads page? here
I tried the direct pip install and then runnning some scripts in the inception folder.This results in errors with bazel not finding some of the dependencies. I am guessing this is related to not building tensorflow from source, but I can't figure out why this is the case. Any pointers? Thanks!
Installing from pip is supported, can you provide more details on your os and the specific errors you saw?
The main reason to build from source is simply performance.
Building and installing from source
The default TensorFlow binaries target the broadest range of hardware to make TensorFlow accessible to everyone. If using CPUs for training or inference, it is recommended to compile TensorFlow with all of the optimizations available for the CPU in use. Speedups for training and inference on CPU are documented below in Comparing compiler optimizations.
To install the most optimized version of TensorFlow, build and install from source. If there is a need to build TensorFlow on a platform that has different hardware than the target, then cross-compile with the highest optimizations for the target platform. The following command is an example of using bazel to compile for a specific platform
ref: https://www.tensorflow.org/performance/performance_guide

Building TensorFlow from source on Ubuntu 14.04 LTS: gcc: internal compiler error: Killed (program cc1plus)

I have successfully built TensorFlow from source under Debian but at present cannot get it to build starting with a new virtual machine using Ubuntu 14.04 LTS. IIRC for Debian I tried g++/gcc 5.2 but had to downgrade to g++/gcc 4.9 and it worked. Following the instructions Installing from sources if I install g++ the version is 4.8 and it failed .
gcc: internal compiler error: Killed (program cc1plus)
I have not tired 4.9 yet.
I checked the info on the last Jenkins build but could not find anything listed for the tools and their versions. Even opened issue: Build tools and versions listed in Jenkins build log
What version(s) of g++/gcc is know to work?
What version of g++/gcc do the build machines use?
EDIT
Found this: TensorFlow.org Continuous Integration
The problem is not with the g++/gcc version but the number of CPU cores Bazel uses to build TensorFlow.
In running multiple builds on a VMware Workstation 7.1 with a fresh install of Ubuntu 14.04 LTS with one CPU core, 2G ram, 2G swap partition and 2G swap file the builds run the fastest. This may not be the best setup, but is the best one I have found so far that consistently works. If I allow 4 cores via VMware and build with Bazel it fails. If I limit the resources with the Bazel option --local_resources using
--local_resources 2048,2.0,1.0
builds successfully
INFO: Elapsed time: 11683.908s, Critical Path: 11459.26s
using
--local_resources 4096,2.0,1.0
builds successfully
INFO: Elapsed time: 39765.257s, Critical Path: 39578.52s
using
--local_resources 4096,1.0,1.0
builds successfully
INFO: Elapsed time: 6562.744s, Critical Path: 6443.80s
using
--local_resources 6144,1.0,1.0
builds successfully
INFO: Elapsed time: 2810.509s, Critical Path: 2654.90s
In summary more memory and less CPU cores works best for my environment.
TLDR;
While keeping an eye during the build process I noticed that certain source files would take a long time to compile and appeared to tie down the flow rate while building. It is as if they are in competition for a resource with other source files and that Bazel does not know about this critical resource so it allows the competing files to compile at the same time. Thus the more files competing with the unknown resource the slower the build.