Unable to build the TensorFlow model using bazel build

Unable to build the TensorFlow model using bazel build - tensorflow

I have set up the TensorFlow server in a docker machine running on a windows machine by following the instructions from https://tensorflow.github.io/serving/serving_basic. I am successfully able to build and run the mnist_model. However, when I am trying to build the model for the wide_n_deep_tutorial example by running the following command "bazel build //tensorflow_serving/example:wide_n_deep_tutorial.py" the model is not successfully built as there are no files generated in bazel-bin folder.
Since there is no error message displayed while building the model, I am unable to figure out the problem. I would really appreciate if someone can help me debug and solve the problem.

You are just guessing the command line here as there is no target in the BUILD file of tensorflow serving for wide_n_deep_tutorial.py.
You can only build mnist and inception targets as of today.

By adding the target for wide and deep model in the BUILD file solves the problem.
Added the following to the BUILD file:
py_binary(
name = "wide_n_deep_model",
srcs = [
"wide_n_deep_model.py",
],
deps = [
"//tensorflow_serving/apis:predict_proto_py_pb2",
"//tensorflow_serving/apis:prediction_service_proto_py_pb2",
"#org_tensorflow//tensorflow:tensorflow_py",
],
)

Related

How to let TensorFlow XLA know the CUDA path

I installed TensorFlow nightly build version via the command
pip install tf-nightly-gpu --prefix=/tf/install/path
When I tried to run any XLA example, TensorFlow has error "Unable to find libdevice dir. Using '.' Failed to compile ptx to cubin. Will attempt to let GPU driver compile the ptx. Not found: /usr/local/cuda-10.0/bin/ptxas not found".
So apparently TensorFlow cannot find my CUDA path. In my system, the CUDA is installed in /cm/shared/apps/cuda/toolkit/10.0.130. Since I didn't build TensorFlow from source, by default XLA searches the folder /user/local/cuda-*. But since I do not have this folder, it will issue an error.
Currently my workaround is to create a symbolic link. I checked the TensorFlow source code in tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc. There is a comment in the file "// CUDA location explicitly specified by user via --xla_gpu_cuda_data_dir has highest priority." So how to pass values to this flag? I tried the following two environment variables, but neither of them works:
export XLA_FLAGS="--xla_gpu_cuda_data_dir=/cm/shared/apps/cuda10.0/toolkit/10.0.130/"
export TF_XLA_FLAGS="--xla_gpu_cuda_data_dir=/cm/shared/apps/cuda10.0/toolkit/10.0.130/"
So how to use the flag "--xla_gpu_cuda_data_dir"? Thanks.

you can run export XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda in terminal

There is a code change for this issue, but not clear how to use. Check here https://github.com/tensorflow/tensorflow/issues/23783

Tensorflow on Hexagon DSP unclear issues

I am follow instruction to build Android application with Tensorflow engine on Hexagon DSP.
My device is Pixel 1.
libhexagon_controller.so - there is no compilation problem with android_Release mode , but when I trying to build android_Release_aarch6 i have linkage problem :
warning: libadsprpc.so, needed by /Tensorflow_Hexagon/tensorflow-master/tensorflow/contrib/makefile/downloads/hexagon/libs/libhexagon_controller.so, not found (try using -rpath or -rpath-link)
I resolve it with adding -ladsprpc to Makefile in tensorflow/contrib/makefile.
If ladsprpc nedeed for android_Release_aarch64 why in android_Release it's ok ?
2.I follow steps to generate sign test shared library -${QUALCOMM_SDK}/docs/Tools_Signing.html
my problem is with adb push output/testsig-0x<serial number> /system/lib/rfsa/adsp/
I am getting adb: error: failed to copy 'testsig-xxxx.so' to '/system/lib/rfsa/adsp/': remote couldn't create file: Is a directory
What to do?
Thanks.

Python Configuration Error when build retrain.py by bazel, following google doc

I am learning transfer learning according to How to Retrain Inception's Final Layer for New Categories however, when I build 'retrain.py' using bazel, the following error ocures:
The error message is:
python configuration error:'PYTHON_BIN_PATH' environment variable is not set and referenced by '//third_party/py/numpy:headers'
I am so sorry, I have done my best to display the error image.unfortunately, I failed.
I use python2.7, anaconda2 and bazel0.6.1, tensorflow1.3.
appreciate for your any reply !

Bazel quits before building new op without error?

I was building a new Tensorflow op with external libraries yesterday and getting errors. Today when I ran the same code for some reason I ended up with this output instead:
(vent)user#server:/dir/tensorflow/tensorflow/core/user_ops$ bazel build --config opt //tensorflow/core/user_ops:my_op.cc
INFO: Found 1 target...
INFO: Elapsed time: 1.493s, Critical Path: 0.01s
(vent)user#server:/dir/tensorflow/tensorflow/core/user_ops$
I thought it was something wrong with my cache so I did a 'bazel clean' and then tried to rebuild the example op zero_out.so, but I got the same problem even though yesterday I was able to successfully run zero_out.so from //bazel-bin/tensorflow/core/user_ops. There's nothing wrong with bazel since I was able to start building tensorflow from source without it quitting on me. My build file in //tensorflow/core/user_ops looks like this:
load("//tensorflow:tensorflow.bzl", "tf_custom_op_library")
tf_custom_op_library(
name = "zero_out.so",
srcs = ["zero_out.cc"],
)
tf_custom_op_library(
name = "my_op.so",
srcs = ["my_op.cc"],
deps = ["#t//:libt"]
)
I've been looking around for a couple hours, but I can't find any help and I don't think I'm looking in the right places. Does this have something to do with the bazel clean deleting some important files? None of my BUILD or WORKSPACE files were changed and nothing on my server has changed.
I'm using Bazel v0.5.1 on Linux with TF v1.2.

Solution to the problem was simply because I had accidentally changed my_op.so to my_op.cc in the build command.

Integrating TensorFlow with OpenCL using bazel

Description of the problem / feature request / question:
I am trying to use bazel to build TensorFlow Library. It builds fine.
Additional Feature :
I would like to add OpenCL code in one of the files of TensorFlow. Added all the required code
AND added the following in one of the build files (tensorflow/core/BUILD), considering 'opencl' as the root directory of opencl.
cc_library( name = "opencl", hdrs=glob(["opencl/include/CL/*h"]),
visibility =["//visibility:public"], )
cc_library( name="all_kernels" , visibility= ["//visibility:public"],
copts=tf_copts() + ["-Ithird_party/opencl/include"], deps= [
"//third_party/opencl", ],
example to reproduce the problem:
By running
bazel build //tensorflow/examples/android:tensorflow_demo --fat_apk_cpu=armeabi-v7a --copt="-Ithird_party/opencl/include"
Issues Faced while building :
error: undefined reference to 'clEnqueueReadBuffer'
error: undefined reference to 'clReleaseMemObject'
error: undefined reference to 'clReleaseMemObject'
etc
Environment info
Operating System: Ubuntu 17.04
Bazel version (output of bazel info release): release 0.5.1
relevant searching on web?
How to add external header files during bazel/tensorflow build
information or logs or outputs that would be helpful?
bazel-out/android-arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-fastbuild/bin/tensorflow/core/kernels/libandroid_tensorflow_kernels.lo(conv_ops.o):conv_ops.cc:function
matrixMul(float*, float*, int, int, int, int, int, int): error:
undefined reference to 'clGetPlatformIDs'
I tried linking directly to libOpenCL.so as shown below by referring https://bazel.build/versions/master/docs/tutorial/cpp.html#adding-dependencies-on-precompiled-libraries
, but still same issue
cc_library( name = "opencl", srcs = glob(["lib/x86_64/.so"]), hdrs =
glob(["include/CL/.h"]), visibility = ["//visibility:public"], )
Please help me in resolving the issues

The libOpenCL.so was red in color in terminal, which meant it was archived, replaced the file and issue is resolved

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Unable to build the TensorFlow model using bazel build - tensorflow

You are just guessing the command line here as there is no target in the BUILD file of tensorflow serving for wide_n_deep_tutorial.py. You can only build mnist and inception targets as of today.

Related

How to let TensorFlow XLA know the CUDA path

Tensorflow on Hexagon DSP unclear issues

Python Configuration Error when build retrain.py by bazel, following google doc

Bazel quits before building new op without error?

Integrating TensorFlow with OpenCL using bazel

Categories

Resources