I've been trying to compile tensorflow 1.1 inception with bazel 0.4.2 on windows 10 using CUDA 8.0.
I haven't been able to find a corresponding error online.
C:\Users\me\Anaconda3\envs\tensorflow_gpu\tensorflow>bazel build --config=opt tensorflow/examples/image_retraining:retrain
ERROR: C:/users/me/appdata/local/temp/_bazel_simon/qco1pmlq/external/local_config_cuda/cuda/BUILD:172:12: in outs attribute of genrule rule #local_config_cuda//cuda:cuda-include: Genrules without outputs don't make sense.
After digging into the Build file specified in the error message I have these genrules which are missing inputs:
genrule(
name = "cuda-include",
outs = [
],
cmd = """
""",
)
genrule(
name = "cuda-nvvm",
outs = [
],
cmd = """
""",
)
genrule(
name = "cuda-extras",
outs = [
],
cmd = """
""",
)
I am assuming these should have been generated? I do have an example of other rules that follow that format but unsure which files and if that is the correct way to go. Any help would be greatly appreciated.
Tensorflow GPU Build with Bazel on Windows is not very stable. Currently it's broken from both sides.
This change was just sent to fix problems in Bazel.
And this PR will make Cuda configuration work on Windows again.
You can first build Bazel from HEAD. And after the PR is merged, use your custom Bazel to build TensorFlow from HEAD.
The correct command on Windows would be:
bazel build -c opt --config=win-cuda --cpu=x64_windows_msvc --host_cpu=x64_windows_msvc --copt=-w --host_copt=-w tensorflow/tools/pip_package:build_pip_package
FYI, there is a script for building TensorFlow on Windows:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/ci_build/windows/gpu/pip/build_tf_windows.sh
Related
I have a c++ project that use bazel as build tools and use tensorflow as a external dependency, bazel version is 0.24.1 and tensorflow version is 1.14.0, it works fine (In my project I use tbb in the tensorflow as a deps in bazel's rule).
But when I update bazel to 3.1.0 and update tensorflow to 2.3.0, I have the error:
ERROR: external/tbb/BUILD.bazel:11:1: in cmd attribute of genrule rule #tbb//:build_tbb: $(AR) not defined
My WOKSPACE file:
http_archive(
name = "org_tensorflow",
strip_prefix = "tensorflow-2.3.0",
sha256 = "2595a5c401521f20a2734c4e5d54120996f8391f00bb62a57267d930bce95350",
urls = ["https://github.com/tensorflow/tensorflow/archive/tensorflow-2.3.0.tar.gz"]
)
My BUILD file:
cc_library(
name = "tf_utils",
srcs = ["tf_utils.cpp"],
hdrs = ["tf_utils.h"],
deps = ["#org_tensorflow//tensorflow/cc/saved_model:loader", "utils"],
visibility = ["//visibility:public"]
)
cc_library(
name = "utils",
hdrs = ["utils.h"],
srcs = ["utils.cpp"],
deps = ["#boost//:filesystem",
"#eigen_archive//:eigen",
"#tbb//:tbb",], # I use the tbb in the tensorflow
visibility = ["//visibility:public"]
)
Firts, I test tbb in the tensorflow project itself, it has the same error.
The way of my test is, write the follow c++ rule in the tensorflow/BUILD:
cc_library(
name = "test-tbb",
srcs = ["test.cpp"],
deps = ["#ngraph_tf//:ngraph_tf"] # ngraph_tf's deps has tbb
)
and then excute bazel build //tensorflow:test-tbb, it has the same error:
#tbb//:build_tbb: $(AR) not defined
second, I think the problem may be in tbb. I test tbb in two ways:
I download tbb sources, and in the root dir of the tbb, make a WORKSPACE(empty) and a BUILD (only rename tbb.BUILD to BUILD), then excute bazel build //:build_tbb, when bazel version is 0.24.1, all is right, but has the above error when change bazel version to 3.1.0.
I only use tbb as a external dependency in a miniproject, which WORKSPACE has a http_archive named tbb, and it's build_file point to a tbb.BUILD. I write a cc_library in BUILD flie and it's deps = ["#tbb:build_tbb"], the test result is the same as 1), when bazel version is 0.24.1 is ok, 3.1.0 is broken.
tbb.BUILD I used: https://github.com/tensorflow/tensorflow/blob/r2.3/third_party/ngraph/tbb.BUILD
Third, I follow the document of bazel (https://docs.bazel.build/versions/master/be/make-variables.html#custom_variables), sets toolchains = ["#bazel_tools//tools/cpp:current_cc_toolchain"] in the rule "build_tbb" in tbb.BUILD, than the error message is "in cmd attribute of genrule rule //:build_tbb: $(CC_FLAGS) not defined".
Fourth, I follow the solution of the issue:https://github.com/rnburn/satyr/issues/2, update the tbb.BUILD file, than retry and the error message is "in cmd attribute of genrule rule #tbb//:build_tbb: $(C_COMPILER) not defined".
I just want use use tensorflow as an external dependency and use the bazel target "tbb" statement in tensorflow (Since it is in tensorflow, I don’t want to recompile multiple times)
Does anyone know how to correctly use tensorflow and tbb as an external dependency that use bazel 3.0.0+? Thanks!
I tried compiling tensorflow 1.3 from the HEAD of the master branch using the following line of shell command after running ./configure
sudo bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 --config=cuda -k --verbose_failures //tensorflow/tools/pip_package:build_pip_package
I get the following error in the end.
At global scope:cc1plus: warning: unrecognized command line option '-Wno-self-assign'
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 3834.785s, Critical Path: 196.95s FAILED: Build did NOT complete successfully
These were the warnings it gave initially.
WARNING: /home/pranav/tensorflow_install/tensorflow/tensorflow/core/BUILD:1634:1: in includes attribute of cc_library rule //tensorflow/core:framework_headers_lib: '../../external/nsync/public' resolves to 'external/nsync/public' not below the relative path of its package 'tensorflow/core'. This will be an error in the future. Since this rule was created by the macro 'cc_header_only_library', the error might have been caused by the macro implementation in /home/pranav/tensorflow_install/tensorflow/tensorflow/tensorflow.bzl:911:30
WARNING: /home/pranav/tensorflow_install/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:exporter': No longer supported. Switch to SavedModel immediately.
WARNING: /home/pranav/tensorflow_install/tensorflow/tensorflow/contrib/learn/BUILD:15:1: in py_library rule //tensorflow/contrib/learn:learn: target '//tensorflow/contrib/learn:learn' depends on deprecated target '//tensorflow/contrib/session_bundle:gc': No longer supported. Switch to SavedModel immediately.
INFO: Analysed target //tensorflow/tools/pip_package:build_pip_package (208 packages loaded).
Then loads of INFO. I'm not sure if it is of any help.
Bazel Version:
Build label: 0.5.4
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Aug 25 10:00:00 2017 (1503655200)
Build timestamp: 1503655200
Build timestamp as int: 1503655200
I read in some answer to run the following code,
$ bazel query --output=build 'somepath("//tensorflow/core:version_info_gen", "//tensorflow/tools/git:gen/spec.json")'
And it gave me this.maybe this will be of help.
# /home/pranav/tensorflow_install/tensorflow/tensorflow/core/BUILD:1546:1
genrule(
name = "version_info_gen",
generator_name = "version_info_gen",
generator_function = "tf_version_info_genrule",
generator_location = "tensorflow/core/BUILD:1546",
srcs = ["//tensorflow/tools/git:gen/spec.json", "//tensorflow/tools/git:gen/head", "//tensorflow/tools/git:gen/branch_ref"],
tools = ["//tensorflow/tools/git:gen_git_source.py"],
outs = ["//tensorflow/core:util/version_info.cc"],
cmd = "$(location //tensorflow/tools/git:gen_git_source.py) --generate $(SRCS) \"$#\"",
local = True,
)
Also, "the bazel command i wrote" > log.txt doesn't fill the text file with the terminal outputs.
If you guys want more information to help me. Suggest me a way to copy the terminal output to a text file so that i can upload it on github and give you the link.
I also used --explain to write all explanations to a file . I can upload that also if you want.
I also tried --local_resources 2048,.5,1.0 to reduce my memory allocation in case of memory issues. Still doesn't work.
Thanks a lot in advance.
I'm stumbling around trying to get bazel working with pypi dependencies.
./pypi.bzl:
def _impl(ctx):
ctx.actions.run_shell(
command = "pip download %s" % ctx.package
)
_pypi_package = rule(
implementation=_impl,
attrs={"package": attr.label(mandatory=True)},
)
def pypi_package(package):
_pypi_package(name = package, package = package)
./BUILD:
py_binary(
name = "app",
srcs = ["app.py"],
deps = [":python-dateutil"]
)
load("//:pypi.bzl", "pypi_package")
pypi_package(
package="python-dateutil",
)
Trying to build:
$ bazel build app
ERROR: /path/to/cwd/BUILD:9:1: in _pypi_package rule //:python-dateutil: cycle in dependency graph:
//:app
.-> //:python-dateutil [self-edge]
`--
This cycle occurred because of a configuration option.
ERROR: Analysis of target '//:app' failed; build aborted.
INFO: Elapsed time: 0.219s
No idea if this is even the right approach for working with external dependencies, but ignoring that, I don't understand where the self-dep here is coming from. In fact, I don't see that I'm declaring any deps for the pypi_package rule at all. What's going on?
The issue was attr.label: "label" here means BUILD label. I should have been using attr.string.
I am trying to run to get R + deepwater + tensorflow to work on a MBP.
The following have been installed.
Python 3.6.1
TensorFlow 1.1
The Hello, TensorFlow example on the TensorFlow website is working fine.
R version 3.4.0
curl -O http://h2o-release.s3.amazonaws.com/h2o/master/3904/R/src/contrib/h2o_3.11.0.3904.tar.gz
R CMD INSTALL h2o_3.11.0.3904.tar.gz
curl -O http://s3.amazonaws.com/h2o-deepwater/public/nightly/latest/h2o_3.11.0.tar.gz
R CMD INSTALL h2o_3.11.0.tar.gz
I am trying run the following example provided on the h2o website.
require(h2o)
h2o.init()
train <- h2o.importFile("https://h2o-public-test-data.s3.amazonaws.com/bigdata/laptop/mnist/train.csv.gz")
target <- "C785"
features <- setdiff(names(train), target)
train[target] <- as.factor(train[target])
model <- h2o.deepwater(x=features, y=target, training_frame=train, epochs=100, activation="Rectifier",
hidden=c(200,200), ignore_const_cols=FALSE, mini_batch_size=256, input_dropout_ratio=0.1,
hidden_dropout_ratios=c(0.5,0.5), stopping_rounds=3, stopping_tolerance=0.05,
stopping_metric="misclassification", score_interval=2, score_duty_cycle=0.5, score_training_samples=1000,
score_validation_samples=1000, nfolds=5, gpu=FALSE, seed=1234, backend="tensorflow")
The error I get is Error: java.lang.RuntimeException: Unable to initialize backend: Cannot find TensorFlow native library for OS: darwin, architecture: x86_64. Based on what I read on SO and the git page, I was under the impression that one does not need to build for the Mac platform.
One other thing that I tried was to use the info from https://github.com/rstudio/tensorflow. When I run install_tensorflow() I get Error: Prerequisites for installing TensorFlow not available. Please help!
We don't provide H2O+deepwater builds for MacOS. The one you downloaded is built for Linux machines (tested on Ubuntu), it's mentioned on the download page.
If you want to run it on MacOS you'd have to build both DeepWater and then H2O yourself.
Essentially, I want to run TensorFlow with a custom LLVM repository and not the llvm-mirror that bazel pulls from.
I made the following changes:
Changed the temp_workaround_http_archive rule in //tensorflow/workspace.bzl to:
native.local_repository (
name = "llvm",
path = "/git/llvm/",
)
In /git/llvm I added the file WORKSPACE containing:
workspace( name = "llvm" )
However, I know that an llvm.build file is required, but since I am new to bazel, I am not sure where it should be located.
I am getting the following error log:
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
ERROR: /git/tensorflow/tensorflow/tools/pip_package/BUILD:81:1: no such package '#llvm//': BUILD file not found on package path and referenced by '//tensorflow/tools/pip_package:licenses'.
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted.
INFO: Elapsed time: 0.219s
I installed TensorFlow from source. Here is the version info:
$ git rev-parse HEAD
4c3bb1aeb7bb46bea35036433742a720f39ce348
$ bazel version
Build label: 0.4.5
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Mar 16 12:19:38 2017 (1489666778)
Build timestamp: 1489666778
Build timestamp as int: 1489666778
Thanks in advance for the help!
Found the fix. Quite simple actually.
The local_repository rule in bazel is for external bazel repositories only. To use a non-bazel external repository, we need to use new_local_repository which takes build_file as an argument.
You can use the http server function of python to build a local file server, like:
python3 -m http.server
Then edit the file "tensorflow/workspace.bzl"
tf_http_archive(
name = "llvm",
urls = [
"https://mirror.bazel.build/**/195a164675af86f390f9816e53291013d1b551d7.tar.gz",
"http://localhost:8000/195a164675af86f390f9816e53291013d1b551d7.tar.gz",
"https://github.com/**/195a164675af86f390f9816e53291013d1b551d7.tar.gz",
],
sha256 = "57a8333f8e6095d49f1e597ca18e591aba8a89d417f4b58bceffc5fe1ffcc02b",
strip_prefix = "llvm-195a164675af86f390f9816e53291013d1b551d7",
build_file = str(Label("//third_party/llvm:llvm.BUILD")),
)
Add one local file path in the middle line of urls, and then rebuild it again.