I have access to a large IBM Power8 machine, and would like to install TensorFlow on it. Naturally, I tried the quick pip install, but it failed:
sudo pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.6.0-cp27-none-linux_x86_64.whl
tensorflow-0.6.0-cp27-none-linux_x86_64.whl is not a supported wheel on this platform.
Storing debug log for failure in /home/pv/.pip/pip.log
Unfortunately, pip.log cotains little useful info.
/usr/bin/pip run on Sat Feb 6 17:29:34 2016
tensorflow-0.6.0-cp27-none-linux_x86_64.whl is not a supported wheel on this platform.
Exception information:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/pip/basecommand.py", line 122, in main
status = self.run(options, args)
File "/usr/lib/python2.7/dist-packages/pip/commands/install.py", line 283, in run
InstallRequirement.from_line(name, None))
File "/usr/lib/python2.7/dist-packages/pip/req.py", line 168, in from_line
raise UnsupportedWheel("%s is not a supported wheel on this platform." % wheel.filename)
UnsupportedWheel: tensorflow-0.6.0-cp27-none-linux_x86_64.whl is not a supported wheel on this platform.
Next thing I tried was to build TensorFlow from source. To no avail, all my attempts ended with some cannot execute binary file: Exec format error message, e.g.:
/usr/local/bin/bazel: line 86: /usr/local/lib/bazel/bin/bazel-real: cannot execute binary file: Exec format error
So then I tried to compile Bazel from source, which also resulted in a similar hard error.
me#machine:~/bazel-0.1.5$ ./compile.sh
INFO: You can skip this first step by providing a path to the bazel binary as second argument:
INFO: ./compile.sh compile /path/to/bazel
🍃 Building Bazel from scratch.
Compiling Java stubs for protocol buffers...
third_party/protobuf/protoc-linux-x86_32.exe -Isrc/main/protobuf/ --java_out=/tmp/bazel.T9C83cNa/src src/main/protobuf/android_studio_ide_info.proto
scripts/bootstrap/buildenv.sh: line 63: third_party/protobuf/protoc-linux-x86_32.exe: cannot execute binary file: Exec format error
pv#sardonis:~/bazel-0.1.5$ ^C
I however found this link http://www.cnblogs.com/rodenpark/p/5007744.html that explains how to build the Protobuf compiler from source on the Power8 machine. This worked and after the modifications described in his other topic http://www.cnblogs.com/rodenpark/p/5007846.html I managed to at least get the compilation process started. But now it crashes with a ton of errors which each seem less severe on their own but the vast amount of them makes it look really hopeless, I posted them on http://pastebin.com/KjkseaGx for reference.
So... I'm running out of inspiration. What can I do to make TensorFlow work on the Power8 machine?
Install bazel 0.2.0-ppc
tf#ubuntu16:~$ git clone https://github.com/ibmsoe/bazel
tf#ubuntu16:~/bazel$ git checkout v0.2.0-ppc
tf#ubuntu16:~/bazel$ ./compile.sh
Install tensorflow
tf#ubuntu16:~$ git clone --recurse-submodules https://github.com/tensorflow/tensorflow
tf#ubuntu16:~/tensorflow$ git checkout v0.10.0rc0
tf#ubuntu16:~/tensorflow$ git commit -m"v0.10.0rc0"
tf#ubuntu16:~/tensorflow$ git cherry-pick ce70f6cf842a46296119337247c24d307e279fa0
tf#ubuntu16:~/tensorflow$ git cherry-pick f1acb3bd828a73b15670fc8019f06a5cd51bd564
tf#ubuntu16:~/tensorflow$ git cherry-pick 9b6215a691a2eebaadb8253bd0cf706f2309a0b8
tf#ubuntu16:~/tensorflow$ ./configure
tf#ubuntu16:~/tensorflow$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
Here you'll encounter an error, something like this
ERROR: /home/tf/.cache/bazel/_bazel_tf/b2f766da603b0bed56d4c1d0b178456a/external/farmhash_archive/BUILD:5:1: Executing genrule #farmhash_archive//:configure failed: bash failed: error executing command /bin/bash -c ... (remaining 1 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
/home/tf/.cache/bazel/_bazel_tf/b2f766da603b0bed56d4c1d0b178456a/tensorflow/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260 /home/tf/.cache/bazel/_bazel_tf/b2f766da603b0bed56d4c1d0b178456a/tensorflow
/tmp/tmp.XdCPQefJyZ /home/tf/.cache/bazel/_bazel_tf/b2f766da603b0bed56d4c1d0b178456a/tensorflow/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260 /home/tf/.cache/bazel/_bazel_tf/b2f766da603b0bed56d4c1d0b178456a/tensorflow
You'll have to edit config.guess as below to insert a stanza for ppc64le
tf#ubuntu16:~/.cache/bazel/_bazel_tf/b2f766da603b0bed56d4c1d0b178456a/external/farmhash_archive/farmhash-34c13ddfab0e35422f4c3979f360635a8c050260$ vi config.guess
*:BSD/OS:*:*)
echo ${UNAME_MACHINE}-unknown-bsdi${UNAME_RELEASE}
exit ;;
+ ppc64le:Linux:*:*)
+ echo powerpc64le-unknown-linux-gnu
+ exit ;;
*:FreeBSD:*:*)
case ${UNAME_MACHINE} in
tf#ubuntu16:~/tensorflow$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
tf#ubuntu16:~/tensorflow$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
tf#ubuntu16:~/tensorflow$ sudo pip install /tmp/tensorflow_pkg/tensorflow*.whl
tf#ubuntu16:~/tensorflow/bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfi
tf#ubuntu16:~/tensorflow$ mkdir _python_build
tf#ubuntu16:~/tensorflow$ cd _python_build
tf#ubuntu16:~/tensorflow/_python_build$ ln -s ~/tensorflow/bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/tensorflow/* .
tf#ubuntu16:~/tensorflow/_python_build$ ln -s ~/tensorflow/tools/* .
tf#ubuntu16:~/tensorflow/_python_build$ python __init__.py develop
Using miniconda:
Installing miniconda:
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux--ppc64le.sh -O miniconda.sh
bash miniconda.sh
Accept the condition and allow conda to be added to PATH
rm miniconda.sh
echo export IBM_POWERAI_LICENSE_ACCEPT=yes >> ~/.bashrc
source ~/.bashrc
This sould add (base) on terminal. Add the correct channel as first priority
conda config --add default_channels https://repo.anaconda.com/pkgs/main
conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/
Create environment (it is good practice not to install packages on base)
conda create -n ai python=3.7
conda activate ai
conda install --strict-channel-priority tensorflow-gpu
For more information on miniconda on IBM Power 8 and Anaconda: IBM Source & Anaconda Source
Related
I want to build a program with mingw w64 and I have msys2 installed.
I tried to work with pacman from the msys2 prompt.
$ pacman -Q libpng
error: package 'libpng' was not found
$ pacman -S libpng
error: target not found: libpng
$ pacman -S *libpng
error: target not found: *libpng
I attempted to use google and came up with:
$ pacman -S mingw-w64-libpng
error: target not found: mingw-w64-libpng
$ pacman -F mingw-w64-libpng
warning: database file for 'mingw32' does not exist (use '-Fy' to download)
warning: database file for 'mingw64' does not exist (use '-Fy' to download)
warning: database file for 'msys' does not exist (use '-Fy' to download)
error: no options specified (use -h for help)
Very peculiar that after all the downloading I did, which I distinctly recall including a database for pacman, that these database files don't seem to exist.
$ pacman -Fy mingw-w64-libpng
[... stuff downloads ... ]
error: no options specified (use -h for help)
$ pacman -U mingw-w64-libpng
loading packages...
error: 'mingw-w64-libpng': could not find or read package
So now the questions are,
1) How in the future will I find the magic prefix for a well-known library in order to be able to tell pacman what to install?
2) How at the moment do I instruct pacman to install the libpng package which seems to be in the mingw-w64-libpng package?
3) Is that the package with the development headers or is that yet another package, as I have adjusted to on Deb/Ubuntu by looking for something like libpng-dev?
Have you tried pacman -Ss libpng? This will list all packages mentioning libpng, prefix and all:
$ pacman -Ss libpng
mingw32/mingw-w64-i686-libpng 1.6.35-1
A collection of routines used to create PNG format graphics (mingw-w64)
mingw64/mingw-w64-x86_64-libpng 1.6.35-1 [installed]
A collection of routines used to create PNG format graphics (mingw-w64)
I notice that these names include an architecture (i686/x86_64), which is fairly common in MinGW package names.
EDIT: The headers end up here:
$ ls /mingw64/include/libpng16/
png.h pngconf.h pnglibconf.h
I have the below files:
1. retrained_graph.pb
2. retrained_labels.txt
3. _retrain_checkpoint.meta
4. _retrain_checkpoint.index
5. _retrain_checkpoint.data-00000-of-00001
6. checkpoint
Command Executed:
python freeze_graph.py
--input_graph=/Users/saurav/Desktop/example/tmp/retrained_graph.pb
--input_checkpoint=./_retrain_checkpoint
--output_graph=/Users/saurav/Desktop/example/tmp/frozen_graph.pb --output_node_names=softmax
Getting error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 44: invalid start byte
Here are screenshots:
Finally I found the answer. To freeze a graph you need to build with "bazel".
1. Install bazel by using homebrew. brew install bazel
2. If you don't have homebrew get it installed.
/usr/bin/ruby -e "$(curl -fsSL \
https://raw.githubusercontent.com/Homebrew/install/master/install)"
Clone tensorflow by command git clone https://github.com/tensorflow/tensorflow
Change directory to tensorflow in terminal
run command ./Configure. It asks few questions answer according your need. Most of them you can type "NO". It asks default path to Python you need to specify the path or just hit "enter".
Now build bazel for freeze_graph using command:
bazel build tensorflow/python/tools:freeze_graph
Keep the retrained graph and checkpoints in folder.
Run bazel command to freeze the graph.
bazel-bin/tensorflow/python/tools/freeze_graph \ --input_graph=YouDirectory/retrained_graph.pb \ --input_checkpoint=YouDirectory/_retrain_checkpoint \ --output_graph=YouDirectory/frozen_graph.pb
This is about a tensorboard which is built from source, not about pip-installed one.
I could successfully build it.
$ git clone https://github.com/tensorflow/tensorboard.git
$ cd tensorboard/
$ bazel build //tensorboard
tensorflow/tensorboard$ bazel build //tensorboard
Starting local Bazel server and connecting to it...
......................................
: (log messages here)
Target //tensorboard:tensorboard up-to-date:
bazel-bin/tensorboard/tensorboard
INFO: Elapsed time: 326.553s, Critical Path: 187.92s
INFO: 619 processes: 456 linux-sandbox, 12 local, 151 worker.
INFO: Build completed successfully, 1268 total actions
Then yes I can run it as documented in tensorboard/README.md, and it works.
$ ./bazel-bin/tensorboard/tensorboard --logdir path/to/logs
The problem is, I'd like to run it as if installed via pip like this:
$ tensorboard --logdir path/to/logs
But as far as I looked for, no script provided to create .whl file so that we can locally-pip-install it, unlike tensorflow provides one like this.
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.8.0-py2-none-any.whl
So... can anybody show how to do that? Making packaging script would solve this, but it should exist somewhere as long as tensorboard is provided via pip anyway. :)
My workaround so far is not clean enough:
$ ln -s /my/build/folder/tensorboard/bazel-bin/tensorboard/tensorboard ~/bin
$ ln -s /my/build/folder/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles ~/bin
I appreciate your suggestions, thanks!
Update July-21:
Thanks to W JC, I found instruction is already there in tensorboard/pip_package/BUILD.
# rm -rf /tmp/tensorboard
# bazel run //tensorboard/pip_package:build_pip_package
# pip install -U /tmp/tensorboard/*py2*.pip
Though unfortunately it shows error in my environment, and I guess it's local issue maybe because I'm using anaconda.
But basically the problem was resolved. It should basically work as long as running on supported environment.
It seems there exists an script in the /tensorboard/pip_packages try to build wheels
bazel run //tensorboard/pip_package:build_pip_package ./ did generate the wheel out, but in the folder where bazel-bin points to. In my case, it's generated at ~/.cache/bazel/_bazel_peijia/b64ba42719633ff75eec6880decefcd3/execroot/org_tensorflow_tensorboard/bazel-out/k8-fastbuild/bin/tensorboard/pip_package/build_pip_package.runfiles/org_tensorflow_tensorboard/tensorboard-2.10.0a0-py3-none-any.whl
I am trying to install tensorflow serving in ubuntu via a docker image
I have cloned the tensorflow serving repo from "https://github.com/tensorflow/serving" and trying to create a docker image with the help of below command:
docker build --pull -t $USER/tensorflow-serving-devel -f tensorflow_serving/tools/docker/Dockerfile.devel .
When i tried to do, I am getting the following error:
Reading package lists...
W: The repository 'http://security.ubuntu.com/ubuntu xenial-security Release' does not have a Release file.
W: The repository 'http://archive.ubuntu.com/ubuntu xenial Release' does not have a Release file.
W: The repository 'http://archive.ubuntu.com/ubuntu xenial-updates Release' does not have a Release file.
W: The repository 'http://archive.ubuntu.com/ubuntu xenial-backports Release' does not have a Release file.
E: Failed to fetch http://security.ubuntu.com/ubuntu/dists/xenial-security/universe/source/Sources Error writing to output file - write (28: No space left on device) Error writing to file - write (28: No space left on device) [IP: 91.189.88.161 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial/universe/source/Sources Error writing to output file - write (28: No space left on device) Error writing to file - write (28: No space left on device) [IP: 91.189.91.26 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-updates/universe/source/Sources Error writing to output file - write (28: No space left on device) Error writing to file - write (28: No space left on device) [IP: 91.189.91.26 80]
E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/xenial-backports/main/binary-amd64/Packages Error writing to output file - write (28: No space left on device) Error writing to file - write (28: No space left on device) [IP: 91.189.91.26 80]
E: Some index files failed to download. They have been ignored, or old ones used instead.
The command '/bin/sh -c apt-get update && apt-get install -y build-essential curl git libfreetype6-dev libpng12-dev libzmq3-dev mlocate pkg-config python-dev python-numpy python-pip software-properties-common swig zip zlib1g-dev libcurl3-dev openjdk-8-jdk openjdk-8-jre-headless wget && apt-get clean && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
Is that Ubuntu Xenial version doesnt have a release for Tensorflow serving? Or is it that i am missing out something?
Please help.
Read through this link.
https://hub.docker.com/r/yesuprelease/tensorflow-serving-devel/ .
It looks like Ubuntu 16.04 has a release for tensorflow serving. Mostly it's a problem of less memory space in your system that led to failed fetching.
I just created a docker container and tried to install SQL Relay inside it.
I've checked the prerequisites here and followed the installation documents here.
However, at the end of make install of sqlrelay, I saw an error like this:
update-rc.d: /etc/init.d/sqlrelay: file does not exist
update-rc.d: /etc/init.d/sqlrcachemanager: file does not exist
make[1]: *** [install] Error 1
make[1]: Leaving directory `/sqlrelay-0.66.0/init'
make: *** [install-init] Error 2
What might be wrong with my installation?
Here's the docker file I used to start my installation:
FROM ubuntu:trusty
RUN apt-get update && \
apt-get install libxml2-dev libpcre3 libpcre3-dev libmysqld-dev -y
RUN apt-get install mysql-server libmysqlclient-dev -y
# sql relay prerequisites
RUN apt-get install g++ make perl php5-dev python-dev ruby-dev \
tcl-dev openjdk-7-jdk erlang-dev nodejs-dev node-gyp mono-devel \
libmariadbclient-dev libpq-dev firebird-dev libfbclient2 libsqlite3-dev \
unixodbc-dev freetds-dev mdbtools-dev -y
COPY rudiments-0.56.0.tar.gz /
COPY sqlrelay-0.66.0.tar.gz /
EXPOSE 80
Here are the outputs of ./configure, make, and make install inside sqlrelay-0.66.0 folder:
configure_log
make_log
make_install_log
If you need more information of my installation process, just let me know. I can provide it.
I think you should use ADD instead of COPY in your lines such as
COPY rudiments-0.56.0.tar.gz /
Your COPY just copies the .tar.gz, but does not unpack them
as with ADD
If the <src> parameter of ADD is an archive in a recognised compression format, it will be unpacked
This is extracted from
What is the difference between the `COPY` and `ADD` commands in a Dockerfile?
I have recently hit the same issue. The issue I found was that the init Makefile was incorrectly detecting the use of systemctl on Ubuntu Trusty and putting the scripts there. Later on the script would try to find the scripts in init.d and fail.
The solution is to edit the Makefile: sqlrelay-X.X.X/init/Makefile
Replace:
install:
if ( test -d "/lib/systemd/system" ); \
With:
install:
if ( test -d "/lib/systemd/system_x" ); \
Make a similar change to the uninstall option later in the script and it will now correctly install on Ubuntu.