lightgbm GPU install not working in Colab - gpu

I'm trying to install lightgbm with GPU support in colab, but every method I've tried ends up causing LightGBM to produce an error message stating that unexpected keyword arguments are being used (e.g., early_stopping_rounds) when I attempt to train the model. If I try it without including the kwarg that caused the error, it just returns the same error message for a different one that's still included. The code works totally fine when I run it with the pre-loaded version of lightgbm that is included in colab envs, but any attempt I've made to install the package with GPU support leads to this error if the installation is successful.
Any idea what I am doing wrong here?
Here's the most recent method I followed to install lgbm with gpu support enabled:
! git clone --recursive https://github.com/Microsoft/LightGBM
! cd LightGBM && rm -rf build && mkdir build && cd build && cmake -DUSE_GPU=1 ../../LightGBM && make -j4 && cd ../python-package && python3 setup.py install --precompile --gpu;
%cd /content/LightGBM/python-package
!python3 setup.py install --gpu
Any ideas why this might be happening?

Well, not entirely sure why, but it is working now. I installed following the instructions in this stack overflow question. Same as before, but only difference is I ran the colab notebook on my iMac instead of macbook. Not sure what difference that would make since my understanding is colab shouldn't be influenced by the local system, but alas, here we are.

Related

Stuck at do you want to continue on jupyter notebook

First off I am using Linux operating system.
The problem is that I am stuck at "Do you want to continue [Y/n]". This will not allow me to download the remaining files because I am having trouble figuring out how to insert a "y" to continue downloading the files.
Here is a snippet of the code where I think the problem lays:
if os.name=='posix':
!apt-get install protobuf-compiler
!cd Tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=. && cp object_detection/packages/tf2/setup.py . && python -m pip install .
The output that appears is this:
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following package was automatically installed and is no longer required:
systemd-hwe-hwdb
Use 'apt autoremove' to remove it.
The following additional packages will be installed:
libprotobuf-dev libprotobuf-lite23 libprotoc23
Suggested packages:
protobuf-mode-el
The following NEW packages will be installed:
libprotobuf-dev libprotobuf-lite23 libprotoc23 protobuf-compiler
0 upgraded, 4 newly installed, 0 to remove and 96 not upgraded.
Need to get 2,246 kB of archives.
After this operation, 14.6 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Here we can see that I am stuck at Do you want to continue [Y/n].
The first thing I tried was running jupyter notbook as a root user which helped for a little until I reached this part.
The second thing I tried was adding a "y" underneath the snippet of the code in jupyter notebook hoping that it will continue downloading the files.
I figured it out it was missing something after doing some research.
If anyone is stuck here please change your code to something like this:
!yes | apt-get install
protobuf-compiler

How to Install Tensorflow on Google Coral Dev Board?

How to install Tensorflow on Coral Dev?
Getting errors following this on Coral Dev board (one of them):
    Error Message: compile.sh not found.
Please, give me some additional instruction, thanks.
It is really not going to be possible to help if you don't give details on what you've done or what errors you ran into while trying to install it.
However, since the objective is to install tensorflow on the board, you can just do this using this pre-built package:
$ wget https://github.com/lhelontra/tensorflow-on-arm/releases/download/v2.0.0/tensorflow-2.0.0-cp37-none-linux_aarch64.whl
$ sudo apt-get install -y python3-dev libhdf5-dev python3-h5py
$ sudo pip3 install tensorflow-2.0.0-cp37-none-linux_aarch64.whl
Also, please note that using the full tensorflow on the board isn't going to gain any performance from the TPU.
[Edit] Apologies, forgot to credit lhelontra/tensorflow-on-arm repo for the prebuilt package!

Can you use rmagic (rpy2) in google colaboratory?

I know google colaboratory doesn't yet support an R kernel. What about rmagic? Can I use rpy2?
I tried :
!pip install rpy2==2.8.6
And got :
Collecting rpy2==2.8.6
Using cached https://files.pythonhosted.org/packages/32/54/d102eec14f9cabd0df60682a38bd45c36169a1ec8fb8a690bf436cb6d758/rpy2-2.8.6.tar.gz
Complete output from command python setup.py egg_info:
Error: Tried to guess R's HOME but no command 'R' in the PATH.
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-3bSiiD/rpy2/
I'm guessing that it isn't working because R isn't installed on whatever cloud machine this notebook is running on, and that it probably isn't possible to install it. But I'm hoping I'm wrong and someone may know of a work around.
OK, I answered my own question. I thought for sure this would fail, but tried anyway:
!apt-get update
!apt-get install r-base
!pip install rpy2==2.8.6
And it worked!

tensorboard: command not found

I installed TensorFlow on my MacBook Pro 10.12.5 from source code by steps described here.
https://www.tensorflow.org/install/install_sources
TensorFlow itself works well but I cannot run TensorBoard.
It seems tensorboard is not installed properly.
When I try running tensorboard --logdir=... it says -bash: tensorboard: command not found. And locate tensorboard returns empty.
Do I need any additional step to install tensorboard?
You could call tensorboard as a python module like this:
python3 -m tensorboard.main --logdir=~/my/training/dir
or add this to your .profile
alias tensorboard='python3 -m tensorboard.main'
If no other methods work then try this one. It may help you.
1. check the location of Tensorflow
pip show tensorflow
It will show output something like this.
...
Name: tensorflow
Version: 1.4.0
Location: /home/abc/xy/.local/lib/python2.7/site-packages
...
2. Go to that location you get from the above output.
cd /home/abc/xy/.local/lib/python2.7/site-packages
There you can see a directory named tensorboard.
cd tensorboard
3. There must be a file named 'main.py'.
4. Execute the following command to launch the tensorboard.
python main.py --logdir=/path/to/log_file/
If you installed Tensorflow with Virtualenv, then first Check whether you have activated the tensorflow envirnoment or not
If you have activated the tensorflow session then your command prompt will look like this :
If not, Write the below command and try running tensorboard again.
source ~/tensorflow/bin/activate
Run this command:
python3 -m tensorboard.main --logdir=logdir
To run directory you can use,
Change =logdir to ="dir/TensorFlow"
(Directory path)
What version of Tensorflow are you running? Older versions don't include Tensorboard.
If you do have a newer version, I see you are using OSX, which apparently caused some problems for other people: https://github.com/tensorflow/tensorflow/issues/2115 Check this page to fix it!
As a MacPorts user, I'm used to running things from out of the path
/opt/local/bin. When you install a python package via MacPorts, that's
where the executables go --- even if they're just symbolic links to
files to a main python repository in
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/
pip installs things into the latter directory, but apparently does NOT
add the symbolic link to /opt/local/bin
This has never been an issue (or even come up) for me before, because
I've only used pip to install (non-executable) packages which load
from within python. In conclusion, there is a
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin/tensorboard
This is a pip / MacPorts-SOP mismatch / user error*, and nothing to do
with tensorboard in particular. Please close this issue. Thanks for
your help.
*my 'locate' database was in the process of updating but hadn't completed
Quickest solution -
echo "alias tensorboard='python3 -m tensorboard.main'" >> ~/.bash_profile
After adding this to your .bash_profile you can use
tensorboard --logdir=/path
If you are using pycharm in windows environment this may help:
python -m tensorboard.main --logdir=logs

Tensorflow with gpu support installation error - the specified --crosstool_top is not a valid cc_toolchain_suite rule

I've been trying to install tensorflow with GPU support using these steps:
http://www.nvidia.com/object/gpu-accelerated-applications-tensorflow-installation.html
and also using:
http://thelazylog.com/install-tensorflow-with-gpu-support-on-sandbox-redhat/
This is the error message that I'm getting when I try to run the bazel build command for building the tensorflow pip package (with the --config-cuda flag set):
The specified --crosstool_top '//third_party/gpus/crosstool:crosstool' is not a valid cc_toolchain_suite rule.
What's strange is that if i remove the --config=cuda flag, I don't get the error message while building and I'm able to install tensorflow successfully - but without GPU support.
I experienced the same issue, using the nvidia instructions. What I did was to drop the git reset line in the instructions, and it works.
Details (from the error message):
Close, reopen terminal
Run git clone (again), and cd tensorflow
Run ./configure
Bazel build, etc
This may be unrelated, but I experienced an issue with the .whl line, the error message was that the wheel cannot be found or something along those lines. This is the "And finally install the TensorFlow pip package" section. To resolve it in my case, I typed in the terminal all the way to "..._pkg/tensorflow", and then pressed tab for auto-completion. The file name that popped up was significantly longer than that in the guide, but it worked. Also, if anyone face a numpy not installed message based on the nvidia instructions, replace the python-pip and dev with python-numpy and run that line again to install.
Configuration: Fresh Ubuntu 16.04, GTX970M, running driver 367.48 (from CUDA installation), CUDA 8.0, CuDNN 5.1
Full setup path:
Fresh Ubuntu, with downloads and 3rd party apps selected during installation.
Control panel => Software and updates => Other Software => Canonical ticked
Install CUDA using nvidia instructions in CUDA documentation, .deb format
CuDNN 5.1 installed, the rest from the nvidia link.
I hope everything works out for you!
(I'm sorry for the poor formatting)
I was going through same problem and recently found the solution. The problem is with the installation of Bazel which leads to this kind of error.
After installation of bazel from installer, make sure that you would give the correct path to ~./bashrc and also activate the path using
source "path-to-your-bin-directory-for-bazel"
Please change the git source version slightly as shown below
$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
// $ git reset --hard 70de76e
$ git reset --hard 287db3a
And please refer the below l
https://github.com/tensorflow/tensorflow/issues/4944
Also, zlib has been updated since this TF build. You need to check http://www.zlib.net/ to get the latest version and SHA-256, then update tensorflow/workspace.bzl with that information (lines 254-266 in this build). At this time, the correct version info would include the following:
url = "http://zlib.net/zlib-1.2.11.tar.gz",
sha256 = "c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1",
strip_prefix = "zlib-1.2.11",