XGBOOST ERROR for AUTOML: Cannot build an XGBoost model - no backend found - xgboost

I am running h2o automl using anaconda env on a CentOS 7 Linux machine(CentOS Linux release 7.7.1908 (Core)). I get this error for XGBoost model.
Cannot build an XGBoost model - no backend found.
According to the docs, XGBoost option should work on a CentOS machine. I am using the latest h2o 3.28.0.3(pip installed in my anaconda env).
Any help is much appreciated.

Related

Tensorflow Loss function is NAN when using GPU

I am trying to train custom object detection model using pre-trained model from Tensorflow1 Model ZOO.
I am using model ssd_mobilenet_v2_coco_2018_03_29
I created suitable environment for training following this tutorial :https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/tensorflow-1.14/training.html
The thing is, when I tried to train the model using tensorflow-gpu==1.14.0 I always got the error saying Model diverged with loss = NaN.
Then I tried to uninstall tensorflow-gpu==1.14.0 and install tensorflow==1.14.0 (so it did not use my GPU) and all of sudden it started to work !
I have no idea how is that possible...
Command I am using -
python model_main.py --alsologtostderr --model_dir=models\ssd_mobilenet_v2_coco_2018_03_29\export --pipeline_config_path=models\ssd_mobilenet_v2_coco_2018_03_29\pipeline.config --num_train_steps=2000
Python version is 3.7
OS is Windows 10
My Graphics Card is Nvidia GeForce RTX3050, I used CUDA v10.0 and cuDNN v7.4.1
Any ideas ?
This is because RTX30's don't support cuda 10. If you need tf v1 (1.15) you can install nvidia's tensorflow (1.15) that can run on cuda 11.
pip install nvidia-pyindex
pip install nvidia-tensorflow[horovod]
Note: Only supports Python 3.6 or 3.8 [Not 3.7]
https://developer.nvidia.com/blog/accelerating-tensorflow-on-a100-gpus/

Trouble installing tensorflow on linux

I am having trouble while importing tensorflow in vs code in ubuntu linux. I have installed it using pip(my cpu is intel pentium g2020) and this is what I get afterimport tensorflow as tf and I get
enter image description here
also my tensorflow version is 2.7
and python version is 2.7.18
Maybe your env is incompatible. you can try to create a virtual env for your code ,such as :
conda create -n yourname_env

Running Train a GPT-2 (or GPT Neo) Text-Generating Model w/ GPU on Colab

When I start "Running Train a GPT-2 (or GPT Neo) Text-Generating Model w/ GPU on Colab" in my Colab, following error comes up:
ERROR: tensorflow 2.5.0 has requirement tensorboard~=2.5, but you'll
have tensorboard 2.4.1 which is incompatible. ERROR: pytorch-lightning
1.3.8 has requirement PyYAML<=5.4.1,>=5.1, but you'll have pyyaml 3.13 which is incompatible.
What to do? Is it because of my Mac, or do I need to upgrade my Colab account would that help?
The problem comes from the default packages installed in the Colab environment. I does not depend on the platform you are using to access Colab or on the type of your subscription.
You have to upgrade the Python packages using pip.
In general you can run shell commands like pip in Colab prepending a ! character,
so in your case the following lines should be sufficient to fix the problem
!pip install tensorboard==2.5
!pip install pyyaml==5.4.1
If you need to run more shell commands, you can use more user-friedly methods (see the answers to this question).

Using Object Detection API on local GPU but not last version (v2.5.0)

I am trying to use my local GPU to train an EfficientDetD0 model. I already have a good pipeline (that works on Google Colab for example), I modified it a bit to use it locally, but one problem happens every time I launch the training.
I use conda to install tensorflow-gpu with cuda and cudnn but it makes TensorFlow v2.4.1 environments and when I launch the training the Object Detection API automatically install TensorFlow V2.5.0. So my env is not using the gpu for the training because cuda and cudnn are waiting for TensorFlow to be v2.4.1 and not v2.5.0.
Is there a way to get the Object Detection API in v2.4.1 and not v2.5.0 ?
I tried many things but it doesn't work (training is failing or going for CPU training).
Here is the code that install dependencies and overwrite TensorFlow version to TensorFlow v2.5.0:
os.system("cp object_detection/packages/tf2/setup.py .")
os.system("python -m pip install .")
SYSTEM:
gpu : Nvidia RTX 3070
os : Ubuntu 20.04 LTS
tensorflow: 2.4.1
P.S.: I go with conda install -c conda-forge tensorflow-gpu for installing TensorFlow, cuda and cudnn in my training env because manually there was a dependency problem, so I took the easy way.
EDIT : solution found explained in comments.
Follow these steps to install specific version of tensorflow gpu
1. Set Up Anaconda Environments
conda create -n tf_gpu cudatoolkit=11.0
2. Activate the new Environment
source activate tf_gpu
3. Install tensorflow-gpu 2.4.1
pip install tensorflow==2.4.1
Try to run object_detection without "installing" it. Dont run setup.py. Just setup the neccesery paths and packages manually.
Or edit the setup.py to skip installing the specific verison of TF. I quess that this version is a requirement of some of the packages installed in setup.py.
I use the object_detection without running the setup.py or doing any "installation" without any problems.

Jupyter Notebook kernel dies when importing tensorflow 1.5.0

Jupyter Notebook kernel dies when importing tensorflow 1.5.0
I have read a lot of posts relating to this but they have all had higher version numbers of tensorflow and have solved it by downgrading to 1.5.0. I also had higher version number and followed the advice to downgrade but I still have the problem.
Does anyone know what to try next?
pip install h5py==2.8.0
worked for me
When trying using the command prompt I got an error message not related to the tensorflow issue (I think);
"Warning! HDF5 library version mismatched error"
The key information from that message body was "Headers are 1.10.1, library is 1.10.2" so I downgraded hdf5 library by "conda install -c anaconda hdf5=1.10.1" and now the error message is gone and the kernel does not die when importing tensorflow.
I got similar problems, any tensorflow or tensorflow related packages (e.g. keras) made my kernel to die when loading, from any interface (jupyter, spyder, console....)
For those having this kind of problems, try running python from the console with verbose mode (python -v) then import tensorflow and look for errors.
I spot errors related to h5py, similar to the reply of #DBSE. I just upgraded the h5py package then everything was solved !
If you are using a conda environment, then the easiest method for fixing this issue is to just create a new environment and install tensorflow with just a single command. I had the same issue, I have tried a lot on most of the version of python and tensorflow. But at the last I have successfully configured it with just a single steps.
Run this command for installing GPU version
conda create --name tf_gpu tensorflow-gpu
The above line of code will automatically install that version of python and tf which is comaptible with your GPU or CPU.
For CPU, Run this command
conda create --name tf_env tensorflow
Both of these command work 100 % with my system for GPU and CPU access and will download the latest version which are compatible with system. It will resolved/fixed "Illegal Instruction (code dumps)" error.
pip install h5py==3.1.0
This is the most updated version which worked for me.
Try using import numpy before Keras and Tensorflow.