Facing trouble during installation of TensorFlow 2 object detection API - tensorflow2.0

Tried to build Tensorflow 2 object detection API in Ubuntu 20.04 version..
Cuda version:10.1,** Cudnn version:7.6.5 and tensorflow version: 2.2**..
After executing all steps, I had found en error during running one command
I have tried the command
python -m pip install .
" It is installing lots of packages, then it stopped...
Pip version:22.1.0.

Related

Error: "An error ocurred while starting the kernel" when using GPU tensorflow (spyder)

I am getting an error with the kernel when activating my GPU for tensorflow. I have installed Tensorflow 2.7 , CUDA 11.5 and CUDNN 8.3.2 . I have tried in an environment python 3.9 and 3.8.12. My GPU is well activated but for some reason the kernel stops working when starting the epoch calculation. I have checked all posible solutions in the web and nothing works for me.
Console Output
in case if you are using anaconda:
please check the cudnn and cuda version installed using the following commands
conda list cudatoolkit
conda list cudnn
If you haven't installed any of them:
do so:
conda install -c anaconda cudatoolkit
conda install -c anaconda cudnn

Using Object Detection API on local GPU but not last version (v2.5.0)

I am trying to use my local GPU to train an EfficientDetD0 model. I already have a good pipeline (that works on Google Colab for example), I modified it a bit to use it locally, but one problem happens every time I launch the training.
I use conda to install tensorflow-gpu with cuda and cudnn but it makes TensorFlow v2.4.1 environments and when I launch the training the Object Detection API automatically install TensorFlow V2.5.0. So my env is not using the gpu for the training because cuda and cudnn are waiting for TensorFlow to be v2.4.1 and not v2.5.0.
Is there a way to get the Object Detection API in v2.4.1 and not v2.5.0 ?
I tried many things but it doesn't work (training is failing or going for CPU training).
Here is the code that install dependencies and overwrite TensorFlow version to TensorFlow v2.5.0:
os.system("cp object_detection/packages/tf2/setup.py .")
os.system("python -m pip install .")
SYSTEM:
gpu : Nvidia RTX 3070
os : Ubuntu 20.04 LTS
tensorflow: 2.4.1
P.S.: I go with conda install -c conda-forge tensorflow-gpu for installing TensorFlow, cuda and cudnn in my training env because manually there was a dependency problem, so I took the easy way.
EDIT : solution found explained in comments.
Follow these steps to install specific version of tensorflow gpu
1. Set Up Anaconda Environments
conda create -n tf_gpu cudatoolkit=11.0
2. Activate the new Environment
source activate tf_gpu
3. Install tensorflow-gpu 2.4.1
pip install tensorflow==2.4.1
Try to run object_detection without "installing" it. Dont run setup.py. Just setup the neccesery paths and packages manually.
Or edit the setup.py to skip installing the specific verison of TF. I quess that this version is a requirement of some of the packages installed in setup.py.
I use the object_detection without running the setup.py or doing any "installation" without any problems.

Error importing tensorflow , tensorflow library was compiled to use AVX instructions, but these aren't available on your machine

System information
- Linux Ubuntu 16.04
TensorFlow installed from binary (pip install)
TensorFlow version:
Python version: 3.5
Installed using virtualenv? pip? conda?: pip and virtualenv
Bazel version (if compiling from source):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:
Problem described
i was following the tutorial for using intel neural stick 2 for object detection https://towardsdatascience.com/speed-up-predictions-on-low-power-devices-using-neural-compute-stick-and-openvino-98f3ae9dcf41
in the example i install the prerequisites using the command
sudo ./opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/install_prerequisites/install_prerequisites.sh
tensorflow was installed with the prerequisites , i also installed tensorflow using pip install , but when i run the next command
mo_tf.py \
--input_model ~/Downloads/ssd_mobilenet_v1_coco_2018_01_28/frozen_inference_graph.pb \
--tensorflow_use_custom_operations_config /opt/intel/computer_vision_sdk/deployment_tools/model_optimizer/extensions/front/tf/ssd_support.json \
--tensorflow_object_detection_api_pipeline_config ~/Downloads/ssd_mobilenet_v1_coco_2018_01_28/pipeline.config \
--data_type FP16
i get the following error
F tensorflow/core/platform/cpu_feature_guard.cc:37]
The tensorflow library was compiled to use AVX instructions, but these aren't available in your machine
Aborted (core dumped)
i am getting the same error when try and import tensorflow
what should i do to solve this error ?
The error message indicates that the machine does not support avx. Is it so? You can refer this link How to tell if a Linux machine supports AVX/AVX2 instructions? to check the same.
If your machine does not support AVX, then the solution would be to build tensorflow from source excluding those settings

Keras with Tensorflow backend on GPU. MKL ERROR: Parameter 4 was incorrect on entry to DLASCL

I installed Tensorflow with GPU support and Keras to an environment in Anaconda (v1.6.5) by using following commands:
conda install -n EnvName tensorflow-gpu
conda install -n EnvName -c conda-forge keras-gpu
I have NVIDIA Quadro 2200K on my machine with driver v384.66, cuda-8.0, cudnn 7.0
When I am trying to run a python code with Keras at the stage of training I get the following
Intel MKL ERROR: Parameter 4 was incorrect on entry to DLASCL.
and later
File
"/home/User/anaconda3/envs/keras_gpu/lib/python3.6/site-packages/numpy/linalg/linalg.py",
line 99, in _raise_linalgerror_svd_nonconvergence
raise LinAlgError("SVD did not converge") numpy.linalg.linalg.LinAlgError: SVD did not converge
Other relevant sources suggest to check data for NaNs and Infs, but my data is clean for sure. By the way, CPU version of the installation is working fine, the issue occurs only when trying to run on GPU
I tried to reinstall Anaconda, to reinstall CUDA and numpy, but it didn't work out.
The problem was in package mkl (2018.0.0) - it seems like it has recently been released and conflicts with the version of some packages supplied with Tensorflow(1.3.0) and Keras(2.0.5) via conda*.
So I manually downgraded mkl using Anaconda Navigator to v11.3.3 which led automatically to downgrade of other packages and everything is working well now.

TensorFlow: unsatisfiableError: the following specifications were found to be in conflict

I am trying to install tensorflow in anaconda with python 2.7 in Win10, by conda installation:
conda install -c conda-forge tensorflow=1.1.0
Then, I get the error message:
- python 2.7*
- tensorflow 1.1.0* -> python 3.5*
Use 'conda info <package>' to see the dependencies for each package.
Does the message mean I need to use python 3.5?
Yes.
TensorFlow only supports version 3.5.x of Python on Windows. Note that Python 3.5.x comes with the pip3 package manager, which is the program you'll use to install TensorFlow.
There are instructions for installing TF with Anaconda on Win10 on that page.