OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory - google-colaboratory

For some reason, I am getting this error on Colab, even if I don't use GPU... Any help would be greatly appreciated! Thanks! The error message is as following:
OSError: libcudart.so.10.2: cannot open shared object file: No such file or directory

The reason is a mismatch of CUDA versions. I ran into this issue because the preinstalled version of pytorch did match the default version which I installed using %pip install torchaudio (CUDA 10.2). print(torch.__version__) gives 1.10.0+cu111 (CUDA 11.1).
So I reinstalled pytorch, torchaudio and torch vision with the command stated on the pytorch website
%pip install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
After restarting the environment, it should work.
This method uninstalls pytorch and reinstalls another version, it would be faster to just install the matching version of pytorch, in my case:
%pip install -q torchaudio==0.10.0+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html
I don't know if it would be better to install the cu113 variant.

Also, I would suggest to check the error logs to find out the python package that causes the error. In my case, it was generated in torch-cluster and it simply resolved by downgrading torch-cluster to 1.5.9 (recent version is 1.6.0 which is release just couple of weeks back and was installed by default)

I've solved it by replacing the version of torchaudio installed by pip with the one from conda.
pip uninstall torchaudio
conda install torchaudio -c pytorch
Notice the message of conda, it installs the version with bundled CUDA lib:
The following NEW packages will be INSTALLED:
torchaudio pytorch/linux-64::torchaudio-0.11.0-py38_cu113

Related

Could not import torch_geometric, it says "undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs"

I am trying to find a solution to the error:
OSError: /opt/conda/lib/python3.7/site-packages/torch_sparse/_version_cuda.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs.
arising from the statement from torch_geometric.data import Data in Kaggle notebook.
There are solutions in github and stackoverflow, but none are working.
-- "nvcc --version" shows
"nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0"
I tried to install torch-geometric by
!conda install pyg -c pyg -c conda-forge
!pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.12.0+cu113.html
from here.
The first statement took more than 1 hour so I moved to the second, which installed it. But the error didn't go.
It is running with out any error in colab.
This issue is mentionned at https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html :
undefined symbol: make_function_schema: This issue signals (1) a version conflict
between your installed PyTorch version and the ${TORCH} version
specified to install the extension packages, or (2) a version conflict
between the installed CUDA version of PyTorch and the ${CUDA} version
specified to install the extension packages. Please verify that your
PyTorch version and its CUDA version match with your installation
command:
python -c "import torch; print(torch.__version__)"
python -c "import torch; print(torch.version.cuda)"
nvcc --version
For re-installation, ensure that you do not run into any caching issues by
using the pip --force-reinstall --no-cache-dir flags. In addition, the
pip --verbose option may help to track down any issues during
installation. If you still do not find any success in installation,
please try to install the extension packages from source.
So, I would try these commands, and re-install all or part of the packages into a fresh environment.

Install TensorFlow addons

I have a venv with the following details:
python 3.6
TensorFlow 2.0.0
I tried to install tensorflow-addons using the following:
pip install -q --no-deps tensorflow-addons~=0.6
But then I keep receiving the following error:
Could not find a version that satisfies the requirement tensorflow-addons~=0.6 (from versions: )
No matching distribution found for tensorflow-addons~=0.6
You are using pip version 18.0, however version 19.3.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
I also tried other versions of tensorflow-addons, e. g., 0.4.0, 0.5.0, ..., but it did not work out.
I came across this problem two times and each time I had to solve the problem with a different solution.
1. Solution:
Upgrade pip/pip3 by using the following command.
python3 -m pip install --upgrade pip
Select appropriate version of the tensorflow-addons using the
following link
https://github.com/tensorflow/addons#python-op-compatibility-matrix
Install using the following command
pip install tensorflow-addons==version
2. Solution:
go to https://pypi.org/project/tensorflow-addons/#history
click on appropriate version
click on "Download files" on menu to the left
click on a .whl file that matches your system
requirements/specifications
go to the directory where you download the .whl file and run the
following
pip install tensorflow_addons-name.whl
The problem appears to have been related to installing on Windows platforms in the earlier versions of tensorflow-addons. As of time of updating this comment this issue should disappear completely.
In fact the developers state it has been solved, as it is shown here:
FYI stable release for windows is out. pip install tensorflow-addons
https://github.com/tensorflow/addons/issues/173#issuecomment-573106184
At your command prompt, simply specify the version you want to install.
For me, my python version is 3.7.4 and Tensorflow version is 2.2.0
Therefore, the tensorflow-addons version that matches my python and tensorflow version is 0.10.0
pip install tensorflow-addons==0.10.0

No module named tensorflow even after installing with pip

I'm trying to follow this guide to test this new algorithm: https://github.com/lalonderodney/SegCaps
I can't do it in my PC, so i'm using another server with Putty. Now I'm connected with the other server.
First of all I installed TensorFlow as indicates in the guide with :
pip install -r requirements.txt
After I wrote this code: ./main.py segcaps.png
in which segcaps.png is the image that i want to use
Finally I wrote python main.py --data_root_dir data
that is the only required parameter with the directory containing imgs and masks folders.
Now it gives me an error:
ModuleNotFoundError: No module named 'tensorflow.python.framework'
I searched it in the directory tensorflow/python/framework and it exists.
So, i don't know how to solve it. Ideas?
If you have multiple Python versions installed, then you'll (most likely) have multiple pip versions installed too. Make sure that the pip command you use installs the package(s) into the Python version you want it to. It may so happen that the package got installed into python2 but you wanted it in python3.
Since using pip did not install the packages in python3, pip3 is most likely to the PyPI for python3. Try
pip3 install -r requirements.txt
and that should work.
In case you have an EnvironmentError you can try this (bad idea):
pip3 install -r requirements.txt --user
This solves the problem most of the times on standalone machines. I'm not sure about the server; insufficient permissions might block this.
Why is the --user flag a bad idea? Read: What is the purpose “pip install --user …”?
You can use pip show tensorflow to see if it is installed or not.
As for ModuleNotFoundError try uninstalling keras and reinstalling an earlier version by pip install keras==2.1.6

Error installing library of Scrapy in PyCharm

I can install other packages, but can't install Scrapy. I get the following errors:
warning: build_py: byte-compiling is disabled, skipping.
running build_ext
building 'lxml.etree' extension
error: Microsoft Visual C++ 10.0 is required (Unable to find vcvarsall.bat).
However, C++ is installed, which I installed numerous of times. I have x86 and 64 bit installations (not sure if it's 10.0) but I have 2013-2017 versions installed.
Please upgrade your pip by following command.
python -m pip install --upgrade pip
Then install Scrapy by following command.
pip install Scrapy
download latest twisted package and install with pip.
https://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted
after that install scrapy
In my case, I found that pywin32 was not installed...
So I did
download the latest Twisted package from https://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted
You want to use the amd64 if you have Windows 64 (regardless if it's an Intel processor or not)
You can use any browser for the download and copy/paste the file into the project folder of your current pycharm project.
Then in pycharm type this:
pip install Twisted-20.3.0-cp39-cp39-win_amd64.whl
(assuming that your package was Twisted-20.3.0-cp39-cp39-win_amd64.whl)
then proceed with:
pip install Scrapy

TensorFlow pip installation issue: cannot import name 'descriptor'

I'm seeing the following error when installing TensorFlow:
ImportError: Traceback (most recent call last):
File ".../graph_pb2.py", line 6, in
from google.protobuf import descriptor as _descriptor
ImportError: cannot import name 'descriptor'
This error signals a mismatch between protobuf and TensorFlow versions.
Take the following steps to fix this error:
Uninstall TensorFlow.
Uninstall protobuf (if protobuf is installed).
Reinstall TensorFlow, which will also install the correct protobuf dependency.
I faced the similar issue, after trial and error, I used the below logic to run the program:
pip install --upgrade --no-deps --force-reinstall tensorflow
This will make sure to uninstall and reinstall the program from fresh. It works!
I would be extra careful before uninstalling/reinstalling other packages such as protobuf. What I think would most likely be the issue is difference in versions. As of writing this, the most recent release of python is 3.7 while tensorflow is only compatible up to 3.6.
If you're using a 3rd party distribution like Anaconda, this can get hidden from you. In this case I would recommend creating a new environment in Anaconda, with python 3.6 and then installing tensorflow: https://conda.io/projects/conda/en/latest/user-guide/getting-started.html#managing-python
Try this:
pip uninstall protobuf
brew install protobuf
mkdir -p
/Users/alexeibendebury/Library/Python/2.7/lib/python/site-packages
echo 'import site;
site.addsitedir("/usr/local/lib/python2.7/site-packages")' >>
/Users/alexeibendebury/Library/Python/2.7/lib/python/site-packages/homebrew.pth