The kernel appears to have died. It will restart automatically. Jupyter notebook [duplicate] - tensorflow

I am using a MacBook Pro with M1 processor, macOS version 11.0.1, Python 3.8 in PyCharm, Tensorflow version 2.4.0rc4 (also tried 2.3.0, 2.3.1, 2.4.0rc0). I am trying to run the following code:
import tensorflow
This causes the error message:
Process finished with exit code 132 (interrupted by signal 4: SIGILL)
The code runs fine on my Windows and Linux machines.
What does the error message mean and how can I fix it?

Seems that this problem happens when you have multiple python interpreters installed, and some of them are for differente architectuers (x86_64 vs arm64). You need to make sure that the correct python interpreter is being used, if you installed Apple's version of tensorflow, then that probably requires an arm64 interpreter.
If you use rosetta (Apple's x86_64 emulator) then you need to use a x86_64 python interpreter, if you somehow load the arm64 python interpreter, you will get the illegal instruction error (which totally makes sense).
If you use any script that installs new python interpreters, then you need to make sure the correct interpreter for the architecture is installed (most likely arm64).
Overalll I think this problem happens because the python environment setup is not made for systems that can run multiple instruction sets/architectures, pip does check the architecture of packages and the host system but seems you can run a x86_64 interpreter to load a package meant for arm64 and this produces the problem.
For reference there is an issue in tensorflow_macos that people can check.

For M1 Macs, From Apple developer page the following worked:
First, download Conda Env from here and then follow these instructions (assuming the script is downloaded to ~/Downloads folder)
chmod +x ~/Downloads/Miniforge3-MacOSX-arm64.sh
sh ~/Downloads/Miniforge3-MacOSX-arm64.sh
source ~/miniforge3/bin/activate
reload the shell and do
python -m pip uninstall tensorflow-macos
python -m pip uninstall tensorflow-metal
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
If the above doesn't work for some reason, there are some edge cases and additional information provided at the Apple developer page

Installing Tensorflow version 1.15 fixed this for me.
$ conda install tensorflow==1.15

I have been able to resolve this issue by using Miniforge instead of Anaconda as the Python environment. Anaconda doesn't support the arm64 architecture, yet.

I had the same issue
This is because of M1 chip. Now there is a pre-release that delivers hardware-accelerated TensorFlow and TensorFlow Addons for macOS 11.0+. Native hardware acceleration is supported on M1 Macs and Intel-based Macs through Apple’s ML Compute framework.
You need to install the TensorFlow that supports M1 chip Simply pull this tensorflow macos repository and run the ./scripts/download_and_install.sh

Related

Can't install tensorflow on python 3.9

When I try to install tensorflow on python 3.9 I get following error:
ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow
Is not there any tensorflow for 3.9?
What do you guys recommend?
Can I install other version of python beside the existing version?
Right now tensorflow does not have a build for python3.9
The latest one is for python3.8
You can check the build files at PyPI
https://pypi.org/project/tensorflow/#files
yes, you can install another version of python.
The original poster did not mention what type of computer or operating system he was using while attempting to install TensorFlow alongside Python 3.9. The error could be linked to working on a 64-bit Mac with the M1 chip (I recently experienced the same error described above while working on a Mac M1 in a Miniconda environment with Python 3.9.13). I solved the error by running
python3 -m pip install tensorflow-macos
from Terminal (in the Miniconda environment). TensorFlow installed normally alongside Python 3.9.13.
I do recommend installing Miniconda (or Anaconda as others have suggested), because it will allow you to easily create development environments with whatever version of Python modules or dependencies you require at the moment. See https://docs.conda.io/en/latest/miniconda.html. The larger Anaconda comes with a user-friendly 'Navigator' GUI which enables you to choose which environment is used to open a Jupyter notebook or other development environment, several of which come with Anaconda. See https://docs.anaconda.com/anaconda/install/
This is terrible with newer versions of Python that are not compatible with the machine learning module package.
So my approach is to keep the existing version 3.9 and the computer is using Anaconda to install a virtual environment with 3.7. When using vscode or pycharm, just remember to set it to that 3.7 Python environment.

Python 3.8.3 incompatible with tensorflow

I recently installed python with the version 3.8.3 and upgraded pip to 20.1.1. According to enter link description here, conda install -c conda-forge tensorflow should work. However, I get this result
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: -
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:
Specifications:
- tensorflow -> python[version='3.5.*|3.6.*|>=3.5,<3.6.0a0|>=3.6,<3.7.0a0|>=3.7,<3.8.0a0|3.7.*']
Your python: python=3.8
If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.
since I use
(base) C:\Users\ivan>python --version
Python 3.8.3
(base) C:\Users\ivan>pip --version
pip 20.1.1 from C:\Users\ivan\anaconda3\lib\site-packages\pip (python 3.8)
I wonder if it is possible to solve this issue without downgrading. For users of anaconda 2020.07, python 3.8 is used by default. Downgrading it will break anaconda.
People have reported problems using tensorflow with python 3.8, it is best to use 3.7. You are incorrect about breaking Anaconda. Here is what to do.
In Anaconda home page click on environments. At the bottom left of the page click on create. A window will appear. Give the new environment a name (say python3.7). In the drop down menu select 3.7. Now a new environment is created using python 3.7. Now in the conda terminal type conda activate python3.7. Then use conda to install tensorflow. It will install version 2.1.1, the cuda toolkit version 10.1.243 and cudnn version 7.6.5. Note conda can only install tensorflow up to version 2.1.1. If you want tensorflow 2.2 install it with pip using pip install tensorflow ==2.2.0. after you have installed 2.1. The cuda toolkit and cudnn work with version 2.2. Now use pip or conda to install any other packages you need in your python3.7 environment and you should be good to go!

Can I install Tensorflow 1.15 with GPU support on Ubuntu 20.04.1 LTS?

I am building a Deep Learning rig with a GeForce RTX 2060.
I am wanting to use baselines-stable which isn't tensorflow 2.0 compatible yet.
According to here and here, tensorflow-gpu-1.15 is only listed as compatible with CUDA 10.0, not CUDA 10.1.
Attempting to download CUDA from Nvidia, the option for Ubuntu 20.04 is not available for CUDA 10.0.
Searching the apt-cache does not result in CUDA 10.0 either.
$ sudo apt-cache policy nvidia-cuda-toolkit
[sudo] password for lansford:
nvidia-cuda-toolkit:
Installed: (none)
Candidate: 10.1.243-3
Version table:
10.1.243-3 500
500 http://us.archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages
I would highly prefer not to have to reinstall the OS with an older version of Ubuntu. However experimenting with reinforcement learning was the motive for purchasing this PC.
I see some possible clues that it might be possible to build tensorflow-gpu-1.15 from source with cuda 10.1 support. I also saw a random comment that tensorflow-gpu-1.15 will just-work with tf 1.15, but I am not wanting to make a miss-step installing things until I have a signal that is the direction to go. Uninstalling things isn't always straightforward.
Should I install CUDA 10.1 and cross my fingers 1.15 will like it.
Should I download the install for CUDA 10.0 for a the older Ubuntu version and see if it will install anyway
Should I attempt to compile tensorflow from source against CUDA 10.1 (heh heh heh)
Should I install and older version of Ubuntu and hope I don't go obsolete too quickly.
Given the situation is there a way to run tensorflow 1.15 with gpu support on Ubuntu 20.04.1?
As this also bothered me I found a working solution that I think is more versatile than using docker containers.
The main idea is from here (not to claim credit from others).
To make a working solution for Ubuntu 20.04 and TensorFlow 1.15 one needs:
Cuda 10.0 (to work with tf 1.15).
I have some trouble finding this version because it's not officially available for Ubuntu 20.04. I resolved to the Ubuntu 18.04 version though which works fine.
Archive toolkits here.
Final toolkit for Ubuntu here (as it's obvious not 20.04 version is available).
I chose runfile as method which resulted into 1 main runfile and 1 patch runfile being available:
cuda_10.0.130_410.48_linux.run
cuda_10.0.130.1_linux.run
The toolkit can be safely installed using the instructions provided with no risk since each version allocates a different folder in the system (typically this would be /usr/local/cuda-10.0/).
The corresponding cudnn for cuda 10.0
I had this one from a previous installation but its shouldn't be hard to download it also. The version I used is cudnn-10.0-linux-x64-v7.6.5.32.tgz.
Cudnn basically just copies files in the right places (do not actually install anything that is). So, an extraction of the compressed file and copy to the folder would suffice:
$ sudo cp cuda/include/cudnn.h /usr/local/cuda-10.0/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda-10.0/lib64
$ sudo chmod a+r /usr/local/cuda-10.0/include/cudnn.h /usr/local/cuda-10.0/lib64/libcudnn*
Upto this point although installed the system is unaware of the presence of cuda 10.0. So, all call to it will fail as if non existent. We should update the relevant system environment for cuda 10.0. One way (there are others) system-wide is to create (in not existent) a /etc/profile.d/cuda.sh which will contain the update to the LD_LIBRARY_PATH variable. It should contain something like:
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-11.3/lib64:/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH
This command would normally do the work:
$ sudo sh -c ‘echo export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/cuda-11.3/lib64:/usr/local/cuda-10.0/lib64:\$LD_LIBRARY_PATH > /etc/profile.d/cuda.sh’
This requires a restart though to be evaluated I think. Anyway, this way the system will search for the relevant so files in:
a) /usr/local/cuda/lib64 (the default symbolic link) and it will fail
b) to the virtually same as the latter /usr/local/cuda-11.3/lib64 and also fail BUT it will search also
c) /usr/local/cuda-10.0/lib64 which will be successful.
The supported versions of python for cuda 10.0 ends with 3.7 so an older version should be installed. This means obligatory a virtual environment (since messing with system python is never not a good idea).
One can install python 3.7 for example using this repository which contains old (and new versions of python):
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get install python3.7
This just installs python3.7 to the system it does not make it default. The default is the previous one.
Create a virtual environment and add the desired python as the default interpreter. For me this works:
virtualenv -p python3.7 ~/tensorflow_1-15
which creates a new venv with Python 3.7 in it.
Now populate with all required modules and you are set to go.
I went ahead and went with the docker approach. The Tensorflow documentation seems to be pushing in that direction anyway. Using docker only the Nvidia driver needs to be installed. You do need to have nvidia support installed in docker for it to work.
This contains the CUDA environment with the Tensorflow version so I can work with 1.15 and with the latest 2.x versions of Tensorflow on the same computer which require different CUDA versions.
It doesn't install anything besides docker stuff to get messy on the computer and difficult to pull back out.
I can still install Tensorflow natively on the computer at some point in the future when the libraries become availabe without compiling from source.
Here is the command which launches jupyter and mounts the current directory from my computer to /tf/bob which shows up in jupyter.
docker run -it --mount type=bind,source="$(pwd)",target=/tf/bob -u $(id -u):$(id -g) -p 8888:8888 tensorflow/tensorflow:1.15.2-gpu-py3-jupyter

Tensorflow will not run on GPU

I'm a newbie when it comes to AWS and Tensorflow and I've been learning about CNNs over the last week via Udacity's Machine Learning course.
Now I've a need to use an AWS instance of a GPU. I launched a p2.xlarge instance of Deep Learning AMI with Source Code (CUDA 8, Ubuntu) (that's what they recommended)
But now, it seems that tensorflow is not using the GPU at all. It's still training using the CPU. I did some searching and I found some answers to this problem and none of them seemed to work.
When I run the Jupyter notebook, it still uses the CPU
What do I do to get it to run on the GPU and not the CPU?
The problem of tensorflow not detecting GPU can possibly be due to one of the following reasons.
Only the tensorflow CPU version is installed in the system.
Both tensorflow CPU and GPU versions are installed in the system, but the Python environment is preferring CPU version over GPU version.
Before proceeding to solve the issue, we assume that the installed environment is an AWS Deep Learning AMI having CUDA 8.0 and tensorflow version 1.4.1 installed. This assumption is derived from the discussion in comments.
To solve the problem, we proceed as follows:
Check the installed version of tensorflow by executing the following command from the OS terminal.
pip freeze | grep tensorflow
If only the CPU version is installed, then remove it and install the GPU version by executing the following commands.
pip uninstall tensorflow
pip install tensorflow-gpu==1.4.1
If both CPU and GPU versions are installed, then remove both of them, and install the GPU version only.
pip uninstall tensorflow
pip uninstall tensorflow-gpu
pip install tensorflow-gpu==1.4.1
At this point, if all the dependencies of tensorflow are installed correctly, tensorflow GPU version should work fine. A common error at this stage (as encountered by OP) is the missing cuDNN library which can result in following error while importing tensorflow into a python module
ImportError: libcudnn.so.6: cannot open shared object file: No such
file or directory
It can be fixed by installing the correct version of NVIDIA's cuDNN library. Tensorflow version 1.4.1 depends upon cuDNN version 6.0 and CUDA 8, so we download the corresponding version from cuDNN archive page (Download Link). We have to login to the NVIDIA developer account to be able to download the file, therefore it is not possible to download it using command line tools such as wget or curl. A possible solution is to download the file on host system and use scp to copy it onto AWS.
Once copied to AWS, extract the file using the following command:
tar -xzvf cudnn-8.0-linux-x64-v6.0.tgz
The extracted directory should have structure similar to the CUDA toolkit installation directory. Assuming that CUDA toolkit is installed in the directory /usr/local/cuda, we can install cuDNN by copying the files from the downloaded archive into corresponding folders of CUDA Toolkit installation directory followed by linker update command ldconfig as follows:
cp cuda/include/* /usr/local/cuda/include
cp cuda/lib64/* /usr/local/cuda/lib64
ldconfig
After this, we should be able to import tensorflow GPU version into our python modules.
A few considerations:
If we are using Python3, pip should be replaced with pip3.
Depending upon user privileges, the commands pip, cp and ldconfig may require to be run as sudo.

TensorFlow on Windows: "not a supported wheel on this platform" error

Was happy to know Tensorflow is made available for Windows and we don't have to use Docker.
I tried to install as per instructions but I get this error.
pip install --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-0.12.0rc0-cp35-cp35m-win_amd64.whl
tensorflow-0.12.0rc0-cp35-cp35m-win_amd64.whl is not a supported wheel on this platform.
What does that error mean?
I am running latest version of Python.
python --version
Python 3.5.2
This is most likely to be a 64-bit versus 32-bit issue. The pre-built TensorFlow pip package is 64-bit only, but the default version of Python 3.5.2 on Python.org is 32-bit. You can download the 64-bit release from here (select one of the "Windows x86-64" options).
It's only available for Python 3.5.x not 3.6.
You can quickly create a 3.5 environment with:
conda create -n tensorflow python=3.5
You must have a 3.5.x version of Python. The 3.6 version won't work.
If you have installed an Anaconda that contains Python 3.6, you need to downgrade its Python to 3.5.2.
Open the Anaconda Prompt as administrator, and run:
conda install python=3.5.2
After the installation is finished, you can follow the rest of the steps on tensorflow website.
Do you have Python and Anaconda installed? I had a similar issue until I uninstalled Anaconda and then the setup was fine.
I did the following steps and it worked.(Anaconda 4.4 x64)
1- Go to Windows 10 command prompt (right click and Run as admin)
2- if activated the path, you can run conda anywhere, if not, should go to .../anaconda3/scripts and run conda command from there and do the following (the main trick was to change 35 to 36)
1- conda -n tensorflow python=3.5
2- activate tensorflow
3- pip install --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-0.12.0rc0-cp36-cp36m-win_amd64.whl
issues fixed when i did the downgrading from 3.6 to 3.5 using the below
conda install python=3.5.2
There can be two reasons:
1) You are using 32-Bit python package. Tensorflow does not support 32 bit, only 64 fit.
Check in your system settings for this. If this is fine refer to second point..
2)You are using Python 3.7.
Python 3.7 isn't eventually officially supported by Python. It's still in beta testing,
and very much under active development.
Consider downgrading to a lower version of python. For now, stick with Python 3.6 or 3.5.