How to get most recent version of Pandas in Google Colab? - pandas

I have successfully downloaded a version of Python 3.10 to Colab, and Colab says I have the most up to date version of pandas. However, whenever I import pandas, the version is 1.3.5, while I need it to be up to date.
I tried restarting the kernel, but this did not fix things. The code and its output are below, excepting the long list I get when I download the newest version of Python.
!wget https://github.com/korakot/kora/releases/download/v0.10/py310.sh
!bash ./py310.sh -b -f -p /usr/local
!python -m ipykernel install --name "py310" --user
!python --version
Python 3.10.6
!pip3 install pandas --upgrade --no-deps
Requirement already satisfied: pandas in /usr/local/lib/python3.10/site-packages (1.5.2)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
import pandas as pd
print(pd.__version__)
1.3.5
Thank you so much for your time.

Related

tensorflow-data-validation cannot be pip installed

Since I'm moving away from pandas DataFrames to TensorFlow datasets, I'd like to use tensorflow-data-validation instead of the more traditional pandas-profiling when it comes to data exploration and validation.
However, pip install tensorflow-data-validation gives the following error:
ERROR: Could not find a version that satisfies the requirement tensorflow-data-validation (from versions: none)
ERROR: No matching distribution found for tensorflow-data-validation
What could be the problem? This old GitHub issue explains how this could be due to the Python version, but Apache Beam (on which tensorflow-data-validation presumably relies) is now fully compatible with Python 3, so it must be something else.
My environment is as follows:
Python 3.9.2
TensorFlow 2.6.0
Debian GNU/Linux 11 (bullseye)
pip 21.3
I got the same error when using Python 3.9. After downgrading to Python 3.8, pip install tensorflow-data-validation ran successfully.
Regarding your comment about Apache Beam, it looks like the Python SDK currently supports Python 3.8 (and earlier) but not yet Python 3.9.
My environment:
Python 3.8.10
TensorFlow 2.8.0
macOS Monterey (12.0.1)
pip 21.1.1
Try this
pip install --upgrade --force-reinstall tensorflow-data-validation[all]
It might be a version compatibility issue with tensorflow==2.6.0.
Try
pip install tensorflow-data-validation==1.3.0
I was able to install the tensorflow_data_validation library successfully, via the below command in my Google Colab file.
!pip install -U tensorflow \
tensorflow-data-validation \
apache-beam[gcp]

How to update to the latest version of sklearn in colab notebook

I am a beginner in machine learning. I am using colab as the primary development platform.
I would like to use the latest version of sklearn in my coding projects. However, colab's sklearn version is 0.22.
Can I update the scikit learn version in colab?
Thank you.
You mentioned that you have a 0.22 version, hopefully, you checked and received the following results (by default) in your colab notebook.
!pip list | grep scikit-learn
scikit-learn 0.22.2.post1
I presume that you already tried with the following command, which ideally installs the latest version of sckit-learn.
!pip install scikit-learn
Requirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (0.22.2.post1)
Requirement already satisfied: numpy>=1.11.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn) (1.19.5)
Requirement already satisfied: joblib>=0.11 in /usr/local/lib/python3.7/dist-packages (from scikit-learn) (1.0.1)
Requirement already satisfied: scipy>=0.17.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn) (1.4.1)
As per the latest scikit-learn document, 0.24.2 is the latest version, which you can install/upgrade by overriding the version number as shown in the following script. It should help you if you want to upgrade to 0.24.2.
!pip install scikit-learn==0.24.2
Then, verify the scikit-learn version by following script.
!pip list | grep scikit-learn
scikit-learn 0.24.2
Also, alternative way to upgrade the package in a colab environment.
!pip install scikit-learn --upgrade
Ideally, you'll need to install it each time, however, some workaround such as you can save the current configuration into your google drive. Refer to the following script which was suggested in another post.
from google.colab import drive
drive.mount('/content/gdrive')
pip freeze --local > /content/gdrive/My\ Drive/colab_installed.txt
Refer to the following script to restore the environment from the file.
from google.colab import drive
drive.mount('/content/gdrive')
pip install --upgrade --force-reinstall `cat /content/gdrive/My\ Drive/colab_installed.txt`

Where to find the existing package installed in the google Colab

I want to check if some packages are installed in the Colab. What is specific folder for storing the installed packages (e.g., keras)?
You can use the pip tool to list installed Python packages and their locations on the system:
!pip list -v | grep [Kk]eras
# Keras 2.2.5 /usr/local/lib/python3.6/dist-packages pip
# Keras-Applications 1.0.8 /usr/local/lib/python3.6/dist-packages pip
# Keras-Preprocessing 1.1.0 /usr/local/lib/python3.6/dist-packages pip
# keras-vis 0.4.1 /usr/local/lib/python3.6/dist-packages pip
Note that in Colab and other Jupyter notebook frontends, the ! character is used to execute a shell command.
If you are not able to find the package, you might have probably downloaded the package instead of installing it.
Incorrect command:
!pip download transformers
Correct command:
!pip install transformers
Next, in order to find the package run the following command:
!pip show transformers

Transfer code from Colab to Jupyter Notebook

I try to transfer some code from Colab to Jupyter notebook.
The code in Colab is:
# Use some functions from tensorflow_docs
!pip install -q git+https://github.com/tensorflow/docs
I get the Error:
ERROR: Could not detect requirement name for 'git+https://github.com/tensorflow/docs', please specify one with #egg=your_package_name"
Also if it try it without "!" in the beginning:
pip install -q git+https://github.com/tensorflow/docs
I get an Error:
File "<ipython-input-11-8fda094c7d6e>", line 5
pip install -q git+https://github.com/tensorflow/docs
^
SyntaxError: invalid syntax
Could someone please help me?
Makes sense that pip without the ! does not work as ! is used to invoke the bash shell within the Jupyter iPython environment.
I tried in a Google Cloud Platform Notebook (Jupyter Lab Version 1.1.4) the command using a Python version (Python 3.5.3):
!pip3 install -q git+https://github.com/tensorflow/docs --user
and worked perfectly.
!pip3 freeze | grep tensorflow
tensorflow==1.15.0
tensorflow-datasets==1.2.0
tensorflow-docs==0.0.0
tensorflow-estimator==1.15.1
tensorflow-hub==0.6.0
tensorflow-io==0.8.0
tensorflow-metadata==0.15.0
tensorflow-probability==0.8.0
tensorflow-serving-api==1.14.0
Also if it try it without "!" in the beginning:
You need to ! in the beginning if you're trying to run a bash command. Otherwise, Jupyter will try to run it as python, which won't work.
Do you have any other code in your notebook? When I tried the code below in the most recent version of Colab and python3, it worked for me:
!pip install git+https://github.com/tensorflow/docs
produced:
Collecting git+https://github.com/tensorflow/docs
Cloning https://github.com/tensorflow/docs to /tmp/pip-req-build-mrqr1fk8
Running command git clone -q https://github.com/tensorflow/docs /tmp/pip-req-build-mrqr1fk8
Requirement already satisfied (use --upgrade to upgrade): tensorflow-docs==0.0.0 from git+https://github.com/tensorflow/docs in /usr/local/lib/python3.6/dist-packages
Requirement already satisfied: astor in /usr/local/lib/python3.6/dist-packages (from tensorflow-docs==0.0.0) (0.8.0)
Requirement already satisfied: absl-py in /usr/local/lib/python3.6/dist-packages (from tensorflow-docs==0.0.0) (0.8.1)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from tensorflow-docs==0.0.0) (1.12.0)
Requirement already satisfied: pathlib2 in /usr/local/lib/python3.6/dist-packages (from tensorflow-docs==0.0.0) (2.3.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.6/dist-packages (from tensorflow-docs==0.0.0) (3.13)
Building wheels for collected packages: tensorflow-docs
Building wheel for tensorflow-docs (setup.py) ... done
Created wheel for tensorflow-docs: filename=tensorflow_docs-0.0.0-cp36-none-any.whl size=80507 sha256=bb4cb3656cd0f5954db502b9812d3ddd49cd1186042a300813874cf1ad84fd3f
Stored in directory: /tmp/pip-ephem-wheel-cache-yl2quvxi/wheels/eb/1b/35/fce87697be00d2fc63e0b4b395b0d9c7e391a10e98d9a0d97f
Successfully built tensorflow-docs
Have you tried resetting your runtime and running the code again? Do you have anything else in your notebook?
To find the problem I made a new environment with Anaconda prompt with the following lines:
conda create -n regression=3.7
conda activate regression
pip install ipykernel
python -m ipykernel install --user --name regression --display-name "regression"
conda install tensorflow-gpu
pip install keras
Than I activated the environment "regression" and started jupyter notebook
The code I will use is from here: here
I tried it again:
# Use seaborn for pairplot
!pip install -q seaborn
That works. But when I execute:
# Use some functions from tensorflow_docs
!pip install -q git+https://github.com/tensorflow/docs
I get now the Error:
" ERROR: Error [WinError 2] Das System kann die angegebene Datei nicht finden while executing command git clone -q https://github.com/tensorflow/docs 'C:\Users\MASTER~1\AppData\Local\Temp\pip-req-build-n2je0pjv'
ERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH?
"
With the code:
!pip3 install -q git+https://github.com/tensorflow/docs
I get the error:
"Der Befehl "pip3" ist entweder falsch geschrieben oder
konnte nicht gefunden werden."

ImportError: Can't determine version for tables

After I upgraded python2 to python3.7, I cannot use pandas to load hdf files any more. The following codes had no problems before, but after updating to python3.7, I got error message of "Can't determine version for tables."
My python version is 3.7, but there are still previous python2.7 paths. Please see the following:
$ python --version
Python 3.7.3
$ whereis python
python: /usr/bin/python /usr/bin/python2.7 /usr/bin/python2.7-config /usr/lib/python2.7 /usr/lib64/python2.7 /etc/python /usr/include/python2.7 /home/yun.wei/anaconda3/bin/python /home/yun.wei/anaconda3/bin/python3.7 /home/yun.wei/anaconda3/bin/python3.7-config /home/yun.wei/anaconda3/bin/python3.7m /home/yun.wei/anaconda3/bin/python3.7m-config /usr/share/man/man1/python.1.gz
Is this error due to old python versions?
import pandas as pd
filename = 'filename.h5'
df = pd.read_hdf(filename, key='data', mode='r')
ImportError: Can't determine version for tables
I ran into this error because I did not have pytables installed. I solved it by running
conda install pytables
Or if you use pip you should be able to run
pip install tables