cannot import name 'register_extension_dtype' while importing pandas - pandas

i am trying to import pandas but is giving me below error. Earlier it was giving me different errors but i fixed those. but now i am stuck on this one. Never had such a problem before while importing pandas.

Here are your to-do's -
Try shutting down and then restarting the notebook.
If 1 does not work, reinstall pandas using "conda install -f pandas" if using anaconda. Do not forget to shutdown and restart the notebook after the installation.

Related

why does matplotlib refuses to show plots in pycharm [duplicate]

I am trying to plot a simple graph using pyplot, e.g.:
import matplotlib.pyplot as plt
plt.plot([1,2,3],[5,7,4])
plt.show()
but the figure does not appear and I get the following message:
UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
I saw in several places that one had to change the configuration of matplotlib using the following:
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
I did this, but then got an error message because it cannot find a module:
ModuleNotFoundError: No module named 'tkinter'
Then, I tried to install "tkinter" using pip install tkinter (inside the virtual environment), but it does not find it:
Collecting tkinter
Could not find a version that satisfies the requirement tkinter (from versions: )
No matching distribution found for tkinter
I should also mention that I am running all this on Pycharm Community Edition IDE using a virtual environment, and that my operating system is Linux/Ubuntu 18.04.
I would like to know how I can solve this problem in order to be able to display the graph.
Solution 1: is to install the GUI backend tk
I found a solution to my problem (thanks to the help of ImportanceOfBeingErnest).
All I had to do was to install tkinter through the Linux bash terminal using the following command:
sudo apt-get install python3-tk
instead of installing it with pip or directly in the virtual environment in Pycharm.
Solution 2: install any of the matplotlib supported GUI backends
solution 1 works fine because you get a GUI backend... in this case the TkAgg
however you can also fix the issue by installing any of the matplolib GUI backends like Qt5Agg, GTKAgg, Qt4Agg, etc
for example pip install pyqt5 will fix the issue also
NOTE:
usually this error appears when you pip install matplotlib and you are trying to display a plot in a GUI window and you do not have a python module for GUI display.
The authors of matplotlib made the pypi software deps not depend on any GUI backend because some people need matplotlib without any GUI backend.
In my case, the error message was implying that I was working in a headless console. So plt.show() could not work. What worked was calling plt.savefig:
import matplotlib.pyplot as plt
plt.plot([1, 2, 3], [5, 7, 4])
plt.savefig("mygraph.png")
I found the answer on a github repository.
If you use Arch Linux (distributions like Manjaro or Antegros) simply type:
sudo pacman -S tk
And all will work perfectly!
Simple install
pip3 install PyQt5==5.9.2
It works for me.
Try import tkinter because pycharm already installed tkinter for you, I looked Install tkinter for Python
You can maybe try:
import tkinter
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('TkAgg')
plt.plot([1,2,3],[5,7,4])
plt.show()
as a tkinter-installing way
I've tried your way, it seems no error to run at my computer, it successfully shows the figure. maybe because pycharm have tkinter as a system package, so u don't need to install it. But if u can't find tkinter inside, you can go to Tkdocs to see the way of installing tkinter, as it mentions, tkinter is a core package for python.
I added %matplotlib inline
and my plot showed up in Jupyter Notebook.
The answer has been given a few times but it is not obvious, one needs to install graphics, this works.
pip3 install PyQt5
I too had this issue in PyCharm. This issue is because you don't have tkinter module in your machine.
To install follow the steps given below (select your appropriate os)
For ubuntu users
sudo apt-get install python-tk
or
sudo apt-get install python3-tk
For Centos users
sudo yum install python-tkinter
or
sudo yum install python3-tkinter
for Arch Users
sudo pacman -S tk
or
sudo pamac install tk
For Windows, use pip to install tk
After installing tkinter restart your Pycharm and run your code, it will work
This worked with R reticulate. Found it here.
1: matplotlib.use( 'tkagg' )
or
2: matplotlib$use( 'tkagg' )
For example:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import style
import matplotlib
matplotlib.use( 'tkagg' )
style.use("ggplot")
from sklearn import svm
x = [1, 5, 1.5, 8, 1, 9]
y = [2, 8, 1.8, 8, 0.6, 11]
plt.scatter(x,y)
plt.show()
If using Jupyter notebook try the following:
%matplotlib inline
This should render the plot even if not specifying the
plt.show()
command.
None of these answers worked for me using Pycharm Professional edition 2021.3
Regular matplotlib graphs did work on the scientific view, but it did not allow me to add images to the plots.
What did work for me is adding this line before I try plotting anything:
plt.switch_backend('TkAgg')
issue = “UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.”
And this worked for me
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('Qt5Agg')
For Windows 10, if using pip install tk does not work for you, try:
Download and run official python installer for windows. Even if you
already have it downloaded, run it again.
When (re)installing python, make sure you chose "advanced" options, and
set the checkbox "tcl/tk and IDLE" to true.
If you already had python installed, select the "Modify" option, and
make sure that checkbox is selected.
Source of my fix:
https://stackoverflow.com/a/59970646/2506354
I have solved it by putting matplotlib.use('TkAgg') after all import statements.
I use python 3.8.5 VSCODE and anaconda.
No other tricks worked.
I installed python3-tk , on Ubuntu 20.04 and using WSL2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use( 'tkagg')
and then I installed GWSL from the Windows Store which seems to solve problem of WSL2 rendering out of the box
This will solve the issue. It works well in jupyter.
%matplotlib inline
The comment by #xicocaio should be highlighted.
tkinter is python version-specific in the sense that sudo apt-get install python3-tk will install tkinter exclusively for your default version of python. Suppose you have different python versions within various virtual environments, you will have to install tkinter for the desired python version used in that virtual environment. For example, sudo apt-get install python3.7-tk. Not doing this will still lead to No module named ' tkinter' errors, even after installing it for the global python version.
On Mac OS, I made it work with:
import matplotlib
matplotlib.use('MacOSX')
Ubuntu 20.04 command line setup. I install the following to make Matplotlib stop throwing the error UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
I installed python-tk through the steps:
apt-get update
apt-get install python3.8-tk
Just in case if this helps anybody.
Python version: 3.7.7
platform: Ubuntu 18.04.4 LTS
This came with default python version 3.6.9, however I had installed my own 3.7.7 version python on it (installed building it from source)
tkinter was not working even when the help('module') shows tkinter in the list.
The following steps worked for me:
sudo apt-get install tk-dev.
rebuild the python:
1. Navigate to your python folder and run the checks:
cd Python-3.7.7
sudo ./configure --enable-optimizations
Build using make command:
sudo make -j 8 --- here 8 are the number of processors, check yours using nproc command.
Installing using:
sudo make altinstall
Don't use sudo make install, it will overwrite default 3.6.9 version, which might be messy later.
Check tkinter now
python3.7 -m tkinter
A windows box will pop up, your tkinter is ready now.
After upgrading lots of packages (Spyder 3 to 4, Keras and Tensorflow and lots of their dependencies), I had the same problem today! I cannot figure out what happened; but the (conda-based) virtual environment that kept using Spyder 3 did not have the problem. Although installing tkinter or changing the backend, via matplotlib.use('TkAgg) as shown above, or this nice post on how to change the backend, might well resolve the problem, I don't see these as rigid solutions. For me, uninstalling matplotlib and reinstalling it was magic and the problem was solved.
pip uninstall matplotlib
... then, install
pip install matplotlib
From all the above, this could be a package management problem, and BTW, I use both conda and pip, whenever feasible.
You can change the matplotlib using backend using the from agg to Tkinter TKAgg using command
matplotlib.use('TKAgg',warn=False, force=True)
Works if you use some third party code in your project. It probably contains the following line
matplotlib.use('Agg')
Search for it and comment it out.
If you have no clue about what it is you are probably not using this part of the code.
Solutions about using another backend GUI may be cleaner, so choose your fighter.
The solution that worked for me:
Install tkinter
import tkinter into the module
make sure that matplotlib uses (TkAgg) instead of (Agg)
matplotlib.use('TkAgg')
execute the following command before plotting
%matplotlib inline
Try:
%matplotlib inline
I had the same problem and it worked for me. I tested it on my Jupyter notebooks and visual studio code, so you should have no problems.
On WSL with X server
Make sure that your X server work. Matplotlib indicate this error if he can't connect to the X display.
Windows Firewall configuration
Pay attention to the windows firewall ! I changed from WSL Debian to Ubuntu and didn't remember about the firewall rule.
I use this post to configure the windows firewall rule to make the X server work. This method avoid too permisive rule that able anyone to use your X server.
It said :
If you already had installed an X11 server, Windows may have created firewall rules that will mess with the above configuration. Search for them and delete them in "Windows Defender Firewall with Advanced Security."
You will now need to configure Windows Firewall to permit connections from WSL2 to the X11 display server. You will install the display server in the next step. We do this step first to avoid Windows Firewall from auto-creating an insecure firewall rule when you run the X11 display server. Many guides on X11 forwarding and WSL2 make this firewall rule too permissive, allowing connections from any computer to your computer. This means someone could theoretically, if they are on your same network, start sending graphical display information to your computer.
To avoid this, we will make Windows Firewall only accept internet traffic from the WSL2 instance.
To set this up, you can copy the below to a script and run it from within WSL2:
#!/bin/sh
LINUX_IP=$(ip addr | awk '/inet / && !/127.0.0.1/ {split($2,a,"/"); print a[1]}')
WINDOWS_IP=$(ip route | awk '/^default/ {print $3}')
# Elevate to administrator status then run netsh to add firewall rule
powershell.exe -Command "Start-Process netsh.exe -ArgumentList \"advfirewall firewall add rule name=X11-Forwarding dir=in action=allow program=%ProgramFiles%\VcXsrv\vcxsrv.exe localip=$WINDOWS_IP remoteip=$LINUX_IP localport=6000 protocol=tcp\" -Verb RunAs"
Manual method :
Alternatively, you can manually add the rule through a GUI by doing the following:
Open "Windows Defender Firewall with Advanced Security"
Click add new rule brings up the New Rule Wizard (next to navigate between each section):
Rule type: Custom
Program: "This program path:" %ProgramFiles%\VcXsrv\vcxsrv.exe
Protocol and ports
Protocol type: TCP
Local port: 6000
Remote port: any
Scope
Local IP address: Obtain the IP address to put in by running the below command in WSL2
ip route | awk '/^default/ {print $3}'
remote IP addresses
Obtain IP address to enter by running the below in WSL2
ip addr | awk '/inet / && !/127.0.0.1/ {split($2,a,"/"); print a[1]}'
Action: "Allow the connection
Profile: Selection Domain, Private, and Public
Name: "X11 forwarding"
Linux Mint 19. Helped for me:
sudo apt install tk-dev
P.S. Recompile python interpreter after package install.
When I ran into this error on Spyder, I changed from running my code line by line to highlighting my block of plotting code and running that all at once. Voila, the image appeared.
If you install python versions using pyenv on Debian-based systems, be sure to run sudo apt install tk-dev before pyenv install. If it's already installed, remove it with pyenv uninstall and install it again after install tk-dev. Therefore, there is no need to set any env variables when running pyenv install.

Issue with 'pandas on spark' used with conda: "No module named 'pyspark.pandas'" even though both pyspark and pandas are installed

I have installed both Spark 3.1.3 and Anaconda 4.12.0 on Ubuntu 20.04.
I have set PYSPARK_PYTHON to be the python bin of a conda environment called my_env
export PYSPARK_PYTHON=~/anaconda3/envs/my_env/bin/python
I installed several packages on conda environment my_env using pip. Here is a portion of the output of pip freeze command:
numpy==1.22.3
pandas==1.4.1
py4j==0.10.9.3
pyarrow==7.0.0
N.B: package pyspark is not installed on the conda environment my_env. I would like to be able to launch a pyspark shell on different conda environments without having to reinstall pyspark in every environment (I would like to only modify PYSPARK_PYTHON). This would also avoids having different versions of Spark on different conda environments (which is sometimes desirable but not always).
When I launch a pyspark shell using pyspark command, I can indeed import pandas and numpy which confirms that PYSPARK_PYTHON is properly set (my_env is the only conda env with pandas and numpy installed, moreover pandas and numpy are not installed on any other python installation even outside conda, and finally if I change PYSPARK_PYTHON I am no longer able to import pandas or numpy).
Inside the pyspark shell, the following code works fine (creating and showing a toy Spark dataframe):
sc.parallelize([(1,2),(2,4),(3,5)]).toDF(["a", "b"]).show()
However, if I try to convert the above dataframe into a pandas on spark dataframe it does not work. The command
sc.parallelize([(1,2),(2,4),(3,5)]).toDF(["t", "a"]).to_pandas_on_spark()
returns:
AttributeError: 'DataFrame' object has no attribute 'to_pandas_on_spark'
I tried to first import pandas (which works fine) and then pyspark.pandas before running the above command but when I run
import pyspark.pandas as ps
I obtain the following error:
ModuleNotFoundError: No module named 'pyspark.pandas'
Any idea why this happens ?
Thanks in advance
From here, it seems that you need apache spark 3.2, not 3.1.3. Update to 3.2 and you will have the desired API.
pip install pyspark #need spark 3.3
import pyspark.pandas as ps

Python, Pandas datareader and Yahoo Error RemoteDataError: Unable to read URL

I am trying to download historical data from Yahoo using Pandas datareader. This is the code that I normally use:
import pandas_datareader as pdr
df = pdr.get_data_yahoo('SPY')
However, I started receiving this error today: RemoteDataError: Unable to read URL: https://finance.yahoo.com/quote/SPY/history?period1=1467511200&period2=1625277599&interval=1d&frequency=1d&filter=history
Does anyone know how to solve it?
Thank you very much in advance!
This has been answered here already. Since now requires headers, pandas and pandas-datareader must be updated. Other libraries working with pdr might give you issues until gets updated or you modify the part of the code which retreives data.
Have a nice day ;).
pip install --upgrade pandas
pip install --upgrade pandas-datareader
If you are using Colab, run:
!pip install --upgrade pandas-datareader
...
Installing collected packages: pandas-datareader
Attempting uninstall: pandas-datareader
Found existing installation: pandas-datareader 0.9.0
Uninstalling pandas-datareader-0.9.0:
Successfully uninstalled pandas-datareader-0.9.0
Successfully installed pandas-datareader-0.10.0
WARNING: The following packages were previously imported in this runtime:
[pandas_datareader]
You must restart the runtime in order to use newly installed versions.
Go to Runtime -> Restart runtime. Then you can import pandas_datareader and check that it's the right version:
import pandas_datareader
pandas_datareader.__version__ # Should show 0.10.0

pandas_profiling.ProfileReport(dataframe) in google colab

Why doesn't pandas_profiling.ProfileReport(dataframe) work in google colab?
Returns a type error.
TypeError: concat() got an unexpected keyword argument 'join_axes'
Just use pandas-profiling version 2.7.1 and you are good to go.Run this command in the colab !pip install pandas-profiling==2.7.1
Aishah Ismail's post on Medium may help you fix this issue.
Install the pandas-profiling package using pip.
! pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip
Restart your kernel = Go to "Runtime" in the option menu and click "Reset All Runtimes"
Execute your code to create your dataframe and create the pandas profile.
import pandas as pd
import numpy as np
from pandas_profiling import ProfileReport
df = pd.read_excel('fileName.xlsx')
profile = ProfileReport(df)
profile.to_notebook_iframe()
You may need to pip install pandas-profiling if the import above does not work.
!pip install pandas-profiling==2.7.1 Re-execute your code after the pip install.
When you try to display the profile do not use .to_widgets()--it isn't working in Colab.
If the above doesn't work, I suggest switching to Jupyter Lab or Jupyter Notebook. The pandas profile dashboard works well in the Jupyter environment.
I hope this helps! Pandas-Profiling a wonderful EDA tool--such a time saver.

ImportError: Install xlrd >= 0.9.0 for Excel support when using pd.readexcel to read .xlsx file : never happened before

Something strange is going on. Just today when trying to read in a dataframe from an xlsx file:
import pandas as pd
df = pd.read_excel('vlnew.xlsx',sheet_name='Sheet1')
I am getting the following error:
ImportError: Install xlrd >= 0.9.0 for Excel support
I am fully aware that plain and simple the instructions are to install xlrd, but I should not have to install xlrd when I was never getting this error before, and also, xlrd only applies to the old .xls file format. I am using .xlsx.
I can't understand why today all of a sudden this error is popping up. This is very strange indeed, at least to me.
Update:
When I execute this script in the Spyder IDE, I do not get the xlrd import error, but just today I ran this script in the Conda command prompt and only then does it report the xlrd error. Why are there inconsistencies between the Conda command prompt and Spyder IDE?
Try writing following command into the terminal
pip install xlrd
And then import the xlrd alongside with pandas
import xlrd and import pandas as pd
I was getting an error "ImportError: Install xlrd >= 1.0.0 for Excel support" on Pycharm for below code
import pandas as pd
df2 = pd.read_excel("data.xlsx")
print(df2.head(3))
print(df2.tail(3))
Solution : pip install xlrd
It resolved error after using this.
Also no need to use "import xlrd" in program
(2021.01.18)
NOTICE: the current version of "xlrd" reads only ".xls" files
to read ".xlsx" files install openpyxl package.
Just do it in your phyton frame (my is "repl.it") writing
import xlrd
or
openpyxl_
NOTICE: the current version of "xlrd" reads only ".xls" files
As mentioned by you and others correctly that xlrd needs to be installed, for using read_excel we require xlrd package.
This might be one of the possibility of compatibility difference between spyder and conda is that you might be using different conda environments for Spyder and prompt, one of which might contain xlrd package and other did not this has happens usually when we use different virtual environments for our work , it has happened to me many times.
You should try
pip install --upgrade xlrd
juste type
pip install xlrd
and use it like this
import xlrd
import pandas as pd
data=pd.read_excel('titanic3.xls')