Unable to use seaborn.countplot - pandas

I'm trying to plot some graphs using the latest version of Pycharm as a Python IDE.
As an interpreter, I'm using Anaconda with Python 3.4.3-0.
I have installed using conda install the news version of pandas (0.17.0), seaborn (0.6.0), numpy (1.10.1), matplotlib (1.4.3), ipython (4.0.1)
Inside the nesarc_pds.csv I have this:
IDNUM,S1Q2I
39191,1
39787,1
40082,1
40189,1
40226,1
40637,1
41306,1
41627,1
41710,1
42113,1
42120,1
42720,1
42909,1
43092,1
7,2
15,2
25,2
40,2
46,2
49,2
57,2
63,2
68,2
100,2
104,2
116,2
125,2
136,2
137,2
145,2
168,2
3787,9
6554,9
7616,9
11686,9
12431,9
14889,9
17694,9
19440,9
20141,9
21540,9
22476,9
24207,9
25762,9
29045,9
29731,9
So, that being said, this is my code:
import pandas as pd
import numpy
import seaborn as snb
import matplotlib.pyplot as plt
data = pd.read_csv("nesarc_pds.csv", low_memory=False)
#converting variable to numeric
pd.to_numeric(data["S1Q2I"], errors='coerce')
#setting a new dataset...
sub1=data[(data["S1Q2I"]==1) & (data["S3BQ1A5"]==1)]
sub2 = sub1.copy()
#setting the missing data 9 = unknown into NaN
sub2["S1Q2I"] = sub2["S1Q2I"].replace(9, numpy.nan)
#setting date to categorical type
sub2["S1Q2I"] = sub2["S1Q2I"].astype('category')
#plotting
snb.countplot(x="S1Q2I", data=sub2)
plt.xlabel("blablabla")
plt.title("lalala")
And then.....this is the error:
Traceback (most recent call last):
File "C:/Users/LPForGE_1/PycharmProjects/guido/haha.py", line 49, in <module>
snb.countplot(x="S1Q2I", data=sub2)
File "C:\Anaconda3\lib\site-packages\seaborn\categorical.py", line 2544, in countplot
errcolor)
File "C:\Anaconda3\lib\site-packages\seaborn\categorical.py", line 1263, in __init__
self.establish_colors(color, palette, saturation)
File "C:\Anaconda3\lib\site-packages\seaborn\categorical.py", line 300, in establish_colors
l = min(light_vals) * .6
ValueError: min() arg is an empty sequence
Any help would be really nice. I pretty much exhausted my intelligence trying to understand how to solve this.

Related

cannot import name 'int' from 'numpy'

I was just getting started with PyCharm and python for statistics.
And I got this error:
ImportError: cannot import name 'int' from 'numpy' (/home/tetiana/.local/lib/python3.8/site-packages/numpy/init.py)
Full traceback looks like this:
Traceback (most recent call last):
File "/home/tetiana/forVScode/python/first/first_try.py", line 1, in
from scipy import stats
File "/usr/lib/python3/dist-packages/scipy/stats/init.py", line 379, in
from .stats import *
File "/usr/lib/python3/dist-packages/scipy/stats/stats.py", line 180, in
import scipy.special as special
File "/usr/lib/python3/dist-packages/scipy/special/init.py", line 643, in
from .basic import *
File "/usr/lib/python3/dist-packages/scipy/special/basic.py", line 19, in
from . import orthogonal
File "/usr/lib/python3/dist-packages/scipy/special/orthogonal.py", line 81, in
from numpy import (exp, inf, pi, sqrt, floor, sin, cos, around, int,
ImportError: cannot import name 'int' from 'numpy' (/home/tetiana/.local/lib/python3.8/site-packages/numpy/init.py)
Process finished with exit code 1
How can I fix it?
Here is my code:
from scipy import stats
import pandas as pd
state = pd.read_csv('state_murder_rate_test_table.csv')
state['Population'].mean()
stats.trim_mean(state['Population'], 0.1)
state['Population'].median()
I checked whether the Python versions in os and in the project match and they are. I have python 3.8.10 and my os is Ubuntu 20.04
Referring to the current numpy documentation, there exists no type called numpy.int that you can import. I believe that the type you wanna import is numpy.integer or numpy.int_.
The code you provided does not have any statement like: from numpy import int. If you could provide a full traceback error, it'll be easier to see where the error stems from.
I hope this answer will be somewhat useful.

Seaborn fails to plot heatmap for a particular feature (titanic dataset)

I am working with some neural networks and I am struggling to plot a correlation heatmap for the titanic dataset using seaborn. To be concise: it seems that there is a problem with the 'n_siblings_spouses' features during the plotting. I don't know if the problem is due to the feature itself (spacing, maybe?) or if there is an intrinsic issue with seaborn.
Would it be possible to solve the issue without the need to remove the feature from the dataset?
Here is a MWE. And thanks in advance!
from __future__ import absolute_import,division,print_function,unicode_literals
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
from matplotlib import rc, font_manager
%matplotlib inline
from IPython.display import clear_output
from six.moves import urllib
import tensorflow.compat.v2.feature_column as fc
import tensorflow as tf
import seaborn as sns
rc('text', usetex=True)
matplotlib.rcParams['text.latex.preamble'] = [r'\usepackage{amsmath}']
# only if needed
#!apt install texlive-fonts-recommended texlive-fonts-extra cm-super dvipng
plt.rc('font', family='serif')
# URL address of data
TRAIN_DATA_URL = "https://storage.googleapis.com/tf-datasets/titanic/train.csv"
# Downloading data
train_file_path = tf.keras.utils.get_file("train.csv", TRAIN_DATA_URL)
# Setting numpy default values.
np.set_printoptions(precision=3, suppress=True)
# Reading data
data_train = pd.read_csv(train_file_path)
print("\n TRAIN DATA SET")
print(data_train.head(),"\n")
def heatMap(df):
#Create Correlation df
corr = df.corr()
#Plot figsize
fig, ax = plt.subplots(figsize=(10, 10))
#Generate Color Map
colormap = sns.diverging_palette(220, 10, as_cmap=True)
#Generate Heat Map, allow annotations and place floats in map
sns.heatmap(corr, cmap=colormap, annot=True, fmt=".2f")
#Apply xticks
plt.xticks(range(len(corr.columns)), corr.columns);
#Apply yticks
plt.yticks(range(len(corr.columns)), corr.columns)
#show plot
plt.show()
heatMap(data_train)
Here is the issue that is raised when trying to execute the heatMap function (I am working in Colab. However, this also happens in console):
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/texmanager.py in _run_checked_subprocess(self, command, tex)
305 cwd=self.texcache,
--> 306 stderr=subprocess.STDOUT)
307 except FileNotFoundError as exc:
22 frames
CalledProcessError: Command '['latex', '-interaction=nonstopmode', '--halt-on-error', '/root/.cache/matplotlib/tex.cache/bf616eae1512bede263889c8e1d8fb21.tex']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/texmanager.py in _run_checked_subprocess(self, command, tex)
317 prog=command[0],
318 tex=tex.encode('unicode_escape'),
--> 319 exc=exc.output.decode('utf-8'))) from exc
320 _log.debug(report)
321 return report
RuntimeError: latex was not able to process the following string:
b'n_siblings_spouses'
Here is the full report generated by latex:
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=latex)
restricted \write18 enabled.
entering extended mode
(/root/.cache/matplotlib/tex.cache/bf616eae1512bede263889c8e1d8fb21.tex
LaTeX2e <2017-04-15>
Babel <3.18> and hyphenation patterns for 3 language(s) loaded.
(/usr/share/texlive/texmf-dist/tex/latex/base/article.cls
Document Class: article 2014/09/29 v1.4h Standard LaTeX document class
(/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))
(/usr/share/texlive/texmf-dist/tex/latex/type1cm/type1cm.sty)
(/usr/share/texmf/tex/latex/cm-super/type1ec.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/t1cmr.fd))
(/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def))
(/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def
(/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu)
(/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu)
(/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu)
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu)))
(/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty
(/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty)
(/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)
Package geometry Warning: Over-specification in `h'-direction.
`width' (5058.9pt) is ignored.
Package geometry Warning: Over-specification in `v'-direction.
`height' (5058.9pt) is ignored.
) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty
For additional information on amsmath, use the `?' option.
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty))
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty)
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty))
(./bf616eae1512bede263889c8e1d8fb21.aux)
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd)
*geometry* driver: auto-detecting
*geometry* detected driver: dvips
! Missing $ inserted.
<inserted text>
$
l.19 {\rmfamily n_
siblings_spouses}
No pages of output.
Transcript written on bf616eae1512bede263889c8e1d8fb21.log.
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/texmanager.py in _run_checked_subprocess(self, command, tex)
305 cwd=self.texcache,
--> 306 stderr=subprocess.STDOUT)
307 except FileNotFoundError as exc:
21 frames
CalledProcessError: Command '['latex', '-interaction=nonstopmode', '--halt-on-error', '/root/.cache/matplotlib/tex.cache/bf616eae1512bede263889c8e1d8fb21.tex']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/matplotlib/texmanager.py in _run_checked_subprocess(self, command, tex)
317 prog=command[0],
318 tex=tex.encode('unicode_escape'),
--> 319 exc=exc.output.decode('utf-8'))) from exc
320 _log.debug(report)
321 return report
RuntimeError: latex was not able to process the following string:
b'n_siblings_spouses'
Here is the full report generated by latex:
This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017/Debian) (preloaded format=latex)
restricted \write18 enabled.
entering extended mode
(/root/.cache/matplotlib/tex.cache/bf616eae1512bede263889c8e1d8fb21.tex
LaTeX2e <2017-04-15>
Babel <3.18> and hyphenation patterns for 3 language(s) loaded.
(/usr/share/texlive/texmf-dist/tex/latex/base/article.cls
Document Class: article 2014/09/29 v1.4h Standard LaTeX document class
(/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))
(/usr/share/texlive/texmf-dist/tex/latex/type1cm/type1cm.sty)
(/usr/share/texmf/tex/latex/cm-super/type1ec.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/t1cmr.fd))
(/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def))
(/usr/share/texlive/texmf-dist/tex/latex/base/inputenc.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/utf8.def
(/usr/share/texlive/texmf-dist/tex/latex/base/t1enc.dfu)
(/usr/share/texlive/texmf-dist/tex/latex/base/ot1enc.dfu)
(/usr/share/texlive/texmf-dist/tex/latex/base/omsenc.dfu)
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.dfu)))
(/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty
(/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty)
(/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)
Package geometry Warning: Over-specification in `h'-direction.
`width' (5058.9pt) is ignored.
Package geometry Warning: Over-specification in `v'-direction.
`height' (5058.9pt) is ignored.
) (/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsmath.sty
For additional information on amsmath, use the `?' option.
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amstext.sty
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsgen.sty))
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsbsy.sty)
(/usr/share/texlive/texmf-dist/tex/latex/amsmath/amsopn.sty))
(./bf616eae1512bede263889c8e1d8fb21.aux)
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd)
*geometry* driver: auto-detecting
*geometry* detected driver: dvips
! Missing $ inserted.
<inserted text>
$
l.19 {\rmfamily n_
siblings_spouses}
No pages of output.
Transcript written on bf616eae1512bede263889c8e1d8fb21.log.
<Figure size 720x720 with 2 Axes>
To solve this problem, I came across this information that Colab needs a Tex-related module. There was also an excellent answer to SO.
You will need to install the following
! sudo apt-get install texlive-latex-recommended
! sudo apt-get install dvipng texlive-fonts-recommended
! wget http://mirrors.ctan.org/macros/latex/contrib/type1cm.zip
! unzip type1cm.zip -d /tmp/type1cm
! cd /tmp/type1cm/type1cm/ && sudo latex type1cm.ins
! sudo mkdir /usr/share/texmf/tex/latex/type1cm
! sudo cp /tmp/type1cm/type1cm/type1cm.sty /usr/share/texmf/tex/latex/type1cm
! sudo texhash
! sudo apt install cm-super
from __future__ import absolute_import,division,print_function,unicode_literals
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
# from matplotlib import rc, font_manager
%matplotlib inline
from IPython.display import clear_output
from six.moves import urllib
import tensorflow.compat.v2.feature_column as fc
import tensorflow as tf
import seaborn as sns
# rc('text', usetex=True)
# matplotlib.rcParams['text.latex.preamble'] = [r'\usepackage{amsmath}']
# only if needed
#!apt install texlive-fonts-recommended texlive-fonts-extra cm-super dvipng
# plt.rc('font', family='serif')
# URL address of data
TRAIN_DATA_URL = "https://storage.googleapis.com/tf-datasets/titanic/train.csv"
# Downloading data
train_file_path = tf.keras.utils.get_file("/content/sample_data/train.csv", TRAIN_DATA_URL)
# Setting numpy default values.
np.set_printoptions(precision=3, suppress=True)
# Reading data
data_train = pd.read_csv(train_file_path)
print("\n TRAIN DATA SET")
print(data_train.head(),"\n")
def heatMap(df):
#Create Correlation df
corr = df.corr()
print(corr)
#Plot figsize
fig, ax = plt.subplots(figsize=(10, 10))
#Generate Color Map
colormap = sns.diverging_palette(220, 10, as_cmap=True)
#Generate Heat Map, allow annotations and place floats in map
sns.heatmap(corr, cmap=colormap, annot=True, fmt=".2f")
#Apply xticks
plt.xticks(range(len(corr.columns)), corr.columns);
#Apply yticks
plt.yticks(range(len(corr.columns)), corr.columns)
#show plot
plt.show()
heatMap(data_train)

when i import numpy and pandas in jupyter it gives error same in spider but in spider works after starting new kernel

When I import numpy and pandas in jupyter it gives error same in spider but in spider works after starting new kernel.
import numpy as np
NameError Traceback (most recent call last)
<ipython-input-1-0aa0b027fcb6> in <module>
----> 1 import numpy as np
~\numpy.py in <module>
1 from numpy import*
2
----> 3 arr = array([1,2,3,4])
NameError: name 'array' is not defined
this is showing "NameError" which is due to the
arr=array([1,2,3,4])
you should try something like this
arr=np.array([1,2,3,4])
I found the error. It was a very bad mistake my c files have program numpy.py so while importing numpy python was accessing that file not the numpy module. So i deleted that and everything worked fine.
Try this:
arr=np.array([1,2,3,4])
As you are using numpy as np, to create an array the following syntax is needed:
arr=np.array([1,2,3])

ImportError: No module named 'matplotlib.pyplot'; matplotlib is not a package

I am trying to use matplotlib for real-time analysis from ECG-signals, but the problem starts even before.
I use the PyCharm IDE, currently working with Python 3.3 and my os is Windows 8.1.
For Matplotlib I downloaded matplotlib and the dependencies (numpy, six, dateutil, pyparsing, pytz) from here (the versions for Python 3.3): http://www.lfd.uci.edu/~gohlke/pythonlibs/ and installed it in the Python33 folder.
Now if I try:
from matplotlib.pyplot import plot, show
plot(range(10))
show()
or:
import pylab
from pylab import *
xAchse=pylab.arange(0,100,1)
yAchse=pylab.array([0]*100)
fig = pylab.figure(1)
ax = fig.add_subplot(111)
ax.grid(True)
ax.set_title("Realtime Waveform Plot")
ax.set_xlabel("Time")
ax.set_ylabel("Amplitude")
ax.axis([0,100,-1.5,1.5])
line1=ax.plot(xAchse,yAchse,'-')
manager = pylab.get_current_fig_manager()
values=[]
values = [0 for x in range(100)]
Ta=0.01
fa=1.0/Ta
fcos=3.5
Konstant=cos(2*pi*fcos*Ta)
T0=1.0
T1=Konstant
def SinwaveformGenerator(arg):
global values,T1,Konstant,T0
#ohmegaCos=arccos(T1)/Ta
#print "fcos=", ohmegaCos/(2*pi), "Hz"
Tnext=((Konstant*T1)*2)-T0
if len(values)%100>70:
values.append(random()*2-1)
else:
values.append(Tnext)
T0=T1
T1=Tnext
def RealtimePloter(arg):
global values
CurrentXAxis=pylab.arange(len(values)-100,len(values),1)
line1[0].set_data(CurrentXAxis,pylab.array(values[-100:]))
ax.axis([CurrentXAxis.min(),CurrentXAxis.max(),-1.5,1.5])
manager.canvas.draw()
#manager.show()
timer = fig.canvas.new_timer(interval=20)
timer.add_callback(RealtimePloter, ())
timer2 = fig.canvas.new_timer(interval=20)
timer2.add_callback(SinwaveformGenerator, ())
timer.start()
timer2.start()
pylab.show()
For a smal test, I get two different Error's. For the first one it is the following:
Traceback (most recent call last):
File "<frozen importlib._bootstrap>", line 1519, in _find_and_load_unlocked
AttributeError: 'module' object has no attribute __path__
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/Timo/PycharmProjects/GUI_test/matplotlib.py", line 1, in <module>
from matplotlib.pyplot import plot, show
File "C:\Users\Timo\PycharmProjects\GUI_test\matplotlib.py", line 1, in <module>
from matplotlib.pyplot import plot, show
ImportError: No module named 'matplotlib.pyplot'; matplotlib is not a package
And for the second bigger example it is this:
Traceback (most recent call last):
File "C:/Users/Timo/PycharmProjects/GUI_test/matplotlib.py", line 1, in <module>
import pylab
File "C:\Python33\lib\site-packages\pylab.py", line 1, in <module>
from matplotlib.pylab import *
File "C:\Users\Timo\PycharmProjects\GUI_test\matplotlib.py", line 4, in <module>
xAchse=pylab.arange(0,100,1)
AttributeError: 'module' object has no attribute 'arange'
Afterwards I changed the imports to the ones Pycharm wanted me to use. from matplotlib import pylab but this only resulted in an ImportError. cannot import pylab
The funny thing is, if I run these small tests in the Python Console it works just fine, so my guess is that it has something to do with PyCharm...
I also tried to add the exact path from the matplotlib to the Path variable but that resulted in another Error.
Your current project folder C:/Users/Timo/PycharmProjects/GUI_test/matplotlib.py contains matplotlib.py which causes this issue. Change the filename to anything else, which is not a name of a python package.

How to resample a time series Pandas dataframe?

I am trying to resample 1 minute based data to day. I have tried the following code on IPython
import pandas as pd
import numpy as np
from pandas import Series, DataFrame, Panel
import matplotlib.pyplot as plt
%matplotlib inline
data = pd.read_csv("DATALOG_22_01_2014.csv",\
names = ['DATE','TIME','HUM1','TMP1','HUM2','TMP2','HUM3','TMP3','WS','WD'])
data.set_index(['DATE','TIME'])
data.resample('D',how=mean)
But I got the following error
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-75-aa63b6b16877> in <module>()
----> 1 data.resample('D', how=mean)
NameError: name 'mean' is not defined
Could you help me?
Thank you
Hugo
Try
data.resample('D', how='mean')
instead. Right now you're asking Python to pass the mean object to the resample method as the how argument, but you don't have one defined.