Pandas does not plot if label includes Turkish Characters like ş,ö,ü - matplotlib

My problem, I can't plot when label include turkish characters like ş,ö,ü,İ,Ğ,ğ
it just gives this output --> matplotlib.figure.Figure at 0x7f8499386050 not show the graph
How can I fix it?
here is my code:
def draw_pie(year,y,by):
v=data[data['yil']==year]
v=v.sort(y,ascending=False).head(20)
v=v.groupby(by).aggregate('sum')
veri= v[ y ]
veri.plot(figsize=(10,10),kind='pie', labels=v.index,
autopct='%.2f', fontsize=20)
plt.show()
draw_pie(2014,'toplam_hasilat','tur')
don't show beacuse tur contains 'Aşk' , 'Gençlik'
it is okay without turkish characters.
Thank you in advance

Assuming you decide to use universal encoding:
You first have to tell the source encoding is utf-8 (first line below). Then you have to decode the string as utf-8. If python3 it should work, if python2 you can tell it to behave like python3:
Example (Python 2.7.8)
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as plt
plt.plot(range(10), range(10))
plt.title("Simple Plot şöüİĞğ")
plt.show()
You can also explicitely decode each string but that might not be very convenient (two solutions below)
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
plt.plot(range(10), range(10))
plt.title("Simple Plot şöüİĞğ".decode('utf-8'))
plt.xlabel(u"Simple Plot şöüİĞğ")
plt.show()

I've used the solution for my problem as follows.
I was trying to read data from Excel with Turkish fonts. Pandas failed in plotting the data in my Jupyter noteboook. Then I added
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
in the first cell, ran the all cells. Then I got my plots.

Related

Unable to generate plot using matplotlib

I am a beginner to Python and experimenting with a plot. the script runs fine but plot does not show up.
the matplotlib and numpy libraries are installed.
import numpy as np
f= h5py.File('3DIMG_05JUN2021_0000_L3B_HEM_DLY.h5','r')
#Studying the structure of the file by printing what HDF5 groups are present
for key in f.keys():
print(key) #Names of the groups in HDF5 file.
# will print the variables in the file
#Get the HDF5 group
ls=list(f.keys())
print("ls")
print(ls)
tsurf = f['HEM_DLY'][:]
print("tsurf")
print(tsurf)
tsurf1=np.squeeze(tsurf)
print(tsurf1.shape)
import matplotlib.pyplot as plt
im= plt.plot(tsurf1)
#plt.colorbar()
plt.imshow(im)```
Python version is 3 running on Ubuntu
Difficult to give you the exact answer without the dataset (please update the question with the dataset), but for sure, plt.plot does not return an object that can be plotted with plt.imshow
Try instead:
ax = plt.plot(tsurf1)
plt.show()
Probably the error was on the final plot.Try this:
import numpy as np
import matplotlib.pyplot as plt
f= h5py.File('/path','r')
ls=list(f.keys())
tsurf = f['your_key_str'][:]
tsurf1=np.squeeze(tsurf)
im= plt.plot(tsurf1)
plt.show(im) # <-- plt.show() NOT plt.imshow()

change decimal point to comma in matplotlib plot

I'm using python 2.7.13 with matplotlib 2.0.0 on Debian. I want to change the decimal marker to a comma in my matplotlib plot on both axes and annotations. However the solution posted here does not work for me. The locale option changes successfully the decimal point but does not imply it in the plot. How can I fix it? I would like to use the locale option in combination with the rcParams setup. Thank you for your help.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import numpy as np
#Locale settings
import locale
# Set to German locale to get comma decimal separater
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')
print locale.localeconv()
import numpy as np
import matplotlib.pyplot as plt
#plt.rcdefaults()
# Tell matplotlib to use the locale we set above
plt.rcParams['axes.formatter.use_locale'] = True
# make the figure and axes
fig,ax = plt.subplots(1)
# Some example data
x=np.arange(0,10,0.1)
y=np.sin(x)
# plot the data
ax.plot(x,y,'b-')
ax.plot([0,10],[0.8,0.8],'k-')
ax.text(2.3,0.85,0.8)
plt.savefig('test.png')
Here is the produced output: plot with point as decimal separator
I think that the answer lies in using Python's formatted print, see Format Specification Mini-Language. I quote:
Type: 'n'
Meaning: Number. This is the same as 'g', except that it uses the current locale setting to insert the appropriate number separator characters.
For example
import locale
locale.setlocale(locale.LC_ALL, 'de_DE')
'{0:n}'.format(1.1)
Gives '1,1'.
This can be applied to your example using matplotlib.ticker. It allows you to specify the print format for the ticks along the axis. Your example then becomes:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import locale
# apply German locale settings
locale.setlocale(locale.LC_ALL, 'de_DE')
# make the figure and axes
fig, ax = plt.subplots()
# some example data
x = np.arange(0,10,0.1)
y = np.sin(x)
# plot the data
ax.plot(x, y, 'b-')
ax.plot([0,10],[0.8,0.8],'k-')
# plot annotation
ax.text(2.3,0.85,'{:#.2n}'.format(0.8))
# reformat y-axis entries
ax.yaxis.set_major_formatter(ticker.StrMethodFormatter('{x:#.2n}'))
# save
plt.savefig('test.png')
plt.show()
Which results in
Note that there is one thing that is a bit disappointing. Apparently the precision cannot be set with the n format. See this answer.

Matplotlib: emoji font does not work when using backend_pdf

I want to use the emoji-font "Symbola.ttf" to label my plots. This does work when I use plt.show(). But it does not work when using the backend_pdf. Only two emojis are shown in a mixed up order.
example images:
when using plt.show():
when using the backend_pdf:
example code:
Here is my code to produce these examples:
import matplotlib.backends.backend_pdf
import matplotlib.pyplot as plt
import emoji
from matplotlib.font_manager import FontProperties
emojis = [emoji.EMOJI_UNICODE[e] for e in list(emoji.EMOJI_UNICODE.keys())[620:630]]
prop = FontProperties(fname='./Symbola.ttf', size=30)
# backend_pdf plot
pdf = matplotlib.backends.backend_pdf.PdfPages("output.pdf")
plt.xticks(range(len(emojis)), emojis, fontproperties=prop)
pdf.savefig()
pdf.close()
# plt.show() plot
plt.xticks(range(len(emojis)), emojis, fontproperties=prop)
plt.show()
I'm running this on a Linux machine.
I think I have found the problem. It seems that my Symbola.ttf was broken. When I use this .ttf file everything works great.

Solving error "delimiter must be a 1-character string" while writing a dataframe to a csv file

Using this question: Pandas writing dataframe to CSV file as a model, I wrote the following code to make a csv file:
df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep='\s+', header=True)
But it returns the following error:
TypeError: "delimiter" must be an 1-character string
I have looked up the documentation for this here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html but I can't figure out what I am missing, or what that error means. I also tried using (sep='\s') in the code, but got the same error.
As mentioned in the issue discussion (here), this is not considered as a pandas issue but rather a compatibility issue of python's csv module with python2.x.
The workaround to solve it is to enclose the separator with str(..). For example, here is how you can reproduce the problem, and then solve it:
from __future__ import unicode_literals
import pandas as pd
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=',')
This will raise the following error:
TypeError ....
----> 1 df.to_csv(sep=',')
TypeError: "delimiter" must be an 1-character string
The following however, will show the expected result
from __future__ import unicode_literals
import pandas as pd
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=str(','))
Output:
',0,1\n0,a,A\n1,b,B\n'
In your case, you should edit your code as follows:
df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep=str('\s+'), header=True)
Note that the although the solution to this error was using a string charcter instead of regex, pandas also raises this error when using from __future__ import unicode_literals with valid unicode characters. As of 2015-11-16, release 0.16.2, this error is still a known bug in pandas:
"to_csv chokes if not passed sep as a string, even when encoding is set to unicode" #6035
For example, where df is a pandas DataFrame:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd
df.to_csv(pdb_seq_fp, sep='\t', encoding='utf-8')
TypeError: "delimiter" must be an 1-character string
Using a byte lteral with the specified encoding (default utf-8 with Python 3) -*- coding: utf-8 -*- will resolve this in pandas 0.16.2: (b'\t') —I haven't tested with previous versions or 0.17.0.
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd
df.to_csv(pdb_seq_fp, sep=b'\t', encoding='utf-8')
(Note that with versions 0.13.0 - ???, it was necessary to use pandas.compat import u; but by 0.16.2 the byte literal is the way to go.)

How can I fit my plots from measured data?

How can I fit my plots?
Up to now, I've got the following code, which plots a variety of graphs from an array (data from an experiment) as it is placed in a loop:
import matplotlib as plt
plt.figure(6)
plt.semilogx(Tau_Array, Correlation_Array, '+-')
plt.ylabel('Correlation')
plt.xlabel('Tau')
plt.title("APD" + str(detector) + "_Correlations_log_graph")
plt.savefig(DataFolder + "/APD" + str(detector) + "_Correlations_log_graph.png")
This works so far with a logarithmic plot, but I am wondering how the fitting process could work right here. In the end I would like to have some kind of a formula or/and a graph which best describes the data I measured.
I would be pleased if someone could help me!
You can use curve_fit from scipy.optimize for this. Here is an example
# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x,a):
return np.exp(a*x)
x,y,z = np.loadtxt("fit3.dat",unpack=True)
popt,pcov = curve_fit(func,x,y)
y_fit = np.exp(popt[0]*x)
plt.plot(x,y,'o')
plt.errorbar(x,y,yerr=z)
plt.plot(x,y_fit)
plt.savefig("fit3_plot.png")
plt.show()
In yourcase, you can define the func accordingly. popt is an array containing the value of your fitting parameters.