Change number formatting on y axis in matplotlib graph - matplotlib

I've made a bar chart in matplotlib. In that chart I managed to change the formatting of the number on the y-axes. The only problem is that the thousand seperator is now a comma (','). Since I live in The Netherlands I would very much like to convert that to a dot ('.'). It looks like a simple thing but I can't find anywhere how to do it. Please see code below.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
top = [72000, 73000, 78000, 71000]
base = [15000, 12000, 13000, 14000]
x = ['2017', '2018', '2019', '2020']
x_ax = pd.Series(x)
fig, ax = plt.subplots()
ax.bar(x, top, label='TOP CUSTOMERS')
ax.bar(x, base, bottom=top, label='BASE CUSTOMERS')
ax.set_title('TON YTD JULY')
ax.legend(loc='lower center')
ax.set_ylabel('Ton')
ax.yaxis.set_major_formatter(mpl.ticker.StrMethodFormatter('{x:,.0f}'))
plt.savefig('C:/Users/MBRU/Box Sync/Data_Source/out/320/example.png', bbox_inches='tight')
plt.show()
Example of graph with code

Related

Directly annotate matplotlib stacked bar graph [duplicate]

This question already has answers here:
Annotate bars with values on Pandas bar plots
(4 answers)
Closed 1 year ago.
I would like to create an annotation to a bar chart that compares the value of the bar to two reference values. An overlay such as shown in the picture, a kind of staff gauge, is possible, but I'm open to more elegant solutions.
The bar chart is generated with the pandas API to matplotlib (e.g. data.plot(kind="bar")), so a plus would be if the solution is playing nicely with that.
You may use smaller bars for the target and benchmark indicators. Pandas cannot annotate bars automatically, but you can simply loop over the values and use matplotlib's pyplot.annotate instead.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
a = np.random.randint(5,15, size=5)
t = (a+np.random.normal(size=len(a))*2).round(2)
b = (a+np.random.normal(size=len(a))*2).round(2)
df = pd.DataFrame({"a":a, "t":t, "b":b})
fig, ax = plt.subplots()
df["a"].plot(kind='bar', ax=ax, legend=True)
df["b"].plot(kind='bar', position=0., width=0.1, color="lightblue",legend=True, ax=ax)
df["t"].plot(kind='bar', position=1., width=0.1, color="purple", legend=True, ax=ax)
for i, rows in df.iterrows():
plt.annotate(rows["a"], xy=(i, rows["a"]), rotation=0, color="C0")
plt.annotate(rows["b"], xy=(i+0.1, rows["b"]), color="lightblue", rotation=+20, ha="left")
plt.annotate(rows["t"], xy=(i-0.1, rows["t"]), color="purple", rotation=-20, ha="right")
ax.set_xlim(-1,len(df))
plt.show()
There's no direct way to annotate a bar plot (as far as I am aware) Some time ago I needed to annotate one so I wrote this, perhaps you can adapt it to your needs.
import matplotlib.pyplot as plt
import numpy as np
ax = plt.subplot(111)
ax.set_xlim(-0.2, 3.2)
ax.grid(b=True, which='major', color='k', linestyle=':', lw=.5, zorder=1)
# x,y data
x = np.arange(4)
y = np.array([5, 12, 3, 7])
# Define upper y limit leaving space for the text above the bars.
up = max(y) * .03
ax.set_ylim(0, max(y) + 3 * up)
ax.bar(x, y, align='center', width=0.2, color='g', zorder=4)
# Add text to bars
for xi, yi, l in zip(*[x, y, list(map(str, y))]):
ax.text(xi - len(l) * .02, yi + up, l,
bbox=dict(facecolor='w', edgecolor='w', alpha=.5))
ax.set_xticks(x)
ax.set_xticklabels(['text1', 'text2', 'text3', 'text4'])
ax.tick_params(axis='x', which='major', labelsize=12)
plt.show()

For my code I am trying to graph two separate waveforms using matplotlib. My output does not show two clear waveforms. How do I fix this

import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('Lab6.csv', dtype='float', delimiter=',', unpack=True)
t = (data[:1])
yt = np.cos(4 * np.pi * t )
y = data[1:2]
plt.figure()
plt.plot(t,yt,t,y,'o')
plt.xlabel('Time, s')
plt.ylabel('Voltage, V')
plt.legend(('Signal 1', 'Signal 1'))
plt.show()
I imported my data from a cvs file. I am looking to have to waveforms with lines between each data point and two separate colors for each wave
Output from code

Pandas histogram plot with Y axis or colorbar

In Pandas, I am trying to generate a Ridgeline plot for which the density values are shown (either as Y axis or color-ramp). I am using the Joyplot but any other alternative ways are fine.
So, first I created the Ridge plot to show the different distribution plot for each condition (you can reproduce it using this code):
import pandas as pd
import joypy
import matplotlib
import matplotlib.pyplot as plt
df1 = pd.DataFrame({'Category1':np.random.choice(['C1','C2','C3'],1000),'Category2':np.random.choice(['B1','B2','B3','B4','B5'],1000),
'year':np.arange(start=1900, stop=2900, step=1),
'Data':np.random.uniform(0,1,1000),"Period":np.random.choice(['AA','CC','BB','DD'],1000)})
data_pivot=df1.pivot_table('Data', ['Category1', 'Category2','year'], 'Period')
fig, axes = joypy.joyplot(data_pivot, column=['AA', 'BB', 'CC', 'DD'], by="Category1", ylim='own', figsize=(14,10), legend=True, alpha=0.4)
so it generates the figure but without my desired Y axis. So, based on this post, I could add a colorramp, which neither makes sense nor show the differences between the distribution plot of the different categories on each line :) ...
ar=df1['Data'].plot.kde().get_lines()[0].get_ydata() ## a workaround to get the probability values to set the colorramp max and min
norm = plt.Normalize(ar.min(), ar.max())
original_cmap = plt.cm.viridis
cmap = matplotlib.colors.ListedColormap(original_cmap(norm(ar)))
sm = matplotlib.cm.ScalarMappable(cmap=original_cmap, norm=norm)
sm.set_array([])
# plotting ....
fig, axes = joypy.joyplot(data_pivot,colormap = cmap , column=['AA', 'BB', 'CC', 'DD'], by="Category1", ylim='own', figsize=(14,10), legend=True, alpha=0.4)
fig.colorbar(sm, ax=axes, label="density")
But what I want is some thing like either of these figures (preferably with colorramp) :

How do I connect two sets of XY scatter values in MatPlotLib?

I am using MatLibPlot to fetch data from an excel file and to create a scatter plot.
Here is a minimal sample table
In my scatter plot, I have two sets of XY values. In both sets, my X values are country population. I have Renewable Energy Consumed as my Y value in one set and Non-Renewable Energy Consumed in the other set.
For each Country, I would like to have a line from the renewable point to the non-renewable point.
My example code is as follows
import pandas as pd
import matplotlib.pyplot as plt
excel_file = 'example_graphs.xlsx'
datasheet = pd.read_excel(excel_file, sheet_name=0, index_col=0)
ax = datasheet.plot.scatter("Xcol","Y1col",c="b",label="set_one")
datasheet.scatter("Xcol","Y2col",c="r",label="set_two", ax=ax)
ax.show()
And it produces the following plot
I would love to be able to draw a line between the two sets of points, preferably a line I can change the thickness and color of.
As commented, you could simply loop over the dataframe and plot a line for each row.
import pandas as pd
import matplotlib.pyplot as plt
datasheet = pd.DataFrame({"Xcol" : [1,2,3],
"Y1col" : [25,50,75],
"Y2col" : [75,50,25]})
ax = datasheet.plot.scatter("Xcol","Y1col",c="b",label="set_one")
datasheet.plot.scatter("Xcol","Y2col",c="r",label="set_two", ax=ax)
for n,row in datasheet.iterrows():
ax.plot([row["Xcol"]]*2,row[["Y1col", "Y2col"]], color="limegreen", lw=3, zorder=0)
plt.show()

Plotting period series in matplotlib pyplot

I'm trying to plot timeseries revenue data by quarter with matplotlib.pyplot but keep getting an error. Below is my code and the errors The desired behavior is to plot the revenue data by quarter using matplotlib. When I try to do this, I get:
TypeError: Axis must havefreqset to convert to Periods
Is it because timeseries dates expressed as periods cannot be plotted in matplotlib? Below is my code.
def parser(x):
return pd.to_datetime(x, format='%m%Y')
tot = pd.read_table('C:/Desktop/data.txt', parse_dates=[2], index_col=[2], date_parser=parser)
tot = tot.dropna()
tot = tot.to_period('Q').reset_index().groupby(['origin', 'date'], as_index=False).agg(sum)
tot.head()
origin date rev
0 KY 2016Q2 1783.16
1 TN 2014Q1 32128.36
2 TN 2014Q2 16801.40
3 TN 2014Q3 33863.39
4 KY 2014Q4 103973.66
plt.plot(tot.date, tot.rev)
If you want to use matplotlib, the following code should give you the desired plot:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({'origin': ['KY','TN','TN','TN','KY'],
'date': ['2016Q2','2014Q1','2014Q2','2014Q3','2014Q4'],
'rev': [1783.16, 32128.36, 16801.40, 33863.39, 103973.66]})
x = np.arange(0,len(df),1)
fig, ax = plt.subplots(1,1)
ax.plot(x,df['rev'])
ax.set_xticks(x)
ax.set_xticklabels(df['date'])
plt.show()
You could use the xticks command and represent the data with a bar chart with the following code:
plt.bar(range(len(df.rev)), df.rev, align='center')
plt.xticks(range(len(df.rev)), df.date, size='small')
It seems like bug.
For me works DataFrame.plot:
ooc.plot(x='date', y='rev')