How to manually create a label in Matplotlib.pyplot - matplotlib

I am trying to compare 2 figures as follows:
I using the function subplot() under the library Matplotlib.pylot to plot these 2 images, and the following are my codes:
def Plot_Labels(data,data_check):
data = np.reshape(data,(414,507))
#data = np.reshape(data,(414,507))
shape = data.shape
nan_mask = np.isnan(data_check)
data_mask = data[~nan_mask]
valid_data = np.empty(shape)
valid_data[~nan_mask] = data_mask
valid_data[nan_mask] = np.nan
fig, axes = plt.subplots()
im = axes.imshow(valid_data)
bar = plt.colorbar(im)
bar.set_label('Flood Hazard')
plt.show()
However, the labels in these 2 figures are different. For example, both 4 and 5 gives the same yellow color in the 2 figures respectively. Therefore, is there any method to manually create label 5 in the second figure, so that the 2 figures become comparable?

Related

Name stacked bars after legend entry on Pandas/Matplotlib

I have a stacked bar chart that works really well for what I'm looking for. My problem is handling the labels.
I can label every single stacked bar after its value (number), but I'm looking to label it after its name (on the legend).
Does anyone have an idea on how to solve this?
ps.: Unfortunately I can't post images yet.
I have something like this:
####
#15#
####
oooo ####
oooo #35#
o55o ####
oooo ####
oooo o12o
And need like this:
####
#### A
####
oooo ####
oooo B #### A
oooo ####
oooo oooo B
I've written a short example, see the code below:
import numpy as np
import matplotlib.pyplot as plt
# Some data
x = np.array([0, 1, 2])
y1 = np.array([3, 4, 1])
y2 = np.array([2, 2, 4])
# label text
label_y1 = 'y1'
label_y2 = 'y2'
# Create the base plot
fig, ax = plt.subplots()
bars_y1 = ax.bar(x, y1, width=0.5, label=label_y1)
bars_y2 = ax.bar(x, y2, width=0.5, label=label_y2, bottom=y1)
# Function to add labels to the plot
def add_labels(ax, bars, label):
for bar in bars:
# Get the desired x and y locations
xloc = bar.get_x() + 1.05 * bar.get_width()
yloc = bar.get_y() + bar.get_height() / 2
ax.annotate(label, xy=(xloc, yloc), va='center', ha='left', color=bar.get_facecolor())
# Add the labels in the plot
add_labels(ax, bars_y1, label_y1)
add_labels(ax, bars_y2, label_y2)
plt.show()
First of all, I generate some dummy data (x, y1 and y2). Then, I define the desired label text (label_y1 and label_y2) and lastly I make the base bar graph using Axes.bar. Note that I store the return value from the Axes.bar calls, which is a container containing all the bars!
Now, we get to the interesting part. I define a function called add_labels. As an input, it takes the Axes of interest, a container with all the bars and the desired label text. In the function body, I loop over all the bars and determine the desired x and y location for the label text. Using these values, I place label text at those coordinates using the Axes.annotate method. At the end of the script, I simply call the add_labels function with the desired arguments to get the following output:
Is this what you are looking for?
Based on Dex answer I came up with a solution.
Using patches, it will get every single bar from the chart. The bars are ordenated by rows. So if you have a 4x3 dataframe:
zero um dois
0 a b c
1 d e f
2 g h i
3 j k l
bars.patches will have each column after the other: [a,d,g,j,b,e,h,k,c,f,i,l]
So, every 4 items (rows), it restarts. To do that, we can use the the mod function (%) based on the number of rows on the df:
i % len(df.index) == 0 #moves position counter to the next column name
The code ended up like this:
import pandas as pd
import numpy as np
# Some data
x = np.array(['zero', 'um', 'dois'])
y = np.array([[3, 4, 8],[2, 2, 4],[6, 7, 8]])
df = pd.DataFrame(y, columns = x)
print(df)
zero um dois
0 3 4 8
1 2 2 4
2 6 7 8
title = 'Chart Title'
bars = df.plot.bar(ax = ax, stacked = True, title = title, legend = False)
plt.xlabel('x axis label')
pos = -1
for i, bar in enumerate(bars.patches): #runs through every single bar on the chart
if i % len(df.index) == 0: #based on lenght of the index, gets which label
pos += 1 #to use from the columns. Returning to the
#first after completing a row
xloc = bar.get_x()
yloc = bar.get_y() + bar.get_height() / 2
if bar.get_height() > 30:
ax.annotate(str(df.columns[pos]), xy = (xloc, yloc), va='center', ha='left')
#df.columns[pos] will get the correct column name
So, no matter the size of the dataframe, it will plot the column names next to the bars
chart example:
https://i.stack.imgur.com/2iHau.png

Subplots not plotting correctly, matplotlib

I would like to have a 2x2 grid each being 8 times three bars next to each other.
Its because I have four evaluation parameter, each evaluation the quality of 8 languages for three models.
I was able to plot one of them, but it doesn't work for the other three being in my plot-grid.
At the moment, it looks like that:
eval_list = ["F1-Werte (micro)", "Precision", "Recall", "Accuracy"]
lang_list = ["Alle Sprachen", "Chinesisch", "Deutsch", "Englisch", "Finnisch", "Französisch", "Italienisch", "Spanisch"]
x = np.arange(start=1, stop=9, step=1)
fig, axes = plt.subplots(2, 2, figsize=(10,10), sharex=False)
df_list = list()
count = 0
for eval_elem in eval_list:
flair = list()
bert = list()
roberta = list()
for lang in lang_list:
flair.append(dfBest.query("Eval==#eval_elem").query("Lang==#lang").query("Model=='multiflair'").iloc[0]['Average'])
bert.append(dfBest.query("Eval==#eval_elem").query("Lang==#lang").query("Model=='bert'").iloc[0]['Average'])
roberta.append(dfBest.query("Eval==#eval_elem").query("Lang==#lang").query("Model=='roberta'").iloc[0]['Average'])
#ax = plt.subplot(2,2,count+1)
a = 1 if count > 1 else 0
b = 1 if count%2!=0 else 0
axes[a][b].bar(x-0.2, flair, width=0.2, color='b', align='center')
axes[a][b].bar(x, bert, width=0.2, color='g', align='center')
axes[a][b].bar(x+0.2, roberta, width=0.2, color='r', align='center')
plt.xticks(x, lang_list, rotation=90)
plt.title(eval_list[count])
plt.legend(bbox_to_anchor=(1.1, 1), labels=["multiflair", "bert", "roberta"])
plt.show()
count += 1
The three lists flair, bert and roberta represent the data and they all correctly put out a list of eight floats in each iteration, but otherwise I don't know, whats wrong with the layout.

Having trouble using NaN figure

I am trying to make a percent stacked bar plot, with 5 bars. 2 bars have no data but they can not be excluded from the chart. I set this value NaN (because I need to calculate means later). In this case one of these 2 is the first entry in the list. This results in not showing the top part of the chart. What I don't understand is that when I switch the first and second, making the second entry NaN, there is no problem.
Code :
Here NaN is first, 3 is second, which does not work. Switching NaN and 3 does work (See images below)
import numpy as np
import matplotlib.pyplot as plt
from math import nan
#Data
goed1 = [nan,3,152,9, nan]
tot1 = [1,1,15,2,1]
total = [(i * 16 ) for i in tot1]
fout1 = np.zeros(5)
for i in range(len(goed1)):
fout1[i] = total[i] - goed1[i]
data = {'Goed': goed1, 'Fout': fout1}
#Grafiek
fig, ax = plt.subplots()
r = [0,1,2,3,4]
df = pd.DataFrame(data)
#naar percentage
totaal = [i + j for i,j in zip(df['Goed'], df['Fout'])]
goed = [i / j * 100 for i,j in zip(df['Goed'], totaal)]
fout = [i / j * 100 for i,j in zip(df['Fout'], totaal)]
#plot
width = 0.85
names = ('Asphalt cover','Special constructions','Gras revetments','Non-flood defensive elements','Stone revetments')
plt.bar(r, goed, color='#b5ffb9', edgecolor='white', width=width, label="Detected")
plt.bar(r, fout, bottom=goed, color='#f9bc86', edgecolor='white', width=width, label="Missed")
# Add a legend
plt.legend(loc='upper left', bbox_to_anchor=(1,1), ncol=1)
plt.title('Boezemkade')
# Custom x axis
plt.xticks(r, names, rotation = 20, horizontalalignment = 'right')
# Show graphic
plt.show()
If anybody knows how to fix this, help is appreciated.
Plots:
NaN first:
NaN second:
You can convert your data to an numpy array, then search where are the NaN and replace them by 0.
goed1 = np.array([nan,3,152,9, nan])
where_are_NaNs = np.isnan(goed1)
goed1[where_are_NaNs] = 0
It will result:

legend showing category's colors on pandas scatter plot

I try to show legend for category in the following code like this discussion. But I couldn't.
import pandas as pd
import matplotlib.pyplot as plt
get_color = lambda x: 0 if x=="a" else 1
df = pd.DataFrame([[1,2,"a"],[2,2,"b"],[2,3,"a"],[3,3,"b"]])
df.plot.scatter(x=0,y=1,c=df[2].apply(get_color))
plt.legend()
How do I show legend that shows color of each category?
If you want to use pandas' df.plot(), you need to go through df.groupby and assign label together with color plot parameters within the loop like:
Step 1: Prepare data properly
c0 = [1,2,2,3]
c1 = [2,2,3,3]
c2 = ['a','b','a','b']
df = pd.DataFrame(dict(c0=c0,c1=c1,c2=c2))
df
c0 c1 c2
0 1 2 a
1 2 2 b
2 2 3 a
3 3 3 b
Step 2: Plot your data by looping through grouped objects:
colors = {'a': 'white','b':'black'}
_, ax = plt.subplots()
for key,group in df.groupby('c2'):
group.plot.scatter(ax=ax, x='c0', y='c1', label=key, color = colors[key]);
Hope this helps

matplotlib shared row label (not y label) in plot containing subplots

I have a trellis-like plot I am trying to produce in matplotlib. Here is a sketch of what I'm going for:
One thing I am having trouble with is getting a shared row label for each row. I.e. in my plot, I have four rows for four different sets of experiments, so I want row labels "1 source node, 2 source nodes, 4 source nodes and 8 source nodes".
Note that I am not referring to the y axis label, which is being used to label the dependent variable. The dependent variable is the same in all subplots, but the row labels I am after are to describe the four categories of experiments conducted, one for each row.
At the moment, I'm generating the plot with:
fig, axes = plt.subplots(4, 5, sharey=True)
While I've found plenty of information on sharing the y-axis label, I haven't found anything on adding a single shared row label.
As far as I know there is no ytitle or something. You can use text to show some text. The x and y are in data-coordinates. ha and va are horizontal and vertical alignment, respectively.
import numpy
import matplotlib
import matplotlib.pyplot as plt
n_rows = 4
n_cols = 5
fig, axes = plt.subplots(n_rows, n_cols, sharey = True)
axes[0][0].set_ylim(0,10)
for i in range(n_cols):
axes[0][i].text(x = 0.5, y = 12, s = "column label", ha = "center")
axes[n_rows-1][i].set_xlabel("xlabel")
for i in range(n_rows):
axes[i][0].text(x = -0.8, y = 5, s = "row label", rotation = 90, va = "center")
axes[i][0].set_ylabel("ylabel")
plt.show()
You could give titles to subplots on the top row like Robbert suggested
fig, axes = plt.subplots(4,3)
for i, ax in enumerate(axes[0,:]):
ax.set_title('col%i'%i)