I'm trying to use plt.GridSpec() to set up two subplots such that the left one takes up about 67% of the space and the right one takes up 33%.
I looked at the documentation, but I just can't seem to figure out how to set up the indexing--probably due a lack of experience with numpy slicing.
Repeatable Example
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
## Dummy Data
x = [0, 0.03, 0.075, 0.108, 0.16, 0.26, 0.37, 0.49, 0.76, 1.05, 1.64,
0.015, 0.04, 0.085, 0.11, 0.165, 0.29, 0.37, 0.6, 0.78, 1.1]
y = [16.13, 0.62, 2.15, 41.083, 59.97, 13.30, 7.36, 6.80, 4.97, 3.53, 11.77,
30.21, 64.47, 57.64, 56.83, 46.69, 4.22, 30.35, 35.12, 5.22, 25.32]
label = ['blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue', 'blue',
'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red', 'red']
df = pd.DataFrame(
list(zip(x, y, label)),
columns =['x', 'y', 'label']
)
## Plotting
fig = plt.figure(figsize=([10,8]))
grid = plt.GridSpec(1, 3, wspace=0.4, hspace=0.3)
ax1 = plt.subplot(grid[0, :1])
ax2 = plt.subplot(grid[0, 2], sharey = ax1)
ax1.scatter(x=df.y, y=df.x, color=df.label)
df_red = df[df['label'] == "red"]
df_blue = df[df['label'] == "blue"]
myhist = ax2.hist([df_blue.x, df_red.x],
density=False,
edgecolor='black',
color=['blue', 'red'],
cumulative=False,
bins='auto',
orientation='horizontal',
stacked=True,
label=['Blue', 'Red'])
ax1.set_xlabel('length')
ax1.set_ylabel('value')
ax2.set_xlabel('frequency')
ax2.set_ylabel('value')
Current Result
Desired Result
Same plot, just with the left:right ratio at 67% : 33% (so left plot is wider than right plot).
Here's the small modification that you need to make:
# one position less than 3rd column
ax1 = plt.subplot(grid[0, :-1])
Related
I have the following data frame say df =
FunderCode
HCPL1 1% 18% 50% 30% 1%
HCPL2 1% 16% 44% 37% 3%
HCPL3 1% 17% 40% 39% 3%
HCPL4 1% 20% 40% 34% 5%
I wanted to plot it like the following
I could get the following using
Piv_age_per.plot( kind = 'bar', stacked = True , legend = True)
I wanted a diagram with percentage on the bars, if there is inbuilt command to achieve that?
g plot
I could use the following code to generate
import pandas as pd #1.4.4
import matplotlib.pyplot as plt # 3.5.2
# Python 3.10.6
data = pd.DataFrame(columns=range(5))
data.loc['HCPL1'] = [1, 18, 50, 30, 1]
data.loc['HCPL2'] = [1, 16, 44, 37, 3]
data.loc['HCPL3'] = [1, 17, 40, 39, 3]
data.loc['HCPL4'] = [1, 20, 40, 34, 5]
cumulative = data.cumsum(axis=1)
n_rows, n_cols = data.shape
y_pos = range(n_rows)
height = 0.35
colors = ['blue', 'darkorange', 'gray', 'yellow', 'darkblue']
fig, ax = plt.subplots(figsize=(8, 5))
for i in range(n_cols):
left = cumulative[i]-data[i]
labels = [f'{value:.1f}%' for value in data[i]]
ploted = ax.barh(y_pos, data[i], height,
align='center',
left=left,
zorder=2,
color=colors[i])
ax.bar_label(ploted, label_type='center', fontsize=12, labels=labels)
ax.set_yticks(y_pos, labels=data.index)
ax.invert_yaxis()
ax.tick_params(axis='y', pad=20)
ax.set_xticks(range(0, 101, 10))
ax.grid(axis='x', zorder=0)
I have some data that is broken down by day. For each day, I have a datapoint at the start and end of the day, each with a value between 0 and 100. I need to display this data as a grouped bar plot with the days on the x axis, values on the y axis and the bars colors are determined by their values. For each day, the left bar needs to have the corresponding start value, and the right bar displays the day's end value. The legend however needs to display information on the color rather than the trace
The plot basically needs to look like this but the legend needs to display "green", "amber", "red" instead of "start", "end".
I need the plot to look like this but with a legend describing the colors rather than the traces
Here is some code to reproduce the plot:
x = ["day"+str(i) for i in range(1,8)]
starts = [10, 50, 70, 75, 20, 50, 90]
ends = [95, 5, 80, 20, 50, 10, 75]
starts_colors = ['green', 'orange', 'red', 'red', 'green', 'orange', 'red']
ends_colors = ['red', 'green', 'red', 'green', 'orange', 'green', 'red']
And here is the code I have for the plot above.
layout = go.Layout(showlegend=True)
fig = go.Figure(layout=layout)
fig.add_trace(go.Bar(x=x, y=starts, name = 'start', marker=dict(color=starts_colors)))
fig.add_trace(go.Bar(x=x, y=ends, name = 'end', marker=dict(color=ends_colors)))
fig.show()
If I rearrange the data into 3 traces (one for each color) with the corresponding values in starts and ends, I end up with gaps between the bars. For example "day1" would have a gap in the middle because there is no orange bar for "day1".
This seems like a simple problem but I'm at a loss as to how to get it to work the way I'm supposed to.
this creates exactly the graph you requested
start by putting your sample data into a dataframe to open up Plotly Express
start by updating traces to use colors columns
adding legend is done. Really is not a functional legend as it cannot be used for filtering the figure, will just show unique colors used in figure. This is achieved by adding additional small traces
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
import numpy as np
df = pd.DataFrame(
{
"day": ["day" + str(i) for i in range(1, 8)],
"starts": [10, 50, 70, 75, 20, 50, 90],
"ends": [95, 5, 80, 20, 50, 10, 75],
"starts_colors": ["green", "orange", "red", "red", "green", "orange", "red"],
"ends_colors": ["red", "green", "red", "green", "orange", "green", "red"],
}
)
# build figure, hover_data / customdata is used to hold colors
fig = px.bar(
df,
x="day",
y=["starts", "ends"],
barmode="group",
hover_data={"starts_colors":False, "ends_colors":False},
)
# update colors of bars
fig.plotly_update(
data=[
t.update(marker_color=[c[i] for c in t.customdata])
for i, t in enumerate(fig.data)
]
)
# just for display purpose, create traces so that legend contains colors. does not connect with
# bars
fig.update_traces(showlegend=False).add_traces(
[
go.Bar(name=c, x=[fig.data[0].x[0]], marker_color=c, showlegend=True)
for c in np.unique(df.loc[:,["starts_colors","ends_colors"]].values.ravel())
]
)
I'm trying to sort the graph with different colors in the same order as the dataframe, but when I sort the values, the colors don't change.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plot
changelist = (0.1, 0.12, 0.13, -0.1, 0.05, 0.07)
assetlist = ('a', 'b', 'c', 'd', 'e', 'f')
clrs = ('yellow', 'green', 'blue', 'blue', 'green', 'yellow')
data = {"Assets":assetlist,
"Change":changelist,
"Colors":clrs,
}
dataFrame = pd.DataFrame(data=data)
dataFrame.sort_values("Change", ascending=False)
dataFrame.plot.bar(x="Assets", y="Change", rot=90, title="DesempeƱo Principales Activos Enero en MXN", color=clrs)
plot.show(block=True)
You need to use inplace=True to have the sorting act on the dataframe itself. Otherwise, the function returns the sorted dataframe without changing the original.
Also, you need to give the column from the sorted dataframe as the list of colors, not the original unsorted color list.
(Note that in Python strings need either single or double quotes, and commands aren't ended with a semicolon.)
import pandas as pd
import numpy as np
import matplotlib.pyplot as plot
changelist = (0.1, 0.12, 0.13, -0.1, 0.05, 0.07)
assetlist = ('a', 'b', 'c', 'd', 'e', 'f')
clrs = ('yellow', 'green', 'blue', 'blue', 'green', 'yellow')
data = {"Assets": assetlist,
"Change": changelist,
"Colors": clrs}
dataFrame = pd.DataFrame(data=data)
dataFrame.sort_values("Change", ascending=False, inplace=True)
dataFrame.plot.bar(x="Assets", y="Change", rot=90, title="DesempeƱo Principales Activos Enero en MXN",
color=dataFrame["Colors"])
plot.show(block=True)
I am applying this strategy to place legend outside plot. The main difference here is that there are ax1 and ax2 twin axes.
The x value in bbox_to_anchor is set to 0.89 in the following MWE.
As can be seen, the legend box does not display the entire string labels for each color:
MWE:
import matplotlib.pyplot as plt
import numpy as np
suptitle_label = "rrrrrrrr # ttttt yyyyyyy. uuuuuuuuuuuuuuuuuu\n$[$Xx$_{2}$Yy$_{7}]^{-}$ + $[$XxYy$_{2}$(cccc)$_{2}]^{+}$ JjYy model"
# Plotting
fig, ax1 = plt.subplots()
ax1.set_xlabel('Time')
ax1.set_ylabel('y1label')
new_time = np.linspace(1, 8, 100)
j_data = [np.linspace(1, 4, 100), np.linspace(1, 5, 100), np.linspace(1, 6, 100), np.linspace(1, 7, 100)]
sorted_new_LABELS_fmt = ['$[$XxYy$_{2}$(cc)$_{2}]^{+}$', '$[$Xx$_{2}$Yy$_{7}]^{-}$', '$[$XxYy$_{4}]^{-}$', '$[$Xx$_{2}$Yy$_{5}$(cc)$_{2}]^{+}$']
sorted_new_LABELS_colors = ['green', 'red', 'blue', 'orange']
for j,k,c in zip(j_data, sorted_new_LABELS_fmt, sorted_new_LABELS_colors):
ax1.plot(new_time, j, label='%s' % k, color='%s' %c)
All_e_chunks_n = np.linspace(-850, -860, 100)
ax2 = ax1.twinx()
ax2.set_ylabel('y2label')
ax2.plot(new_time, All_e_chunks_n, 'grey', alpha=0.6, linewidth=2.5, label='y2')
# Shrink cccrent axis
box = ax1.get_position()
ax1.set_position([box.x0, box.y0, box.width * 0.9, box.height])
# Put the legend:
fig.legend(loc='center left', bbox_to_anchor=(0.89, 0.5))
fig.suptitle(suptitle_label, fontsize=15)
fig.savefig('mwe.pdf', bbox_inches='tight')
Decreasing this x value and commenting out thebbox_inches='tight' part, yields the following:
For bbox_to_anchor=(0.85, 0.5), this is the result:
For bbox_to_anchor=(0.80, 0.5), this is the result:
For bbox_to_anchor=(0.7, 0.5), this is the result:
In this Senkey are two Inputs: K and S, three Outputs: H,F and Sp and the Rest: x
The Inputs shall come from the left Side, the Outputs go to the right Side.
The Rest shall go to the Top.
from matplotlib.sankey import Sankey
import matplotlib.pyplot as plt
fig = plt.figure(figsize = [10,10])
ax = fig.add_subplot(1,1,1)
ax.set(yticklabels=[],xticklabels=[])
ax.text(-10,10, "xxx")
Sankey(ax=ax, flows = [ 20400,3000,-19900,-400,-2300,800],
labels = ['K', 'S', 'H', 'F', 'Sp', 'x'],
orientations = [ 1, -1, 1, 0, -1, -1 ],
scale=1, margin=100, trunklength=1.0).finish()
plt.tight_layout()
plt.show()
I played a lot with the orientations, but nothing works or looks nice.
And, it there a way to set different colors for every arrow?
The scale of the Sankey should be such that input-flow times scale is about 1.0 and output-flow times scale is about -1.0 (see docs). Therefore, about 1/25000 is a good starting point for experimentation. The margin should be a small number, maybe around 1, or leave it out. I think the only way to have individual colors, is to chain multiple Sankeys together (with add), but that's probably not what you want. Use plt.axis("off") to suppress the axes completely.
My test code:
from matplotlib.sankey import Sankey
import matplotlib.pyplot as plt
fig = plt.figure(figsize = [10,10])
ax = fig.add_subplot(1,1,1)
Sankey(ax=ax, flows = [ 20400,3000,-19900,-400,-2300,-800],
labels = ['K', 'S', 'H', 'F', 'Sp', 'x'],
orientations = [ 1, -1, 1, 0, -1, -1 ],
scale=1/25000, trunklength=1,
edgecolor = '#099368', facecolor = '#099368'
).finish()
plt.axis("off")
plt.show()
Generated Sankey:
With different Colors
from matplotlib.sankey import Sankey
import matplotlib.pyplot as plt
from matplotlib import rcParams
plt.rc('font', family = 'serif')
plt.rcParams['font.size'] = 10
plt.rcParams['font.serif'] = "Linux Libertine"
fig = plt.figure(figsize = [6,4], dpi = 330)
ax = fig.add_subplot(1, 1, 1,)
s = Sankey(ax = ax, scale = 1/40000, unit = 'kg', gap = .4, shoulder = 0.05,)
s.add(
flows = [3000, 20700, -23700,],
orientations = [ 1, 1, 0, ],
labels = ["S Value", "K Value", None, ],
trunklength = 1, pathlengths = 0.4, edgecolor = '#000000', facecolor = 'darkgreen',
lw = 0.5,
)
s.add(
flows = [23700, -800, -2400, -20500],
orientations = [0, 1, -1, 0],
labels = [None, "U Value", "Sp Value", None],
trunklength=1.5, pathlengths=0.5, edgecolor = '#000000', facecolor = 'grey',
prior = 0, connect = (2,0), lw = 0.5,
)
s.add(
flows = [20500, -20000, -500],
orientations = [0, -1, -1],
labels = [None, "H Value", "F Value"],
trunklength =1, pathlengths = 0.5, edgecolor = '#000000', facecolor = 'darkred',
prior = 1, connect = (3,0), lw = 0.5,
)
diagrams = s.finish()
for d in diagrams:
for t in d.texts:
t.set_horizontalalignment('left')
diagrams[0].texts[0].set_position(xy = [-0.58, 0.9,]) # S
diagrams[0].texts[1].set_position(xy = [-1.5, 0.9,]) # K
diagrams[2].texts[1].set_position(xy = [ 2.35, -1.2,]) # H
diagrams[2].texts[2].set_position(xy = [ 1.75, -1.2,]) # F
diagrams[1].texts[2].set_position(xy = [ 0.7, -1.2]) # Sp
diagrams[1].texts[1].set_position(xy = [ 0.7, 0.9,]) # U
# print(diagrams[0].texts[0])
# print(diagrams[0].texts[1])
# print(diagrams[1].texts[0])
# print(diagrams[1].texts[1])
# print(diagrams[1].texts[2])
# print(diagrams[2].texts[0])
# print(diagrams[2].texts[1])
# print(diagrams[2].texts[2])
plt.axis("off")
plt.show()