I have a data frame as shown below
product bought_date Monthly_profit Average_discout
A 2016 85000000 5
A 2017 55000000 5.6
A 2018 45000000 10
A 2019 35000000 9.8
B 2016 75000000 5
B 2017 55000000 4.6
B 2018 75000000 11
B 2019 45000000 9.8
C 2016 95000000 5.3
C 2017 55000000 5.1
C 2018 50000000 10.2
C 2019 45000000 9.8
From the above I would like to plot 3 subplots.
one for product A, B and C.
In each subplot there should be 3 line plot, where
X axis = bought_date
Y axis1 = Monthly_profit
Y axis2 = Average_discout
I tried below code.
sns.set(style = 'darkgrid')
sns.lineplot(x = 'bought_date', y = 'Monthly_profit', style = 'product',
data = df1, markers = True, ci = 68, err_style='bars')
Variant 1: using subplots and separating the data manually
products = df['product'].unique()
fig,ax = plt.subplots(1,len(products),figsize=(20,10))
for i,p in enumerate(products):
sns.lineplot('bought_date', 'Monthly_profit', data=df[df['product']==p], ax=ax[i])
sns.lineplot('bought_date', 'Average_discout', data=df[df['product']==p], ax=ax[i].twinx(), color='orange')
ax[i].legend([f'Product {p}'])
Variant 2: using FacetGrid:
def lineplot2(x, y, y2, **kwargs):
ax = sns.lineplot(x, y, **kwargs)
ax2 = ax.twinx()
sns.lineplot(x, y2, ax=ax2, **kwargs)
g = sns.FacetGrid(df, col='product')
g.map(lineplot2, 'bought_date', 'Monthly_profit', 'Average_discout', marker='o')
These are just general rough examples, you'll have to tidy up axis labels etc. as needed.
Related
enter image description here
I want to clean some matplotlib image, removing the year from every x beside the January ticks.
My date column is in datetime format from pandas
this is my code:
df_ref1 = data.groupby(['ref_date'])['fuel'].value_counts().unstack()
fig, ax = plt.subplots(figsize=(6,4), dpi=200);
df_ref1.plot(ax=ax, kind='bar', stacked=True)
ax.set_title('Distribuição de tipo de combustível')
ax.spines[['top','right', 'left']].set_visible(False)
ax.set_xlabel('')
plt.xticks(rotation = 45,ha='right');
ax.legend(['Gasolina','Disel', 'Álcool'],bbox_to_anchor=(.85, .95, 0, 0), shadow=False, frameon=False)
plt.tight_layout()
I tried use:
ax.xaxis.set_major_formatter('%Y/%m')
ax.xaxis.set_minor_formatter('%m')
Tried using:
ax.xaxis.set_major_formatter(DateFormatter('%Y/%m'))
ax.xaxis.set_minor_formatter(DateFormatter('%m'))
but the xticks turn to 1970/01
Without success. Any tip?
One way to do this would be to first truncate the labels to show just year and month and then show only every nth (12th in your case) label, so that you see the first month of each year. Hope the dates are in datetime format.
The data I used...
Date Petrol Diesel Alcohol
0 2021-01-01 1975 320 30
1 2021-02-01 1976 321 31
2 2021-03-01 1977 322 32
3 2021-04-01 1978 323 33
....
22 2022-11-01 1997 342 52
23 2022-12-01 1998 343 53
24 2023-01-01 1999 344 54
The updated code...
df_ref1=df_ref1.set_index('Date', drop=True) ## Set it so dates are in index
##Your code
fig, ax = plt.subplots(figsize=(6,4), dpi=200);
ax.set_title('Distribuição de tipo de combustível')
#ax.spines[['top','right', 'left']].set_visible(False)
ax.set_xlabel('')
df_ref1.plot(ax=ax, kind='bar', stacked=True)
plt.xticks(rotation = 45,ha='right')
ax.legend(['Gasolina','Disel', 'Álcool'],bbox_to_anchor=(.85, .95, 0, 0), shadow=False, frameon=False)
## Get the labels using get_text() and set it to show first 7 characters
labels = [item.get_text() for item in ax.get_xticklabels()]
labels = [x[:7] for x in labels]
ax.set_xticklabels(labels)
## Show only the 1st, 13th, ... label
every_nth = 12
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)
Output plot
Update:
To show the ticklabels as per comment (Year-Month for Jan and just month for others), use something similar - get the ticklabels, cut the string to first 7 for Jan and 5th to 7th for the other months (two chars only) and display the same... Updated code and plot
df_ref1=df_ref1.set_index('Date', drop=True)
fig, ax = plt.subplots(figsize=(6,4), dpi=200);
ax.set_title('Distribuição de tipo de combustível')
ax.set_xlabel('')
df_ref1.plot(ax=ax, kind='bar', stacked=True)
plt.xticks(rotation = 45, ha='right')
ax.legend(['Gasolina','Disel', 'Álcool'],bbox_to_anchor=(.85, .95, 0, 0), shadow=False, frameon=False)
## Get the labels using get_text()
labels = [item.get_text() for item in ax.get_xticklabels()]
## Show the correct chars based on Jan or not not-Jan
every_nth = 12
for n in range(len(labels)):
if n % every_nth == 0:
labels[n]=labels[n][:7] ## Is Jan - show 7 chars
else:
labels[n]=labels[n][5:7] ## Is NOT Jan - show 5th to 7th chars
## Update labels
ax.set_xticklabels(labels)
I want to plot two y axis - one with continuous data and other with values - ranging from 0 to 7
Example:
ID IHC FISH1 FISH2
1 3 11.5 9.5
2 1 2.9 3.9
3 2 1.5 6.5
4 1 3.3 1.3
5 2 5.5 8.5
6 2 6.6 9.6
How can I plot secondary y axis - if it is not related to primary y axis
I want to code in R.
I want a plot like this as output
data %>%
select(ID, IHC, FISH1, FISH2) %>%
gather(key = "FISH_IHC", value = "FISH_val", FISH1, FISH 2, IHC, -ID) %>%
mutate(as_factor = as.factor(FISH_IHC)) %>%
ggplot(aes(x = reorder(ID,FISH_val), y = FISH_val, group = as_factor), na.rm = TRUE) +
geom_point(aes(shape=as_factor, color=as_factor)) +
scale_y_continuous(limits = c(0, 10),
oob = function(x, ...){x},
expand = c(0, -1),
breaks=number_ticks(10),
sec.axis = sec_axis(scales::rescale(pretty(range(IHC)),
name = "IHC")))
In the df all the color types are together after sorting, however, in the first plot Y1 is between Xs (see below).
By adding the col= to build a grid, the order is even further broken up.
How can you keep the color types together on the plot?
I've tried hue_order, order, col_order without any success.
Plot 1: need the orange X color types to be in order X3 > X2 > X1, not X3 > X2 > Y1 > X1
Plot 2: need all the color types to be in order (one after another)
Thank you in advance!
color_data.csv =
color_data.csv:
YEAR
SITE
G_annoation
color
Type
2018
Alpha
Y1
Y
A
2017
Alpha
X1
X
A
2016
Alpha
X2
X
B
2018
Alpha
X3
X
B
2017
Alpha
Z1
Z
B
2017
Alpha
T1
T
A
2018
Alpha
T2
T
A
2016
Alpha
T3
T
A
df = pd.read_csv('color_data.csv')
g= sns.catplot(data=df.sort_values(by='color'),
x="Type", y="G_annotation", hue="color")
df = pd.read_csv('color_data.csv')
g= sns.catplot(data=df.sort_values(by='color'),
x="Type", y="G_annotation", hue="color", col="YEAR")
I have a list of t-shirt orders along with the corresponding size and I would like to plot them in pie chart for each design showing the percentage in which size sells the most etc.
Design Total
0 Boba L 9
1 Boba M 4
2 Boba S 2
3 Boba XL 5
4 Burger L 6
5 Burger M 2
6 Burger S 3
7 Burger XL 1
8 Donut L 5
9 Donut M 9
10 Donut S 2
11 Donut XL 5
It is not complete clear what you asking, but here is my interpretation:
df[['Design', 'Size']] = df['Design'].str.rsplit(n=1, expand=True)
fig, ax = plt.subplots(1, 3, figsize=(10,8))
ax = iter(ax)
for t, g in df.groupby('Design'):
g.set_index('Size')['Total'].plot.pie(ax=next(ax), autopct='%.2f', title=f'{t}')
Maybe you want:
df = pd.read_clipboard() #create data from above text no modification
dfplot = df.loc[df.groupby(df['Design'].str.rsplit(n=1).str[0])['Total'].idxmax(), :]
ax = dfplot.set_index('Design')['Total'].plot.pie(autopct='%.2f')
ax.set_ylabel('');
Let do groupby.plot.pie:
(df.Design.str.split(expand=True)
.assign(Total=df['Total'])
.groupby(0)
.plot.pie(x=1,y='Total', autopct='%.1f%%')
)
# format the plots
for design, ax in s.iteritems():
ax.set_title(design)
one of the output:
I create a bar plot like this:
But since each x axis label is one day of january (for example 1, 3, 4, 5, 7, 8, etc) I think the best way of showing this is something like
__________________________________________________ x axis
Jan 1 3 4 5 7 8 ...
2019
But I dont know how to do this with Pandas.
Here is my code:
import pandas as pd
import matplotlib.plt as plt
df = pd.read_excel('solved.xlsx', sheetname="Jan")
fig, ax = plt.subplots()
df_plot=df.plot(x='Date', y=['Solved', 'POT'], ax=ax, kind='bar',
width=0.9, color=('#ffc114','#0098c9'), label=('Solved','POT'))
def line_format(label):
"""
Convert time label to the format of pandas line plot
"""
month = label.month_name()[:3]
if month == 'Jan':
month += f'\n{label.year}'
return month
ax.set_xticklabels(map(lambda x: line_format(x), df.Date))
The function was a solution provided here: Pandas bar plot changes date format
I dont know how to modify it to get the axis I want
My data example solved.xlsx:
Date Q A B Solved POT
2019-01-04 Q4 11 9 14 5
2019-01-05 Q4 9 11 14 5
2019-01-08 Q4 11 18 10 6
2019-01-09 Q4 18 19 18 5
I have found a solution:
import pandas as pd
import matplotlib.plt as plt
df = pd.read_excel('solved.xlsx', sheetname="Jan")
fig, ax = plt.subplots()
df_plot=df.plot(x='Date', y=['Solved', 'POT'], ax=ax, kind='bar',
width=0.9, color=('#ffc114','#0098c9'), label=('Solved','POT'))
def line_format(label):
"""
Convert time label to the format of pandas line plot
"""
day = label.day
if day == 2:
day = str(day) + f'\n{label.month_name()[:3]}'
return day
ax.set_xticklabels(map(lambda x: line_format(x), df.Date))
plt.show()
In my particular case I didnt have the date 2019-01-01 . So the first day for me was Jan 2