Graph plotting in pandas and seaborn - pandas

I have the table with 5 columns with 8000 rows:
Market DeliveryWindowID #Orders #UniqueShoppersAvailable #UniqueShoppersFulfilled
NY 296 2 2 5
MA 365 3 4 8
How do I plot a graph in pandas or seaborn that will show the #Order, #UniqueShoppersAvailable, #UniqueShoppersFulfilled v/s the market and delivery window?

Using Seaborn, reshape your dataframe with melt first:
df_chart = df.melt(['Market','DeliveryWindowID'])
sns.barplot('Market', 'value',hue='variable', data=df_chart)
Output:

One way is to set Market as index forcing it onto the x axis and do a bar graph if you wanted a quick visualization. This can be stacked or not.
Not Stacked
import matplotlib .pyplot as plt
df.drop(columns=['DeliveryWindowID']).set_index(df.Market).plot(kind='bar')
Stacked
df.drop(columns=['DeliveryWindowID']).set_index(df.Market).plot(kind='bar', stacked=True)

Related

Stacked bar chart for a pandas df [duplicate]

This question already has answers here:
Using pandas crosstab to create a bar plot
(2 answers)
count plot with stacked bars per hue [duplicate]
(1 answer)
How to have clusters of stacked bars
(10 answers)
Closed 7 months ago.
I have a df like this and would like to plot stacked bar chart where in the x axis is Component and the y-axis shows the count by 'Major', 'Minor' etc.
Component Priority
0 Browse Groups Minor
1 Notifications Major
2 BI Major
3 BI Minor
4 BI Minor
For example, the first bar would have 1st component with a count of 1 minor,..so on.. and 3rd would have 'BI' in x-axis with 1 count of Major and 2 counts of Minor stacked.
What is the simplest way to do this in seaborn or something similar?
You can groupby both columns and count on Priority, then unstack and plot as stacked bar chart:
df.groupby(['Component', 'Priority']).Priority.count().unstack().plot.bar(stacked=True)
Example:
import pandas as pd
df = pd.DataFrame({'Component': list('abccc'), 'Priority': ['Minor', 'Major', 'Major', 'Minor', 'Minor']})
df.groupby(['Component', 'Priority']).Priority.count().unstack().plot.bar(stacked=True)
As an alternative, you can use a crosstab:
pd.crosstab(df.Component, df.Priority).plot.bar(stacked=True)
If you want to use seaborn (I only now saw the seaborn tag), you can use a displot:
import seaborn as sns
sns.displot(x='Component', hue='Priority', data=df, multiple='stack')

Plot a barchart after pandas.value_counts() [duplicate]

This question already has an answer here:
Using pandas value_counts and matplotlib
(1 answer)
Closed 8 months ago.
I have a dataframe with several columns and I need to plot a graph based on the number of counts in the 'Total'column.
I performed the following code:
df['Total'].value_counts()
The output are as follows:
2 10
20 15
4 8
8 20
This means the the number 2 appears in the Total columns 10 times, number 20 appears 15 times and so on.
How do I plot a barchart with the x-axis as the number itself and the y-axis as the occurances and in ascending
order? The x-axis will plot 2 -> 4 -> 8 -> 20.
What are the next steps after:
%matplotlib inline
import matplotlib.pyplot as plt
Consider this as an example:
This denoted your 'Total' column -> [2,2,2,2,2,20,20,20,20,4,4,4,8,8,8,8,8]
import pandas as pd
import matplotlib.pyplot as plt
import collections
total = [2,2,2,2,2,20,20,20,20,4,4,4,8,8,8,8,8]
df = pd.DataFrame(total, columns=['total'])
#print(df.value_counts())
fig,ax = plt.subplots()
df['total'].value_counts().plot(ax = ax, kind = 'bar', ylabel = 'frequency')
plt.show()
This gives the following output:

Can you make 4 bars with 4 columns of data using matplotlib?

I need to make a bar graph where x-axis plot the index ([1992,1993,1994,1995]) and y-axis will plot 4 columns of random data as shown below. It will generate 4 bars for 4 columns of data. How can I do that? I am pretty new to matplotlib. Please help. This is the DataFrame:
np.random.seed(12345)
df = pd.DataFrame([np.random.normal(32000,200000,3650), np.random.normal(43000,100000,3650), np.random.normal(43500,140000,3650),np.random.normal(48000,70000,3650)], index=[1992,1993,1994,1995])
df.head()

plot a stacked bar chart matplotlib pandas

I want to plot this data frame but I get an error.
this is my df:
6month final-formula Question Text
166047.0 1 0.007421 bathing
166049.0 1 0.006441 dressing
166214.0 1 0.001960 feeding
166216.0 2 0.011621 bathing
166218.0 2 0.003500 dressing
166220.0 2 0.019672 feeding
166224.0 3 0.012882 bathing
166226.0 3 0.013162 dressing
166229.0 3 0.008821 feeding
160243.0 4 0.023424 bathing
156876.0 4 0.000000 dressing
172110.0 4 0.032024 feeding
how can I plot a stacked bar based on the Question text?
I tried some codes but raises error.
dffinal.groupby(['6month','Question Text']).unstack('Question Text').plot(kind='bar',stacked=True,x='6month', y='final-formula')
import matplotlib.pyplot as plt
plt.show()
Actually I want the 6month column be in the x-axis, final-formula in the y-axis and Question text being stacked.
so as here I have three kind of Question text, three stacked bar should be there. and as I have 4 month, 4 bars totally.
Something like this but I applied this and did not work.
Am I missing something?
this picture is without stacking them. its like all question text has been summed up. I want for each Question Text there be stacked.
You missed aggregation step after groupby, namely, sum()
df = dffinal.groupby(['6month','Question Text']).sum().unstack('Question Text')
df.columns = df.columns.droplevel()
df.plot(kind='bar', stacked=True)
I dropped multiindex level from columns just for legend consistency.

Tick labels overlap in pandas bar chart

TL;DR: In pandas how do I plot a bar chart so that its x axis tick labels look like those of a line chart?
I made a time series with evenly spaced intervals (one item each day) and can plot it like such just fine:
intensity[350:450].plot()
plt.show()
But switching to a bar chart created this mess:
intensity[350:450].plot(kind = 'bar')
plt.show()
I then created a bar chart using matplotlib directly but it lacks the nice date time series tick label formatter of pandas:
def bar_chart(series):
fig, ax = plt.subplots(1)
ax.bar(series.index, series)
fig.autofmt_xdate()
plt.show()
bar_chart(intensity[350:450])
Here's an excerpt from the intensity Series:
intensity[390:400]
2017-03-07 3
2017-03-08 0
2017-03-09 3
2017-03-10 0
2017-03-11 0
2017-03-12 0
2017-03-13 2
2017-03-14 0
2017-03-15 3
2017-03-16 0
Freq: D, dtype: int64
I could go all out on this and just create the tick labels by hand completely but I'd rather not have to baby matplotlib and let do pandas its job and do what it did in the very first figure but with a bar plot. So how do I do that?
Pandas bar plots are categorical plots. They create one tick (+label) for each category. If the categories are dates and those dates are continuous one may aim at leaving certain dates out, e.g. to plot only every fifth category,
ax = series.plot(kind="bar")
ax.set_xticklabels([t if not i%5 else "" for i,t in enumerate(ax.get_xticklabels())])
In contrast, matplotlib bar charts are numberical plots. Here a useful ticker can be applied, which ticks the dates weekly, monthly or whatever is needed.
In addition, matplotlib allows to have full control over the tick positions and their labels.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import dates
index = pd.date_range("2018-01-26", "2018-05-05")
series = pd.Series(np.random.rayleigh(size=100), index=index)
plt.bar(series.index, series.values)
plt.gca().xaxis.set_major_locator(dates.MonthLocator())
plt.gca().xaxis.set_major_formatter(dates.DateFormatter("%b\n%Y"))
plt.show()