Seaborn - Change the X-Axis Range (Date field) - pandas

how can I change the x-axis so that I begin on January 1 2022? I don't want to set the other side of the bound. The aim here is to create a YTD chart. Thanks! (Data type for the x-axis field 'Date_reported' is a Dtype datetime64[ns]) (ps: does anyone know why my figsize statement isn't working? I'm aiming for the 15 by 8 siz but it doesn't seem to work.
sns.relplot(kind='line', data=df_Final, x='Date_reported', y='New_cases_Mov_avg',
hue='Continent', linewidth=1, ci=None)
sns.set_style("white")
sns.set_style('ticks')
plt.xlabel("Date Reported")
plt.ylabel("New Cases (Moving Average)")
plt.figure(figsize=(15,8))

You could define your figure and ax beforehand, set the figsize and then plot. Doing so, you have to go with lineplot instead of relplot.
ax.set_xlim will define the left boundary, fig.autofmt_xdate rotates the x labels.
fig, ax = plt.subplots(figsize=(15,8))
sns.lineplot(data=df_Final, x='Date_reported', y='New_cases_Mov_avg',
hue='Continent', linewidth=1, ci=None, ax=ax)
sns.set_style("white")
sns.set_style('ticks')
ax.set_xlabel("Date Reported")
ax.set_ylabel("New Cases (Moving Average)")
ax.set_xlim(datetime.date(2022, 1, 1))
fig.autofmt_xdate()

Related

Remove thousands(k) in plotly line plot

I have plotly line plot with x axis year & week as integer. I do get correct data for year weeks 202101,202102,202103. In plot it shows as 202.1K,202.1K, 202.1K. I am looking for to show 202101,202102, 202103 in plot axis as well. Below is my code.
if chart_choice == 'line':
print(dff['week'])
dff = dff.groupby(['product','week'], as_index=False)[['CYSales']].sum()
fig = px.line(dff, x='week', y=num_value, color='product')
return fig
thanks for help
Does tickformat = "000" help perhaps? Have a look here Plotly for R: Remove the k that apears on the y axis when the dataset contains numbers larger than 1000

Creating a grid of polar histograms (python)

I wish to create a sub plot that looks like the following picture,
it is supposed to contain 25 polar histograms, and I wish to add them to the plot one by one.
needs to be in python.
I already figured I need to use matplotlib but can't seem to figure it out completely.
thanks a lot!
You can create a grid of polar axes via projection='polar'.
hist creates a histogram, also when working with polar axes. Note that the x is in radians with a range of 2π. It works best when you give the bins explicitly as a linspace from 0 to 2π (or from -π to π, depending on the data). The third parameter of linspace should be one more than the number of bars that you'd want for the full circle.
About the exact parameters of axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30), endpoint=True), color='dodgerblue', ec='black'):
axs[i][j] draw on the jth subplot of the ith line
.hist create a histogram
x: the values that are put into bins
bins=: to enter the bins (either a fixed number between lowest and highest x or some explicit boundaries; default is 10 fixed boundaries)
np.random.randint(7, 30) a random whole number between 7 and 29
np.linspace(0, 2 * np.pi, n, endpoint=True) divide the range between 0 and 2π into n equal parts; endpoint=True makes boundaries at 0, at 2π and at n-2 positions in between; when endpoint=False there will be a boundary at 0, at n-1 positions in between but none at the end
color='dodgerblue': the color of the histogram bars will be blueish
ec='black': the edge color of the bars will be black
import numpy as np
import matplotlib.pyplot as plt
fig, axs = plt.subplots(5, 5, figsize=(8, 8),
subplot_kw=dict(projection='polar'))
for i in range(5):
for j in range(5):
x = np.random.uniform(0, 2 * np.pi, 50)
axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30)), color='dodgerblue', ec='black')
plt.tight_layout()
plt.show()

Pandas set labeling legend from groupby elements

I'm plotting a kde distribution of 2 dataframes on the same axis, and I need to set a legend saying which line is which dataframe. Now, this is my code:
fig, ax = plt.subplots(figsize=(15,10))
for label, df in dataframe1.groupby('ID'):
dataframe1.Value.plot(kind="kde", ax=ax,color='r')
for label, df in dataframe2.groupby('ID'):
dataframe2.Value.plot(kind='kde', ax=ax, color='b')
plt.legend()
plt.title('title here', fontsize=20)
plt.axvline(x=np.pi,color='gray',linestyle='--')
plt.xlabel('mmHg', fontsize=16)
plt.show()
But the result is this:
How can I show the legends inside the graph as 'values from df1' and 'results from df2'?
Edit:
with the following code I correctly have the question's result. But in some dataframes I get the following results:
fig, ax = plt.subplots(figsize=(15,10))
sns.kdeplot(akiPEEP['Value'], color="r", label='type 1', ax=ax)
sns.kdeplot(noAkiPEEP['Value'], color="b",label='type 2', ax=ax)
plt.legend()
plt.title('d', fontsize=20)
plt.axvline(x=np.pi,color='gray',linestyle='--')
plt.xlabel('value', fontsize=16)
plt.show()
A distribution I'm plotting now:
How do I fix this? Also, is it good to also plot the rolling means over this distribution or it becomes too heavy?
I'm not sure I understand your question, but from your code, it looks like you are trying to plot one KDE per ID value in your dataframes. In which case you would have to do:
for label, df in dataframe1.groupby('ID'):
df.Value.plot(kind="kde", ax=ax,color='r', label=label)
notice that I replaced dataframe1 by df in the body of the for-loop. df correspond to the sub-dataframe where all the elements in the column ID have value label

plot ordering/layering julia pyplot

I have a subplot that plots a line (x,y) and a particular point (xx,yy). I want to highligh (xx,yy), so I've plotted it with scatter. However, even if I order it after the original plot, the new point still shows up behind the original line. How can I fix this? MWE below.
x = 1:10
y = 1:10
xx = 5
yy = 5
fig, ax = subplots()
ax[:plot](x,y)
ax[:scatter](xx,yy, color="red", label="h_star", s=100)
legend()
xlabel("x")
ylabel("y")
title("test")
grid("on")
You can change which plots are displayed on top of each other with the argument zorder. The matplotlib example shown here gives a brief explanation:
The default drawing order for axes is patches, lines, text. This
order is determined by the zorder attribute. The following defaults
are set
Artist Z-order
Patch / PatchCollection 1
Line2D / LineCollection 2
Text 3
You can change the order for individual artists by setting the zorder.
Any individual plot() call can set a value for the zorder of that
particular item.
A full example based on the code in the question, using python is shown below:
import matplotlib.pyplot as plt
x = range(1,10)
y = range(1,10)
xx = 5
yy = 5
fig, ax = plt.subplots()
ax.plot(x,y)
# could set zorder very high, say 10, to "make sure" it will be on the top
ax.scatter(xx,yy, color="red", label="h_star", s=100, zorder=3)
plt.legend()
plt.xlabel("x")
plt.ylabel("y")
plt.title("test")
plt.grid("on")
plt.show()
Which gives:

how to plot 2 histograms side by side?

I have 2 dataframes. I want to plot a histogram based on a column 'rate' for each, side by side. How to do it?
I tried this:
import matplotlib.pyplot as plt
plt.subplot(1,2,1)
dflux.hist('rate' , bins=100)
plt.subplot(1,2,2)
dflux2.hist('rate' , bins=100)
plt.tight_layout()
plt.show()
It did not have the desired effect. It showed two blank charts then one populated chart.
Use subplots to define a figure with two axes. Then specify the axis to plot to within hist using the ax parameter.
fig, axes = plt.subplots(1, 2)
dflux.hist('rate', bins=100, ax=axes[0])
dflux2.hist('rate', bins=100, ax=axes[1])
Demo
dflux = pd.DataFrame(dict(rate=np.random.randn(10000)))
dflux2 = pd.DataFrame(dict(rate=np.random.randn(10000)))
fig, axes = plt.subplots(1, 2)
dflux.hist('rate', bins=100, ax=axes[0])
dflux2.hist('rate', bins=100, ax=axes[1])