How to plot a time series with ordinal levels in Plotly Python - plotly-python

I have time series data recorded at discrete ordinal levels (e.g. 0, 1, 2), and I'd like to plot them with meaningful names (e.g. low, medium, high).
Currently I have:
import pandas as pd
import plotly.express as px
df = pd.DataFrame({
"x": ["2022-01-01", "2022-01-02", "2022-01-03", "2022-01-04"],
"y": [2, 1, 2, 0],
})
fig = px.line(x=df.x, y=df.y, line_shape="hv")
fig.show()
which produces:
But I'd like something like:

This feels like the easiest way:
import pandas as pd
import plotly.express as px
df = pd.DataFrame({
"x": ["2022-01-01", "2022-01-02", "2022-01-03", "2022-01-04"],
"y": [2, 1, 2, 0],
})
fig = px.line(x=df.x, y=df.y, line_shape="hv")
fig.update_yaxes(
ticktext=["Low", "Medium", "High"],
tickvals=[0, 1, 2],
)
fig.show()
Result:
In Plotly language this falls under the "categorical" umbrella.
If the order needs tweaked, the categoryarray and categoryorder can also be set with update_yaxes.
https://plotly.com/python/reference/layout/yaxis/#layout-yaxis-categoryarray
https://plotly.com/python/reference/layout/yaxis/#layout-yaxis-categoryorder

Related

last tick label missing after change ticks frequency

I would like to change x ticks frequecy to every 5, but the last tick missing (20 in this case)!
#!/usr/bin/env python3
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
r = np.random.RandomState(10)
df = pd.DataFrame({
"x": np.linspace(0, 20, 10),
"y1": r.uniform(1, 10, 10),
"y2": r.uniform(5, 15, 10),
})
fig, ax = plt.subplots(figsize=(8, 4))
df.plot(x='x',ax=ax)
ax.set_xticks(np.arange(min(df['x']),max(df['x']),5))
plt.legend()
plt.show()
Output:

Scaling Markers with Zoom in Plotly's Scattermapbox

Does anyone know if it is possible to specify a mapping that varies the size of markers in a Plotly scattermapbox visualization as one varies the zoom level? I'd like to layer a scattermapbox visualization over a densitymapbox visualization and have the scatter plot be invisible at larger scales but come into view as one zooms in.
Thanks!
you can specify minzoom on layers
below example shows a density mapbox that are replaced by red markers after zooming in past zoom 4
this clearly works where markers and density items are the same. If different, best that you update question with sample data
import plotly.express as px
import pandas as pd
import geopandas as gpd
import shapely.geometry
import json
df = pd.DataFrame(
data=(
[
[32.4087249155, -100.9509696428, "2013-01-01", 1],
[31.5201976084, -102.1030942593, "2013-01-01", 1],
[31.434573418, -102.0592907601, "2013-01-01", 1],
[31.2635930582, -101.95341361, "2013-01-01", 1],
[31.4287233847, -102.0253840388, "2013-01-01", 1],
[31.4872286706, -101.5455598032, "2021-01-01", 1],
[31.5439162579, -101.4833865708, "2021-01-01", 1],
[31.5439362581, -101.4833065695, "2021-01-01", 1],
[31.7980713977, -102.0937650441, "2021-01-01", 1],
[32.02050082, -103.31736372, "2021-01-01", 1],
]
),
columns=["Latitude", "Longitude", "Date", "Count"],
)
fig = px.density_mapbox(
df,
lat="Latitude",
lon="Longitude",
z="Count",
radius=10,
zoom=3,
)
# fig = go.Figure(go.Scattermapbox())
fig.update_layout(
mapbox_layers=[
{
# "below": "traces",
"circle": {"radius": 10},
"color":"red",
"minzoom": 4,
"source": gpd.GeoSeries(
df.loc[:, ["Longitude", "Latitude"]].apply(
shapely.geometry.Point, axis=1
)
).__geo_interface__,
},
],
mapbox_style="carto-positron",
)

Align bar and line plot on x axis without the use of rank and pointplot

Please note, I've looked at other questions like question and my problem is different and not a duplicate!
I would like to have two plots, with the same x axis in matplotlib. I thought this should be achieved via constrained_layout, but apparently this is not the case. Here is an example code.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.gridspec as grd
x = np.arange(0, 30, 0.001)
df_line = pd.DataFrame({"x": x, "y": np.sin(x)})
df_bar = pd.DataFrame({
"x_bar": [1, 7, 10, 20, 30],
"y_bar": [0.0, 0.3, 0.4, 0.1, 0.2]
})
fig = plt.subplots(constrained_layout=True)
gs = grd.GridSpec(2, 1, height_ratios=[3, 2], wspace=0.1)
ax1 = plt.subplot(gs[0])
sns.lineplot(data=df_line, x=df_line["x"], y=df_line["y"], ax=ax1)
ax1.set_xlabel("time", fontsize="22")
ax1.set_ylabel("y values", fontsize="22")
plt.yticks(fontsize=16)
plt.xticks(fontsize=16)
plt.setp(ax1.get_legend().get_texts(), fontsize="22")
ax2 = plt.subplot(gs[1])
sns.barplot(data=df_bar, x="x_bar", y="y_bar", ax=ax2)
ax2.set_xlabel("time", fontsize="22")
ax2.set_ylabel("y values", fontsize="22")
plt.yticks(fontsize=16)
plt.xticks(fontsize=16)
this leads to the following figure.
However, I would like to see the corresponding x values of both plot aligned. How can I achieve this? Note, I've tried to use the following related question. However, this doesn't fully apply to my situation. First with the high number of x points (which I need in reality) point plots is make the picture to big and slow for loading. On top, I can't use the rank method as my categories for the barplot are not evenly distributed. They are specific points on the x axis which should be aligned with the corresponding point on the lineplot
x = np.arange(0, 30, 0.001)
df_line = pd.DataFrame({"x": x, "y": np.sin(x)})
df_bar = pd.DataFrame({
"x_bar": [1, 7, 10, 20, 30],
"y_bar": [0.0, 0.3, 0.4, 0.1, 0.2]
})
fig, (ax1, ax2) = plt.subplots(2,1)
ax1.plot(df_line['x'], df_line['y'])
for i in range(len(df_bar['x_bar'])):
ax2.axvline(x=df_bar['x_bar'][i], ymin=0, ymax=df_bar['y_bar'][i])
Output:
---edit---
I incorporated #mozway advice for linewidth:
lw = (300/ax1.get_xlim()[1])
ax2.axvline(x=df_bar['x_bar'][i], ymin=0, ymax=df_bar['y_bar'][i], solid_capstyle='butt', lw=lw)
Output:
or:

pandas scatter plot and groupby does not work

I am trying to do a scatter plot with pandas. Unfortunately kind='scatter' doesn't work. If I change this to kind='line' it works as expected. What can I do to fix this?
for label, d in df.groupby('m'):
d[['te','n']].sort_values(by='n', ascending=False).plot(kind="scatter", x='n', y='te', ax=ax, label='m = '+str(label))```
Use plot.scatter instead:
df = pd.DataFrame({'x': [0, 5, 7,3, 2, 4, 6], 'y': [0, 5, 7,3, 2, 4, 6]})
df.plot.scatter('x', 'y')
Use this snippet if you want individual labels and colours:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'm': np.random.randint(0, 5, size=100),
'x': np.random.uniform(size=100),
'y': np.random.uniform(size=100),
})
fig, ax = plt.subplots()
for label, d in df.groupby('m'):
# generate a random color:
color = list(np.random.uniform(size=3))
d.plot.scatter('x', 'y', label=f'group {label}', ax=ax, c=[color])

plot all columns of a pandas dataframe with matplotlib

I have a dataframe with a datetime-index and 65 columns.
And I want to plot all these colums among themselves,
not in a grid or in one figure.
df= test[['one', 'two', 'three','four']]
fig, ax = plt.subplots(figsize=(24, 5))
df.plot(ax=ax)
plt.show()
The example is all in one plot and not all among themselves.
You can loop over the columns of the DataFrame and create a new figure for each column. This will plot them all at once. If you want the next one to show up once the previous one is closed then put the call to plt.show() inside the for loop.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'one': [1, 3, 2, 5],
'two': [9, 6, 4, 3],
'three': [0, 1, 1, 0],
'four': [-3, 2, -1, 0]})
for i, col in enumerate(df.columns):
df[col].plot(fig=plt.figure(i))
plt.title(col)
plt.show()