seaborn.swarmplot problem with symlog scale: zero's are not expanded - matplotlib

I have a data set of positive values and zero's that I would like to show on the log scale. To represent zero's I use 'symlog' option, but all zero values are mapped into one point on swarmplot. How to fix it?
import numpy as np
import seaborn as sns
import pandas as pd
import random
import matplotlib.pyplot as plt
n = 100
x = np.concatenate(([0]*n,np.linspace(0,1,n),[5]*n,np.linspace(10,100,n),np.linspace(100,1000,n)),axis=None)
data = pd.DataFrame({'value': x, 'category': random.choices([0,1,2,3], k=len(x))})
f, ax = plt.subplots(figsize=(10, 6))
ax.set_yscale("symlog",linthreshy=1.e-2)
ax.set_ylim(ymax=1000)
sns.swarmplot(x="category", y="value", data=data)
sns.despine(left=True)
link to the resulting plot

Related

how to display netcdf raster values over map?

I'm trying to plot netcdf raster values of snowfall data in a text format overlaying what I currently have (mentioned further below). Example, something like this below:
Example
This is all the relevant code I have so far. I excluded the non relevant code. I tried plt.text and it gave me "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
What I have plotted so far
import numpy
from datetime import datetime
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import cartopy.mpl.ticker as cticker
import matplotlib.pyplot as plt
from matplotlib import ticker, patheffects
from metpy.units import units
import numpy as np
import numpy.ma as ma
from scipy.ndimage import gaussian_filter, maximum_filter, minimum_filter
import xarray as xr
from metpy.plots import USCOUNTIES
from gradient import Gradient
import pandas as pd
import matplotlib.colors as col
#Open NOAA Snowfall dataset
ds = xr.open_dataset('sfav2_CONUS_2021093012_to_2022042512.nc')
ds
lat = ds.lat
lon = ds.lon
#converts snowfall data to inches
snowdata = ds['Data'] * 39
plt.text(lon, lat, snowdata, transform=datacrs)
As far as I know there isn't a vectorized way of plotting text (plt.text or plt.annotated). So you'll have to loop over the arrays and plot each point.
import matplotlib.pyplot as plt
import matplotlib.patheffects as PathEffects
import cartopy.crs as ccrs
import numpy as np
data = np.random.rand(18, 9)
lons, lats = np.mgrid[-17:18:2, 8:-9:-2]
lons = lons * 10
lats = lats * 10
fig, ax = plt.subplots(figsize=(10, 5), dpi=86, facecolor="w", subplot_kw=dict(projection=ccrs.EqualEarth()))
ax.pcolormesh(lons, lats, data, cmap="coolwarm", alpha=.2, transform=ccrs.PlateCarree())
ax.coastlines()
for val, lat, lon in zip(data.flat, lats.flat, lons.flat):
ax.text(
lon, lat, f"{val:1.1f}", ha="center", va="center", transform=ccrs.PlateCarree(),
path_effects=[PathEffects.withStroke(linewidth=3, foreground="w", alpha=.5)],
)

Pandas.rolling().median vs Scipy.signal.medfilt()

I applied Pandas.rolling().median(), and it has a delay or phase shift (green line).
If I use Scipy.signal.medfilt() the results are not shifted (yellow line).
Why, the results are not the same?
P.S. I've tried to look into medfilt implementation, which uses sigtools._order_filterND, which I assume is not in python.
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import medfilt
x = np.zeros(100)
x[30:70] = 2
x+= .1 * np.random.randn(100)
y1 = medfilt(x, kernel_size=5)
plt.plot(x, '.-', label='original signal');
plt.plot(y1, '.-', alpha=.5, label='medfilt');
plt.plot(pd.Series(x).rolling(5).median(), label='rolling window');
plt.legend();
Have you tried to center the window while rolling?
Pandas.rolling(center=True).median()

Add a category without data in it to a plot in seaborn

I am making plotting some data as a catplot like this:
ax = sns.catplot(x='Kind', y='VAF', hue='Sample', jitter=True, data=df, legend=False)
The trouble is that some of the categories of 'VAF' contain no data, and the corresponding label is not added to the plot. Is there a way to retain the label but just not plot any points for it?
Here is a reproducible example to help explain:
x=pd.DataFrame({'Data':[1,3,4,6,3,2],'Number':['One','One','One','One','Three','Three']})
plt.figure()
ax = sns.catplot(x='Number', y='Data', jitter=True, data=x)
In this plot you can see that on the x-axis, samples One and Three are displayed. But imagine that there is also a sample Two that just had no data points in it. How can I display One, Two, and Three on the x-axis?
Order parameter
Of course one would need to know which categories are expected. Given a list of expected categories, one can use the order parameter to supply the expected categories.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'Data':[1,3,4,6,3,2],
'Number':['One','One','One','One','Three','Three']})
exp_cats = ["One", "Two", "Three"]
ax = sns.stripplot(x='Number', y='Data', jitter=True, data=df, order=exp_cats)
plt.show()
Alternatives
The above works with matplotlib 2.2.3, but not with 3.0. It works again with the current development version (hence 3.1). For the moment, there are the following alternatives:
A. Looping over categories
Given a list of expected categories, one can just loop over them and plot a scatter of each category.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Data':[1,3,4,6,3,2],
'Number':['One','One','One','One','Three','Three']})
exp_cats = ["One", "Two", "Three"]
for i, cat in enumerate(exp_cats):
cdf = df[df["Number"] == cat]
x = np.zeros(len(cdf))+i+.2*(np.random.rand(len(cdf))-0.5)
plt.scatter(x, cdf["Data"].values)
plt.xticks(range(len(exp_cats)), exp_cats)
plt.show()
B. Map categories to numbers.
You can map the expected categories to numbers and plot numbers instead of categories.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'Data':[1,3,4,6,3,2],
'Number':['One','One','One','One','Three','Three']})
exp_cats = ["One", "Two", "Three"]
df["IntNumber"] = df["Number"].map(dict(zip(exp_cats, range(len(exp_cats)))))
plt.scatter(df["IntNumber"] + .2*(np.random.rand(len(df))-0.5), df["Data"].values,
c = df["IntNumber"].values.astype(int))
plt.xticks(range(len(exp_cats)), exp_cats)
plt.show()
C. Appending missing categories to the dataframe
Finally you may append nan values to the dataframe to make sure each expected category appears in it.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'Data':[1,3,4,6,3,2],
'Number':['One','One','One','One','Three','Three']})
exp_cats = ["One", "Two", "Three"]
dfa = df.append(pd.DataFrame({'Data':[np.nan]*len(exp_cats), 'Number':exp_cats}))
ax = sns.stripplot(x='Number', y='Data', jitter=True, data=dfa, order=exp_cats)
plt.show()

matplotlib adding string to a an axis

import matplotlib.pyplot as plt
import numpy as np
ydata = [55,60,65,70,75,80]
xdata = [1,2,3,4,5,6]
plt.plot(xdata, ydata)
set(plt.gca,'XTickLabel',{'Jan','Feb','Mar','April','May','June'})
plt.show()
I am using matplotlib and trying to add text values to appear on the x axis.
I have tried to use the following code but get the following error message
set(plt.gca,'XTickLabel',
{'Jan','Feb','Mar','April','May','June'})
TypeError: set expected at most 1 arguments, got 3 I am not sure what this
is referring get current access I have set the value
Sets are a Python data structure, it has nothing to do with what you want here, you only need to use ax.set_xticklabels and ax.set_xticks to ensure all of them show in the plot:
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
ydata = [55,60,65,70,75,80]
xdata = [1,2,3,4,5,6]
ax.set_xticks(xdata)
ax.set_xticklabels(['Jan','Feb','Mar','April','May','June'])
plt.plot(xdata, ydata)
plt.show()

My pandas-generated subplots are layouted incorrectly

I ran the following code to get two plots next to each other (it is a minimal working example that you can copy):
import pandas as pd
import numpy as np
from matplotlib.pylab import plt
comp1 = np.random.normal(0,1,size=200)
values = pd.Series(comp1)
plt.close("all")
f = plt.figure()
plt.show()
sp1 = f.add_subplot(2,2,1)
values.hist(bins=100, alpha=0.5, color="r", normed=True)
sp2 = f.add_subplot(2,2,2)
values.plot(kind="kde")
Unfortunately, I then get the following image:
This is also an interesting layout, but I wanted the figures to be next to each other. What did I do wrong? How can I correct it?
For clarity, I could also use this:
import pandas as pd
import numpy as np
from matplotlib.pylab import plt
comp1 = np.random.normal(0,1,size=200)
values = pd.Series(comp1)
plt.close("all")
fig, axes = plt.subplots(2,2)
plt.show()
axes[0,0].hist(values, bins=100, alpha=0.5, color="r", normed=True) # Until here, it works. You get a half-finished correct image of what I was going for (though it is 2x2 here)
axes[0,1].plot(values, kind="kde") # This does not work
Unfortunately, in this approach axes[0,1] refers to the subplot that has a plot method but does not know kind="kde". Please take into consideration that the in the first version plot is executed on the pandas object, whereas in the second version plot is executed on the subplot, which does not work with the kind="kde" parameter.
use ax= argument to set which subplot object to plot:
import pandas as pd
import numpy as np
from matplotlib.pylab import plt
comp1 = np.random.normal(0,1,size=200)
values = pd.Series(comp1)
plt.close("all")
f = plt.figure()
sp1 = f.add_subplot(2,2,1)
values.hist(bins=100, alpha=0.5, color="r", normed=True, ax=sp1)
sp2 = f.add_subplot(2,2,2)
values.plot(kind="kde", ax=sp2)