Individual axes titles in a facet plot? - plotly-python

I have a Pandas dataframe that I would like to plot with different labels on each yaxis using plotly.
For example
fig = dataframe.plot(
x="t", y="value",
facet_col="variable_label",
facet_col_wrap=2,
yaxes_titles="multiplier_label", # This is unfortunately not a valid keyword argument
).update_yaxes(matches=None, showticklabels=True)
Is there a way to update each yaxis with a different title?

I found a solution - it is quite complicated due to the ordering of yaxes being from bottom to top, left to right! I needed the following utility function:
def get_yaxis(fig, i, facet_col_wrap):
"""Get the yaxis for the ith facet in a facet plot"""
yaxes = list(fig.select_yaxes())
n = len(yaxes)
n_cols = min(facet_col_wrap, n)
n_rows = n // n_cols
idx = [[n-((1+r)*n_cols-c) for c in range(n_cols)] for r in range(n_rows)]
idx = [item for sublist in idx for item in sublist]
return yaxes[idx[i]]
With this function and a list of my desired yaxes_titles at hand, I can finally do:
for i, yaxis_title in enumerate(yaxis_titles):
get_yaxis(fig, i, facet_col_wrap).update(title_text=yaxis_title)

Related

discrete numpy array to continuous array

I have some discrete data in an array, such that:
arr = np.array([[1,1,1],[2,2,2],[3,3,3],[2,2,2],[1,1,1]])
whose plot looks like:
I also have an index array, such that each unique value in arr is associated with a unique index value, like:
ind = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]])
What is the most pythonic way of converting arr from discrete values to continuous values, so that the array would look like this when plotted?:
therefore, interpolating between the discrete points to make continuous data
I found a solution to this if anyone has a similar issue. It is maybe not the most elegant so modifications are welcome:
def ref_linear_interp(x, y):
arr = []
ux=np.unique(x) #unique x values
for u in ux:
idx = y[x==u]
try:
min = y[x==u-1][0]
max = y[x==u][0]
except:
min = y[x==u][0]
max = y[x==u][0]
try:
min = y[x==u][0]
max = y[x==u+1][0]
except:
min = y[x==u][0]
max = y[x==u][0]
if min==max:
sub = np.full((len(idx)), min)
arr.append(sub)
else:
sub = np.linspace(min, max, len(idx))
arr.append(sub)
return np.concatenate(arr, axis=None).ravel()
y = np.array([[1,1,1],[2,2,2],[3,3,3],[2,2,2],[1,1,1]])
x = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]])
z = np.arange(1, 16, 1)
Here is an answer for the symmetric solution that I would expect when reading the question:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# create the data as described
numbers = [1,2,3,2,1]
nblock = 3
df = pd.DataFrame({
"x": np.arange(nblock*len(numbers)),
"y": np.repeat(numbers, nblock),
"label": np.repeat(np.arange(len(numbers)), nblock)
})
Expecting a constant block size of 3, we could use a rolling window:
df['y-smooth'] = df['y'].rolling(nblock, center=True).mean()
# fill NaNs
df['y-smooth'].bfill(inplace=True)
df['y-smooth'].ffill(inplace=True)
plt.plot(df['x'], df['y-smooth'], marker='*')
If the block size is allowed to vary, we could determine the block centers and interpolate piecewise.
centers = df[['x', 'y', 'label']].groupby('label').mean()
df['y-interp'] = np.interp(df['x'], centers['x'], centers['y'])
plt.plot(df['x'], df['y-interp'], marker='*')
Note: You may also try
centers = df[['x', 'y', 'label']].groupby('label').min() to select the left corner of the labelled blocks.

randomly choose value between two numpy arrays

I have two numpy arrays:
left = np.array([2, 7])
right = np.array([4, 7])
right_p1 = right + 1
What I want to do is
rand = np.zeros(left.shape[0])
for i in range(left.shape[0]):
rand[i] = np.random.randint(left[i], right_p1[i])
Is there a way I could do this without using a for loop?
You could try with:
extremes = zip(left, right_p1)
rand = map(lambda x: np.random.randint(x[0], x[1]), extremes)
This way you will end up with a map object. If you need to save memory, you can keep it that way, otherwise you can get the full np.array passing through a list conversion, like this:
rand = np.array(list(map(lambda x: np.random.randint(x[0], x[1]), extremes)))

In numpy, what is the efficient way to find the maximum values and their indices of a 3D ndarray across two axis?

How to find the correlation-peak values and coordinates of a set of 2D cross-correlation functions?
Given an 3D ndarray that contains a set of 2D cross-correlation functions. What is the efficient way to find the maximum(peak) values and their coordinates(x and y indices)?
The code below do the work but I think it is inefficient.
import numpy as np
import numpy.matlib
ccorr = np.random.rand(7,5,5)
xind = ccorr.argmax(axis=-1)
mccorr = ccorr[np.matlib.repmat(np.arange(0,7)[:,np.newaxis],1,5),np.matlib.repmat(np.arange(0,5)[np.newaxis,:],7,1), xind]
yind = mccorr.argmax(axis=-1)
xind = xind[np.arange(0,7),yind]
values = mccorr[np.arange(0,7),yind]
print("cross-correlation functions (z,y,x)")
print(ccorr)
print("x and y indices of the maximum values")
print(xind,yind)
print("Maximum values")
print(values)
You'll want to flatten the dimensions you're searching over and then use unravel_index and take_along_axis to get the coordinates and values, respectively.
ccorr = np.random.rand(7,5,5)
cc_rav = ccorr.reshape(ccorr.shape[0], -1)
idx = np.argmax(cc_rav, axis = -1)
indices_2d = np.unravel_index(idx, ccorr.shape[1:])
vals = np.take_along_axis(ccorr, indices = indices_2d, axis = 0)
if you're using numpy version <1.15:
vals = cc_rav[np.arange(ccorr.shape[0]), idx]
or:
vals = ccorr[np.arange(ccorr.shape[0]),
indices_2d[0], indices_2d[1]]

Numpy: regrid by averaging?

I'm trying to regrid a numpy array onto a new grid. In this specific case, I'm trying to regrid a power spectrum onto a logarithmic grid so that the data are evenly spaced logarithmically for plotting purposes.
Doing this with straight interpolation using np.interp results in some of the original data being ignored entirely. Using digitize gets the result I want, but I have to use some ugly loops to get it to work:
xfreq = np.fft.fftfreq(100)[1:50] # only positive, nonzero freqs
psw = np.arange(xfreq.size) # dummy array for MWE
# new logarithmic grid
logfreq = np.logspace(np.log10(np.min(xfreq)), np.log10(np.max(xfreq)), 100)
inds = np.digitize(xfreq,logfreq)
# interpolation: ignores data *but* populates all points
logpsw = np.interp(logfreq, xfreq, psw)
# so average down where available...
logpsw[np.unique(inds)] = [psw[inds==i].mean() for i in np.unique(inds)]
# the new plot
loglog(logfreq, logpsw, linewidth=0.5, color='k')
Is there a nicer way to accomplish this in numpy? I'd be satisfied with just a replacement of the inline loop step.
You can use bincount() twice to calculate the average value of every bins:
logpsw2 = np.interp(logfreq, xfreq, psw)
counts = np.bincount(inds)
mask = counts != 0
logpsw2[mask] = np.bincount(inds, psw)[mask] / counts[mask]
or use unique(inds, return_inverse=True) and bincount() twice:
logpsw4 = np.interp(logfreq, xfreq, psw)
uinds, inv_index = np.unique(inds, return_inverse=True)
logpsw4[uinds] = np.bincount(inv_index, psw) / np.bincount(inv_index)
Or if you use Pandas:
import pandas as pd
logpsw4 = np.interp(logfreq, xfreq, psw)
s = pd.groupby(pd.Series(psw), inds).mean()
logpsw4[s.index] = s.values

N-D interpolation for equally-spaced data

I'm trying to copy the Scipy Cookbook function:
from scipy import ogrid, sin, mgrid, ndimage, array
x,y = ogrid[-1:1:5j,-1:1:5j]
fvals = sin(x)*sin(y)
newx,newy = mgrid[-1:1:100j,-1:1:100j]
x0 = x[0,0]
y0 = y[0,0]
dx = x[1,0] - x0
dy = y[0,1] - y0
ivals = (newx - x0)/dx
jvals = (newy - y0)/dy
coords = array([ivals, jvals])
newf = ndimage.map_coordinates(fvals, coords)
by using my own function that has to work for many scenarios
import scipy
import numpy as np
"""N-D interpolation for equally-spaced data"""
x = np.c_[plist['modx']]
y = np.transpose(np.c_[plist['mody']])
pdb.set_trace()
#newx,newy = np.meshgrid(plist['newx'],plist['newy'])
newx,newy = scipy.mgrid[plist['modx'][0]:plist['modx'][-1]:-plist['remapto'],
plist['mody'][0]:plist['mody'][-1]:-plist['remapto']]
x0 = x[0,0]
y0 = y[0,0]
dx = x[1,0] - x0
dy = y[0,1] - y0
ivals = (newx - x0)/dx
jvals = (newy - y0)/dy
coords = scipy.array([ivals, jvals])
for i in np.arange(ivals.shape[0]):
nvals[i] = scipy.ndimage.map_coordinates(ivals[i], coords)
I'm having difficulty getting this code to work properly. The problem areas are:
1.) Recreating this line: newx,newy = mgrid[-1:1:100j,-1:1:100j]. In my case I have a dictionary with the grid in vector form. I've tried to recreate this line using np.meshgrid but then I get an error on line coords = scipy.array([ivals, jvals]). I'm looking for some help in recreating this Cookbook function and making it more dynamic
any help is greatly appreciated.
/M
You should have a look at the documentation for map_coordinates. I don't see where the actual data you are trying to interpolate is in your code. What I mean is, presumably you have some data input which is a function of x and y; i.e. input = f(x,y) that you want to interpolate. In the first example you show, this is the array fvals. This should be your first argument to map_coordinates.
For example, if the data you are trying to inperpolate is input, which should be a 2-dimensional array of shape (len(x),len(y)), then the interpolated data would be:
interpolated_data = map_coordinates(input, coords)