demarcating a value on sns.heatmap() - matplotlib

I'm plotting the magnitude of a physiological response (z) as a function of trial (y) and time sample within the trial (x). I'm wondering if it's possible to add to each row (for each trial) a point indicating the reaction time for that trial.
def plot_evoked_response_map(ordered_samples_df, fig_name,
fig_path=fig_path, trial_end_sample_idx=1500):
jtplot.style('grade3', context='poster', fscale=1.4, spines=False, gridlines='--')
ordered_samples_df = ordered_samples_df.loc[ordered_samples_df.trial_sample < trial_end_sample_idx]
samples_sparse = ordered_samples_df[['trial_sample', 'trial_epoch',
'z_pupil_diameter']]
samples_sparse['reset_trial_epoch_idx'] = np.repeat(np.arange(0,n_trials),
trial_end_sample_idx)
# hack to get pivot to respect the stated order of the trial epochs
# otherwise, will sort the index ...
samples_pivot = samples_sparse.pivot(index='reset_trial_epoch_idx',
columns='trial_sample', values='z_pupil_diameter')
plt.figure(1)
fig, ax = plt.subplots(figsize=(10,10))
sns.heatmap(samples_pivot, fmt="g", cmap='viridis',
cbar_kws={'label': 'pupil diameter'}, robust=True, vmin=0, vmax=2)
plt.title(fig_name)
plt.ylabel('trial')
plt.savefig(os.path.join(fig_path, fig_name + '.png'))
return fig_name
_= plot_evoked_response_map(rt_ordered_samples_df, fig_name='RT_ordered_evoked_responses')
sample of the type of plot i'm generating

Related

I am passing a function through a dataframe in subgroups and I would like to get a separate plot for each subgroup. Now plots overalap

I want to calculate the prediction bands = (upb and lpb) of subgroups in my dataframe and plot each of them individually. I would like to plot the results in different plots because now the prediction bands overlap. I tried to save the output (lpb and upb) in a dataframe (predictionbounds) but I get an error and I think this approach is not elegant. How can I get 4 different plots out of the loop??
## 1,2,3 are functions I use in the loop
# 1) linear model
def model(x, y, start):
return (b*x) + start
# 2) Prediction band
def predband .....
return lpb, upb
# 3) function to calculate CI 0f fitted parameters
def conf_int:........
return params_ci
### prepare dataset for loop
# group dataset in subgroups. Loops is applied in each subgroup
dfforR2 = dfdata.groupby(["treatment1", "treatment2])
variables={'treatment1':treatment1, 'treatment2':'treatment2','b':float, 'start':float,
'r_2':float}
results = pd.DataFrame(variables, index=[])
#Here I try to create an empty dataframe so I can save the variables 'lpb' and 'upb'
#bounds={'treatment1':treatment1, 'treatment2':'treatment2', 'lpb':float, 'upb':float}
#predictionbounds=pd.DataFrame(bounds, index=[])
### loop and make fit
for key, g in dfforR2:
x= np.linspace(0, 2, )
popt, pcov = curve_fit(model, g['x'], g['y'])
confint=(conf_int(g['y'], alpha, popt, pcov))
lpb, upb=predband(x, g['x'], g['y'], popt, model, conf=0.95)
new_row = {'treatment1':key[0], 'treatment2':key[1], 'slope': popt[0], 'start':popt[1],
'r_2':r_2}
results=results.append(new_row, ignore_index=True)
Problem starts below:
#line below does not work as I get an error:
#AttributeError: 'dict' object has no attribute 'append'
#new_bound = {'treatment1':key[0], 'treatment2':key[1], ' lpb': lpb, 'upb':upb}
#this works but I would like to print different graphs
plt.fill_between(x, lpb, upb, color = 'grey', alpha = 0.15)
#### Plot manually the output
# construct fitted curve for this treatment
#x= np.linspace(0, 1500, 400)
a = model(x, results.iloc[0,2], results.iloc[0,3])
plt.plot(x, a, color='tab:blue', label='Ctrl_W')
Curenlty this is what I get:

Plot random points a specified distance apart

I'm trying to come up with a function that plots n points inside the unit circle, but I need them to be sufficiently spread out.
ie. something that looks like this:
Is it possible to write a function with two parameters, n (number of points) and min_d (minimum distance apart) such that the points are:
a) equidistant
b) no pairwise distance exceeds a given min_d
The problem with sampling from a uniform distribution is that it could happen that two points are almost on top of each other, which I do not want to happen. I need this kind of input for a network diagram representing node clusters.
EDIT: I have found an answer to a) here: Generator of evenly spaced points in a circle in python, but b) still eludes me.
At the time this answer was provided, the question asked for random numbers. This answer thus gives a solution drawing random numbers. It ignores any edits made to the question afterwards.
On may simply draw random points and for each one check if the condition of the minimum distance is fulfilled. If not, the point can be discarded. This can be done until a list is filled with enough points or some break condition is met.
import numpy as np
import matplotlib.pyplot as plt
class Points():
def __init__(self,n=10, r=1, center=(0,0), mindist=0.2, maxtrials=1000 ) :
self.success = False
self.n = n
self.r = r
self.center=np.array(center)
self.d = mindist
self.points = np.ones((self.n,2))*10*r+self.center
self.c = 0
self.trials = 0
self.maxtrials = maxtrials
self.tx = "rad: {}, center: {}, min. dist: {} ".format(self.r, center, self.d)
self.fill()
def dist(self, p, x):
if len(p.shape) >1:
return np.sqrt(np.sum((p-x)**2, axis=1))
else:
return np.sqrt(np.sum((p-x)**2))
def newpoint(self):
x = (np.random.rand(2)-0.5)*2
x = x*self.r-self.center
if self.dist(self.center, x) < self.r:
self.trials += 1
if np.all(self.dist(self.points, x) > self.d):
self.points[self.c,:] = x
self.c += 1
def fill(self):
while self.trials < self.maxtrials and self.c < self.n:
self.newpoint()
self.points = self.points[self.dist(self.points,self.center) < self.r,:]
if len(self.points) == self.n:
self.success = True
self.tx +="\n{} of {} found ({} trials)".format(len(self.points),self.n,self.trials)
def __repr__(self):
return self.tx
center =(0,0)
radius = 1
p = Points(n=40,r=radius, center=center)
fig, ax = plt.subplots()
x,y = p.points[:,0], p.points[:,1]
plt.scatter(x,y)
ax.add_patch(plt.Circle(center, radius, fill=False))
ax.set_title(p)
ax.relim()
ax.autoscale_view()
ax.set_aspect("equal")
plt.show()
If the number of points should be fixed, you may try to run find this number of points for decreasing distances until the desired number of points are found.
In the following case, we are looking for 60 points and start with a minimum distance of 0.6 which we decrease stepwise by 0.05 until there is a solution found. Note that this will not necessarily be the optimum solution, as there is only maxtrials of retries in each step. Increasing maxtrials will of course bring us closer to the optimum but requires more runtime.
center =(0,0)
radius = 1
mindist = 0.6
step = 0.05
success = False
while not success:
mindist -= step
p = Points(n=60,r=radius, center=center, mindist=mindist)
print p
if p.success:
break
fig, ax = plt.subplots()
x,y = p.points[:,0], p.points[:,1]
plt.scatter(x,y)
ax.add_patch(plt.Circle(center, radius, fill=False))
ax.set_title(p)
ax.relim()
ax.autoscale_view()
ax.set_aspect("equal")
plt.show()
Here the solution is found for a minimum distance of 0.15.

Numpy: regrid by averaging?

I'm trying to regrid a numpy array onto a new grid. In this specific case, I'm trying to regrid a power spectrum onto a logarithmic grid so that the data are evenly spaced logarithmically for plotting purposes.
Doing this with straight interpolation using np.interp results in some of the original data being ignored entirely. Using digitize gets the result I want, but I have to use some ugly loops to get it to work:
xfreq = np.fft.fftfreq(100)[1:50] # only positive, nonzero freqs
psw = np.arange(xfreq.size) # dummy array for MWE
# new logarithmic grid
logfreq = np.logspace(np.log10(np.min(xfreq)), np.log10(np.max(xfreq)), 100)
inds = np.digitize(xfreq,logfreq)
# interpolation: ignores data *but* populates all points
logpsw = np.interp(logfreq, xfreq, psw)
# so average down where available...
logpsw[np.unique(inds)] = [psw[inds==i].mean() for i in np.unique(inds)]
# the new plot
loglog(logfreq, logpsw, linewidth=0.5, color='k')
Is there a nicer way to accomplish this in numpy? I'd be satisfied with just a replacement of the inline loop step.
You can use bincount() twice to calculate the average value of every bins:
logpsw2 = np.interp(logfreq, xfreq, psw)
counts = np.bincount(inds)
mask = counts != 0
logpsw2[mask] = np.bincount(inds, psw)[mask] / counts[mask]
or use unique(inds, return_inverse=True) and bincount() twice:
logpsw4 = np.interp(logfreq, xfreq, psw)
uinds, inv_index = np.unique(inds, return_inverse=True)
logpsw4[uinds] = np.bincount(inv_index, psw) / np.bincount(inv_index)
Or if you use Pandas:
import pandas as pd
logpsw4 = np.interp(logfreq, xfreq, psw)
s = pd.groupby(pd.Series(psw), inds).mean()
logpsw4[s.index] = s.values

matplotlib x-axis ticks dates formatting and locations

I've tried to duplicate plotted graphs originally created with flotr2 for pdf output with matplotlib. I must say that flotr is way easyer to use... but that aside - im currently stuck at trying to format the dates /times on x-axis to desired format, which is hours:minutes with interval of every 2 hours, if period on x-axis is less than one day and year-month-day format if period is longer than 1 day with interval of one day.
I've read through numerous examples and tried to copy them, but outcome remains the same which is hours:minutes:seconds with 1 to 3 hour interval based on how long is the period.
My code:
colorMap = {
'speed': '#3388ff',
'fuel': '#ffaa33',
'din1': '#3bb200',
'din2': '#ff3333',
'satellites': '#bfbfff'
}
otherColors = ['#00A8F0','#C0D800','#CB4B4B','#4DA74D','#9440ED','#800080','#737CA1','#E4317F','#7D0541','#4EE2EC','#6698FF','#437C17','#7FE817','#FBB117']
plotMap = {}
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import matplotlib.dates as dates
fig = plt.figure(figsize=(22, 5), dpi = 300, edgecolor='k')
ax1 = fig.add_subplot(111)
realdata = data['data']
keys = realdata.keys()
if 'speed' in keys:
speed_index = keys.index('speed')
keys.pop(speed_index)
keys.insert(0, 'speed')
i = 0
for key in keys:
if key not in colorMap.keys():
color = otherColors[i]
otherColors.pop(i)
colorMap[key] = color
i += 1
label = u'%s' % realdata[keys[0]]['name']
ax1.set_ylabel(label)
plotMap[keys[0]] = {}
plotMap[keys[0]]['label'] = label
first_dates = [ r[0] for r in realdata[keys[0]]['data']]
date_range = first_dates[-1] - first_dates[0]
ax1.xaxis.reset_ticks()
if date_range > datetime.timedelta(days = 1):
ax1.xaxis.set_major_locator(dates.WeekdayLocator(byweekday = 1, interval=1))
ax1.xaxis.set_major_formatter(dates.DateFormatter('%Y-%m-%d'))
else:
ax1.xaxis.set_major_locator(dates.HourLocator(byhour=range(24), interval=2))
ax1.xaxis.set_major_formatter(dates.DateFormatter('%H:%M'))
ax1.xaxis.grid(True)
plotMap[keys[0]]['plot'] = ax1.plot_date(
dates.date2num(first_dates),
[r[1] for r in realdata[keys[0]]['data']], colorMap[keys[0]], xdate=True)
if len(keys) > 1:
first = True
for key in keys[1:]:
if first:
ax2 = ax1.twinx()
ax2.set_ylabel(u'%s' % realdata[key]['name'])
first = False
plotMap[key] = {}
plotMap[key]['label'] = u'%s' % realdata[key]['name']
plotMap[key]['plot'] = ax2.plot_date(
dates.date2num([ r[0] for r in realdata[key]['data']]),
[r[1] for r in realdata[key]['data']], colorMap[key], xdate=True)
plt.legend([value['plot'] for key, value in plotMap.iteritems()], [value['label'] for key, value in plotMap.iteritems()], loc = 2)
plt.savefig(path +"node.png", dpi = 300, bbox_inches='tight')
could someone point out why im not getting desired results, please?
Edit1:
I moved the formatting block after the plotting and seem to be getting better results now. They are still now desired results though. If period is less than day then i get ticks after every 2 hours (interval=2), but i wish i could get those ticks at even hours not uneven hours. Is that possible?
if date_range > datetime.timedelta(days = 1):
xax.set_major_locator(dates.DayLocator(bymonthday=range(1,32), interval=1))
xax.set_major_formatter(dates.DateFormatter('%Y-%m-%d'))
else:
xax.set_major_locator(dates.HourLocator(byhour=range(24), interval=2))
xax.set_major_formatter(dates.DateFormatter('%H:%M'))
Edit2:
This seemed to give me what i wanted:
if date_range > datetime.timedelta(days = 1):
xax.set_major_locator(dates.DayLocator(bymonthday=range(1,32), interval=1))
xax.set_major_formatter(dates.DateFormatter('%Y-%m-%d'))
else:
xax.set_major_locator(dates.HourLocator(byhour=range(0,24,2)))
xax.set_major_formatter(dates.DateFormatter('%H:%M'))
Alan
You are making this way harder on your self than you need to. matplotlib can directly plot against datetime objects. I suspect your problem is you are setting up the locators, then plotting, and the plotting is replacing your locators/formatters with the default auto versions. Try moving that block of logic about the locators to below the plotting loop.
I think that this could replace a fair chunk of your code:
d = datetime.timedelta(minutes=2)
now = datetime.datetime.now()
times = [now + d * j for j in range(500)]
ax = plt.gca() # get the current axes
ax.plot(times, range(500))
xax = ax.get_xaxis() # get the x-axis
adf = xax.get_major_formatter() # the the auto-formatter
adf.scaled[1./24] = '%H:%M' # set the < 1d scale to H:M
adf.scaled[1.0] = '%Y-%m-%d' # set the > 1d < 1m scale to Y-m-d
adf.scaled[30.] = '%Y-%m' # set the > 1m < 1Y scale to Y-m
adf.scaled[365.] = '%Y' # set the > 1y scale to Y
plt.draw()
doc for AutoDateFormatter
I achieved what i wanted by doing this:
if date_range > datetime.timedelta(days = 1):
xax.set_major_locator(dates.DayLocator(bymonthday=range(1,32), interval=1))
xax.set_major_formatter(dates.DateFormatter('%Y-%m-%d'))
else:
xax.set_major_locator(dates.HourLocator(byhour=range(0,24,2)))
xax.set_major_formatter(dates.DateFormatter('%H:%M'))

How can I make a greyscale copy of a Surface in pygame?

In pygame, I have a surface:
im = pygame.image.load('foo.png').convert_alpha()
im = pygame.transform.scale(im, (64, 64))
How can I get a grayscale copy of the image, or convert the image data to grayscale? I have numpy.
Use a Surfarray, and filter it with numpy or Numeric:
def grayscale(self, img):
arr = pygame.surfarray.array3d(img)
#luminosity filter
avgs = [[(r*0.298 + g*0.587 + b*0.114) for (r,g,b) in col] for col in arr]
arr = numpy.array([[[avg,avg,avg] for avg in col] for col in avgs])
return pygame.surfarray.make_surface(arr)
After a lot of research, I came up with this solution, because answers to this question were too slow for what I wanted this feature to:
def greyscale(surface: pygame.Surface):
start = time.time() # delete me!
arr = pygame.surfarray.array3d(surface)
# calulates the avg of the "rgb" values, this reduces the dim by 1
mean_arr = np.mean(arr, axis=2)
# restores the dimension from 2 to 3
mean_arr3d = mean_arr[..., np.newaxis]
# repeat the avg value obtained before over the axis 2
new_arr = np.repeat(mean_arr3d[:, :, :], 3, axis=2)
diff = time.time() - start # delete me!
# return the new surface
return pygame.surfarray.make_surface(new_arr)
I used time.time() to calculate the time cost for this approach, so for a (800, 600, 3) array it takes: 0.026769161224365234 s to run.
As you pointed out, here is a variant preserving the luminiscence:
def greyscale(surface: pygame.Surface):
arr = pygame.surfarray.pixels3d(surface)
mean_arr = np.dot(arr[:,:,:], [0.216, 0.587, 0.144])
mean_arr3d = mean_arr[..., np.newaxis]
new_arr = np.repeat(mean_arr3d[:, :, :], 3, axis=2)
return pygame.surfarray.make_surface(new_arr)
The easiest way is to iterate over all the pixels in your image and call .get_at(...) and .set_at(...).
This will be pretty slow, so in answer to your implicit suggestion about using NumPy, look at http://www.pygame.org/docs/tut/surfarray/SurfarrayIntro.html. The concepts and most of the code are identical.