Matplotlib - Correlation plots with different range of numbers but on same scale - matplotlib

I would like to have a 2 by 3 figure with 6 correlation plots, sharing the same scale, even when the values in the plots have different ranges. Below you can see what I have so far.
In the first column, the values range from 0 to 1, with 1 on the diagonal, and close to 0 elsewhere. For the other two columns it holds for the top row that the values range from 0 to 1, whereas the values in the bottom row range from -1 and 1. The difference between the second and third column is that the values in the second column are around 0.3 (and -0.3) and the values in the third column are around 0.7 (and -0.7).
As you can see, several things seem to be going incorrect. First of all, although I want them all to be plotted according to the same color scale, with dark blue being -1 and yellow being 1, this is clearly not the case. If this would hold, we would have bright blue/greenish in the first column. What could I do to indicate the range for the colors? Next, how do I change the labels of the color scale on the right? I would like it to range from -1 to 1.
Below, you find my implementation.
fig, ax = plt.subplots(nrows=2, ncols=3, figsize=(15,8))
idx_mixed = {False: 0, True: 1}
idx_rho = {0: 0, 0.3: 1, 0.7: 2}
for mixed in [False, True]:
for rho in [0, 0.3, 0.7]:
ax[idx_mixed[mixed]][idx_rho[rho]].matshow(results[mixed][rho])
ax[0][0].set_title("No correlation", pad=20, fontsize=14)
ax[0][1].set_title("Weakly correlated", pad=20, fontsize=14)
ax[0][2].set_title("Strongly correlated", pad=20, fontsize=14)
ax[0][0].set_ylabel("Positive correlations", fontsize = 14)
ax[1][0].set_ylabel("Mixed correlations", fontsize = 14)
fig.colorbar(mpl.cm.ScalarMappable(), ax=fig.get_axes())

You need to provide a norm= argument to matshow() so that the data is scaled to the range [-1, 1] rather than a range defined by the min and max value present in the data. See Colormap Normalization for more details.
cmap = 'viridis'
norm = matplotlib.colors.Normalize(vmin=-1, vmax=1)
fig, axs = plt.subplots(2,3)
for ax, d in zip(axs.flat, data):
m = ax.matshow(d, cmap=cmap, norm=norm)
fig.colorbar(m, ax=axs)

Related

PyPlot line plot changing color by column value

I have a dataframe with a structure similar to the following example.
df = pd.DataFrame({'x': ['2008-01-01', '2008-01-02', '2008-01-03', '2008-01-04'], 'y': [1, 2, 3, 6],
'group_id': ['OBSERVED', 'IMPUTED', 'OBSERVED', 'IMPUTED'], 'color': ['blue', 'red', 'blue', 'red']})
df['x'] = pd.to_datetime(df['x'])
I.e. a dataframe where some of the values (y) are observed and others are imputed.
x y group_id color
0 2008-01-01 1 OBSERVED blue
1 2008-01-02 2 IMPUTED red
2 2008-01-03 3 OBSERVED blue
3 2008-01-04 6 IMPUTED red
How to I create a single line which changes color based on the group_id (the column color is uniquely determined by group_id as in this example)?
I have tried the following two solutions (one of them being omitted by the comment)
df_grp = df.groupby('group_id')
fig, ax = plt.subplots(1)
for id, data in df_grp:
#ax.plot(data['x'], data['y'], label=id, color=data['color'].unique().tolist()[0])
data.plot('x', 'y', label=id, ax=ax)
plt.legend()
plt.show()
However, the plow is not
a single line.
colored correctly by each segment.
You can use the below code to do the forward looking colors. The key was to get the data right in the dataframe, so that the plotting was easy. You can print(df) after manipulation to see what was done. Primarily, I added the x and y from below row as additional columns in the current row for all except last row. I also included a marker of the resultant color so that you know whether the color is red of blue. One thing to note, the dates in the x column should be in ascending order.
#Add x_end column to df from subsequent row - column X
end_date=df.iloc[1:,:]['x'].reset_index(drop=True).rename('x_end')
df = pd.concat([df, end_date], axis=1)
#Add y_end column to df from subsequent row - column y
end_y=df.iloc[1:,:]['y'].reset_index(drop=True).astype(int).rename('y_end')
df = pd.concat([df, end_y], axis=1)
#Add values for last row, same as x and y so the marker is of right color
df.iat[len(df)-1, 4] = df.iat[len(df)-1, 0]
df.iat[len(df)-1, 5] = df.iat[len(df)-1, 1]
for i in range(len(df)):
plt.plot([df.iloc[i,0],df.iloc[i,4]],[df.iloc[i,1],df.iloc[i,5]], marker='o', c=df.iat[i,3])
Output plot

Creating a grid of polar histograms (python)

I wish to create a sub plot that looks like the following picture,
it is supposed to contain 25 polar histograms, and I wish to add them to the plot one by one.
needs to be in python.
I already figured I need to use matplotlib but can't seem to figure it out completely.
thanks a lot!
You can create a grid of polar axes via projection='polar'.
hist creates a histogram, also when working with polar axes. Note that the x is in radians with a range of 2π. It works best when you give the bins explicitly as a linspace from 0 to 2π (or from -π to π, depending on the data). The third parameter of linspace should be one more than the number of bars that you'd want for the full circle.
About the exact parameters of axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30), endpoint=True), color='dodgerblue', ec='black'):
axs[i][j] draw on the jth subplot of the ith line
.hist create a histogram
x: the values that are put into bins
bins=: to enter the bins (either a fixed number between lowest and highest x or some explicit boundaries; default is 10 fixed boundaries)
np.random.randint(7, 30) a random whole number between 7 and 29
np.linspace(0, 2 * np.pi, n, endpoint=True) divide the range between 0 and 2π into n equal parts; endpoint=True makes boundaries at 0, at 2π and at n-2 positions in between; when endpoint=False there will be a boundary at 0, at n-1 positions in between but none at the end
color='dodgerblue': the color of the histogram bars will be blueish
ec='black': the edge color of the bars will be black
import numpy as np
import matplotlib.pyplot as plt
fig, axs = plt.subplots(5, 5, figsize=(8, 8),
subplot_kw=dict(projection='polar'))
for i in range(5):
for j in range(5):
x = np.random.uniform(0, 2 * np.pi, 50)
axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30)), color='dodgerblue', ec='black')
plt.tight_layout()
plt.show()

How to show axis ticks corresponding to plotted datapoints in a seaborn plot?

I am using seaborn to plot some values which are numerical. But the each of those numbers correspond to a textual value and I want those textual values to be displayed on the axes. Like if the numerical values progress as 0, 5, 10, ..., 30; each of those encoded numbers must be linked to a textual description. How can I do this?
Main Point:
Use:
ax = plt.gca()
ax.set_xticks([<the x components of your datapoints>]);
ax.set_yticks([<the y components of your datapoints>]);
More elaborate version below.
You can go back to matplotlib and it will do it for you.
Let's say you want to plot [0, 7, 14 ... 35] against [0, 2, 4, ... 10]. The two arrays can be created by:
stepy=7
stepx=2
[stepy*y for y in range(6)]
(returning [0, 7, 14, 21, 28, 35])
and
[stepx*x for x in range(6)]
(returning [0, 2, 4, 6, 8, 10]).
Plot these with seaborn:
sns.scatterplot([stepx*x for x in range(6)],[stepy*y for y in range(6)]).
Give current axis to matplotlib by ax = plt.gca(), finish using set_xticks and set_yticks:
ax.set_xticks([stepx*x for x in range(6)]);
ax.set_yticks([stepy*y for y in range(6)]);
The whole code together:
stepy=7
stepx=2
sns.scatterplot([stepx*x for x in range(6)],[stepy*y for y in range(6)])
ax = plt.gca()
ax.set_xticks([stepx*x for x in range(6)]);
ax.set_yticks([stepy*y for y in range(6)]);
Yielding the plot:
I changed the example in the OP because with those numbers to plot, the plots already behave as desired.

Matplotlib how to divide an histogram by a constant number

I would like to perform a personalized normalization on histograms on matplotlib. In particular I have two histograms and I would like to divide each of them by a given number (number of generated events).
I don't want to "normally" normalize it, because the "normal normalization" makes the area equal to 1. What I wish for is basically to divide the value of each bin by a given number N, so that if my histogram has 2 bins, one with 5 entries and one with 3, the resulting "normalized" (or "divided") histogram would have the first bin with 5/N entries and the second one with 3/N.
I searched far&wide and found nothing really helpful. Do you have any handy solution? This is my code, working with pandas:
num_bins = 128
list_1 = dataframe_1['E']
list_2 = dataframe_2['E']
fig, ax = plt.subplots()
ax.set_xlabel('Proton energy [MeV]')
ax.set_ylabel('Normalized frequency')
ax.set_title('Proton energy distribution')
n, bins, patches = ax.hist(list_1, num_bins, density=1, alpha=0.5, color='red', ec='red', label='label_1')
n, bins, patches = ax.hist(list_2, num_bins, density=1, alpha=0.5, color='blue', ec='blue', label='label_2')
plt.legend(loc='upper center', fontsize='x-large')
fig.savefig('NiceTitle.pdf')
plt.close('all')

How to change colorbar's color (in some particular value interval)?

In matplotlib, I would like to change colorbar's color in some particular value interval. For example, I would like to change the seismic colorbar, to let the values between -0.5 and 0.5 turn white, how can I do this?
thank you very much
You basically need to create your own colormap that has the particular features you want. Of course it is possible to make use of existing colormaps when doing so.
Colormaps are always ranged between 0 and 1. This range will then be mapped to the data interval. So in order to create whites between -0.5 and 0.5 we need to know the range of data - let's say data goes from -1 to 1. We can then decide to have the lower (blues) part of the seismic map go from -1 to -0.5, then have white between -0.5 and +0.5 and finally the upper part of the seismic map (reds) from 0.5 to 1. In the language of a colormap this corresponds to the ranges [0,0.25], [0.25, 0.75] and [0.75,1]. We can then create a list, with the first and last 25% percent being the colors of the seismic map and the middle 50% white.
This list can be used to create a colormap, using matplotlib.colors.LinearSegmentedColormap.from_list("colormapname", listofcolors).
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
n=50
x = 0.5
lower = plt.cm.seismic(np.linspace(0, x, n))
white = plt.cm.seismic(np.ones(100)*0.5)
upper = plt.cm.seismic(np.linspace(1-x, 1, n))
colors = np.vstack((lower, white, upper))
tmap = matplotlib.colors.LinearSegmentedColormap.from_list('terrain_map_white', colors)
x = np.linspace(0,10)
X,Y = np.meshgrid(x,x)
z = np.sin(X) * np.cos(Y*0.4)
fig, ax = plt.subplots()
im = ax.imshow(z, cmap=tmap)
plt.colorbar(im)
plt.show()
For more general cases, you may need a color normalization (using matplotlib.colors.Normalize). See e.g. this example, where a certain color in the colormap is always fixed at a data value of 0, independent of the data range.