Mutiple plots in a single window - matplotlib

I need to draw many such rows (for a0 .. a128) in a single window. I've searched in FacetGrid, PairGrid and all over around but couldn't find. Only regplot has similar argument ax but it doesn't plot histograms. My data is 128 real valued features with label column [0, 1]. I need the graphs to be shown from my Python code as a separate application on Linux.
Also, it there a way to scale this histogram to show relative values on Y such that the right curve is not skewed?
g = sns.FacetGrid(df, col="Result")
g.map(plt.hist, "a0", bins=20)
plt.show()

Just a simple example using matplotlib. The code is not optimized (ugly, but simple plot-indexing):
import numpy as np
import matplotlib.pyplot as plt
N = 5
data = np.random.normal(size=(N*N, 1000))
f, axarr = plt.subplots(N, N) # maybe you want sharex=True, sharey=True
pi = [0,0]
for i in range(data.shape[0]):
if pi[1] == N:
pi[0] += 1 # next row
pi[1] = 0 # first column again
axarr[pi[0], pi[1]].hist(data[i], normed=True) # i was wrong with density;
# normed=True should be used
pi[1] += 1
plt.show()
Output:

Related

Multiple different kinds of plots on a single figure and save it to a video

I am trying to plot multiple different plots on a single matplotlib figure with in a for loop. At the moment it is all good in matlab as shown in the picture below and then am able to save the figure as a video frame. Here is a link of a sample video generated in matlab for 10 frames
In python, tried it as below
import matplotlib.pyplot as plt
for frame in range(FrameStart,FrameEnd):#loop1
# data generation code within a for loop for n frames from source video
array1 = np.zeros((200, 3800))
array2 = np.zeros((19,2))
array3 = np.zeros((60,60))
for i in range(len(array2)):#loop2
#generate data for arrays 1 to 3 from the frame data
#end loop2
plt.subplot(6,1,1)
plt.imshow(DataArray,cmap='gray')
plt.subplot(6, 1, 2)
plt.bar(data2D[:,0], data2D[:,1])
plt.subplot(2, 2, 3)
plt.contourf(mapData)
# for fourth plot, use array2[3] and array2[5], plot it as shown and keep the\is #plot without erasing for next frame
not sure how to do the 4th axes with line plots. This needs to be there (done using hold on for this axis in matlab) for the entire sequence of frames processing in the for loop while the other 3 axes needs to be erased and updated with new data for each frame in the movie. The contour plot needs to be square all the time with color bar on the side. At the end of each frame processing, once all the axes are updated, it needs to be saved as a frame of a movie. Again this is easily done in matlab, but not sure in python.
Any suggestions
thanks
I guess you need something like this format.
I have used comments # in code to answer your queries. Please check the snippet
import matplotlib.pyplot as plt
fig=plt.figure(figsize=(6,6))
ax1=fig.add_subplot(311) #3rows 1 column 1st plot
ax2=fig.add_subplot(312) #3rows 1 column 2nd plot
ax3=fig.add_subplot(325) #3rows 2 column 5th plot
ax4=fig.add_subplot(326) #3rows 2 column 6th plot
plt.show()
To turn off ticks you can use plt.axis('off'). I dont know how to interpolate your format so left it blank . You can adjust your figsize based on your requirements.
import numpy as np
from numpy import random
import matplotlib.pyplot as plt
fig=plt.figure(figsize=(6,6)) #First is width Second is height
ax1=fig.add_subplot(311)
ax2=fig.add_subplot(312)
ax3=fig.add_subplot(325)
ax4=fig.add_subplot(326)
#Bar Plot
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
ax2.bar(langs,students)
#Contour Plot
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X, Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
cp = ax3.contourf(X, Y, Z)
fig.colorbar(cp,ax=ax3) #Add a colorbar to a plot
#Multiple line plot
x = np.linspace(-1, 1, 50)
y1 = 2*x + 1
y2 = 2**x + 1
ax4.plot(x, y2)
ax4.plot(x, y1, color='red',linewidth=1.0)
plt.tight_layout() #Make sures plots dont overlap
plt.show()

pandas subplot, split into rows [duplicate]

I have a few Pandas DataFrames sharing the same value scale, but having different columns and indices. When invoking df.plot(), I get separate plot images. what I really want is to have them all in the same plot as subplots, but I'm unfortunately failing to come up with a solution to how and would highly appreciate some help.
You can manually create the subplots with matplotlib, and then plot the dataframes on a specific subplot using the ax keyword. For example for 4 subplots (2x2):
import matplotlib.pyplot as plt
fig, axes = plt.subplots(nrows=2, ncols=2)
df1.plot(ax=axes[0,0])
df2.plot(ax=axes[0,1])
...
Here axes is an array which holds the different subplot axes, and you can access one just by indexing axes.
If you want a shared x-axis, then you can provide sharex=True to plt.subplots.
You can see e.gs. in the documentation demonstrating joris answer. Also from the documentation, you could also set subplots=True and layout=(,) within the pandas plot function:
df.plot(subplots=True, layout=(1,2))
You could also use fig.add_subplot() which takes subplot grid parameters such as 221, 222, 223, 224, etc. as described in the post here. Nice examples of plot on pandas data frame, including subplots, can be seen in this ipython notebook.
You can plot multiple subplots of multiple pandas data frames using matplotlib with a simple trick of making a list of all data frame. Then using the for loop for plotting subplots.
Working code:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# dataframe sample data
df1 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df2 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df3 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df4 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df5 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df6 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
#define number of rows and columns for subplots
nrow=3
ncol=2
# make a list of all dataframes
df_list = [df1 ,df2, df3, df4, df5, df6]
fig, axes = plt.subplots(nrow, ncol)
# plot counter
count=0
for r in range(nrow):
for c in range(ncol):
df_list[count].plot(ax=axes[r,c])
count+=1
Using this code you can plot subplots in any configuration. You need to define the number of rows nrow and the number of columns ncol. Also, you need to make list of data frames df_list which you wanted to plot.
You can use the familiar Matplotlib style calling a figure and subplot, but you simply need to specify the current axis using plt.gca(). An example:
plt.figure(1)
plt.subplot(2,2,1)
df.A.plot() #no need to specify for first axis
plt.subplot(2,2,2)
df.B.plot(ax=plt.gca())
plt.subplot(2,2,3)
df.C.plot(ax=plt.gca())
etc...
You can use this:
fig = plt.figure()
ax = fig.add_subplot(221)
plt.plot(x,y)
ax = fig.add_subplot(222)
plt.plot(x,z)
...
plt.show()
You may not need to use Pandas at all. Here's a matplotlib plot of cat frequencies:
x = np.linspace(0, 2*np.pi, 400)
y = np.sin(x**2)
f, axes = plt.subplots(2, 1)
for c, i in enumerate(axes):
axes[c].plot(x, y)
axes[c].set_title('cats')
plt.tight_layout()
Option 1: Create subplots from a dictionary of dataframes with long (tidy) data
Assumptions:
There is a dictionary of multiple dataframes of tidy data that are either:
Created by reading in from files
Created by separating a single dataframe into multiple dataframes
The categories, cat, may be overlapping, but all dataframes don't necessarily contain all values of cat
hue='cat'
This example uses a dict of dataframes, but a list of dataframes would be similar.
If the dataframes are wide, use pandas.DataFrame.melt to convert them to long form.
Because dataframes are being iterated through, there's no guarantee that colors will be mapped the same for each plot
A custom color map needs to be created from the unique 'cat' values for all the dataframes
Since the colors will be the same, place one legend to the side of the plots, instead of a legend in every plot
Tested in python 3.10, pandas 1.4.3, matplotlib 3.5.1, seaborn 0.11.2
Imports and Test Data
import pandas as pd
import numpy as np # used for random data
import matplotlib.pyplot as plt
from matplotlib.patches import Patch # for custom legend - square patches
from matplotlib.lines import Line2D # for custom legend - round markers
import seaborn as sns
import math import ceil # determine correct number of subplot
# synthetic data
df_dict = dict()
for i in range(1, 7):
np.random.seed(i) # for repeatable sample data
data_length = 100
data = {'cat': np.random.choice(['A', 'B', 'C'], size=data_length),
'x': np.random.rand(data_length), 'y': np.random.rand(data_length)}
df_dict[i] = pd.DataFrame(data)
# display(df_dict[1].head())
cat x y
0 B 0.944595 0.606329
1 A 0.586555 0.568851
2 A 0.903402 0.317362
3 B 0.137475 0.988616
4 B 0.139276 0.579745
# display(df_dict[6].tail())
cat x y
95 B 0.881222 0.263168
96 A 0.193668 0.636758
97 A 0.824001 0.638832
98 C 0.323998 0.505060
99 C 0.693124 0.737582
Create color mappings and plot
# create color mapping based on all unique values of cat
unique_cat = {cat for v in df_dict.values() for cat in v.cat.unique()} # get unique cats
colors = sns.color_palette('tab10', n_colors=len(unique_cat)) # get a number of colors
cmap = dict(zip(unique_cat, colors)) # zip values to colors
col_nums = 3 # how many plots per row
row_nums = math.ceil(len(df_dict) / col_nums) # how many rows of plots
# create the figue and axes
fig, axes = plt.subplots(row_nums, col_nums, figsize=(9, 6), sharex=True, sharey=True)
# convert to 1D array for easy iteration
axes = axes.flat
# iterate through dictionary and plot
for ax, (k, v) in zip(axes, df_dict.items()):
sns.scatterplot(data=v, x='x', y='y', hue='cat', palette=cmap, ax=ax)
sns.despine(top=True, right=True)
ax.legend_.remove() # remove the individual plot legends
ax.set_title(f'dataset = {k}', fontsize=11)
fig.tight_layout()
# create legend from cmap
# patches = [Patch(color=v, label=k) for k, v in cmap.items()] # square patches
patches = [Line2D([0], [0], marker='o', color='w', markerfacecolor=v, label=k, markersize=8) for k, v in cmap.items()] # round markers
# place legend outside of plot; change the right bbox value to move the legend up or down
plt.legend(title='cat', handles=patches, bbox_to_anchor=(1.06, 1.2), loc='center left', borderaxespad=0, frameon=False)
plt.show()
Option 2: Create subplots from a single dataframe with multiple separate datasets
The dataframes must be in a long form with the same column names.
This option uses pd.concat to combine multiple dataframes into a single dataframe, and .assign to add a new column.
See Import multiple csv files into pandas and concatenate into one DataFrame for creating a single dataframes from a list of files.
This option is easier because it doesn't require manually mapping colors to 'cat'
Combine DataFrames
# using df_dict, with dataframes as values, from the top
# combine all the dataframes in df_dict to a single dataframe with an identifier column
df = pd.concat((v.assign(dataset=k) for k, v in df_dict.items()), ignore_index=True)
# display(df.head())
cat x y dataset
0 B 0.944595 0.606329 1
1 A 0.586555 0.568851 1
2 A 0.903402 0.317362 1
3 B 0.137475 0.988616 1
4 B 0.139276 0.579745 1
# display(df.tail())
cat x y dataset
595 B 0.881222 0.263168 6
596 A 0.193668 0.636758 6
597 A 0.824001 0.638832 6
598 C 0.323998 0.505060 6
599 C 0.693124 0.737582 6
Plot a FacetGrid with seaborn.relplot
sns.relplot(kind='scatter', data=df, x='x', y='y', hue='cat', col='dataset', col_wrap=3, height=3)
Both options create the same result, however, it's less complicated to combine all the dataframes, and plot a figure-level plot with sns.relplot.
Building on #joris response above, if you have already established a reference to the subplot, you can use the reference as well. For example,
ax1 = plt.subplot2grid((50,100), (0, 0), colspan=20, rowspan=10)
...
df.plot.barh(ax=ax1, stacked=True)
Here is a working pandas subplot example, where modes is the column names of the dataframe.
dpi=200
figure_size=(20, 10)
fig, ax = plt.subplots(len(modes), 1, sharex="all", sharey="all", dpi=dpi)
for i in range(len(modes)):
ax[i] = pivot_df.loc[:, modes[i]].plot.bar(figsize=(figure_size[0], figure_size[1]*len(modes)),
ax=ax[i], title=modes[i], color=my_colors[i])
ax[i].legend()
fig.suptitle(name)
import numpy as np
import pandas as pd
imoprt matplotlib.pyplot as plt
fig, ax = plt.subplots(2,2)
df = pd.DataFrame({'A':np.random.randint(1,100,10),
'B': np.random.randint(100,1000,10),
'C':np.random.randint(100,200,10)})
for ax in ax.flatten():
df.plot(ax =ax)

How to show following data with colors and color bar. What will be suitable command for this? [duplicate]

I want to make a scatterplot (using matplotlib) where the points are shaded according to a third variable. I've got very close with this:
plt.scatter(w, M, c=p, marker='s')
where w and M are the data points and p is the variable I want to shade with respect to.
However I want to do it in greyscale rather than colour. Can anyone help?
There's no need to manually set the colors. Instead, specify a grayscale colormap...
import numpy as np
import matplotlib.pyplot as plt
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
# Plot...
plt.scatter(x, y, c=y, s=500) # s is a size of marker
plt.gray()
plt.show()
Or, if you'd prefer a wider range of colormaps, you can also specify the cmap kwarg to scatter. To use the reversed version of any of these, just specify the "_r" version of any of them. E.g. gray_r instead of gray. There are several different grayscale colormaps pre-made (e.g. gray, gist_yarg, binary, etc).
import matplotlib.pyplot as plt
import numpy as np
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
plt.scatter(x, y, c=y, s=500, cmap='gray')
plt.show()
In matplotlib grey colors can be given as a string of a numerical value between 0-1.
For example c = '0.1'
Then you can convert your third variable in a value inside this range and to use it to color your points.
In the following example I used the y position of the point as the value that determines the color:
from matplotlib import pyplot as plt
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [125, 32, 54, 253, 67, 87, 233, 56, 67]
color = [str(item/255.) for item in y]
plt.scatter(x, y, s=500, c=color)
plt.show()
Sometimes you may need to plot color precisely based on the x-value case. For example, you may have a dataframe with 3 types of variables and some data points. And you want to do following,
Plot points corresponding to Physical variable 'A' in RED.
Plot points corresponding to Physical variable 'B' in BLUE.
Plot points corresponding to Physical variable 'C' in GREEN.
In this case, you may have to write to short function to map the x-values to corresponding color names as a list and then pass on that list to the plt.scatter command.
x=['A','B','B','C','A','B']
y=[15,30,25,18,22,13]
# Function to map the colors as a list from the input list of x variables
def pltcolor(lst):
cols=[]
for l in lst:
if l=='A':
cols.append('red')
elif l=='B':
cols.append('blue')
else:
cols.append('green')
return cols
# Create the colors list using the function above
cols=pltcolor(x)
plt.scatter(x=x,y=y,s=500,c=cols) #Pass on the list created by the function here
plt.grid(True)
plt.show()
A pretty straightforward solution is also this one:
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8,8))
p = ax.scatter(x, y, c=y, cmap='cmo.deep')
fig.colorbar(p,ax=ax,orientation='vertical',label='labelname')

Showing several sparse matrix plots with IPython and matplotlib

I'm using IPython and matplotlib to show sparce matricies, like this:
%matplotlib inline
import math
a = [ [randint(2) for j in range(0,5)] for i in range(0, 5)]
spy(a)
Is it possible to call spy in a loop, to show several plots? This code only shows one, but I'd like it to show all five.
plots = [ [ [randint(2) for j in range(0,5)] for i in range(0, 5)] for x in range(0,5)]
for plot in plots:
spy(plot)
You can call it in a loop, but first let's make five random 5x5 sparse arrays:
ms = np.random.randint(0, 2, (5, 5, 5))
If you want them to show up as separate figures, you must create a new figure each time:
for m in ms:
plt.figure()
plt.spy(m)
Or, you can make 1 figure with 5 subplots:
f, axes = plt.subplots(1, 5) # 1 row, 5 columns
for ax, m in zip(axes, ms):
ax.spy(m)

inverse of FFT not the same as original function

I don't understand why the ifft(fft(myFunction)) is not the same as my function. It seems to be the same shape but a factor of 2 out (ignoring the constant y-offset). All the documentation I can see says there is some normalisation that fft doesn't do, but that ifft should take care of that. Here's some example code below - you can see where I've bodged the factor of 2 to give me the right answer. Thanks for any help - its driving me nuts.
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:] = 0
# make new series
y2 = fftp.ifft(myfft).real
# find constant y offset
myfft[1:]=0
c = fftp.ifft(myfft)[0]
# remove c, apply factor of 2 and re apply c
y2 = (y2-c)*2 + c
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
You're removing half the spectrum when you do myfft[wn:] = 0. The negative frequencies are those in the top half of the array and are required.
You have a second fudge to get your results which is taking the real part to find y2: y2 = fftp.ifft(myfft).real (fftp.ifft(myfft) has a non-negligible imaginary part due to the asymmetry in the spectrum).
Fix it with myfft[wn:-wn] = 0 instead of myfft[wn:] = 0, and remove the fudges. So the fixed code looks something like:
import numpy as np
import scipy.fftpack as fftp
import matplotlib.pyplot as plt
def fourier_series(x, y, wn, n=None):
# get FFT
myfft = fftp.fft(y, n)
# kill higher freqs above wavenumber wn
myfft[wn:-wn] = 0
# make new series
y2 = fftp.ifft(myfft)
plt.figure(num=None)
plt.plot(x, y, x, y2)
plt.show()
if __name__=='__main__':
x = np.array([float(i) for i in range(0,360)])
y = np.sin(2*np.pi/360*x) + np.sin(2*2*np.pi/360*x) + 5
fourier_series(x, y, 3, 360)
It's really worth paying attention to the interim arrays that you are creating when trying to do signal processing. Invariably, there are clues as to what is going wrong that should direct you to the problem. In this case, you taking the real part masked the problem and made your task more difficult.
Just to add another quick point: Sometimes taking the real part of the resultant array is exactly the correct thing to do. It's often the case that you end up with an imaginary part to the signal output which is just down to numerical errors in the input to the inverse FFT. Typically this manifests itself as very small imaginary values, so taking the real part is basically the same array.
You are killing the negative frequencies between 0 and -wn.
I think what you mean to do is to set myfft to 0 for all frequencies outside [-wn, wn].
Change the following line:
myfft[wn:] = 0
to:
myfft[wn:-wn] = 0