Force grayscale legend colors in scatterplot - matplotlib

I want to prepare a grayscale figure with multiple, overlapping scatter plots. A simplified version of my code is as follows:
for object in list_of_objects:
plt.scatter(object.x,object.y,marker=object.marker,label=object.label)
plt.legend()
plt.show()
The problem is that the legend appears in non-grayscale color, so is there any way to force it to use grayscale colors? (i.e. a different grayscale color for the data of each object)

Matplotlib
If you need to use matplotlib, then here is a sample of how you can achieve this. Assuming there are 5 groups that you want to add here.
import numpy as np
import matplotlib.pyplot as plt
x = np.random.rand(100)
y = np.random.rand(100)
groups = [1, 2, 3, 4, 5] * 20
plt.scatter(x, y, c=groups, cmap='Greys')
plt.colorbar(ticks=[1, 2, 3, 4, 5])
plt.show()
Output graph:
Seaborn
If you are able to use seaborn, it is simpler...
You can use palette = 'gray' and hue=<your group> to give it different shades of gray.
Sample code:
tips = sns.load_dataset("tips")
sns.scatterplot(data=tips, x="total_bill", y="tip", hue="size", palette="gray")
Will give this pic

Related

Combine two matplotlib Figures, side by side, high quality

I produced two matplotlib Figures, at size of 1000x1000.
Each of the figures is 4x4 subplots based figure.
I want one figure at size of 1000x2000 (width is 2000).
fig1
<Figure size 1000x1000 with 4 Axes>
fig2
<Figure size 1000x1000 with 4 Axes>
Now I want to combine them together.
I've searched many references:
How to make two plots side-by-side using Python?
Plotting two figures side by side
Adding figures to subplots in Matplotlib
They are not relevant because mostly they suggest to change the way the initial plots were created. I don't want to change it - I want to use the Figure as is.
I just need to place Fig1 to the left of Fig2. Not changing the way Fig1 or Fig2 were created.
I also tried using PIL method: https://note.nkmk.me/en/python-pillow-concat-images/
However it was lower quality
You can render your figures to arrays using the agg backend.
Then concat the arrays side by side and switch back to your normal backend to show the result:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
backend = mpl.get_backend()
mpl.use('agg')
dpi = 100
fig1,_ = plt.subplots(2,2, figsize=(1000/dpi, 1000/dpi), dpi=dpi)
fig1.suptitle('Figure 1')
fig2,_ = plt.subplots(2,2, figsize=(1000/dpi, 1000/dpi), dpi=dpi)
fig2.suptitle('Figure 2')
c1 = fig1.canvas
c2 = fig2.canvas
c1.draw()
c2.draw()
a1 = np.array(c1.buffer_rgba())
a2 = np.array(c2.buffer_rgba())
a = np.hstack((a1,a2))
mpl.use(backend)
fig,ax = plt.subplots(figsize=(2000/dpi, 1000/dpi), dpi=dpi)
fig.subplots_adjust(0, 0, 1, 1)
ax.set_axis_off()
ax.matshow(a)
Not directly merging two seperate figures, but I succeeded achieving the final goal by using this reference:
https://matplotlib.org/devdocs/gallery/subplots_axes_and_figures/subfigures.html
That's the code I needed:
fig = plt.figure(constrained_layout=True, figsize=(20, 11))
titles_size = 25
labels_size = 18
subfigs = fig.subfigures(1, 2, wspace=0.02)
subfigs[0].suptitle('Title 1', fontsize=titles_size)
subfigs[1].suptitle('Title 2', fontsize=titles_size)
axsLeft = subfigs[0].subplots(2, 2)
axsRight = subfigs[1].subplots(2, 2)
for ax_idx, ax in enumerate(axsLeft.reshape(-1)):
ax.grid(False)
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.axes.xaxis.set_visible(False)
ax.axes.yaxis.set_visible(False)
for ax_idx, ax in enumerate(axsRight.reshape(-1)):
ax.grid(False)
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.axes.xaxis.set_visible(False)
ax.axes.yaxis.set_visible(False)
plt.show()

Pandas histogram plot with Y axis or colorbar

In Pandas, I am trying to generate a Ridgeline plot for which the density values are shown (either as Y axis or color-ramp). I am using the Joyplot but any other alternative ways are fine.
So, first I created the Ridge plot to show the different distribution plot for each condition (you can reproduce it using this code):
import pandas as pd
import joypy
import matplotlib
import matplotlib.pyplot as plt
df1 = pd.DataFrame({'Category1':np.random.choice(['C1','C2','C3'],1000),'Category2':np.random.choice(['B1','B2','B3','B4','B5'],1000),
'year':np.arange(start=1900, stop=2900, step=1),
'Data':np.random.uniform(0,1,1000),"Period":np.random.choice(['AA','CC','BB','DD'],1000)})
data_pivot=df1.pivot_table('Data', ['Category1', 'Category2','year'], 'Period')
fig, axes = joypy.joyplot(data_pivot, column=['AA', 'BB', 'CC', 'DD'], by="Category1", ylim='own', figsize=(14,10), legend=True, alpha=0.4)
so it generates the figure but without my desired Y axis. So, based on this post, I could add a colorramp, which neither makes sense nor show the differences between the distribution plot of the different categories on each line :) ...
ar=df1['Data'].plot.kde().get_lines()[0].get_ydata() ## a workaround to get the probability values to set the colorramp max and min
norm = plt.Normalize(ar.min(), ar.max())
original_cmap = plt.cm.viridis
cmap = matplotlib.colors.ListedColormap(original_cmap(norm(ar)))
sm = matplotlib.cm.ScalarMappable(cmap=original_cmap, norm=norm)
sm.set_array([])
# plotting ....
fig, axes = joypy.joyplot(data_pivot,colormap = cmap , column=['AA', 'BB', 'CC', 'DD'], by="Category1", ylim='own', figsize=(14,10), legend=True, alpha=0.4)
fig.colorbar(sm, ax=axes, label="density")
But what I want is some thing like either of these figures (preferably with colorramp) :

Making a Scatter Plot from a DataFrame in Pandas

I have a DataFrame and need to make a scatter-plot from it.
I need to use 2 columns as the x-axis and y-axis and only need to plot 2 rows from the entire dataset. Any suggestions?
For example, my dataframe is below (50 states x 4 columns). I need to plot 'rgdp_change' on the x-axis vs 'diff_unemp' on the y-axis, and only need to plot for the states, "Michigan" and "Wisconsin".
So from the dataframe, you'll need to select the rows from a list of the states you want: ['Michigan', 'Wisconsin']
I also figured you would probably want a legend or some way to differentiate one point from the other. To do this, we create a colormap assigning a different color to each state. This way the code is generalizable for more than those two states.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
# generate a random df with the relevant rows, columns to your actual df
df = pd.DataFrame({'State':['Alabama', 'Alaska', 'Michigan', 'Wisconsin'], 'real_gdp':[1.75*10**5, 4.81*10**4, 2.59*10**5, 1.04*10**5],
'rgdp_change': [-0.4, 0.5, 0.4, -0.5], 'diff_unemp': [-1.3, 0.4, 0.5, -11]})
fig, ax = plt.subplots()
states = ['Michigan', 'Wisconsin']
colormap = cm.viridis
colorlist = [colors.rgb2hex(colormap(i)) for i in np.linspace(0, 0.9, len(states))]
for i,c in enumerate(colorlist):
x = df.loc[df["State"].isin(['Michigan', 'Wisconsin'])].rgdp_change.values[i]
y = df.loc[df["State"].isin(['Michigan', 'Wisconsin'])].diff_unemp.values[i]
legend_label = states[i]
ax.scatter(x, y, label=legend_label, s=50, linewidth=0.1, c=c)
ax.legend()
plt.show()
Use the dataframe plot method, but first filter the sates you need using index isin method:
states = ["Michigan", "Wisconsin"]
df[df.index.isin(states)].plot(kind='scatter', x='rgdp_change', y='diff_unemp')

making scatter plot with different colors for different groups using matplotlib.pyploy

I am trying to make a scatter plot for my data using matplotlib.pyplot (which is imported as plt) and using the following command:
fig = plt.figure(1, figsize = (10, 6))
plt.figure()
plt.scatter(X_train_reduced[:, 0], X_train_reduced[:, 1], c = y_train, cmap = plt.cm.Paired , linewidths=10)
but it gives this error:
ValueError: 'c' argument must be a mpl color, a sequence of mpl colors or a sequence of numbers, not sample group
when I remove this argument:
c = y_train
it returns the plot but the same color for all groups (I have 2 groups).
do you know how to fix it?

How to show following data with colors and color bar. What will be suitable command for this? [duplicate]

I want to make a scatterplot (using matplotlib) where the points are shaded according to a third variable. I've got very close with this:
plt.scatter(w, M, c=p, marker='s')
where w and M are the data points and p is the variable I want to shade with respect to.
However I want to do it in greyscale rather than colour. Can anyone help?
There's no need to manually set the colors. Instead, specify a grayscale colormap...
import numpy as np
import matplotlib.pyplot as plt
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
# Plot...
plt.scatter(x, y, c=y, s=500) # s is a size of marker
plt.gray()
plt.show()
Or, if you'd prefer a wider range of colormaps, you can also specify the cmap kwarg to scatter. To use the reversed version of any of these, just specify the "_r" version of any of them. E.g. gray_r instead of gray. There are several different grayscale colormaps pre-made (e.g. gray, gist_yarg, binary, etc).
import matplotlib.pyplot as plt
import numpy as np
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
plt.scatter(x, y, c=y, s=500, cmap='gray')
plt.show()
In matplotlib grey colors can be given as a string of a numerical value between 0-1.
For example c = '0.1'
Then you can convert your third variable in a value inside this range and to use it to color your points.
In the following example I used the y position of the point as the value that determines the color:
from matplotlib import pyplot as plt
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [125, 32, 54, 253, 67, 87, 233, 56, 67]
color = [str(item/255.) for item in y]
plt.scatter(x, y, s=500, c=color)
plt.show()
Sometimes you may need to plot color precisely based on the x-value case. For example, you may have a dataframe with 3 types of variables and some data points. And you want to do following,
Plot points corresponding to Physical variable 'A' in RED.
Plot points corresponding to Physical variable 'B' in BLUE.
Plot points corresponding to Physical variable 'C' in GREEN.
In this case, you may have to write to short function to map the x-values to corresponding color names as a list and then pass on that list to the plt.scatter command.
x=['A','B','B','C','A','B']
y=[15,30,25,18,22,13]
# Function to map the colors as a list from the input list of x variables
def pltcolor(lst):
cols=[]
for l in lst:
if l=='A':
cols.append('red')
elif l=='B':
cols.append('blue')
else:
cols.append('green')
return cols
# Create the colors list using the function above
cols=pltcolor(x)
plt.scatter(x=x,y=y,s=500,c=cols) #Pass on the list created by the function here
plt.grid(True)
plt.show()
A pretty straightforward solution is also this one:
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8,8))
p = ax.scatter(x, y, c=y, cmap='cmo.deep')
fig.colorbar(p,ax=ax,orientation='vertical',label='labelname')