for loop of scatterplots, setting the colorbar of a scatterplot horizontally under each plot - pandas

I am really new to plots and matplotlib
I am trying to plot several scatterplots with a loop for data from a pandas data frame
each scatterplot will have on the x axis data from the columns and on the y axis will be the data from the last column
I can make the plots work, but my color bar is displayed on the right side of each plot
I would like to set the color bar horizontally, under each plot but unfortunately I lack the necessary knowledge.
my code looks like this:
num_cols = df.columns.to_list()[:-1]
for col in num_cols:
df.plot.scatter(x = col, y = df.columns[-1], c=col, cmap='Paired', title=col, figsize = (5, 5))
current relust looks like this (sample of 1 plot from the loop):[current reslut]https://i.stack.imgur.com/m26wn.jpg
I tried a lines of code, but with no success.
I would like to have something like this (excuse my windows paint skills):expected result

You can use the cmap_location to move it to bottom.
num_cols = df.columns.to_list()[:-1]
for col in num_cols:
df.plot.scatter(x = col, y = df.columns[-1], c=col, cmap='Paired', title=col, figsize = (5, 5), cbar_location="bottom", cbar_mode="edge",
cbar_pad=0.25,
cbar_size="15%",)

Related

Multiple different kinds of plots on a single figure and save it to a video

I am trying to plot multiple different plots on a single matplotlib figure with in a for loop. At the moment it is all good in matlab as shown in the picture below and then am able to save the figure as a video frame. Here is a link of a sample video generated in matlab for 10 frames
In python, tried it as below
import matplotlib.pyplot as plt
for frame in range(FrameStart,FrameEnd):#loop1
# data generation code within a for loop for n frames from source video
array1 = np.zeros((200, 3800))
array2 = np.zeros((19,2))
array3 = np.zeros((60,60))
for i in range(len(array2)):#loop2
#generate data for arrays 1 to 3 from the frame data
#end loop2
plt.subplot(6,1,1)
plt.imshow(DataArray,cmap='gray')
plt.subplot(6, 1, 2)
plt.bar(data2D[:,0], data2D[:,1])
plt.subplot(2, 2, 3)
plt.contourf(mapData)
# for fourth plot, use array2[3] and array2[5], plot it as shown and keep the\is #plot without erasing for next frame
not sure how to do the 4th axes with line plots. This needs to be there (done using hold on for this axis in matlab) for the entire sequence of frames processing in the for loop while the other 3 axes needs to be erased and updated with new data for each frame in the movie. The contour plot needs to be square all the time with color bar on the side. At the end of each frame processing, once all the axes are updated, it needs to be saved as a frame of a movie. Again this is easily done in matlab, but not sure in python.
Any suggestions
thanks
I guess you need something like this format.
I have used comments # in code to answer your queries. Please check the snippet
import matplotlib.pyplot as plt
fig=plt.figure(figsize=(6,6))
ax1=fig.add_subplot(311) #3rows 1 column 1st plot
ax2=fig.add_subplot(312) #3rows 1 column 2nd plot
ax3=fig.add_subplot(325) #3rows 2 column 5th plot
ax4=fig.add_subplot(326) #3rows 2 column 6th plot
plt.show()
To turn off ticks you can use plt.axis('off'). I dont know how to interpolate your format so left it blank . You can adjust your figsize based on your requirements.
import numpy as np
from numpy import random
import matplotlib.pyplot as plt
fig=plt.figure(figsize=(6,6)) #First is width Second is height
ax1=fig.add_subplot(311)
ax2=fig.add_subplot(312)
ax3=fig.add_subplot(325)
ax4=fig.add_subplot(326)
#Bar Plot
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
ax2.bar(langs,students)
#Contour Plot
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X, Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
cp = ax3.contourf(X, Y, Z)
fig.colorbar(cp,ax=ax3) #Add a colorbar to a plot
#Multiple line plot
x = np.linspace(-1, 1, 50)
y1 = 2*x + 1
y2 = 2**x + 1
ax4.plot(x, y2)
ax4.plot(x, y1, color='red',linewidth=1.0)
plt.tight_layout() #Make sures plots dont overlap
plt.show()

Stacked hue histogram

I don't have the reputation to add inline images I'm sorry.
This is the code I found:
bins = np.linspace(df.Principal.min(), df.Principal.max(), 10)
g = sns.FacetGrid(df, col="Gender", hue="loan_status", palette="Set1", col_wrap=2)
g.map(plt.hist, 'Principal', bins=bins, ec="k")
g.axes[-1].legend()
plt.show()
Output:
I want to do something similar with some data I have:
bins = np.linspace(df.overall.min(), df.overall.max(), 10)
g = sns.FacetGrid(df, col="player_positions", hue="preferred_foot", palette="Set1", col_wrap=4)
g.map(plt.hist, 'overall', bins=bins, ec="k")
g.axes[-1].legend()
plt.show()
The hue "preferred_foot" is just left and right.
My output:
I am not sure why I can't see the left values on the plot
df['preferred_foot'].value_counts()
Right 13960
Left 4318
I am fairly sure those are not stacked histograms, but just two histograms one behind the other. I believe your "left" red bars are simply hidden behind the "right" blue bars.
You could try adding some alpha=0.5 or changing the order of the hues (add hue_order=['Right','Left'] to the call to FacetGrid.

Making multiple pie charts out of a pandas dataframe (one for each column)

My question is similar to Making multiple pie charts out of a pandas dataframe (one for each row).
However, instead of each row, I am looking for each column in my case.
I can make pie chart for each column, however, as I have 12 columns the pie charts are too much close to each other.
I have used this code:
fig, axes = plt.subplots(4, 3, figsize=(10, 6))
for i, (idx, row) in enumerate(df.iterrows()):
ax = axes[i // 3, i % 3]
row = row[row.gt(row.sum() * .01)]
ax.pie(row, labels=row.index, startangle=30)
ax.set_title(idx)
fig.subplots_adjust(wspace=.2)
and I have the following result
But I want is on the other side. I need to have 12 pie charts (becuase I have 12 columns) and each pie chart should have 4 sections (which are leg, car, walk, and bike)
and if I write this code
fig, axes = plt.subplots(4,3)
for i, col in enumerate(df.columns):
ax = axes[i // 3, i % 3]
plt.plot(df[col])
then I have the following results:
and if I use :
plot = df.plot.pie(subplots=True, figsize=(17, 8),labels=['pt','car','walk','bike'])
then I have the following results:
Which is quite what I am looking for. but it is not possible to read the pie charts. if it can produce in more clear output, then it is better.
As in your linked post I would use matplotlib.pyplot for this. The accepted answer uses plt.subplots(2, 3) and I would suggest doing the same for creating two rows with each 3 plots in them.
Like this:
fig, axes = plt.subplots(2,3)
for i, col in enumerate(df.columns):
ax = axes[i // 3, i % 3]
ax.plot(df[col])
Finally, I understood that if I swap rows and columns
df_sw = df.T
Then I can use the code in the examples:
Making multiple pie charts out of a pandas dataframe (one for each row)

numbers in yaxis is not arranged in order in matplotlib

[the numbers in yaxis in the plot is not arranged in order while the word in x axis are too close. I am actually scraping wikipedia table for COVID19 cases, so i dont save as csv. i am only ploting it directly from the website.]
my code also below
URL="https://en.wikipedia.org/wiki/COVID19_pandemic_in_Nigeria"
html=requests.get(URL)
bsObj= BeautifulSoup(html.content, 'html.parser')
states= []
cases=[]
active=[]
recovered=[]
death=[]
for items in bsObj.find("table",{"class":"wikitable
sortable"}).find_all('tr')[1:37]:
data = items.find_all(['th',{"align":"left"},'td'])
states.append(data[0].a.text)
cases.append(data[1].b.text)
active.append(data[2].text)
recovered.append(data[3].text)
death.append(data[4].text)
table= ["STATES","ACTIVE"]
tab= pd.DataFrame(list(zip(states,active)),columns=table)
tab["ACTIVE"]=tab["ACTIVE"].replace('\n','', regex=True)
x=tab["STATES"]
y=tab["ACTIVE"]
plt.cla()
plt.bar(x,y, color="green")
plt.xticks(fontsize= 5)
plt.yticks(fontsize= 8)
plt.title("PLOTTERSCA")
plt.show()
It's hard to say without the data, but you can try this to sort the values on y axis:
y.sort()
plt.bar(x,y, color="green")
Barcharts are plotted in the order they are presented. In this case, there is no need to create a dataframe. Try the following:
x=[s.replace('\n', '') for s in states]
y=np.array(active)
order = np.argsort(y)
xNew = [x[i] for i in order]
yNew = y[order]
plt.cla()
plt.bar(xNew,yNew, color="green")
plt.xticks(fontsize= 5, rotation=90)
plt.yticks(fontsize= 8)
plt.title("PLOTTERSCA")
plt.show()
Here, we have reordered everything based upon the y values. Also, the xticks have been rotated so that they are easier to see ...

How can I make this plot awesome (colours by group plus alpha value by second group)

I do have following dataframe:
I plotted it the following way:
Right now the plot looks ugly. Aside of using different font size, marker_edge_width, marker face color etc. I would like to have two colors for each protein (hum1 and hum2) and within the group the different pH values should have different intensities. What makes it more difficult is the fact that my groups do not have the same size.
Any ideas ?
P.S Such a build in feature would be really cool e.g colourby = level_one thenby level_two
fig = plt.figure(figsize=(9,9))
ax = fig.add_subplot(1,1,1)
c1 = plt.cm.Greens(np.linspace(0.5, 1, 4))
c2 = plt.cm.Blues(np.linspace(0.5, 1, 4))
colors = np.vstack((c1,c2))
gr.unstack(level=(0,1))['conc_dil'].plot(marker='o',linestyle='-',color=colors,ax=ax)
plt.legend(loc=1,bbox_to_anchor = (0,0,1.5,1),numpoints=1)
gives:
P.S This post helped me:
stacked bar plot and colours