I have the following:
df.groupby(['A', 'B'])['time'].mean().unstack().plot()
This gives me a line graph like this one:
The circles in the plot indicate data points. The values on x-axis are the values of A which are discrete (100, 200 and 300). That is, I only have data points on the 100th, 200th and 300th point on the x-axis. Pandas/Matplotlib adds the intermediate values on the x-axis (125, 150, 175, 225, 250 and 275) that I don't want.
How can I plot and tell Pandas not to add the extra values on the x-axis?
Matplotlib x axis tick locators
You're looking for tick locators
import matplotlib.ticker as ticker
df = pd.DataFrame({'y':[1, 1.75, 3]}, index=[100, 200, 300])
ax = df.plot(legend=False)
ax.xaxis.set_major_locator(ticker.MultipleLocator(100))
Related
I am trying to plot two data sets that have a string as x axis.
%matplotlib inline
at=['00', '10', '30', '40', '50', '60', '90']
a0=[2,3,4,5,6,7,8]
a1=[1,2,3,2,1,2,1]
bt=['20', '70', '80', '100']
b0=[4,5,6,7]
b1=[1,2,3,2]
plt.figure(figsize=(12,6))
plt.grid(visible=True, axis='y')
plt.bar(at, a1, bottom=a0, color='g', width=0.2)
plt.bar(bt, b1, bottom=b0, color='c', width=0.2)
This is the result of above code:
However, I would like the blue bars being inserted at the right positions, respecting the order. Given that those are strings, how could that be achieved?
I suggest plotting the bars with actual horizontal numerical positions, and then afterwards setting the tick labels as you please. An example follows. Please note that I'm using the alternative Matplotlib API (I explicitly create figures and axes). This is immaterial here, and I'm sure you're able to adapt it to your style.
import matplotlib.pyplot as plt
at=[0, 10, 30, 40, 50, 60, 90]
a0=[2,3,4,5,6,7,8]
a1=[1,2,3,2,1,2,1]
bt=[20, 70, 80, 100]
b0=[4,5,6,7]
b1=[1,2,3,2]
fig = plt.figure(figsize=(12,6))
ax = fig.add_subplot(1,1,1)
ax.grid(visible=True, axis='y')
ax.bar(at, a1, bottom=a0, color='g', width=2, align="center") # I increased the width because yours were very narrow. I also centered the bars.
ax.bar(bt, b1, bottom=b0, color='c', width=2, align="center")
ticks = at + bt
ax.set_xticks(ticks)
ax.set_xticklabels(["{:02d}".format(t) for t in ticks]) # The format string here formats integers with two digits, padding with zeros.
plt.show()
The important things to note here is that at and bt now simply contain integers. Any formatting of what actually shows up as tick marks (such as 0 being displayed as "00") does not affect the overall placement of the bars.
The code above produces
I have created a seaborn boxplot graph for investment data ranging from $100,000 to $40,000,000, spread out over several years. The graph shows a boxplot for investment amounts per year. The data comes from two series in a dataframe, which I'm rendering from a CSV: investment year ('inv_year') and investment amount ('inv_amount').
The problem: the graph's y-axis is automatically shown in increments of 0.5, so:
0.0, 0.5, 1.0, 1.5, etc. (i.e. in tens of millions)
but I want the axis in increments of 5, so:
0, 5, 10, 15, etc. (i.e. in millions)
How do I change the axis's scale?
Here's my current code:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_excel('investments'))
plt.figure(figsize=(12, 10))
sns.boxplot(x='inv_year', y='inv_amount', data=df)
plt.show()
You can define an array with the specified ticks and pass it to plt.yticks:
plt.yticks(np.arange(min(x), max(x)+1, 5))
If you want to keep the limits but just adjust the ticks:
increments = 5
ax = plt.gca()
start, end = ax.get_ylim()
ax.yaxis.set_ticks(np.arange(start, end, increments))
I am trying to plot multiple different plots on a single matplotlib figure with in a for loop. At the moment it is all good in matlab as shown in the picture below and then am able to save the figure as a video frame. Here is a link of a sample video generated in matlab for 10 frames
In python, tried it as below
import matplotlib.pyplot as plt
for frame in range(FrameStart,FrameEnd):#loop1
# data generation code within a for loop for n frames from source video
array1 = np.zeros((200, 3800))
array2 = np.zeros((19,2))
array3 = np.zeros((60,60))
for i in range(len(array2)):#loop2
#generate data for arrays 1 to 3 from the frame data
#end loop2
plt.subplot(6,1,1)
plt.imshow(DataArray,cmap='gray')
plt.subplot(6, 1, 2)
plt.bar(data2D[:,0], data2D[:,1])
plt.subplot(2, 2, 3)
plt.contourf(mapData)
# for fourth plot, use array2[3] and array2[5], plot it as shown and keep the\is #plot without erasing for next frame
not sure how to do the 4th axes with line plots. This needs to be there (done using hold on for this axis in matlab) for the entire sequence of frames processing in the for loop while the other 3 axes needs to be erased and updated with new data for each frame in the movie. The contour plot needs to be square all the time with color bar on the side. At the end of each frame processing, once all the axes are updated, it needs to be saved as a frame of a movie. Again this is easily done in matlab, but not sure in python.
Any suggestions
thanks
I guess you need something like this format.
I have used comments # in code to answer your queries. Please check the snippet
import matplotlib.pyplot as plt
fig=plt.figure(figsize=(6,6))
ax1=fig.add_subplot(311) #3rows 1 column 1st plot
ax2=fig.add_subplot(312) #3rows 1 column 2nd plot
ax3=fig.add_subplot(325) #3rows 2 column 5th plot
ax4=fig.add_subplot(326) #3rows 2 column 6th plot
plt.show()
To turn off ticks you can use plt.axis('off'). I dont know how to interpolate your format so left it blank . You can adjust your figsize based on your requirements.
import numpy as np
from numpy import random
import matplotlib.pyplot as plt
fig=plt.figure(figsize=(6,6)) #First is width Second is height
ax1=fig.add_subplot(311)
ax2=fig.add_subplot(312)
ax3=fig.add_subplot(325)
ax4=fig.add_subplot(326)
#Bar Plot
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
ax2.bar(langs,students)
#Contour Plot
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X, Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
cp = ax3.contourf(X, Y, Z)
fig.colorbar(cp,ax=ax3) #Add a colorbar to a plot
#Multiple line plot
x = np.linspace(-1, 1, 50)
y1 = 2*x + 1
y2 = 2**x + 1
ax4.plot(x, y2)
ax4.plot(x, y1, color='red',linewidth=1.0)
plt.tight_layout() #Make sures plots dont overlap
plt.show()
I wish to create a sub plot that looks like the following picture,
it is supposed to contain 25 polar histograms, and I wish to add them to the plot one by one.
needs to be in python.
I already figured I need to use matplotlib but can't seem to figure it out completely.
thanks a lot!
You can create a grid of polar axes via projection='polar'.
hist creates a histogram, also when working with polar axes. Note that the x is in radians with a range of 2π. It works best when you give the bins explicitly as a linspace from 0 to 2π (or from -π to π, depending on the data). The third parameter of linspace should be one more than the number of bars that you'd want for the full circle.
About the exact parameters of axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30), endpoint=True), color='dodgerblue', ec='black'):
axs[i][j] draw on the jth subplot of the ith line
.hist create a histogram
x: the values that are put into bins
bins=: to enter the bins (either a fixed number between lowest and highest x or some explicit boundaries; default is 10 fixed boundaries)
np.random.randint(7, 30) a random whole number between 7 and 29
np.linspace(0, 2 * np.pi, n, endpoint=True) divide the range between 0 and 2π into n equal parts; endpoint=True makes boundaries at 0, at 2π and at n-2 positions in between; when endpoint=False there will be a boundary at 0, at n-1 positions in between but none at the end
color='dodgerblue': the color of the histogram bars will be blueish
ec='black': the edge color of the bars will be black
import numpy as np
import matplotlib.pyplot as plt
fig, axs = plt.subplots(5, 5, figsize=(8, 8),
subplot_kw=dict(projection='polar'))
for i in range(5):
for j in range(5):
x = np.random.uniform(0, 2 * np.pi, 50)
axs[i][j].hist(x, bins=np.linspace(0, 2 * np.pi, np.random.randint(7, 30)), color='dodgerblue', ec='black')
plt.tight_layout()
plt.show()
I want to make a scatterplot (using matplotlib) where the points are shaded according to a third variable. I've got very close with this:
plt.scatter(w, M, c=p, marker='s')
where w and M are the data points and p is the variable I want to shade with respect to.
However I want to do it in greyscale rather than colour. Can anyone help?
There's no need to manually set the colors. Instead, specify a grayscale colormap...
import numpy as np
import matplotlib.pyplot as plt
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
# Plot...
plt.scatter(x, y, c=y, s=500) # s is a size of marker
plt.gray()
plt.show()
Or, if you'd prefer a wider range of colormaps, you can also specify the cmap kwarg to scatter. To use the reversed version of any of these, just specify the "_r" version of any of them. E.g. gray_r instead of gray. There are several different grayscale colormaps pre-made (e.g. gray, gist_yarg, binary, etc).
import matplotlib.pyplot as plt
import numpy as np
# Generate data...
x = np.random.random(10)
y = np.random.random(10)
plt.scatter(x, y, c=y, s=500, cmap='gray')
plt.show()
In matplotlib grey colors can be given as a string of a numerical value between 0-1.
For example c = '0.1'
Then you can convert your third variable in a value inside this range and to use it to color your points.
In the following example I used the y position of the point as the value that determines the color:
from matplotlib import pyplot as plt
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [125, 32, 54, 253, 67, 87, 233, 56, 67]
color = [str(item/255.) for item in y]
plt.scatter(x, y, s=500, c=color)
plt.show()
Sometimes you may need to plot color precisely based on the x-value case. For example, you may have a dataframe with 3 types of variables and some data points. And you want to do following,
Plot points corresponding to Physical variable 'A' in RED.
Plot points corresponding to Physical variable 'B' in BLUE.
Plot points corresponding to Physical variable 'C' in GREEN.
In this case, you may have to write to short function to map the x-values to corresponding color names as a list and then pass on that list to the plt.scatter command.
x=['A','B','B','C','A','B']
y=[15,30,25,18,22,13]
# Function to map the colors as a list from the input list of x variables
def pltcolor(lst):
cols=[]
for l in lst:
if l=='A':
cols.append('red')
elif l=='B':
cols.append('blue')
else:
cols.append('green')
return cols
# Create the colors list using the function above
cols=pltcolor(x)
plt.scatter(x=x,y=y,s=500,c=cols) #Pass on the list created by the function here
plt.grid(True)
plt.show()
A pretty straightforward solution is also this one:
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(8,8))
p = ax.scatter(x, y, c=y, cmap='cmo.deep')
fig.colorbar(p,ax=ax,orientation='vertical',label='labelname')