Say I have the following geodataframe that contains 3 polygon objects.
import geopandas as gpd
from shapely.geometry import Polygon
p1=Polygon([(0,0),(0,1),(1,1),(1,0)])
p2=Polygon([(3,3),(3,6),(6,6),(6,3)])
p3=Polygon([(3,.5),(4,2),(5,.5)])
gdf=gpd.GeoDataFrame(geometry=[p1,p2,p3])
gdf['Value1']=[1,10,20]
gdf['Value2']=[300,200,100]
gdf content:
>>> gdf
geometry Value1 Value2
0 POLYGON ((0 0, 0 1, 1 1, 1 0, 0 0)) 1 300
1 POLYGON ((3 3, 3 6, 6 6, 6 3, 3 3)) 10 200
2 POLYGON ((3 0.5, 4 2, 5 0.5, 3 0.5)) 20 100
>>>
I can make a separate figure for each plot by calling geopandas.plot() twice. However, is there a way for me to plot both of these maps next to each other in the same figure as subfigures?
Always always always create your matplotlib objects ahead of time and pass them to the plotting methods (or use them directly). Doing so, your code becomes:
from matplotlib import pyplot
import geopandas
from shapely import geometry
p1 = geometry.Polygon([(0,0),(0,1),(1,1),(1,0)])
p2 = geometry.Polygon([(3,3),(3,6),(6,6),(6,3)])
p3 = geometry.Polygon([(3,.5),(4,2),(5,.5)])
gdf = geopandas.GeoDataFrame(dict(
geometry=[p1, p2, p3],
Value1=[1, 10, 20],
Value2=[300, 200, 100],
))
fig, (ax1, ax2) = pyplot.subplots(ncols=2, sharex=True, sharey=True)
gdf.plot(ax=ax1, column='Value1')
gdf.plot(ax=ax2, column='Value2')
Which gives me:
// for plotting multiple GeoDataframe
import geopandas as gpd
gdf = gpd.read_file(geojson)
fig, axes = plt.subplots(1,4, figsize=(40,10))
axes[0].set_title('Some Title')
gdf.plot(ax=axes[0], column='Some column for coloring', cmap='coloring option')
axes[0].set_title('Some Title')
gdf.plot(ax=axes[0], column='Some column for coloring', cmap='coloring option')
Related
Currently my MDS looks like this Instead of having the numberings on the MDS plot, I would like to replace them with dots. My desired output is that the point that is annotated as 'highest' will be a blue dot, the point that is annotated with 'lowest' will be a red dot, and all the other points will be grey dots.
Code to reproduce the MDS plot above.
import numpy as np
import scipy
import matplotlib.pyplot as plt
from sklearn.metrics import pairwise_distances #jaccard diss.
from sklearn import manifold # multidimensional scaling
foods_binary = np.random.randint(2, size=(100, 10)) #initial dataset
print(foods_binary.shape)
dis_matrix = pairwise_distances(foods_binary, metric = 'jaccard')
mds_model = manifold.MDS(n_components = 2, random_state = 123,
dissimilarity = 'precomputed')
mds_fit = mds_model.fit(dis_matrix)
mds_coords = mds_model.fit_transform(dis_matrix)
plt.figure()
plt.scatter(mds_coords[:,0],mds_coords[:,1],
facecolors = 'none', edgecolors = 'none') # points in white (invisible)
labels = [ 1, 2, 3, 4, 5, 'highest', 5, 6, 7, 8, 9, 'lowest']
for label, x, y in zip(labels, mds_coords[:,0], mds_coords[:,1]):
plt.annotate(label, (x,y), xycoords = 'data')
plt.xlabel('First Dimension')
plt.ylabel('Second Dimension')
plt.title('Dissimilarity among food items')
plt.show()
How can I achieve what I desire above? Thanks for any suggestions.
I am trying to plot an array of 101 rows * 12 Columns, with row #1 as a highlight using the code below:
plt.plot(HW.transpose()[1:101],color = 'grey', alpha = 0.1)
plt.plot(HW.transpose()[0],color = 'red', linewidth = 3, alpha = 0.7)
The only issue in this graph is that 'S1' somehow ends up in the last instead of beginning. What am I doing wrong?
HW.transpose()[1:101] doesn't select the desired columns. You can use HW.transpose().iloc[:, 1:101] instead:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
HW = pd.DataFrame(np.random.randn(101, 12).cumsum(axis=1), columns=[f'S{i}' for i in range(1, 13)])
plt.plot(HW.transpose().iloc[:, 1:101], color='grey', alpha=0.1)
plt.plot(HW.transpose().iloc[:, 0], color='red', linewidth=3, alpha=0.7)
plt.show()
I have this simple dataframe:
df = pd.DataFrame({"X": np.random.randint(50,53,size=100),
"Y": np.random.randint(200,300,size=100),
"Z": np.random.randint(400,800,size=100)})
And as I have many columns (all of them numeric), I did this loop in order to do a specific plot:
for i in df.columns:
data = df[i]
data.plot(kind="kde")
plt.vlines(x=data.mean(),ymin=0, ymax=0.01, linestyles="dotted")
plt.show()
However, I'm having trouble trying to generalize the ymax argument of plt.vlines(), as I need to get the maximum y-axis value of each density plot in order to plot the mean vline of each plot accordingly. I have tried with np.argmax(), but it doesn't seem to work.
Any suggestions?
pandas.DataFrame.plot() returns matplotlib.axes.Axes object. You can use get_ylim() function to get ymin and ymax.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"X": np.random.randint(50,53,size=100),
"Y": np.random.randint(200,300,size=100),
"Z": np.random.randint(400,800,size=100)})
for i in df.columns:
data = df[i]
ax = data.plot(kind="kde")
ymin, ymax = ax.get_ylim()
plt.vlines(x=data.mean(),ymin=ymin, ymax=ymax, linestyles="dotted")
plt.show()
To get the value of the kde corresponding to the mean, you could extract the curve from the plot and interpolate it at the position of the mean:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({"X": 20 + np.random.randint(-1, 2, size=100).cumsum(),
"Y": 30 + np.random.randint(-1, 2, size=100).cumsum(),
"Z": 40 + np.random.randint(-1, 2, size=100).cumsum()})
fig, ax = plt.subplots()
for col in df.columns:
data = df[col]
data.plot(kind="kde", ax=ax)
x = data.mean()
kdeline = ax.lines[-1]
ymax = np.interp(x, kdeline.get_xdata(), kdeline.get_ydata())
ax.vlines(x=data.mean(), ymin=0, ymax=ymax, linestyles="dotted")
ax.set_ylim(ymin=0) # ax.vlines() moves the bottom ylim; set it back to 0
plt.show()
Use plt.axvline. You specify the limits as numbers in the range [0,1], 0 being the bottom of the plot, 1 being the top.
for i in df.columns:
data = df[i]
data.plot(kind="kde")
plt.axvline(data.mean(), 0, 1, linestyle='dotted', color='black')
plt.show()
I'd like to have two plots (a scatter and a bar) shown on the same figure, with the following layout:
| 0 | 1 | 2 | 3 |
But where 0 through 2 are filled with the first plot, and 3 is filled with the second. I've tried to use fig, ax = plt.subplots(1,4,figsize = (12, 10)) to create a 1x4 array of subplots, but when I call sns.scatterplot(..., ax=...), the ax argument is only able to accept one subplot label.
Is there a way, either in the subplot call or in the ax argument, to make a plot that is 75% of the width?
This is one way to do it using plt.subplots() by utilizing the keyword gridspec_kwdict, which takes dictionary that is passed to the GridSpec constructor used to create the grid on which the subplots are placed.
import numpy as np
from collections import Counter
import matplotlib.pyplot as plt
np.random.seed(123)
data = np.random.randint(0, 10, 100)
x, y = zip(*Counter(data).items())
fig, (ax1, ax2) = plt.subplots(1, 2, gridspec_kw={'width_ratios': [3, 1]},
figsize=(10, 4))
ax1.scatter(x, y)
ax2.bar(x, y)
plt.tight_layout()
I want to show my Data in two (or more) stacked Bargraphs inkluding Errorbars. My Code leans on an working Example, but uses df`s at input instead of Arrays.
I tried to set the df-output to an array, but this will not work
from uncertain_panda import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
raw_data = {'': ['Error', 'Value'],'Stars': [3, 18],'Cats': [2,15],'Planets': [1,12],'Dogs': [2,16]}
df = pd.DataFrame(raw_data)
df.set_index('', inplace=True)
print(df)
N = 2
ind = np.arange(N)
width = 0.35
first_Value = df.loc[['Value'],['Cats','Dogs']]
second_Value = df.loc[['Value'],['Stars','Planets']]
first_Error = df.loc[['Error'],['Cats','Dogs']]
second_Error = df.loc[['Error'],['Stars','Planets']]
p1 = plt.bar(ind, first_Value, width, yerr=first_Error)
p2 = plt.bar(ind, second_Value, width, yerr=second_Error, bottom=first_Value)
plt.xticks(ind, ('Pets', 'Universe'))
plt.legend((p1[0], p2[0]), ('Cats', 'Dogs', 'Stars', 'Planets'))
plt.show()
I expect an output like this:
https://matplotlib.org/3.1.0/gallery/lines_bars_and_markers/bar_stacked.html#sphx-glr-gallery-lines-bars-and-markers-bar-stacked-py
Instead i get this error:
TypeError: only size-1 arrays can be converted to Python scalars