How do I find rectangular areas in a matrix - numpy

I have a matrix with different labels.
How do I find all rectangles with identical values?
I tried:
import numpy as np
import scipy.ndimage as nd
a= np.zeros((120, 120), dtype=np.uint16)
a[:100, :100] = 5
a[10:100, 10:100] = 6
a[12:100, 12:100] = 7
for v in np.unique(a):
if v:
l=nd.label((a==v).astype(int))
f = nd.find_objects(l[0])
print(v, f)
This reports
5 [(slice(0, 100, None), slice(0, 100, None))]
6 [(slice(10, 100, None), slice(10, 100, None))]
7 [(slice(12, 100, None), slice(12, 100, None))]
But a[50, 50] = 7 so there is clearly no rectangle [0:100, 0:100] with the value 5

Related

Numpy.polyfit Not Returning Polynomial

I am trying to create a python program in which the user inputs a set of data and the program spits out an output in which it creates a graph with a line/polynomial which best fits the data.
This is the code:
from matplotlib import pyplot as plt
import numpy as np
x = []
y = []
x_num = 0
while True:
sequence = int(input("Input 1 number in the sequence, type 9040321 to stop"))
if sequence == 9040321:
poly = np.polyfit(x, y, deg=2, rcond=None, full=False, w=None, cov=False)
plt.plot(poly)
plt.scatter(x, y, c="blue", label="data")
plt.legend()
plt.show()
break
else:
y.append(sequence)
x.append(x_num)
x_num += 1
I used the polynomial where I inputed 1, 2, 4, 8 each in separate inputs. MatPlotLib graphed it properly, however, for the degree of 2, the output was the following image:
This is clearly not correct, however I am unsure what the problem is. I think it has something to do with the degree, however when I change the degree to 3, it still does not fit. I am looking for a graph like y=sqrt(x) to go over each of the points and when that is not possible, create the line that fits the best.
Edit: I added a print(poly) feature and for the selected input above, it gives [0.75 0.05 1.05]. I do not know what to make of this.
Approximation by a second degree polynomial
np.polyfit gives the coefficients of a polynomial close to the given points. To plot the polynomial as a smooth curve with matplotlib, you need to calculate a lot of x,y pairs. Using np.linspace(start, stop, numsteps) for the xs, numpy's vectorization allows calculating all the corresponding ys in one go. E.g. ys = a * x**2 + b * x + c.
from matplotlib import pyplot as plt
import numpy as np
x = [0, 1, 2, 3, 4, 5, 6]
y = [1, 2, 4, 8, 16, 32, 64]
plt.scatter(x, y, color='crimson', label='given points')
poly = np.polyfit(x, y, deg=2, rcond=None, full=False, w=None, cov=False)
xs = np.linspace(min(x), max(x), 100)
ys = poly[0] * xs ** 2 + poly[1] * xs + poly[2]
plt.plot(xs, ys, color='dodgerblue', label=f'$({poly[0]:.2f})x^2+({poly[1]:.2f})x + ({poly[2]:.2f})$')
plt.legend()
plt.show()
Higher degree approximating polynomials
Given N points, an N-1 degree polynomial can pass exactly through each of them. Here is an example with 7 points and polynomials of up to degree 6,
from matplotlib import pyplot as plt
import numpy as np
x = [0, 1, 2, 3, 4, 5, 6]
y = [1, 2, 4, 8, 16, 32, 64]
plt.scatter(x, y, color='black', zorder=3, label='given points')
for degree in range(0, len(x)):
poly = np.polyfit(x, y, deg=degree, rcond=None, full=False, w=None, cov=False)
xs = np.linspace(min(x) - 0.5, max(x) + 0.5, 100)
ys = sum(poly_i * xs**i for i, poly_i in enumerate(poly[::-1]))
plt.plot(xs, ys, label=f'degree {degree}')
plt.legend()
plt.show()
Another example
x = [0, 1, 2, 3, 4]
y = [1, 1, 6, 5, 5]
import numpy as np
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [1, 2, 4, 8]
coeffs = np.polyfit(x, y, 2)
print(coeffs)
poly = np.poly1d(coeffs)
print(poly)
x_cont = np.linspace(0, 4, 81)
y_cont = poly(x_cont)
plt.scatter(x, y)
plt.plot(x_cont, y_cont)
plt.grid(1)
plt.show()
Executing the code, you have the graph above and this is printed in the terminal:
[ 0.75 -1.45 1.75]
2
0.75 x - 1.45 x + 1.75
It seems to me that you had false expectations about the output of polyfit.

Plotting in horizontal with percentage on the bar

I have the following data frame say df =
FunderCode
HCPL1 1% 18% 50% 30% 1%
HCPL2 1% 16% 44% 37% 3%
HCPL3 1% 17% 40% 39% 3%
HCPL4 1% 20% 40% 34% 5%
I wanted to plot it like the following
I could get the following using
Piv_age_per.plot( kind = 'bar', stacked = True , legend = True)
I wanted a diagram with percentage on the bars, if there is inbuilt command to achieve that?
g plot
I could use the following code to generate
import pandas as pd #1.4.4
import matplotlib.pyplot as plt # 3.5.2
# Python 3.10.6
data = pd.DataFrame(columns=range(5))
data.loc['HCPL1'] = [1, 18, 50, 30, 1]
data.loc['HCPL2'] = [1, 16, 44, 37, 3]
data.loc['HCPL3'] = [1, 17, 40, 39, 3]
data.loc['HCPL4'] = [1, 20, 40, 34, 5]
cumulative = data.cumsum(axis=1)
n_rows, n_cols = data.shape
y_pos = range(n_rows)
height = 0.35
colors = ['blue', 'darkorange', 'gray', 'yellow', 'darkblue']
fig, ax = plt.subplots(figsize=(8, 5))
for i in range(n_cols):
left = cumulative[i]-data[i]
labels = [f'{value:.1f}%' for value in data[i]]
ploted = ax.barh(y_pos, data[i], height,
align='center',
left=left,
zorder=2,
color=colors[i])
ax.bar_label(ploted, label_type='center', fontsize=12, labels=labels)
ax.set_yticks(y_pos, labels=data.index)
ax.invert_yaxis()
ax.tick_params(axis='y', pad=20)
ax.set_xticks(range(0, 101, 10))
ax.grid(axis='x', zorder=0)

Set real dimensione of a Chart in Matplotlib

I need to set the dimension of a chart exactly. I tried this, but the result is not what I expected (both if I set px and cm). In addiction, I would like to know how to export correctly the image.
import numpy as np
plt.rcParams['figure.dpi']=100
# create data
x = ['A', 'B', 'C', 'D']
y1 = np.array([10, 20, 10, 30])
y2 = np.array([20, 25, 15, 25])
y3 = np.array([12, 15, 19, 6])
y4 = np.array([10, 29, 13, 19])
# plot bars in stack manner
cm = 1/2.54 # centimeters in inches
px = 1/plt.rcParams['figure.dpi'] # pixel in inches
plt.figure(figsize=(800*px,1000*px))
plt.bar(x, y1, color='r')
plt.bar(x, y2, bottom=y1, color='b')
plt.bar(x, y3, bottom=y1+y2, color='y')
plt.bar(x, y4, bottom=y1+y2+y3, color='g')
plt.xlabel("Teams")
plt.ylabel("Score")
plt.legend(["Round 1", "Round 2", "Round 3", "Round 4"])
plt.title("Scores by Teams in 4 Rounds")
plt.show()
Dimensions expected: 800px x 1000 px, dpi= 100
I attach here a screenshot from Photoshop of the exported image
Not correct dimensions!
The Figure constructor accepts a tuple (numbers in inches) with a default of 80 dpi. You'll want to pass a dpi argument to change this
from matplotlib.figure import Figure
fig = Figure(figsize=(5, 4), dpi=80)
The above is 5 inches by 4 inches at 80dpi, which is 400px by 320px
if you want 800 by 1000 you can do
fig = Figure(figsize=(8, 10), dpi=100)
Exporting an image is as simple as
fig.savefig("MatPlotLib_Graph.png", dpi = 100)

Create polygons from points with GeoPandas

I have a geopandas dataframe containing a list of shapely POINT geometries. There is another column with a list of ID's that specifies which unique polygon each point belongs to. Simplified input code is:
import pandas as pd
from shapely.geometry import Point, LineString, Polygon
from geopandas import GeoDataFrame
data = [[1,10,10],[1,15,20],[1,20,10],[2,30,30],[2,35,40],[2,40,30]]
df_poly = pd.DataFrame(data, columns = ['poly_ID','lon', 'lat'])
geometry = [Point(xy) for xy in zip(df_poly.lon, df_poly.lat)]
geodf_poly = GeoDataFrame(df_poly, geometry=geometry)
geodf_poly.head()
I would like to groupby the poly_ID in order to convert the geometry from POINT to POLYGON. This output would essentially look like:
poly_ID geometry
1 POLYGON ((10 10, 15 20, 20 10))
2 POLYGON ((30 30, 35 40, 40 30))
I imagine this is quite simple, but I'm having trouble getting it to work. I found the following code that allowed me to convert it to open ended polylines, but have not been able to figure it out for polygons. Can anyone suggest how to adapt this?
geodf_poly = geodf_poly.groupby(['poly_ID'])['geometry'].apply(lambda x: LineString(x.tolist()))
Simply replacing LineString with Polygon results in TypeError: object of type 'Point' has no len()
Your request is a bit tricky to accomplish in Pandas because, in your output you want the text 'POLYGON' but numbers inside the brackets.
See the below options work for you
from itertools import chain
df_poly.groupby('poly_ID').agg(list).apply(lambda x: tuple(chain.from_iterable(zip(x['lon'], x['lat']))), axis=1).reset_index(name='geometry')
output
poly_ID geometry
0 1 (10, 10, 15, 20, 20, 10)
1 2 (30, 30, 35, 40, 40, 30)
or
from itertools import chain
df_new =df_poly.groupby('poly_ID').agg(list).apply(lambda x: tuple(chain.from_iterable(zip(x['lon'], x['lat']))), axis=1).reset_index(name='geometry')
df_new['geometry']=df_new.apply(lambda x: 'POLYGON ('+str(x['geometry'])+')',axis=1 )
df_new
Output
poly_ID geometry
0 1 POLYGON ((10, 10, 15, 20, 20, 10))
1 2 POLYGON ((30, 30, 35, 40, 40, 30))
Note: Column geometry is a string & I am not sure you can feed this directly into Shapely
This solution works for large data via .dissolve and .convex_hull.
>>> import pandas as pd
>>> import geopandas as gpd
>>> df = pd.DataFrame(
... {
... "x": [0, 1, 0.1, 0.5, 0, 0, -1, 0],
... "y": [0, 0, 0.1, 0.5, 1, 0, 0, -1],
... "label": ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b'],
... }
... )
>>> gdf = geopandas.GeoDataFrame(
... df,
... geometry=gpd.points_from_xy(df["x"], df["y"]),
... )
>>> gdf
x y label geometry
0 0.0 0.0 a POINT (0.00000 0.00000)
1 1.0 1.0 a POINT (1.00000 1.00000)
2 0.1 0.1 a POINT (0.10000 0.10000)
3 0.5 0.5 a POINT (0.50000 0.50000)
4 0.0 1.0 a POINT (0.00000 1.00000)
5 0.0 0.0 b POINT (0.00000 0.00000)
6 -1.0 0.0 b POINT (-1.00000 0.00000)
7 0.0 -1.0 b POINT (0.00000 -1.00000)
>>> res = gdf.dissolve("label").convex_hull
>>> res.to_wkt()
label
a POLYGON ((0 0, 0 1, 1 0, 0 0))
b POLYGON ((0 -1, -1 0, 0 0, 0 -1))
dtype: object

Geopandas plots as subfigures

Say I have the following geodataframe that contains 3 polygon objects.
import geopandas as gpd
from shapely.geometry import Polygon
p1=Polygon([(0,0),(0,1),(1,1),(1,0)])
p2=Polygon([(3,3),(3,6),(6,6),(6,3)])
p3=Polygon([(3,.5),(4,2),(5,.5)])
gdf=gpd.GeoDataFrame(geometry=[p1,p2,p3])
gdf['Value1']=[1,10,20]
gdf['Value2']=[300,200,100]
gdf content:
>>> gdf
geometry Value1 Value2
0 POLYGON ((0 0, 0 1, 1 1, 1 0, 0 0)) 1 300
1 POLYGON ((3 3, 3 6, 6 6, 6 3, 3 3)) 10 200
2 POLYGON ((3 0.5, 4 2, 5 0.5, 3 0.5)) 20 100
>>>
I can make a separate figure for each plot by calling geopandas.plot() twice. However, is there a way for me to plot both of these maps next to each other in the same figure as subfigures?
Always always always create your matplotlib objects ahead of time and pass them to the plotting methods (or use them directly). Doing so, your code becomes:
from matplotlib import pyplot
import geopandas
from shapely import geometry
p1 = geometry.Polygon([(0,0),(0,1),(1,1),(1,0)])
p2 = geometry.Polygon([(3,3),(3,6),(6,6),(6,3)])
p3 = geometry.Polygon([(3,.5),(4,2),(5,.5)])
gdf = geopandas.GeoDataFrame(dict(
geometry=[p1, p2, p3],
Value1=[1, 10, 20],
Value2=[300, 200, 100],
))
fig, (ax1, ax2) = pyplot.subplots(ncols=2, sharex=True, sharey=True)
gdf.plot(ax=ax1, column='Value1')
gdf.plot(ax=ax2, column='Value2')
Which gives me:
// for plotting multiple GeoDataframe
import geopandas as gpd
gdf = gpd.read_file(geojson)
fig, axes = plt.subplots(1,4, figsize=(40,10))
axes[0].set_title('Some Title')
gdf.plot(ax=axes[0], column='Some column for coloring', cmap='coloring option')
axes[0].set_title('Some Title')
gdf.plot(ax=axes[0], column='Some column for coloring', cmap='coloring option')