I want to select the largest region from a tuple of regions (ConnectedRegions in this case).
threshold (Image, Region, 250, 255)
connection (Region, ConnectedRegions)
* TODO: Get the largest region in ConnectedRegions
What is an elegant way to achieve this?
Updated Answer
Use select_shape_std:
select_shape_std (ConnectedRegions, MaxRegion, 'max_area', 0)
For other selections criteria, there is select_shape.
Original Answer
With three operators, you can solve the task in the example: area_center, tuple_sort_index and select_obj
threshold (Image, Region, 250, 255)
connection (Region, ConnectedRegions)
* Get the area of each region. R and C return values are not used.
area_center (ConnectedRegions, Areas, R, C)
* Get the indices to sort the areas in descending order.
tuple_sort_index (- Areas, SortIndices)
* Select the region using the first index.
* We need to add 1, because control tuples use 0-based indexing,
* while object tuples are 1-based
select_obj (ConnectedRegions, MaxRegion, SortIndices[0] + 1)
Related
I need to find shapes that have many branches.. But with only the region_features i'm not able to make this work.
basically, I'd need a score for a "branch factor".. for example, a star would have a rather high score, since each tip would be a branch.. a picture of a tree-branch would have a high score, since it has many smaller branches.. A sphere, or a cube would have a low score since it does not have many branches..
I have tried with the proportion between area and circumference, but its not precise enough..
here are 2 samples.. one that hsould have a high score, and one that should have a low score:
These are only samples to explain what I mean by branches.. the shapes can have any form..
No, there is not this kind of parameter.
Maybe you can extract this parameter with a code like:
* load image example
read_image(Image,'ppUXL.jpg')
* create 4 Regions
binary_threshold (Image, Region, 'max_separability', 'dark', UsedThreshold)
connection (Region, Regions)
count_obj (Regions,NumRegions)
NumBranches :=[]
* for every region in Regions
for i:=1 to NumRegions by 1
* select the region
select_obj (Regions, RegionSelected, i)
* --------------------------------------------------------------------------
* Here I want to calculate the region convex hull,
* i.e. the smallest region convex region that contains the selected region
* https://en.wikipedia.org/wiki/Convex_hull
* --------------------------------------------------------------------------
* convex hull of a region as polygon
get_region_convex (RegionSelected, Rows, Columns)
* trasform the polygon in a region
gen_region_polygon_filled (ConvexRegion, Rows, Columns)
* For avoiding to merge separeted parts, I erode a little the convex region
erosion_circle (ConvexRegion, RegionErosion, 1.5)
* Now I remove the selected region from its convex (erosed) region.
* In most of the case the results is the space between the branches
difference (RegionErosion, RegionSelected, RegionDifference)
* --------------------------------------------------------------------------
* I separate the space between the branches and I count the its number
* --------------------------------------------------------------------------
* connection
connection (RegionDifference, InsideRegions)
* I remove empy regions
select_shape (InsideRegions, InsideSelectedRegions, 'area', 'and', 1, 99999999)
* I count the regions
count_obj (InsideSelectedRegions,NumInsideRegions)
* I add the result to the array
NumBranches :=[NumBranches,NumInsideRegions]
endfor
New to PostGIS/PostgreSQL...any help would be greatly appreciated!
I have two tables in a postgres db aliased as gas and ev. I'm trying to choose a specific gas station (gas.site_id=11949) and locate all EV/alternative fuel charging stations within a 1000m radius. When I run the following though, PostGIS returns a number of ev stations that are all stacked on top of each other in the map (see screenshot).
Anyone have any idea why this is happening? How can I get PostGIS to visualize the points within a 1000m radius of the specified gas station?
with myplace as (
SELECT gas.geom
from nj_gas gas
where gas.site_id = 11949 limit 1)
select myplace.*, ev.*
from alt_fuel ev, myplace
where ST_DWithin(ev.geom1, myplace.geom, 1000)
The function ST_DWithin does not compute distances in meters using geometry typed parameters.
From the documentation:
For geometry: The distance is specified in units defined by the
spatial reference system of the geometries. For this function to make
sense, the source geometries must both be of the same coordinate
projection, having the same SRID.
So, if you want compute distances in meters you have to use the data type geography:
For geography units are in meters and measurement is defaulted to
use_spheroid=true, for faster check, use_spheroid=false to measure
along sphere.
That all being said, you have to cast the data type of your geometries. Besides that your query looks just fine - considering your data is correct :-)
WITH myplace as (
SELECT gas.geom
FROM nj_gas gas
WHERE gas.site_id = 11949 LIMIT 1)
SELECT myplace.*, ev.*
FROM alt_fuel ev, myplace
WHERE ST_DWithin(ev.geom1::GEOGRAPHY, myplace.geom::GEOGRAPHY, 1000)
Sample data:
CREATE TABLE t1 (id INT, geom GEOGRAPHY);
INSERT INTO t1 VALUES (1,'POINT(-4.47 54.22)');
CREATE TABLE t2 (geom GEOGRAPHY);
INSERT INTO t2 VALUES ('POINT(-4.48 54.22)'),('POINT(-4.41 54.18)');
Query
WITH j AS (
SELECT geom FROM t1 WHERE id = 1 LIMIT 1)
SELECT ST_AsText(t2.geom)
FROM j,t2 WHERE ST_DWithin(t2.geom, j.geom, 1000);
st_astext
--------------------
POINT(-4.48 54.22)
(1 Zeile)
You are cross joining those tables and have PostgreSQL return the cartesian product of both when selecting myplace.* & ev.*.
So while there is only one row in myplace, its geom will be merged with every row of alt_fuel (i.e. the result set will have all columns of both tables in every possible combination of both); since the result set thus has two geometry columns, your client application likely chooses either the first, or the one called geom (as opposed to alt_fuel.geom1) to display!
I don't see that you are interested in myplace.geom in the result set anyway, so I suggest to run
WITH
myplace as (
SELECT gas.geom
FROM nj_gas gas
WHERE gas.site_id = 11949
LIMIT 1
)
SELECT ev.*
FROM alt_fuel AS ev
JOIN myplace AS mp
ON ST_DWithin(ev.geom1, mp.geom, 1000) -- ST_DWithin(ev.geom1::GEOGRAPHY, mp.geom::GEOGRAPHY, 1000)
;
If, for some reason, you also want to display myplace.geom along with the stations, you'd have to UNION[ ALL] the above with a SELECT * on myplace; note that you will also have to provide the same column list and structure (same data types!) as alt_fuel.* (or better, the other side of the UNION[ ALL]) in that SELECT!
Note the suggestions made by #JimJones about units; if your data is not projected in a meter based CRS (but in a geographic reference system; 'LonLat'), use the cast to GEOGRAPHY to have ST_DWithin consider the input as meter (and calculate using spheroidal algebra instead of planar (Euclidean))!
Resolved by using:
WITH
myplace as (
SELECT geom as g
FROM nj_gas
WHERE site_id = 11949 OR site_id = 11099 OR site_id = 11679 or site_id = 480522
), myresults AS (
SELECT ev.*
FROM alt_fuel AS ev
JOIN myplace AS mp
ON ST_DWithin(ev.geom, mp.g, 0.1))
select * from myresults```
Thanks so much for your help #ThingumaBob and #JimJones ! Greatly appreciate it.
Let P and Q be two finite probability distributions on integers, with support between 0 and some large integer N. The one-dimensional earth mover's distance between P and Q is the minimum cost you have to pay to transform P into Q, considering that it costs r*|n-m| to "move" a probability r associated to integer n to another integer m.
There is a simple algorithm to compute this. In pseudocode:
previous = 0
sum = 0
for i from 0 to N:
previous = P(i) - Q(i) + previous
sum = sum + abs(previous) // abs = absolute value
return sum
Now, suppose you have two tables that contain each a probability distribution. Column n contains integers, and column p contains the corresponding probability. The tables are correct (all probabilities are between 0 and 1, their sum is I want to compute the earth mover's distance between these two tables in BigQuery (Standard SQL).
Is it possible? I feel like one would need to use analytical functions, but I don't have much experience with them, so I don't know how to get there.
What if N (the maximum integers) is very large, but my tables are not? Can we adapt the solution to avoid doing a computation for each integer i?
Hopefully I fully understand your problem. This seems to be what you're looking for:
WITH Aggr AS (
SELECT rp.n AS n, SUM(rp.p - rq.p)
OVER(ORDER BY rp.n ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS emd
FROM P rp
LEFT JOIN Q rq
ON rp.n = rq.n
) SELECT SUM(ABS(a.emd)) AS total_emd
FROM Aggr a;
WRT question #2, note that we only scan what's actually in tables, regardless of the N, assuming a one-to-one match for every n in P with n in Q.
I adapted Michael's answer to fix its issues, here's the solution I ended up with. Suppose the integers are stored in column i and the probability in column p. First I join the two tables, then I compute EMD(i) for all i using the window, then I sum all absolute values.
WITH
joined_table AS (
SELECT
IFNULL(table1.i, table2.i) AS i,
IFNULL(table1.p, 0) AS p,
IFNULL(table2.p, 0) AS q,
FROM table1
OUTER JOIN table2
ON table1.i = table2.i
),
aggr AS (
SELECT
(SUM(p-q) OVER win) * (i - (LAG(i,1) OVER win)) AS emd
FROM joined_table
WINDOW win AS (
ORDER BY i
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
)
)
SELECT SUM(ABS(emd)) AS total_emd
FROM aggr
In a single query I'm trying to implement a decimation on a Database view (The view's size ranging on 10^7 elements). To do so, I'm trying to select every Nth element in a result set. Currently, the implementation I have is correct, but very slow (2-3 seconds) and I'm looking to find a faster alternative.
Right now I select my result set with row numbers, select that result set with each element's mod value, and finally select all elements from that result set where mod value equals 0 or the rownumber is equal to the total count (to retrieve the last element). To find the Mod value I need to get the total size of the inital result set and divide by 1000 (we decimate by 1000 as a fixed number).
SELECT * FROM (
SELECT id, age, sex, height, weight, attendence_time,
EXP1.rownum, MOD(INITIAL.rownum-1, CEIL(COUNT(*) OVER () /1000)) AS person_mod
FROM (
SELECT id, age, sex, height, weight, attendence_time,
ROW_NUMBER() OVER(ORDER BY attendence_time) AS rownum
FROM person p
WHERE p.age = :age
) INITIAL
) MODROW WHERE MODROW.person_mod = 0 OR rownum = (
SELECT COUNT(*)
FROM person p
WHERE p.age = :age
)
I've looked into NTILE, and although I can use it to separate my result set into a 1000 groups based on the rownumber, I'm not sure how to get just the first element from each group to have a decimated response.
Are there any recommendations on how I might be able to make it more efficient?
Since I have a Java backend, would it be better to return the initial large result set and decimate in Java?
Note I'm using Oracle SQL with JPA/JPQL.
I want to find POINTCOUNT values that cut the input set ADS.PREDICTOR into equally large groups. The parameter POINTCOUNT can have different value for different predictors, so I don't want to hard-code it in the code.
Unfortunately the code below fails with ORA-30496: Argument should be a constant... How can I overcome this (except for 300 lines of code with hard-coded threshold quantiles, of course)?
define POINTCOUNT=300;
select
*
from (
select
percentile_disc(MYQUNTILE)
within group (
order by PREDICTOR ) as THRESHOLD
from ADS
inner join (
select (LEVEL - 1)/(&POINTCOUNT-1) as MYQUANTILE
from dual
connect by LEVEL <= &POINTCOUNT
)
on 1=1
)
group by THRESHOLD
I want to draw a ROC curve. The curve will be plotted in Excel as a linear interpolation between pairs of points (X, Y) calculated in Oracle.
Each point (X, Y) is calculated using a threshold value.
I will get the best approximation of the ROC curve for a give number of the pairs of points if the distance between each adjacent pair of (X, Y) is uniform.
if I cut the domain of the predicted values into N values that separate 1/Nth quantiles, I should get a fairly good set of the threshold values.
PERCENTILE_CONT() only requires that the percentile value be constant within each group. You do not have a group by in your subquery, so I think this might fix your problem:
select MYQUANTILE,
percentile_disc(MYQUANTILE) within group (order by PREDICTOR
) as THRESHOLD
from ADS cross join
(select (LEVEL - 1)/(&POINTCOUNT-1) as MYQUANTILE
from dual
connect by LEVEL <= &POINTCOUNT
)
GROUP BY MYQUANTILE;
Also, note that CROSS JOIN is the same as INNER JOIN . . . ON 1=1.