Need polygon intersect counts with buffered polygon neighbors FOR EACH polygon - sql

I'm trying to find whether its possible in purely SQL to generate a table with the number of intersects each polygon in a layer has with its corresponding neighboring polygons(buffered) in a buffered version of the layer.
A rough and flawed version is the following:
For each value in list:
SELECT
Count(*)
INTO
intersectcounts
FROM
parcels,parcelsbuffered
WHERE
parcels.apn = value AND ST_INTERSECT(parcels.geom,parcelsbuffered.geom)
Here the geom is the polygon
I need as result like
intersectscount table
APN COUNT
100 3
101 87
...
...
I could use python loop and modify the query string with a different value in the WHERE clause but I dont think this will have good performance - there are thousands of parcels(polygons)

SELECT parcels.apn, count(*) as intersectcounts
FROM parcels
JOIN parcelsbuffered
ON ST_INTERSECT(parcels.geom, parcelsbuffered.geom)
GROUP BY parcels.apn
You probably want include some validation to remove the parcel intersect with his own buffered version like
(count(*) - 1) as intersectcounts
or
WHERE parcerls.apn <> parcelsbuffered.apn

Related

function to sum all first value of Results SQL

I have a table with "Number", "Name" and "Result" Column. Result is a 2D text Array and I need to create a Column with the name "Average" that sum all first values of Result Array and divide by 2, can somebody help me Pls, I must use the create function for this. Its look like this:
Table1
Number
Name
Result
Average
01
Kevin
{{2.0,10},{3.0,50}}
2.5
02
Max
{{1.0,10},{4.0,30},{5.0,20}}
5.0
Average = ((2.0+3.0)/2) = 2.5
= ((1.0+4.0+5.0)/2) = 5.0
First of all: You should always avoid storing arrays in the table (or generate them in a subquery if not extremely necessary). Normalize it, it makes life much easier in nearly every single use case.
Second: You should avoid more-dimensional arrays. The are very hard to handle. See Unnest array by one level
However, in your special case you could do something like this:
demo:db<>fiddle
SELECT
number,
name,
SUM(value) FILTER (WHERE idx % 2 = 1) / 2 -- 2
FROM mytable,
unnest(avg_result) WITH ORDINALITY as elements(value, idx) -- 1
GROUP BY number, name
unnest() expands the array elements into one element per record. But this is not an one-level expand: It expand ALL elements in depth. To keep track of your elements, you could add an index using WITH ORDINALITY.
Because you have nested two-elemented arrays, the unnested data can be used as follows: You want to sum all first of two elements, which is every second (the odd ones) element. Using the FILTER clause in the aggregation helps you to aggregate only exact these elements.
However: If that's was a result of a subquery, you should think about doing the operation BEFORE array aggregation (if this is really necessary). This makes things easier.
Assumptions:
number column is Primary key.
result column is text or varchar type
Here are the steps for your requirements:
Add the column in your table using following query (you can skip this step if column is already added)
alter table table1 add column average decimal;
Update the calculated value by using below query:
update table1 t1
set average = t2.value_
from
(
select
number,
sum(t::decimal)/2 as value_
from table1
cross join lateral unnest((result::text[][])[1:999][1]) as t
group by 1
) t2
where t1.number=t2.number
Explanation: Here unnest((result::text[][])[1:999][1]) will return the first value of each child array (considering you can have up to 999 child arrays in your 2D array. You can increase or decrease it as per your requirement)
DEMO
Now you can create your function as per your requirement with above query.

How to find the points that did not intersect with STIntersect using SQL Server

I performed STInteract using two tables and the intersections of points onto a given polygon. I have converted all the tables to have geometries for all. I am having a problem writing the query for this. I am trying to look for the points that did not intersect.
These are my two table
PO_Database = contains the points
POLY_Database = Polygon of interest
This is my script:
SELECT GEOM
FROM [dbo].[PO_Database] as PO
JOIN [dbo].[POLY_Database] as p ON hwy.GEOM.STIntersects(p.NEATCELL) = 1
I tried changing the value from 1 to 0 but I get repeating values of the geometry for when the query is run with 0. How do I write the query to give me the names of the points that did not intersect with the polygon. Also is there a way to do checks if the intersects where done right.
If you get repeating values, you probably have multiple rows in the POLY_Database table. If you want to find the points that do not intersect any of those polygons, try this query:
SELECT GEOM
FROM [dbo].[PO_Database] as PO
WHERE NOT EXISTS (
SELECT * FROM [dbo].[POLY_Database] as p
WHERE hwy.GEOM.STIntersects(p.NEATCELL) = 1
)

Finding the pair of points whose distance from each other is maximal

I have a very small database which includes 6 points, with those columns id, the_geom, descr. And my aim to write a PL/pgSQL function which finds the the pair of points whose distance from each other is maximal. As an output, I would like to show the id or descr of two points and also the distance between them.
I have tried to do a function with returns table but setof text would be better solution?
You may try something like a cross join to find all combinations, then order by the difference. If your table name was foo something similar to:
SELECT set1.id, set2.id, abs(set1.the_geom - set2.the_geom) --- May want to use earth_distance extension ehre
FROM foo set1, foo set2
WHERE set1.id != set2.id
ORDER BY 3 DESC;
And if you need earth distance to calculate the distance itself - http://www.postgresql.org/docs/9.3/static/earthdistance.html

SQL Cross Apply Performance Issues

My database has a directory of about 2,000 locations scattered throughout the United States with zipcode information (which I have tied to lon/lat coordinates).
I also have a table function which takes two parameters (ZipCode & Miles) to return a list of neighboring zip codes (excluding the same zip code searched)
For each location I am trying to get the neighboring location ids. So if location #4 has three nearby locations, the output should look like:
4 5
4 24
4 137
That is, locations 5, 24, and 137 are within X miles of location 4.
I originally tried to use a cross apply with my function as follows:
SELECT A.SL_STORENUM,A.Sl_Zip,Q.SL_STORENUM FROM tbl_store_locations AS A
CROSS APPLY (SELECT SL_StoreNum FROM tbl_store_locations WHERE SL_Zip in (select zipnum from udf_GetLongLatDist(A.Sl_Zip,7))) AS Q
WHERE A.SL_StoreNum='04'
However that ran for over 20 minutes with no results so I canceled it. I did try hardcoding in the zipcode and it immediately returned a list
SELECT A.SL_STORENUM,A.Sl_Zip,Q.SL_STORENUM FROM tbl_store_locations AS A
CROSS APPLY (SELECT SL_StoreNum FROM tbl_store_locations WHERE SL_Zip in (select zipnum from udf_GetLongLatDist('12345',7))) AS Q
WHERE A.SL_StoreNum='04'
What is the most efficient way of accomplishing this listing of nearby locations? Keeping in mind while I used "04" as an example here, I want to run the analysis for 2,000 locations.
The "udf_GetLongLatDist" is a function which uses some math to calculate distance between two geographic coordinates and returns a list of zipcodes with a distance of > 0. Nothing fancy within it.
When you use the function you probably have to calculate every single possible distance for each row. That is why it takes so long. SInce teh actual physical locations don;t generally move, what we always did was precalculate the distance from each zipcode to every other zip code (and update only once a month or so when we added new possible zipcodes). Once the distances are precalculated, all you have to do is run a query like
select zip2 from zipprecalc where zip1 = '12345' and distance <=10
We have something similar and optimized it by only calculating the distance of other zipcodes whose latitude is within a bounded range. So if you want other zips within #miles, you use a
where latitude >= #targetLat - (#miles/69.2) and latitude <= #targetLat + (#miles/69.2)
Then you are only calculating the great circle distance of a much smaller subset of other zip code rows. We found this fast enough in our use to not require precalculating.
The same thing can't be done for longitude because of the variation between equator and pole of what distance a degree of longitude represents.
Other answers here involve re-working the algorithm. I personally advise the pre-calculated map of all zipcodes against each other. It should be possible to embed such optimisations in your existing udf, to minimise code-changes.
A refactoring of the query, however, could be as follows...
SELECT
A.SL_STORENUM, A.Sl_Zip, C.SL_STORENUM
FROM
tbl_store_locations AS A
CROSS APPLY
dbo.udf_GetLongLatDist(A.Sl_Zip,7) AS B
INNER JOIN
tbl_store_locations AS C
ON C.SL_Zip = B.zipnum
WHERE
A.SL_StoreNum='04'
Also, the performance of the CROSS APPLY will benefit greatly if you can ensure that the udf is INLINE rather than MULTI-STATEMENT. This allows the udf to be expanded inline (macro like) for a much cleaner execution plan.
Doing so would also allow you to return additional fields from the udf. The optimiser can then include or exclude those fields from the plan depending on whether you actually use them. Such an example would be to include the SL_StoreNum if it's easily accessible from the query in the udf, and so remove the need for the last join...

Select pair of rows that obey a rule

I have a big table (1M rows) with the following columns:
source, dest, distance.
Each row defines a link (from A to B).
I need to find the distances between a pair using anoter node.
An example:
If want to find the distance between A and B,
If I find a node x and have:
x -> A
x -> B
I can add these distances and have the distance beetween A and B.
My question:
How can I find all the nodes (such as x) and get their distances to (A and B)?
My purpose is to select the min value of distance.
P.s: A and B are just one connection (I need to do it for 100K connections).
Thanks !
As Andomar said, you'll need the Dijkstra's algorithm, here's a link to that algorithm in T-SQL: T-SQL Dijkstra's Algorithm
Assuming you want to get the path from A-B with many intermediate steps it is impossible to do it in plain SQL for an indefinite number of steps. Simply put, it lacks the expressive power, see http://en.wikipedia.org/wiki/Expressive_power#Expressive_power_in_database_theory . As Andomar said, load the data into a process and us Djikstra's algorithm.
This sounds like the traveling salesman problem.
From a SQL syntax standpoint: connect by prior would build the tree your after using the start with and limit the number of layers it can traverse; however, doing will not guarantee the minimum.
I may get downvoted for this, but I find this an interesting problem. I wish that this could be a more open discussion, as I think I could learn a lot from this.
It seems like it should be possible to achieve this by doing multiple select statements - something like SELECT id FROM mytable WHERE source="A" ORDER BY distance ASC LIMIT 1. Wrapping something like this in a while loop, and replacing "A" with an id variable, would do the trick, no?
For example (A is source, B is final destination):
DECLARE var_id as INT
WHILE var_id != 'B'
BEGIN
SELECT id INTO var_id FROM mytable WHERE source="A" ORDER BY distance ASC LIMIT 1
SELECT var_id
END
Wouldn't something like this work? (The code is sloppy, but the idea seems sound.) Comments are more than welcome.
Join the table to itself with destination joined to source. Add the distance from the two links. Insert that as a new link with left side source, right side destination and total distance if that isn't already in the table. If that is in the table but with a shorter total distance then update the existing row with the shorter distance.
Repeat this until you get no new links added to the table and no updates with a shorter distance. Your table now contains a link for every possible combination of source and destination with the minimum distance between them. It would be interesting to see how many repetitions this would take.
This will not track the intermediate path between source and destination but only provides the shortest distance.
IIUC this should do, but I'm not sure if this is really viable (performance-wise) due to the big amount of rows involved and to the CROSS JOIN
SELECT
t1.src AS A,
t1.dest AS x,
t2.dest AS B,
t1.distance + t2.distance AS total_distance
FROM
big_table AS t1
CROSS JOIN
big_table AS t2 ON t1.dst = t2.src
WHERE
A = 'insert source (A) here' AND
B = 'insert destination (B) here'
ORDER BY
total_distance ASC
LIMIT
1
The above snippet will work for the case in which you have two rows in the form A->x and x->B but not for other combinations (e.g. A->x and B->x). Extending it to cover all four combiantions should be trivial (e.g. create a view that duplicates each row and swaps src and dest).