Finding the pair of points whose distance from each other is maximal - sql

I have a very small database which includes 6 points, with those columns id, the_geom, descr. And my aim to write a PL/pgSQL function which finds the the pair of points whose distance from each other is maximal. As an output, I would like to show the id or descr of two points and also the distance between them.
I have tried to do a function with returns table but setof text would be better solution?

You may try something like a cross join to find all combinations, then order by the difference. If your table name was foo something similar to:
SELECT set1.id, set2.id, abs(set1.the_geom - set2.the_geom) --- May want to use earth_distance extension ehre
FROM foo set1, foo set2
WHERE set1.id != set2.id
ORDER BY 3 DESC;
And if you need earth distance to calculate the distance itself - http://www.postgresql.org/docs/9.3/static/earthdistance.html

Related

Fastest way to filter latitude and longitude from a sql table?

I have a huge table data where sample data is like below. I want to filter few latitude and longitude records from the huge table and I am using In clause to filter list of lat,lon values but when I try to run the query it takes more a min to execute what is the better query to execute it faster? the list of lat,lon is around 120-150
id longitude latitude
--------------------------
190 -0.410123 51.88409
191 -0.413256 51.84567
query:-
SELECT DISTINCT id, longitude, latitude
FROM geo_table
WHERE ROUND(longitude::numeric, 3) IN (-0.418, -0.417, -0.417, -0.416 and so on )
AND ROUND(latitude::numeric, 3) IN (51.884, 51.884, 51.883, 51.883 and so on);
If at least one of the ranges of values in X or Y is tight you can try prefiltering rows. For example, if X (longitude) values are all close together you could try:
SELECT distinct id,longitude,latitude
from (
select *
FROM geo_table
where longitude between -0.418 and -0.416 -- prefilter with index scan
and latitude between 51.883 and 51.884 -- prefilter with index filter
) x
-- now the re-check logic for exact filtering
where ROUND(longitude::numeric,3) in (-0.418, -0.417, -0.417, -0.416, ...)
and ROUND(latitude::numeric,3) in (51.884, 51.884, 51.883, 51.883, ...)
You would need an index with the form:
create index ix1 on geo_table (longitude, latitude);
First, the way you are looking for a list of latitudes and a list of longitudes is likely wrong if you are looking for points locations:
point: lat;long
----------------
Point A: 1;10
Point B: 2;10
Point C: 1;20
Point D: 2;20
--> if you search for latitude in (1;2) and longitude in (10;20), the query will return the 4 points, while if you search for (latitude,longitude) in ((1;10),(2;20)), the query will return only points A and D.
Then, since you are looking for rounded values, you must index the rounded values:
create index latlong_rdn on geo_table (round(longitude,3),round(latitude,3));
and the query should use the exact same expression:
select *
from geo_table
where (round(longitude,3),round(latitude,3)) in
(
(-0.413,51.846),
(-0.410,51.890)
);
But here again rounding is not necessarily the best approach when dealing with locations. You may want to have a look at the PostGIS extension, to save the points as geography, add a spatial index, and to search for points within a distance (st_dwithin()) of the input locations.

function to sum all first value of Results SQL

I have a table with "Number", "Name" and "Result" Column. Result is a 2D text Array and I need to create a Column with the name "Average" that sum all first values of Result Array and divide by 2, can somebody help me Pls, I must use the create function for this. Its look like this:
Table1
Number
Name
Result
Average
01
Kevin
{{2.0,10},{3.0,50}}
2.5
02
Max
{{1.0,10},{4.0,30},{5.0,20}}
5.0
Average = ((2.0+3.0)/2) = 2.5
= ((1.0+4.0+5.0)/2) = 5.0
First of all: You should always avoid storing arrays in the table (or generate them in a subquery if not extremely necessary). Normalize it, it makes life much easier in nearly every single use case.
Second: You should avoid more-dimensional arrays. The are very hard to handle. See Unnest array by one level
However, in your special case you could do something like this:
demo:db<>fiddle
SELECT
number,
name,
SUM(value) FILTER (WHERE idx % 2 = 1) / 2 -- 2
FROM mytable,
unnest(avg_result) WITH ORDINALITY as elements(value, idx) -- 1
GROUP BY number, name
unnest() expands the array elements into one element per record. But this is not an one-level expand: It expand ALL elements in depth. To keep track of your elements, you could add an index using WITH ORDINALITY.
Because you have nested two-elemented arrays, the unnested data can be used as follows: You want to sum all first of two elements, which is every second (the odd ones) element. Using the FILTER clause in the aggregation helps you to aggregate only exact these elements.
However: If that's was a result of a subquery, you should think about doing the operation BEFORE array aggregation (if this is really necessary). This makes things easier.
Assumptions:
number column is Primary key.
result column is text or varchar type
Here are the steps for your requirements:
Add the column in your table using following query (you can skip this step if column is already added)
alter table table1 add column average decimal;
Update the calculated value by using below query:
update table1 t1
set average = t2.value_
from
(
select
number,
sum(t::decimal)/2 as value_
from table1
cross join lateral unnest((result::text[][])[1:999][1]) as t
group by 1
) t2
where t1.number=t2.number
Explanation: Here unnest((result::text[][])[1:999][1]) will return the first value of each child array (considering you can have up to 999 child arrays in your 2D array. You can increase or decrease it as per your requirement)
DEMO
Now you can create your function as per your requirement with above query.

How to get the intersection length of touching geometries with ST_Touches

I am trying to develop a query in Postgis, where it can solve this problem:
I have a geometry and I wanna know which of the polygons that touches it, there is the highest contact area of this geometry. After I recognize this polygon I will take its value in a specific column and put this value in the same column but in my geometry.
Someone know how can I do that? I am a new user in postgresql/postgis.
As pointed out by #JGH in the comments, the overlapping area will be zero if you use ST_Touches alone. What you can do is to filter out only the geometries that do touch your reference geometry and then use ST_Intersection to get the intersection area, so that you can finally calculate the length of the intersection with ST_Length.
Data Sample
The geometry values depicted above are inside the CTE:
WITH j (id,geom) AS (
VALUES
(1,'POLYGON((-4.64 54.19,-4.59 54.19,-4.59 54.17,-4.64 54.17,-4.64 54.19))'),
(2,'POLYGON((-4.59 54.19,-4.56 54.19,-4.56 54.17,-4.59 54.17,-4.59 54.19))'),
(3,'LINESTRING(-4.65 54.19,-4.57 54.21)'),
(4,'POLYGON((-4.66 54.21,-4.60 54.21,-4.60 54.20,-4.66 54.20,-4.66 54.21))'),
(5,'POINT(-4.57 54.20)')
)
SELECT
id,
ST_Length(
ST_Intersection(
geom,
'POLYGON((-4.62 54.22,-4.58 54.22,-4.58 54.19,
-4.62 54.19,-4.62 54.22))')) AS touch_length
FROM j
WHERE
ST_Touches(
geom,
'POLYGON((-4.62 54.22,-4.58 54.22,-4.58 54.19,
-4.62 54.19,-4.62 54.22))')
ORDER BY touch_length DESC
LIMIT 1;
id | touch_length
----+---------------------
1 | 0.03000000000000025
(1 Zeile)

Need polygon intersect counts with buffered polygon neighbors FOR EACH polygon

I'm trying to find whether its possible in purely SQL to generate a table with the number of intersects each polygon in a layer has with its corresponding neighboring polygons(buffered) in a buffered version of the layer.
A rough and flawed version is the following:
For each value in list:
SELECT
Count(*)
INTO
intersectcounts
FROM
parcels,parcelsbuffered
WHERE
parcels.apn = value AND ST_INTERSECT(parcels.geom,parcelsbuffered.geom)
Here the geom is the polygon
I need as result like
intersectscount table
APN COUNT
100 3
101 87
...
...
I could use python loop and modify the query string with a different value in the WHERE clause but I dont think this will have good performance - there are thousands of parcels(polygons)
SELECT parcels.apn, count(*) as intersectcounts
FROM parcels
JOIN parcelsbuffered
ON ST_INTERSECT(parcels.geom, parcelsbuffered.geom)
GROUP BY parcels.apn
You probably want include some validation to remove the parcel intersect with his own buffered version like
(count(*) - 1) as intersectcounts
or
WHERE parcerls.apn <> parcelsbuffered.apn

Select pair of rows that obey a rule

I have a big table (1M rows) with the following columns:
source, dest, distance.
Each row defines a link (from A to B).
I need to find the distances between a pair using anoter node.
An example:
If want to find the distance between A and B,
If I find a node x and have:
x -> A
x -> B
I can add these distances and have the distance beetween A and B.
My question:
How can I find all the nodes (such as x) and get their distances to (A and B)?
My purpose is to select the min value of distance.
P.s: A and B are just one connection (I need to do it for 100K connections).
Thanks !
As Andomar said, you'll need the Dijkstra's algorithm, here's a link to that algorithm in T-SQL: T-SQL Dijkstra's Algorithm
Assuming you want to get the path from A-B with many intermediate steps it is impossible to do it in plain SQL for an indefinite number of steps. Simply put, it lacks the expressive power, see http://en.wikipedia.org/wiki/Expressive_power#Expressive_power_in_database_theory . As Andomar said, load the data into a process and us Djikstra's algorithm.
This sounds like the traveling salesman problem.
From a SQL syntax standpoint: connect by prior would build the tree your after using the start with and limit the number of layers it can traverse; however, doing will not guarantee the minimum.
I may get downvoted for this, but I find this an interesting problem. I wish that this could be a more open discussion, as I think I could learn a lot from this.
It seems like it should be possible to achieve this by doing multiple select statements - something like SELECT id FROM mytable WHERE source="A" ORDER BY distance ASC LIMIT 1. Wrapping something like this in a while loop, and replacing "A" with an id variable, would do the trick, no?
For example (A is source, B is final destination):
DECLARE var_id as INT
WHILE var_id != 'B'
BEGIN
SELECT id INTO var_id FROM mytable WHERE source="A" ORDER BY distance ASC LIMIT 1
SELECT var_id
END
Wouldn't something like this work? (The code is sloppy, but the idea seems sound.) Comments are more than welcome.
Join the table to itself with destination joined to source. Add the distance from the two links. Insert that as a new link with left side source, right side destination and total distance if that isn't already in the table. If that is in the table but with a shorter total distance then update the existing row with the shorter distance.
Repeat this until you get no new links added to the table and no updates with a shorter distance. Your table now contains a link for every possible combination of source and destination with the minimum distance between them. It would be interesting to see how many repetitions this would take.
This will not track the intermediate path between source and destination but only provides the shortest distance.
IIUC this should do, but I'm not sure if this is really viable (performance-wise) due to the big amount of rows involved and to the CROSS JOIN
SELECT
t1.src AS A,
t1.dest AS x,
t2.dest AS B,
t1.distance + t2.distance AS total_distance
FROM
big_table AS t1
CROSS JOIN
big_table AS t2 ON t1.dst = t2.src
WHERE
A = 'insert source (A) here' AND
B = 'insert destination (B) here'
ORDER BY
total_distance ASC
LIMIT
1
The above snippet will work for the case in which you have two rows in the form A->x and x->B but not for other combinations (e.g. A->x and B->x). Extending it to cover all four combiantions should be trivial (e.g. create a view that duplicates each row and swaps src and dest).