I have trouble using geography to calculate distance in miles using the format of my table.
Latitude and Longitude of both locations are side by side:
id | A_Latitude | A_Longitude | B_Latitude | B_Longitude
I'm trying to get the point of A, along with the point of B, and return the distance between A and B in miles.
I've tried a few things, including something similar to:
DECLARE #orig geography
DECLARE #end geography
SELECT
#orig = geography::Point(A_LATITUDE, A_LONGITUDE, 4326)
,#end = geography::Point(B_LATITUDE, B_LONGITUDE, 4326)
FROM table1
SELECT
ROUND(#END.STDistance(#ORIG)/1609.344,2) AS MILES
,#END.STDistance(#ORIG) AS METERS
,table1.*
FROM table1;
Where I'm getting a repeating value for miles and meters across all rows. Could anyone please suggest how I should be structuring this query to get what I'm looking for?
EDIT: Thanks SQL Surfer!
WITH X AS
(SELECT
geography::Point(A_LATITUDE, A_LONGITUDE, 4326) A
,geography::Point(B_LATITUDE, B_LONGITUDE, 4326) B
,ID
FROM TABLE1)
SELECT
ROUND(X.B.STDistance(X.A)/1609.344,2) AS MILES
,X.B.STDistance(X.A) AS METERS
,T.*
FROM TABLE1 T
LEFT JOIN X ON T.ID = X.ID
Here's what I'd do:
WITH X AS
(
SELECT
geography::Point(A_LATITUDE, A_LONGITUDE, 4326) A
,geography::Point(B_LATITUDE, B_LONGITUDE, 4326) B
,*
FROM TABLE1
)
SELECT
ROUND(B.STDistance(A)/1609.344, 2) AS MILES
,B.STDistance(X.A) AS METERS
,*
FROM X
If you're thinking of doing this often (and on the fly), consider adding computed columns to your table for not only the geography types, but also the distance between them. At that point, it's just a simple query from the table!
Related
I have a table with about 100 000 names/rows that look something like this. There are about 3000 different Refnrs. The names are clustered around the Refnr geographically. The problem is that there are some names that have the wrong location. I need to find the rows who dont fit in with the others. I figured I would do this by finding the Latidude OR Longitude that is too far away from the Longitude and Latitude in the rest of the same Refnrs. So if you see the first Refnr they two of them are located at Latitude 10.67xxx, and 1 is located at Latitude 10.34xxx.
So if I say that I want to compare all the names in the different Refnrs and sort out where the 2nd decimal number differs from the rest of the names.
Is there any way to do this so that I dont have to manually run a query 3000 times?
Refnr
Latitude
Longitude
Name
123
10.67643
50.67523
bob
123
10.67143
50.67737
joe
123
10.34133
50.67848
al
234
11.56892
50.12324
berny
234
11.56123
50.12432
bonny
234
11.98135
50.12223
arby
567
10.22892
50.67143
nilly
567
10.22123
50.67236
tilly
567
10.22148
50.22422
billy
I need a select to give me this.
Refnr
Latitude
Longitude
Name
123
10.34133
50.67848
al
234
11.98135
50.12223
arby
567
10.22148
50.22422
billy
Thanks for the help.
Here's what is hopefully a working solution - it gives the 3 outliers from your sample data, will be interesting to see if it works on your larger data set.
Create a CTE for each longitude and latitude, count the number of matching values based on first 2 decimal places only and choose the minimum of each group - that's the group's outlier.
Join the results with the main table and filter to only rows matching the outlier lat or long.
with outlierLat as (
select top (1) with ties refnr, Round(latitude,2,1) latitude
from t
group by refnr, Round(latitude,2,1)
order by Count(*)
), outlierLong as (
select top (1) with ties refnr, Round(Longitude,2,1) Longitude
from t
group by refnr, Round(Longitude,2,1)
order by Count(*)
)
select t.*
from t
left join outlierLat lt on lt.refnr=t.refnr and Round(t.latitude,2,1)=lt.latitude
left join outlierLong lo on lo.refnr=t.refnr and Round(t.Longitude,2,1)=lo.Longitude
where lt.latitude is not null or lo.Longitude is not null
See demo Fiddle
This got overly complex, and may not be that useful. Still, it was interesting to work on.
First, set up the test data
DROP TABLE #Test
GO
CREATE TABLE #Test
(
Refnr int not null
,Latitude decimal(7,5) not null
,Longitude decimal(7,5) not null
,Name varchar(100) not null
)
INSERT #Test VALUES
(123, 10.67643, 50.67523, 'bob')
,(123, 10.67143, 50.67737, 'joe')
,(123, 10.34133, 50.67848, 'al')
,(234, 11.56892, 50.12324, 'berny')
,(234, 11.56123, 50.12432, 'bonny')
,(234, 11.98135, 50.12223, 'arby')
,(567, 10.22892, 50.67143, 'nilly')
,(567, 10.22123, 50.67236, 'tilly')
,(567, 10.22148, 50.22422, 'billy')
SELECT *
from #Test
As requirements are a tad imprecise, use this to round lat, lon to the desired precision. Adjust as necessary.
DECLARE #Precision TINYINT = 1
--SELECT
-- Latitude
-- ,round(Latitude, #Precision)
-- from #Test
Then it gets messy. Problems will up with if there are multiple "outliers", by EITHER latitude OR longitude. I think this will account for all, and remove duplicates, but further review and testing is called for.
;WITH cteGroups as (
-- Set up groups by lat/lon proximity
SELECT
Refnr
,'Latitude' Type
,round(Latitude, #Precision) Proximity
,count(*) HowMany
from #Test
group by
Refnr
,round(Latitude, #Precision)
UNION ALL SELECT
Refnr
,'Longitude' Type
,round(Longitude, #Precision) Proximity
,count(*) HowMany
from #Test
group by
Refnr
,round(Longitude, #Precision)
)
,cteOutliers as (
-- Identify outliers
select
Type
,Refnr
,Proximity
,row_number() over (partition by Type, Refnr order by HowMany desc) Ranking
from cteGroups
)
-- Pull out all items that match with outliers
select te.*
from cteOutliers cte
inner join #Test te
on te.Refnr = cte.Refnr
and ( (cte.Type = 'Latitude' and round(te.Latitude, #Precision) = Proximity)
or (cte.Type = 'Longitude' and round(te.Longitude, #Precision) = Proximity) )
where cte.Ranking > 1 -- Not in the larger groups
This averages out the center of the locations and looks for ones far from it
SELECT *
, ABS((SELECT Sum(Latitude) / COUNT(*) FROM #Test) - Latitude)
+ ABS((SELECT Sum(Longitude) / COUNT(*) FROM #Test) - Longitude) as Awayfromhome
from #Test
Order by Awayfromhome desc
as the title say, i am doing a query on a bikesharing data stored in bigquery
I am able to extract the data and arrange it in a correct order to be displayed in a path chart. In the data, there are coordinated with only start and end long and lat, or sometimes only start long and lat, how do i remove anything with less then 4 points?
this is the code , i am also limited to select only
SELECT
routeID ,
json_extract(st_asgeojson(st_makeline( array_agg(st_geogpoint(locs.lon, locs.lat) order by locs.date))),'$.coordinates') as geo
FROM
howardcounty.routebatches
where unlockedAt between {{start_date}} and {{end_date}}
cross join UNNEST(locations) as locs
GROUP BY routeID
order by routeID
limit 10
have also included a screen shot for clarity
To apply a condition after a group by, please use a having. For a simply condition -- Are there at least two dataset for the route? -- this query can be used:
With dummy as (
Select 1 as routeID, [struct(current_timestamp() as date, 1 as lon, 2 as lat),struct(current_timestamp() as date, 3 as lon, 4 as lat)] as locations
Union all select 2 as routeID, [struct(current_timestamp() as date, 10 as lon, 20 as lat)]
)
SELECT
routeID , count(locs.date) as amountcoord,
json_extract(st_asgeojson(st_makeline( array_agg(st_geogpoint(locs.lon, locs.lat) order by locs.date))),'$.coordinates') as geo
FROM
#howardcounty.routebatches
dummy
#where unlockedAt between {{start_date}} and {{end_date}}
cross join UNNEST(locations) as locs
GROUP BY routeID
having count(locs.date)>1
order by routeID
limit 10
For more complex ones, a nested select may do the job:
Select *
from (
--- your code ---
) where length(geo)-length(replace(geo,"]","")) > 1+4
The JSON is transformed to a string in your code. If you count the ] and substract one for the end of the JSON, the inside arrays are counted.
I have a very large amount of lat/long coordinates in Table 1, as well as Table 2. For example, let's say there are 100,000 coordinates in both tables. I need to return the closest pair of coordinates in Table 2 from Table 1, as long as they are within a set minimum distance (let's say, 100 meters) for each unique item from Table 1 (up to 100,000 items, but then culled down to 100 meters is my expected output).
I am fairly familiar with the Geometry and Geography parts of MSSQL, and would traditionally approach the following with something like this:
Select
Table1ID = T1.ID,
Table2ID = T2.ID,
Distance = T1.PointGeog.STDistance(T2.PointGeog),
Keep = 0
into #Distance
From #Table1 T1
cross join #Table2 T2
where T1.PointGeog.STDistance(T2.PointGeog) <= 100
which would return all items from Table2 that are within 100 meters of Table1
Then, to limit to only the closest items, I could:
Update #Distance
set Keep = 1
from #Distance D
inner join
(select shortestDist = min(Distance), Table1ID from #Distance GROUP BY
Table1ID) A
on A.ID = D.Table1ID and A.shortestDist = D.Distance
and then delete anything where keep <> 1
This works, however it takes absolutely forever. The cross join creates an absurd amount of calculations that SQL needs to handle, which results in ~ 9 minute queries on MSSQL 2016. I can limit the range of the portion of Table 1 and Table 2 that I compare with SOME criteria, but really not much. I'm just really not sure how I could make the process quicker. Ultimately, I just need: closest item, distance from T2 to T1.
I have played around with a few different solutions, but I wanted to see if the SO community has any additional ideas on how I could better optimize something like this.
Try CROSS APPLY:
SELECT
T1.ID, TT2.ID, T1.PointGeog.STDistance(TT2.PointGeog)
FROM #Table1 as T1
CROSS APPLY (SELECT TOP 1 T2.ID, T2.PointGeog
FROM #Table2 as T2
WHERE T1.PointGeog.STDistance(T2.PointGeog) <= 100
ORDER BY T1.PointGeog.STDistance(T2.PointGeog) ASC) as TT2
I have played around with a new option, and I think this is the fastest I have gotten the calculation - to about 3 minutes.
I changed Table1 to be:
select
ID,
PointGeog,
Buffer = PointGeom.STBuffer(8.997741566866716e-4)
into #Table1
where the buffer is 100/111139 (convert degrees to meters)
and then
if object_id('tempdb.dbo.#Distance') is not null
drop table #Distance
Select
T1ID = T1.ID,
T1Geog = T1.PointGeog,
T2ID = T2.ID,
T2Geog = T2.PointGeog,
DistanceMeters = cast(null as float),
DistanceMiles = cast(null as float),
Keep = 0
Into #Distance
From #Table1 T1
cross join #Table2 T2
Where T1.Buffer.STIntersects(T2.PointGeom) = 1
which does not calculate the distance, but first culls the dataset to anything within 100 meters. I can then pass an update statement to calculate the distance on a substantially more manageable dataset.
Create a spatial index on geom column on both tables and it shouldn't be too bad performance.
Something like:
CREATE SPATIAL INDEX spat_t ON [#Table1]
(
[PointGeog]
)
I ran some tests with 100k dots on my laptop and it took 3 minutes to "join"
Finding distances on the surface of the earth means using Great Circle distances, worked out with the Haversine formula, also called the Spherical Cosine Law formula.
The problem is this: Given a table of locations with latitudes and longitudes, which of those locations are nearest to a given location?
I have the following query:
SELECT z.id,
z.latitude, z.longitude,
p.radius,
p.distance_unit
* DEGREES(ACOS(COS(RADIANS(p.latpoint))
* COS(RADIANS(z.latitude))
* COS(RADIANS(p.longpoint - z.longitude))
+ SIN(RADIANS(p.latpoint))
* SIN(RADIANS(z.latitude)))) AS distance
FROM doorbots as z
JOIN ( /* these are the query parameters */
SELECT 34.0480698 AS latpoint, -118.3589196 AS longpoint,
2 AS radius, 111.045 AS distance_unit
) AS p ON 1=1
WHERE z.latitude between ... and
z.longitude between ...
How to use earthdistance extension to change my complicated formula in the query?
Is it equivalent change?
SELECT z.id,
z.latitude, z.longitude,
p.radius,
round(earth_distance(ll_to_earth(p.latpoint, p.longpoint), ll_to_earth(z.latitude, z.longitude))::NUMERIC,0) AS distance
FROM doorbots as z
JOIN ( /* these are the query parameters */
SELECT 34.0480698 AS latpoint, -118.3589196 AS longpoint,
2 AS radius, 111.045 AS distance_unit
) AS p ON 1=1
WHERE z.latitude between ... and
z.longitude between ...
You can get the most out of earthdistance with the following queries:
Locations close enough (i.e. within 1000000.0 meters -- 621.371192 miles) to (34.0480698, -118.3589196):
select *
from doorbots z
where earth_distance(ll_to_earth(z.latitude, z.longitude), ll_to_earth(34.0480698, -118.3589196)) < 1000000.0; -- in meters
select *
from doorbots z
where point(z.longitude, z.latitude) <#> point(-118.3589196, 34.0480698) < 621.371192; -- in miles
Top 5 locations closest to (34.0480698, -118.3589196):
select *
from doorbots z
order by earth_distance(ll_to_earth(z.latitude, z.longitude), ll_to_earth(34.0480698, -118.3589196))
limit 5;
select *
from doorbots z
order by point(z.longitude, z.latitude) <#> point(-118.3589196, 34.0480698)
limit 5;
To use indexes, apply the following one to your table:
create index idx_doorbots_latlong
on doorbots using gist (earth_box(ll_to_earth(latitude, longitude), 0));
Use index for: locations close enough (i.e. within 1000000.0 meters -- 621.371192 miles) to (34.0480698, -118.3589196):
with p as (
select 34.0480698 as latitude,
-118.3589196 as longitude,
1000000.0 as max_distance_in_meters
)
select z.*
from p, doorbots z
where earth_box(ll_to_earth(z.latitude, z.longitude), 0) <# earth_box(ll_to_earth(p.latitude, p.longitude), p.max_distance_in_meters)
and earth_distance(ll_to_earth(z.latitude, z.longitude), ll_to_earth(p.latitude, p.longitude)) < p.max_distance_in_meters;
Use index for: top 5 locations closest to (34.0480698, -118.3589196):
select z.*
from doorbots z
order by earth_box(ll_to_earth(z.latitude, z.longitude), 0) <-> earth_box(ll_to_earth(34.0480698, -118.3589196), 0)
limit 5;
http://rextester.com/WQAY4056
This might be a clone question, but no of the other answers I searched for did make any sense to me. I am still learning SQL so I would appreciate if you would guide me through the process of doing this. Thanks in advance.
So the problem is : I have this table ( with more data in it ) and I need to get the name of the airport that is the farthest away from Fiumicino airport ( that means I only have 1 set of longitude and latitude data ) and I have to do it with the distance function. Sql table
Simply you can run following sql query
SELECT *,
( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM table_name;
Where;
To search distance by kilometers instead of miles, replace 3959 with 6371.
37 is Your input latitude
-122 is your input longitude
lat is table column name which contains airport latitude values
lng is table column name which contains airport longitude value
More details answer: Creating a store locator
Whatever distance function you are using (simple straight line Pythagorean for short distances, or Great circle formula for anything over a few thousand miles),
Select * from table
where [DistanceFunction]
(Latitude, Longitude, FiumicinoLatitude, FiumicinoLongitude) =
(Select Max([DistanceFunction]
(Latitude, Longitude, FiumicinoLatitude, FiumicinoLongitude))
From table)
if you need to find the airport the farthest away from some arbitrary airport (not always Fiumicino), then, assuming #code is airport code of arbitrary airport:
Select * from table t
join table r on r.code = #code
where [DistanceFunction]
(t.Latitude, t.Longitude, r.Latitude, r.Longitude) =
(Select Max([DistanceFunction]
(Latitude, Longitude, r.Latitude, r.Longitude))
SQL SERVER
You will need a function if you are trying to find the furthest one from each airport. But since you said FCO, I did it for FCO.
--temp table for testing
select 'FCO' as code, 'Fiumicino' as name, 'Rome' as city, 'Italy' as country, 41.7851 as latitude, 12.8903 as longitude into #airports
union all
select 'VCE', 'Marco Polo','Venice','Italy',45.5048,12.3396
union all
select 'NAP', 'capodichino','Naples','Italy',40.8830,14.2866
union all
select 'CDG', 'Charles de Gaulle','Paris','France',49.0097,2.5479
--create a point from your LAT/LON
with cte as(
select
*,
geography::Point(latitude,longitude,4326) as Point --WGS 84 datum
from #airports),
--Get the distance from your airport of interest and all others.
cteDistance as(
select
*,
Point.STDistance((select Point from cte where code = 'FCO')) as MetersToFiuminico
from cte)
--this is the one that's furthest away. Remove the inner join to see them all
select d.*
from
cteDistance d
inner join(select max(MetersToFiuminico) as m from cteDistance where MetersToFiuminico > 0) d2 on d.MetersToFiuminico = d2.m