Help structure query to list concerts/venues by distance - sql

SQL noob here needing some help. I've got an idea of how to do this in PHP/SQL, but I would really like to condense this into one SELECT statement. OK:
The site I am working on is a list of concerts and venues. Venues have a latitude and longitude, and so do accounts corresponding to that users location.
I have three tables, accounts (users), concerts, I would like to SELECT a list of concerts (and join on venues for that info) that are happening at venues within x miles of the account, using this cheap formula for distance calculation (the site only lists venues in the UK so the error is acceptable):
x = 69.1 * (accountLatitude - venueLatitude);
y = 69.1 * (accountLongitude - venueLongitude) * cos(venueLatitude / 57.3);
distance = sqrt(x * x + y * y);
How can I achieve this in a single query?
Thanks in advance xD

This is done exactly as your formulas suggest.
Just substitute x and y into the distance formula.
If this is for MySQL the below should work (just replace the correct table names / column names).
SELECT concert.name, venue.name,
SQRT(POW(69.1 * (account.Latitude - venue.Latitude), 2) + POW(69.1 * (account.Longitude - venue.Longitude) * cos(venue.Latitude / 57.3), 2)) AS distance
FROM account, venue
LEFT JOIN concert ON conert.ID = venue.concertID
WHERE account.id = UserWhoIsLoggedIn
ORDER BY 3;
This should return all concert names, venue names and distance from user order by the distance.
If you are not using MySQL you may need to change either the POW or SQRT functions.
Also be aware that sin & cos functions take there inputs in Radians.

For everyone elses benefit heres now I did it in the end:
$q = 'SELECT gigs.date, bands.idbands, venues.latitude, venues.longitude, venues.idvenues, bands.name AS band, venues.name AS venue,
SQRT(POW(69.1 * (' . $lat . ' - venues.latitude), 2) + POW(69.1 * (' . $lon . ' - venues.longitude) * cos(venues.latitude / 57.3), 2)) AS distance
FROM gigs
LEFT JOIN bands ON bands.idbands=gigs.bands_idbands
LEFT JOIN venues ON venues.idvenues=gigs.venues_idvenues
WHERE 1
ORDER BY distance';
where $lat and $lon are the latitude and longitude of the user currently logged in!
this selects every single gig thats happening at every venue, and arranges them in order of distance

Related

Is there a way to compare Lat/long of two tables

I have two tables:
address_points
kmldata
address_points table columns: ID address Latitude1 Longitude2
kmldata table columns: Locname Lat Long
Now I want to see all the records of the address_points table whose Latitude1 and Longitude2 values fall in the range of Lat and Long values of kmldata table.
I have not handled comparison of locations before in the SQL server so don't know which function I can use here. I thought of the BETWEEN operator but can seem to use it here properly here. Any guidance on how I can acheive this?
You need to use the spatial functions within SQL Server. First you will need to aggregate all the points in your kmldata table to create an area, and then use STWithin to check which points fall within that area:
declare #kmlarea geography
select #kmlarea = geography::EnvelopeAggregate(geography::Point(Lat, Long, 4326))
from kmldata
select *
from address_points a
where geography::Point(Latitude1, Longitude2, 4326).STWithin(#kmlarea) = 1
There are several ways of calculating the geographical distance between two sets of coordinates. Already posted is the built in Geography method. There are also several good "home grown" functions based on a round earth model.
The hard part is making the actual comparison when you have large row counts in your source and destination tables. Comparing every source to every destination creates an unnecessarily large Cartesian product. I say "unnecessarily" because there's no point in calculating the distance between a source in Florida and a destination in California when I'm only interested in destinations withing 15 miles of the source.
To solve the problem, I created a "bounding box" function that calculates a square(ish) box around a single set of coordinates. The code is posted below...
CREATE FUNCTION dbo.tfn_LatLngBoundingBox
/* ===================================================================
12/03/2019 JL, Created: Calculates the bounding box for a given set of Lat/Lng coordinates.
=================================================================== */
--===== Define I/O parameters
(
#Lat DECIMAL(8,5),
#Lng DECIMAL(8,5),
#MaxDistance DECIMAL(8,3),
#DistanceUnit CHAR(1) -- 'M'=miles ; 'K'=kilometers
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT
MinLat = CONVERT(decimal(8, 5), (x.MinLat / PI()) * 180),
MaxLat = CONVERT(decimal(8, 5), (x.MaxLat / PI()) * 180),
MinLng = CONVERT(decimal(8, 5), (y.MinLng / PI()) * 180),
MaxLng = CONVERT(decimal(8, 5), (y.MaxLng / PI()) * 180)
FROM
( VALUES (
CASE
WHEN #DistanceUnit = 'K' THEN #MaxDistance / 6366.707019 -- Earth sphere radius in kilometers
WHEN #DistanceUnit = 'M' THEN (#MaxDistance * 1.609344) / 6366.707019
END,
(#Lat / 180) * PI(),
(#Lng / 180) * PI()
) ) r (DistRad, rLat, rLng)
CROSS APPLY ( VALUES (r.rLat - r.DistRad, r.rLat + r.DistRad) ) x (MinLat, MaxLat)
CROSS APPLY ( VALUES (ASIN(SIN(r.rLat) / COS(r.DistRad))) ) lt (LatT) -- = 1.4942 rad
CROSS APPLY ( VALUES (ACOS( ( COS(r.DistRad) - sin(lt.LatT) * sin(r.rLat) ) / ( cos(lt.LatT) * cos(r.rLat) ) ) ) ) dl (DeltaLng)
CROSS APPLY ( VALUES (r.rLng - dl.DeltaLng, r.rLng + dl.DeltaLng) ) y (MinLng, MaxLng);
GO
The use case looks like the following...
SELECT
s.Lat,
s.Lng,
d.Lat,
d.Lng,
dm.DistanceInMiles
FROM
dbo.[Source] s
CROSS APPLY dbo.tfn_LatLngBoundingBox(s.Lat, s.Lng, 15, 'M') bb
LEFT JOIN dbo.Destination d
ON d.lat BETWEEN bb.MinLat AND bb.MaxLat
AND d.Lng BETWEEN bb.MinLng AND bb.MaxLng
CROSS APPLY dbo.tfn_LatLonDistanceInMiles(s.Lat, s.Lng, d.Lat, d.Lng) dm
WHERE
dm.DistanceInMiles <= 15;

Assistance with Percentage calculation in SQL Server

DBMS: SQL Server
I am trying to calculate the percentage of how many times a certain language is used for subtitles, for this I tried to retrieve how many times a language is used from my database
The calculation I am trying to do is: languagesusedforsubtitles / totalamountofsubtitlesused * 100
My guess is that during my SELECT statement I need to or retrieve the values in a different manner but I can't seem to figure out how.
This is my so far "working" query which displays how many times one language is used for subtitling:
-- Sorted language count / total used languages * 100
SELECT DISTINCT [language].name AS "Taal", COUNT(*) AS "Percentage"
FROM [profile]
INNER JOIN [watched_media] ON [watched_media].profileid = [profile].profile_ID
INNER JOIN subtitles ON [watched_media].subtitlesid = [subtitles].subtitles_ID
INNER JOIN language ON [language].language_ID = [subtitles].languageid
GROUP BY [language].name;
And this is what I tried in the first place but it only comes up 0's as a result:
SELECT DISTINCT [language].name AS "Taal", COUNT(*) / (SELECT COUNT(watched_media_ID) FROM [watched_media]) * 100 AS "Hoevaak gebruikt"
FROM [profile]
INNER JOIN [watched_media] ON [watched_media].profileid = [profile].profile_ID
INNER JOIN subtitles ON [watched_media].subtitlesid = [subtitles].subtitles_ID
INNER JOIN language ON [language].language_ID = [subtitles].languageid
GROUP BY [language].name;
Here are the results of my first query:
SQL Server does integer division, so 0/1 is 0 rather than 0.5. A simple solution for you is:
COUNT(*) * 100.0 / (SELECT COUNT(watched_media_ID) FROM [watched_media]
The .0 changes the value from an integer to a decimal number -- and division works more intuitively.

find the maximum in a column, but only when two other columns match

I need help in PostgreSQL.
I have two tables
Predicton - predicts future disasters and casualties for each city.
Measures fits the type of damage control providers for each type of disaster (incl. cost and percent of "averted casualties")
Each disaster and provider combination has an amount of averted casualties (the percent from measures * amount of predicted casualties for that disaster*0.01).
For each combination of city and disaster, I need to find two providers that
1) their combined cost is less than a million
2) have the biggest amount of combined averted casualties.
My work and product so far
select o1.cname, o1.etype, o1.provider as provider1, o2.provider as provider2, (o1.averted + o2.averted) averted_casualties
from (select cname, m.etype, provider, mcost, (percent*Casualties*0.01)averted
from measures m, prediction p
where (m.etype = p.etype)) as o1, (select cname, m.etype, provider, mcost, (percent*Casualties*0.01)averted
from measures m, prediction p
where (m.etype = p.etype)) as o2
where (o1.cname = o2.cname) and (o1.etype = o2.etype) and (o1.provider < o2.provider) and (o1.mcost + o2.mcost < 1000000)
How do I change this query so It Will show me the best averted_casualties for each city/disaster combo (not just max of all table, max for each combo)
This is the desired outcome:
P.S. I'm not allowed to use ordering, views or functions.
First, construct all pairs of providers and do the casualty and cost calculation:
select p.*, m1.provider as provider_1, m2.provider as provider_2,
p.casualties * (1 - m1.percent / 100.0) * (1 - m2.percent / 100.0) as net_casualties,
(m1.mcost + m2.mcost) as total_cost
from measures m1 join
measures m2
on m1.etype = m2.etype and m1.provide < m2.provider join
prediction p
on m1.etype = p.etype;
Then, apply your conditions. Normally, you would use window functions, but since ordering isn't allowed for this exercise, you want to use a subquery:
with pairs as (
select p.*, m1.provider as provider_1, m2.provider as provider_2,
p.casualties * (1 - m1.percent / 100.0) * (1 - m2.percent / 100.0) as net_casualties,
(m1.mcost + m2.mcost) as total_cost
from measures m1 join
measures m2
on m1.etype = m2.etype and m1.provide < m2.provider join
prediction p
on m1.etype = p.etype;
)
select p.*
from pairs p
where p.total_cost < 1000000 and
p.net_casualties = (select min(p2.net_casualties)
from pairs p2
where p2.city = p.city and p2.etype = p.etype and
p2.total_cost < 1000000
);
The biggest number of averted casualties results in the smallest number of net casualties. They are the same thing.
As for your attempted solution. Just seeing the , in the from clause tells me that you need to study up on join. Simple rule: Never use commas in the from clause. Always use proper, explicit, standard join syntax.
Your repeated subqueries also suggest that you need to learn about CTEs.

SQL Query to find 5 mile radius of a given zip code

I'm currently using Redshift and I have a table that has a list of zip codes along with their latitudes and longitudes. I'm trying to write a sql statement where I can specify a given zip code and have it return all of the zip codes within a 5 mile radius.
Any ideas on how I can approach this?
Here's what I had tried:
SELECT zip, city, latitude, longitude,
69.0* DEGREES(ACOS(COS(RADIANS(latpoint))
* COS(RADIANS(latitude))
* COS(RADIANS(longpoint) - RADIANS(longitude))
+ SIN(RADIANS(latpoint))
* SIN(RADIANS(latitude)))) AS distance_in_miles
FROM zip_code_db
JOIN (
SELECT 42.81 AS latpoint, -70.81 AS longpoint
) AS p ON 1=1
ORDER BY distance_in_miles
I'm trying to see if there is a way to use zip codes instead of specifying latitues and longitudes

SQL. Calculation correlations of asset classes

I have a database with 101 simulations for, lets say, 5 different asset classes returns.
I need to write a query that will calculate the respective correlations between each of the 5 classes. Table will look something like this:
AssetClass_ID | Simulation | AssetClass_Value
Any ideas? I am struggling to get even close.
(Depending on difficulty I may end up having to tell the end user to just download all the simulations and do the stats using inbuilt EXCEL functions, but I am unlikely to be popular for doing so)
Ok, with some google and some work I came up with:
SELECT
AssetID_1, AssetID_2,
((psum - (sum1 * sum2 / n)) / sqrt((sum1sq - sum1*sum1 / n) * (sum2sq - sum2*sum2 / n))) AS [Correlation Coefficient],
n
FROM
(SELECT
n1.AssetClass_ID AS AssetID_1,
n2.AssetClass_ID AS AssetID_2,
SUM(n1.RunResults_Value) AS sum1,
SUM(n2.RunResults_Value) AS sum2,
SUM(n1.RunResults_Value * n1.RunResults_Value) AS sum1sq,
SUM(n2.RunResults_Value * n2.RunResults_Value) AS sum2sq,
SUM(n1.RunResults_Value * n2.RunResults_Value) AS psum,
COUNT(*) AS n
FROM
dbo.tbl_RunResults AS n1
LEFT JOIN dbo.tbl_RunResults AS n2 ON n1.Simulation_ID = n2.Simulation_ID
WHERE
n1.AssetClass_ID < n2.AssetClass_ID AND
n1.series_ID = 2332 AND
n2.series_ID = 2332
GROUP BY
n1.AssetClass_ID, n2.AssetClass_ID) AS step1
ORDER BY
AssetID_1
Answers match Excel inbuilt functions so far, so good.