SQL Query count unique based on Latitude Longitude radius - google-bigquery

Hi I have a table of data that contains register_date, customer_id, lat_long (format as -6.2134,106.783876). I want to count unique customer id that registers between date range 21-23/jan/2022 and is in a radius of 500m from the longitude 106.8486583 and latitude -6.14462529 . How do I do this on google bigquery?
Currently I am stuck at this
WITH params AS (
SELECT ST_GeogPoint(106.8486583, -6.14462529) AS center,
0.2 AS maxn_stations,
0.2 AS maxdist_km
),
distance_from_center AS (
SELECT
customer_id,
register_date,
ST_GeogPoint(CAST(SPLIT(lat_long,",")[OFFSET(1)] AS float64), CAST(SPLIT(lat_long,",")[OFFSET(0)] AS float64)) AS loc,
ST_Distance(ST_GeogPoint(CAST(SPLIT(lat_long,",")[OFFSET(1)] AS float64), CAST(SPLIT(lat_long,",")[OFFSET(0)] AS float64)), params.center) AS dist_meters
FROM
`lat long data`,
params
WHERE ST_DWithin(ST_GeogPoint(CAST(SPLIT(lat_long,",")[OFFSET(1)] AS float64), CAST(SPLIT(lat_long,",")[OFFSET(0)] AS float64)), params.center, params.maxdist_km*1000)
)
SELECT
*
FROM distance_from_center
WHERE
date(register_date)
BETWEEN
'2022-01-21'
AND
'2022-01-23'
solved

Related

SQL: Calculate distance between two points using coordinates

I've got a table with lat and lng coordnates, and need to add the distance into a new column called 'distance' in Bigquery.
table
start_lat
end_lat
start_lng
end_lng
41.8964
41.9322
-87.661
-87.6586
41.9244
41.9306
-87.7154
-87.7238
41.903
41.8992
-87.6975
-87.6722
I haven't a clue how to do it. I saw some examples, but simply couldn't apply it into this case.
Any tip?
The ST_DISTANCE function will calculate the distance (in meters) between 2 points.
with my_data as (
select 1 as trip_id, 41.8964 as start_lat, 41.9322 as end_lat, -87.661 as start_lng, -87.6586 as end_lng union all
select 2, 41., 41.9306, -87.7154, -87.7238
)
select trip_id,
ST_DISTANCE(ST_GEOGPOINT(start_lng, start_lat), ST_GEOGPOINT(end_lng, end_lat)) as distance_in_meters
from my_data
Output:
trip_id
distance_in_meters
1
3985.735019583467
2
103480.52812005761

Bigquery - Sum Product of multiple value within two columns

I have this value
I want to calculate a new column, which will add the product of multiplication from ticket_units_count and price, so it must be:
5 * 33104.0 + 4 * 23449.0 = 259316
How to do that in bigquery?
I tried this one
SELECT
SUM(CAST(price AS FLOAT64) * CAST(ticket_units_count AS INT64))
FROM table
But it shows this error: Bad double value: 33104.0;23449.0
Need your help to specify the query to get the expected result
Consider below approach
select *,
( select sum(cast(_count as int64) * cast(_price as float64))
from unnest(split(ticket_units_count, ';')) _count with offset
join unnest(split(price, ';')) _price with offset
using (offset)
) as total
from your_table
if applied to sample data in your question - output is

how to remove coordinates from geojson with less then 4 values

as the title say, i am doing a query on a bikesharing data stored in bigquery
I am able to extract the data and arrange it in a correct order to be displayed in a path chart. In the data, there are coordinated with only start and end long and lat, or sometimes only start long and lat, how do i remove anything with less then 4 points?
this is the code , i am also limited to select only
SELECT
routeID ,
json_extract(st_asgeojson(st_makeline( array_agg(st_geogpoint(locs.lon, locs.lat) order by locs.date))),'$.coordinates') as geo
FROM
howardcounty.routebatches
where unlockedAt between {{start_date}} and {{end_date}}
cross join UNNEST(locations) as locs
GROUP BY routeID
order by routeID
limit 10
have also included a screen shot for clarity
To apply a condition after a group by, please use a having. For a simply condition -- Are there at least two dataset for the route? -- this query can be used:
With dummy as (
Select 1 as routeID, [struct(current_timestamp() as date, 1 as lon, 2 as lat),struct(current_timestamp() as date, 3 as lon, 4 as lat)] as locations
Union all select 2 as routeID, [struct(current_timestamp() as date, 10 as lon, 20 as lat)]
)
SELECT
routeID , count(locs.date) as amountcoord,
json_extract(st_asgeojson(st_makeline( array_agg(st_geogpoint(locs.lon, locs.lat) order by locs.date))),'$.coordinates') as geo
FROM
#howardcounty.routebatches
dummy
#where unlockedAt between {{start_date}} and {{end_date}}
cross join UNNEST(locations) as locs
GROUP BY routeID
having count(locs.date)>1
order by routeID
limit 10
For more complex ones, a nested select may do the job:
Select *
from (
--- your code ---
) where length(geo)-length(replace(geo,"]","")) > 1+4
The JSON is transformed to a string in your code. If you count the ] and substract one for the end of the JSON, the inside arrays are counted.

How to split multiple character types into different columns? Using SQL in BigQuery

I have 2 different characters ('|' and ',') in one column in Bigquery. Using SQL standard how do I split a column with the string from these characters below into multiple columns separating by '|' and ',' ?
Inbr | Evermore | In Banner Video, Canary Island | 702B6
The code I have so far is:
Thank you here is the code scenario, how do I apply that with the other columns I need in the table?
SELECT CAST(Date AS DATE) Date,
Data_Source_type,
Data_Source_id,
Campaign,
Data_Source,
Data_Source_name,
Data_Source_type_name,
Ad_legacy__AdWords,
Ad_Group_Name__AdWords,
Ad_Type__AdWords,
SPLIT(Campaign,'|')[safe_ordinal(1)] as Media,SPLIT(Campaign,'|')[safe_ordinal(2)] as Client,SPLIT(Campaign, '|')[safe_ordinal(3)] as Market_Type,SPLIT(Campaign,'|')[safe_ordinal(4)] as Market,SPLIT(Campaign,'|')[safe_ordinal(5)] as Market_ID,
City__AdWords,
FROM `data.aud_summary'
Consider below (as this is Campaign info - I assume the structure of string in column is consistent across rows and has same number of columns to be extracted)
select * except(key) from (
select to_json_string(t) key, offset, value
from `project.dataset.table` t,
unnest(regexp_extract_all(Campaign, r'[^,|]+')) value with offset
)
pivot(max(value) for offset in (0 as Media, 1 as Client, 2 as Market_Type, 3 as Market, 4 as Code))
if applied to sample data in your question - output is
how do I apply that with the other columns I need in the table?
just add t.* as in below example
select * from (
select t.*, offset, value
from `project.dataset.table` t,
unnest(regexp_extract_all(Campaign, r'[^,|]+')) value with offset
)
pivot(max(value) for offset in (0 as Media, 1 as Client, 2 as Market_Type, 3 as Market, 4 as Code))
Use REPLACE to replace your , to | before splitting the column.
WITH
SampleData AS (
SELECT
"Inbr | Evermore | In Banner Video, Canary Island | 702B6" AS DATA )
SELECT
a[safe_ORDINAL(1)] AS Media,
a[safe_ORDINAL(2)] AS Client,
a[safe_ORDINAL(3)] AS Market_Type,
a[safe_ORDINAL(4)] AS Market,
a[safe_ORDINAL(5)] AS Fifth,
FROM (
SELECT
SPLIT(REPLACE(DATA, ",", "|"),'|') AS a
FROM
SampleData)
Result
Media
Client
Market_Type
Market
Fifth
Inbr
Evermore
In Banner Video
Canary Island
702B6
at last,
SELECT
* EXCEPT(a),
a[safe_ORDINAL(1)] AS Media,
a[safe_ORDINAL(2)] AS Client,
a[safe_ORDINAL(3)] AS Market_Type,
a[safe_ORDINAL(4)] AS Market,
a[safe_ORDINAL(5)] AS Fifth,
FROM (
SELECT
CAST(Date AS DATE) Date,
* EXCEPT(Date),
SPLIT(REPLACE(DATA, ',', '|'),'|') AS a
FROM
`data.aud_summary`)

SQL Distance calculation between 1 point and any other

This might be a clone question, but no of the other answers I searched for did make any sense to me. I am still learning SQL so I would appreciate if you would guide me through the process of doing this. Thanks in advance.
So the problem is : I have this table ( with more data in it ) and I need to get the name of the airport that is the farthest away from Fiumicino airport ( that means I only have 1 set of longitude and latitude data ) and I have to do it with the distance function. Sql table
Simply you can run following sql query
SELECT *,
( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance FROM table_name;
Where;
To search distance by kilometers instead of miles, replace 3959 with 6371.
37 is Your input latitude
-122 is your input longitude
lat is table column name which contains airport latitude values
lng is table column name which contains airport longitude value
More details answer: Creating a store locator
Whatever distance function you are using (simple straight line Pythagorean for short distances, or Great circle formula for anything over a few thousand miles),
Select * from table
where [DistanceFunction]
(Latitude, Longitude, FiumicinoLatitude, FiumicinoLongitude) =
(Select Max([DistanceFunction]
(Latitude, Longitude, FiumicinoLatitude, FiumicinoLongitude))
From table)
if you need to find the airport the farthest away from some arbitrary airport (not always Fiumicino), then, assuming #code is airport code of arbitrary airport:
Select * from table t
join table r on r.code = #code
where [DistanceFunction]
(t.Latitude, t.Longitude, r.Latitude, r.Longitude) =
(Select Max([DistanceFunction]
(Latitude, Longitude, r.Latitude, r.Longitude))
SQL SERVER
You will need a function if you are trying to find the furthest one from each airport. But since you said FCO, I did it for FCO.
--temp table for testing
select 'FCO' as code, 'Fiumicino' as name, 'Rome' as city, 'Italy' as country, 41.7851 as latitude, 12.8903 as longitude into #airports
union all
select 'VCE', 'Marco Polo','Venice','Italy',45.5048,12.3396
union all
select 'NAP', 'capodichino','Naples','Italy',40.8830,14.2866
union all
select 'CDG', 'Charles de Gaulle','Paris','France',49.0097,2.5479
--create a point from your LAT/LON
with cte as(
select
*,
geography::Point(latitude,longitude,4326) as Point --WGS 84 datum
from #airports),
--Get the distance from your airport of interest and all others.
cteDistance as(
select
*,
Point.STDistance((select Point from cte where code = 'FCO')) as MetersToFiuminico
from cte)
--this is the one that's furthest away. Remove the inner join to see them all
select d.*
from
cteDistance d
inner join(select max(MetersToFiuminico) as m from cteDistance where MetersToFiuminico > 0) d2 on d.MetersToFiuminico = d2.m