Calculate distance in Bigquery - google-bigquery

I'm trying to calculate the distance between sequential points and partitioned by the ID number in BigQuery.
Here's what my table looks like:
OBJECTID ID DateAndTime Lat Long
1 1 2002-11-26T12:00:00 38.82551095 -109.9709871
2 1 2002-11-29T13:00:00 38.541137 -109.677575
3 2 2002-11-03T10:00:00 38.550676 -109.901774
4 2 2002-11-04T10:00:00 38.53689 -109.683531
5 2 2002-11-05T10:00:00 38.45689 -109.683531
Based on the above table, I'd want the query to calculate the distance between ObjectID 1 & 2, and then the distance between ObjectID 3 & 4 and then 4 & 5
Here's a query I've started for ordering by DateAndTime and finding the time difference. In this query I was trying to find time differences over 12hours. Is it similar logic to this? How can I calculate distances between sequenced points in BigQuery?
SELECT *,
DATETIME_DIFF( prev_DateAndTime, DateAndTime, hour) as diff_hours
FROM
(SELECT points.ID, points.DateAndTime,
LAG(DateAndTime) OVER (PARTITION BY points.ID ORDER BY points.DateAndTime) as prev_DateAndTime
FROM `table1` AS table1 INNER JOIN
`table2` AS points ON table1.ID = points.ID
WHERE
(points.DateAndTime BETWEEN table1.BeginDate AND COALESCE (table1.EndDate, CURRENT_DATE() + 1))
And points.DateAndTime between '2020-12-01T00:00:00' and CURRENT_DATE()
) d
WHERE
DATETIME_DIFF(prev_DateAndTime, DateAndTime, hour) > 12

Below example for BigQuery Standard SQL
#standardSQL
with `project.dataset.table` as (
select 1 objectid, 1 id, timestamp '2002-11-26T12:00:00' DateAndTime, 38.82551095 lat, -109.9709871 long union all
select 2, 1, '2002-11-29T13:00:00', 38.541137, -109.677575 union all
select 3, 2, '2002-11-03T10:00:00', 38.550676, -109.901774 union all
select 4, 2, '2002-11-04T10:00:00', 38.53689, -109.683531 union all
select 5, 2, '2002-11-05T10:00:00', 38.45689, -109.683531
)
select *,
objectid as objectid_start,
lead(objectid) over next as objectid_next,
round(st_distance(st_geogpoint(long, lat), lead(st_geogpoint(long, lat)) over next), 2) as distance
from `project.dataset.table`
window next as (partition by id order by DateAndTime)
-- order by id, DateAndTime
with output

Related

Bigquery driving distance using coordenades with a dataset [duplicate]

I'm trying to calculate the distance between sequential points and partitioned by the ID number in BigQuery.
Here's what my table looks like:
OBJECTID ID DateAndTime Lat Long
1 1 2002-11-26T12:00:00 38.82551095 -109.9709871
2 1 2002-11-29T13:00:00 38.541137 -109.677575
3 2 2002-11-03T10:00:00 38.550676 -109.901774
4 2 2002-11-04T10:00:00 38.53689 -109.683531
5 2 2002-11-05T10:00:00 38.45689 -109.683531
Based on the above table, I'd want the query to calculate the distance between ObjectID 1 & 2, and then the distance between ObjectID 3 & 4 and then 4 & 5
Here's a query I've started for ordering by DateAndTime and finding the time difference. In this query I was trying to find time differences over 12hours. Is it similar logic to this? How can I calculate distances between sequenced points in BigQuery?
SELECT *,
DATETIME_DIFF( prev_DateAndTime, DateAndTime, hour) as diff_hours
FROM
(SELECT points.ID, points.DateAndTime,
LAG(DateAndTime) OVER (PARTITION BY points.ID ORDER BY points.DateAndTime) as prev_DateAndTime
FROM `table1` AS table1 INNER JOIN
`table2` AS points ON table1.ID = points.ID
WHERE
(points.DateAndTime BETWEEN table1.BeginDate AND COALESCE (table1.EndDate, CURRENT_DATE() + 1))
And points.DateAndTime between '2020-12-01T00:00:00' and CURRENT_DATE()
) d
WHERE
DATETIME_DIFF(prev_DateAndTime, DateAndTime, hour) > 12
Below example for BigQuery Standard SQL
#standardSQL
with `project.dataset.table` as (
select 1 objectid, 1 id, timestamp '2002-11-26T12:00:00' DateAndTime, 38.82551095 lat, -109.9709871 long union all
select 2, 1, '2002-11-29T13:00:00', 38.541137, -109.677575 union all
select 3, 2, '2002-11-03T10:00:00', 38.550676, -109.901774 union all
select 4, 2, '2002-11-04T10:00:00', 38.53689, -109.683531 union all
select 5, 2, '2002-11-05T10:00:00', 38.45689, -109.683531
)
select *,
objectid as objectid_start,
lead(objectid) over next as objectid_next,
round(st_distance(st_geogpoint(long, lat), lead(st_geogpoint(long, lat)) over next), 2) as distance
from `project.dataset.table`
window next as (partition by id order by DateAndTime)
-- order by id, DateAndTime
with output

Non duplicate records with max date query on oracle

Hello i have a problem with a simple query. I need to see the max date of some articles in two direfent sites.
This is my actual query:
SELECT a.aa_codart, MAX(t.tr_fechafac), t.tr_tipo
FROM ARTALM a, traspaso t
WHERE t.tr_codart = a.aa_codart
and t.tr_tipomov > 1
and a.aa_codalm = '1'
and (t.tr_tipo >= 1 and t.tr_tipo <=2)
group by a.aa_codart, t.tr_tipo;
And the result:
01..FRB10X80 30/11/07 2
01..FRB10X80 08/03/01 1
01.32122RS 05/02/16 1
01.32122RS 02/07/10 2
01.33052Z 21/09/15 1
01.60042Z 24/02/16 2
I want, for example in the two first rows, see only one row, like this:
01..FRB10X80 30/11/07 2
01.32122RS 05/02/16 1
01.33052Z 21/09/15 1
01.60042Z 24/02/16 2
Taking the max date
Thanks
This calls for an analytical query. This query shows how the ROW_NUMBER() function will assign the value 1 to the row with the article's most recent date. Give it a try first to help understand the final query, coming up next:
SELECT
a.aa_codart,
t.tr_fechafac,
t.tr_tipo,
ROW_NUMBER() OVER (PARTITION BY a.aa_codart ORDER BY t.tr_fechafac DESC) as rnk
FROM artalm a
INNER JOIN trapaso t ON a.aa_codart = t.tr_codart
WHERE t.tr_tipomov > 1
AND a.aa_codalm = '1'
AND t.tr_tipo BETWEEN 1 AND 2
You can't apply the WHERE clause to the rnk column because the column is calculated after the WHERE clause. You can get around this using a nested query:
SELECT * FROM (
SELECT
a.aa_codart,
t.tr_fechafac,
t.tr_tipo,
ROW_NUMBER() OVER (PARTITION BY a.aa_codart ORDER BY t.tr_fechafac DESC) as rnk
FROM artalm a
INNER JOIN trapaso t ON a.aa_codart = t.tr_codart
WHERE t.tr_tipomov > 1
AND a.aa_codalm = '1'
AND t.tr_tipo BETWEEN 1 AND 2
) WHERE rnk = 1;
I apologize in advance for any column names I may have retyped badly. The Oracle syntax should be fine; the column names maybe not so much :)
I think you may want to look at row_number() (then just pick the ones where it is one) something like this.
WITH t
AS (SELECT 'A' aa_codart,
TO_DATE ('17/05/00', 'dd/mm/yy') mydt,
1 tr_tipo
FROM DUAL
UNION ALL
SELECT 'A', TO_DATE ('12/04/00', 'dd/mm/yy'), 2 FROM DUAL
UNION ALL
SELECT 'B', TO_DATE ('30/06/98', 'dd/mm/yy'), 2 FROM DUAL
UNION ALL
SELECT 'C', TO_DATE ('30/06/98 ', 'dd/mm/yy'), 2 FROM DUAL),
t2
AS (SELECT aa_codart,
mydt,
tr_tipo,
ROW_NUMBER ()
OVER (PARTITION BY aa_codart ORDER BY mydt DESC)
rn
FROM t)
SELECT *
FROM t2
WHERE rn = 1

Remove + - value records in SQL where clause

I need to remove the + - values records mean to say
I need only Blue colored two records from the output windows.
Hope its clear what exactly I want.
User5 | -15
User6 | -10
The idea is to get rows whose second column, in my case it's Val, is are cancelled out. You can do it by getting the absolute value and assign a row number grouped by absolute value and the value itself. Those row number that does not have a match should be the result.
WITH SampleData(UserID, Val) AS(
SELECT 'User1', -10 UNION ALL
SELECT 'User2', 10 UNION ALL
SELECT 'User3', -15 UNION ALL
SELECT 'User4', -10 UNION ALL
SELECT 'User5', -15 UNION ALL
SELECT 'User6', -10 UNION ALL
SELECT 'User7', 10 UNION ALL
SELECT 'User8', 15
)
,Numbered AS(
SELECT
UserID,
Val,
BaseVal = ABS(Val),
RN = ROW_NUMBER() OVER(PARTITION BY ABS(Val), Val ORDER BY UserId)
FROM SampleData
)
SELECT
n1.UserID,
n1.Val
FROM Numbered n1
LEFT JOIN Numbered n2
ON n2.BaseVal = n1.BaseVal
AND n2.RN = n1.rn
AND n2.UserID <> n1.UserID
WHERE n2.UserID IS NULL
ORDER BY n1.UserID
Appears that you want rows where the total does not equal 0?
select
userName,
userValue
from
yourTable
where
userName in (
select userName from yourTable
group by userName
having sum (userValue) <> 0
)

Obtaining the percentage in sqllite

I made a query with the following statement :
select mood, count(*) * 100/ (select count(*) from entry)from entry group by mood having data>data-30 order by mood asc
mood is an integer from 0 to 2
the output is :
mood count
0 96,55
1 3,44
is there a way to add a row with mood 2 count 0?
SELECT MOOD, SUM (COUNTER) TOTAL
FROM ( SELECT 0 MOOD, 0 COUNTER FROM DUAL
UNION ALL
SELECT 1 MOOD, 0 COUNTER FROM DUAL
UNION ALL
SELECT 2 MOOD, 0 COUNTER FROM DUAL
UNION ALL
SELECT MOOD, COUNT ( * )
* 100.0
/ (SELECT COUNT ( * )
FROM ENTRY
WHERE DATA > DATE ('now') - 30)
FROM (SELECT *
FROM ENTRY
WHERE DATA > DATE ('now') - 30)
GROUP BY MOOD, DATA)
GROUP BY MOOD
ORDER BY MOOD ASC;
You have to enumerate (0, 1, 2, .....) all the possible numbers, associating a counter = 0.
Then, you sum the counters grouping by mood.
Please note that your condition having data>data-30 is absurd.
You have to select from ENTRY all the records satisfying the condition data > date('now') - 30, for example.
SQLite: A VIEW named "dual" that works the same as the Oracle "dual" table can be created as follows: "CREATE VIEW dual AS SELECT 'x' AS dummy;"

How to recursively compute ratio of remaining amounts based on rounded values from preceding rows?

I need to split 1 amount into 2 fields. I know the total sums of the resulting fields = the ratio to split the first row, but i need to round the resulting sums and only then compute the ratio for next row (so the total sum of the rounded values will be correct).
How can i write this algorithm in Oracle 10g PL/SQL? I need to test some migrated data. Here is what i came up with (so far):
with temp as (
select 1 id, 200 amount, 642 total_a from dual union all
select 2, 200, 642 from dual union all
select 3, 200, 642 from dual union all
select 4, 200, 642 from dual union all
select 5, 200, 642 from dual
)
select
temp2.*,
remaining_a / remaining_amount ratio,
round(amount * remaining_a / remaining_amount, 0) rounded_a,
round(amount - amount * remaining_a / remaining_amount, 0) rounded_b
from (
select
temp.id,
temp.amount,
sum(amount) over (
order by id
range between current row and unbounded following
) remaining_amount,
case when id=1 then total_a /* else ??? */ end remaining_a
from temp
) temp2
Update: If you can't see the image above, expected rounded_A values are:
1 128
2 129
3 128
4 129
5 128
Here is my suggestion. It is not getting exactly what you want . . . by my calculation the 129 doesn't come until the 3rd row.
The idea is to add more columns. For each row, calculate the estimated split. Then, keep track of the accumulative fraction. When the cum remainder exceeds an integer, then bump up the A amount by 1. Once you have the A amount, you can calculate the rest:
WITH temp AS (
SELECT 1 id, 200 amount, 642 total_a FROM dual UNION ALL
SELECT 2, 200, 642 FROM dual UNION ALL
SELECT 3, 200, 642 FROM dual UNION ALL
SELECT 4, 200, 642 FROM dual UNION ALL
SELECT 5, 200, 642 FROM dual
)
select temp3.*,
sum(estArem) over (order by id) as cumrem,
trunc(estA) + (case when trunc(sum(estArem) over (order by id)) > trunc(- estArem + sum(estArem) over (order by id))
then 1 else 0 end)
from (SELECT temp2.*,
trunc(Aratio*amount) as estA,
Aratio*amount - trunc(ARatio*amount) as estArem
FROM (SELECT temp.id, temp.amount,
sum(amount) over (ORDER BY id range BETWEEN CURRENT ROW AND unbounded following
) remaining_amount,
sum(amount) over (partition by null) as total_amount,
max(total_a) over (partition by null)as maxA,
(max(total_a) over (partition by null) /
sum(amount) over (partition by null)
) as ARatio
FROM temp
) temp2
) temp3
This isn't exactly a partitioning problem. This is an integer approximation problem.
If you are rounding the values rather than truncating them, then you need a slight tweak to the logic.
trunc(estA) + (case when trunc(sum(0.5+estArem) over (order by id)) > trunc(0.5 - estArem + sum(estArem) over (order by id))
This statement was originally just looking for the cumulative remainder passing over the integer threshhold. This should do rounding instead of truncation.