Running Sum between dates on group by clause - sql

I have the following query which shows the first 3 columns:
select
'Position Date' = todaypositiondate,
'Realized LtD SEK' = round(sum(realizedccy * spotsek), 0),
'Delta Realized SEK' = round(sum(realizedccy * spotsek) -
(SELECT sum(realizedccy*spotsek)
FROM t1
WHERE todaypositiondate = a.todaypositiondate - 1
GROUP BY todaypositiondate), 0)
FROM
t1 AS a
GROUP BY
todaypositiondate
ORDER BY
todaypositiondate DESC
Table:
Date | Realized | Delta | 5 day avg delta
-------------------------------------------------------------------
2016-09-08 | 696 981 323 | 90 526 | 336 611
2016-09-07 | 696 890 797 | 833 731 | 335 232
2016-09-06 | 696 057 066 | 85 576 | 84 467
2016-09-05 | 695 971 490 | 86 390 | 83 086
2016-09-04 | 695 885 100 | 81 434 | 80 849
2016-09-03 | 695 803 666 | 81 434 | 78 806
2016-09-02 | 695 722 231 | 79 679 | 74 500
2016-09-01 | 695 642 553 | 75 305 |
2016-08-31 | 695 567 248 | 68 515 |
How do I create the 5d average of delta realized?
Based on delta I tried the following but it did not work:
select
todaypositiondate,
'30d avg delta' = (select sum(realizedccy * spotsek)
from T1
where todaypositiondate between a.todaypositiondate and a.todaypositiondate -5
group by todaypositiondate)
from
T1 as a
group by
todaypositiondate
order by
todaypositiondate desc

Do not use single quotes for column names. Only use single quotes for string and date literals.
I would write this as:
with t as (
select todaypositiondate as PositionDate,
round(sum(realizedccy * spotsek), 0) as RealizedSEK,
from t1 a
group by todaypositiondate
)
select a.*,
(a.RealizedSEK - a_prev.RealizedSEK) as diff_1,
(a.RealizedSEK - a_prev5.RealizedSEK)/5 as avg_diff_5
from a outer apply
(select top 1 a_prev.*
from a a_prev
where a_prev = a.PositionDate - 1
) a_prev outer apply
(select top 1 a_prev.*
from a a_prev
where a_prev = a.PositionDate - 5
) a_prev5;
Note that the 5 day average difference is the most recent value minus the value from 6 days ago divided by 5.

I already have that kind of formula when I caluclate Delta between 2 dates.
It's like this:
Select todaypositiondate,
'D_RealizedSEK' = round(sum(realizedccy*spotsek) -
(SELECT sum(realizedccy*spotsek)
FROM T1
WHERE todaypositiondate = a.todaypositiondate - 1
GROUP BY todaypositiondate),0)
FROM T1 AS a
group by todaypositiondate
J
Instead of adding 5 formulas and just replaceing -1 with -2, -3... I would like to find away to select the average sum of all realicedccy from the previous 5 days, eventually adding them together and divide by 5.

Related

Summing column that is grouped - SQL

I have a query:
SELECT
date,
COUNT(o.row_number)FILTER (WHERE o.row_number > 1 AND date_ddr IS NOT NULL AND telephone_number <> 'Anonymous' ) repeat_calls_24h
(
SELECT
telephone_number,
date_ddr,
ROW_NUMBER() OVER(PARTITION BY ddr.telephone_number ORDER BY ddr.date) row_number,
FROM
table_a
)o
GROUP BY 1
Generating the following table:
date
Repeat calls_24h
17/09/2022
182
18/09/2022
381
19/09/2022
81
20/09/2022
24
21/09/2022
91
22/09/2022
110
23/09/2022
231
What can I add to my query to provide a sum of the previous three days as below?:
date
Repeat calls_24h
Repeat Calls 3d
17/09/2022
182
18/09/2022
381
19/09/2022
81
644
20/09/2022
24
486
21/09/2022
91
196
22/09/2022
110
225
23/09/2022
231
432
Thanks
We can do it using lag.
select "date"
,"Repeat calls_24h"
,"Repeat calls_24h" + lag("Repeat calls_24h") over(order by "date") + lag("Repeat calls_24h", 2) over(order by "date") as "Repeat Calls 3d"
from t
date
Repeat calls_24h
Repeat Calls 3d
2022-09-17
182
null
2022-09-18
381
null
2022-09-19
81
644
2022-09-20
24
486
2022-09-21
91
196
2022-09-22
110
225
2022-09-23
231
432
Fiddle

What is the best why to aggregate data for last 7,30,60.. days in SQL

Hi I have a table with date and the number of views that we had in our channel at the same day
date views
03/06/2020 5
08/06/2020 49
09/06/2020 50
10/06/2020 1
13/06/2020 1
16/06/2020 1
17/06/2020 102
23/06/2020 97
29/06/2020 98
07/07/2020 2
08/07/2020 198
12/07/2020 1
14/07/2020 168
23/07/2020 292
No we want to see in each calendar date the sum of the past 7 and 30 days
so the result will be
date sum_of_7d sum_of_30d
01/06/2020 0 0
02/06/2020 0 0
03/06/2020 5 5
04/06/2020 5 5
05/06/2020 5 5
06/06/2020 5 5
07/06/2020 5 5
08/06/2020 54 54
09/06/2020 104 104
10/06/2020 100 105
11/06/2020 100 105
12/06/2020 100 105
13/06/2020 101 106
14/06/2020 101 106
15/06/2020 52 106
16/06/2020 53 107
17/06/2020 105 209
18/06/2020 105 209
so I was wondering what is the best SQL that I can write in order to get it
I'm working on redshift and the actual table (not this example) include over 40B rows
I used to do something like this:
select dates_helper.date
, tbl1.cnt
, sum(tbl1.cnt) over (order by date rows between 7 preceding and current row ) as sum_7d
, sum(tbl1.cnt) over (order by date rows between 30 preceding and current row ) as sum_7d
from bi_db.dates_helper
left join tbl1
on tbl1.invite_date = dates_helper.date

Count only original seconds with Oracle SQL

I have a table with this structure and data, with start and stop positions of an audio/video. I have to count the original seconds and discard the not original ones.
E.g.
CUSTOMER_ID ITEM_ID CHAPTER_ID START_POSITION END_POSITION
A 123456 1 6 0 97
B 123456 1 6 97 498
C 123456 1 6 498 678
D 123456 1 6 678 1332
E 123456 1 6 1180 1190
F 123456 1 6 1190 1206
G 123456 1 6 1364 1529
H 123456 1 6 1530 1531
Original Data
Lines "E" and "F" does not represent original seconds because "D" line starts at 678 and finishes with 1332 so I need to create a new set of lines like this:
CUSTOMER_ID ITEM_ID CHAPTER_ID START_POSITION END_POSITION
A 123456 1 6 0 97
B 123456 1 6 97 498
C 123456 1 6 498 678
D 123456 1 6 678 1332
E 123456 1 6 1364 1529
F 123456 1 6 1530 1531
New Result Set
Can you help mw with this?
If I am following you correctly, you can use not exists to filter out rows whose range is contained in the range of another row:
select t.*
from mytable t
where not exists (
select 1
from mytable t1
where
t1.customer_id = t.customer_id
and t1.start_position < t.start_position
and t1.end_position > t.end_position
)
You can use the self join as follows:
Select distinct t.*
from your_table t
Left Join your_table tt
On t.customer_id = tt.customer_id
And t.item_id = tt.item_id
And t.chapter_id = tt.chapter_id
And t.rowid <> tt.rowid
And t.start_position between tt.start_position and tt.end_position - 1
Where tt.rowid is null

select all rows based on max(cases) and min(date)

I have a table covid, my table looks something like this:
location | date | new_cases | total_deaths | new_deaths
----------------------------------------------------------------
Afghanistan 2020-04-07 38 7 0
Afghanistan 2020-04-08 30 11 4
Afghanistan 2020-04-09 56 14 3
Afghanistan 2020-04-10 61 15 1
Afghanistan 2020-04-11 37 15 0
Afghanistan 2020-04-12 34 18 3
In this case, I want to get rows location based on max(new_cases),this is my query:
select a.*
from covid a
join (
select location, max(new_cases) highest_case
from covid
group by location
) b
on a.location = b.location
and a.new_cases = b.highest_case
but I found the same location and max(case) values with the different date value, this is the result.
location | date | new_cases | total_deaths | new_deaths
----------------------------------------------------------------
Bhutan 2020-06-08 11 0 0
Bolivia 2020-07-28 2382 2647 64
Bonaire Sint 2020-04-02 2 0 0
Bonaire Sint 2020-07-15 2 0 0
Botswana 2020-07-24 164 1 0
Now, how can I get the values based on min(date), please give me advice for fix this, and the output should be like this:
location | date | new_cases | total_deaths | new_deaths
----------------------------------------------------------------
Bhutan 2020-06-08 11 0 0
Bolivia 2020-07-28 2382 2647 64
Bonaire Sint 2020-04-02 2 0 0
Botswana 2020-07-24 164 1 0
Use distinct on:
select distinct on (location) c.*
from covid c
order by location, new_cases desc;
For the minimum date, use:
order by location, date asc;
You can use window function Max() to get max_cases (according to location) and then numbering rows (to fetch the min date) :
select location,date,new_cases,total_deaths,new_deaths from
(
--get min date with max_cases
select row_number()over(partition by location order by date)n,date,
location,new_cases,total_deaths,new_deaths
from
(
select location,date,max(new_cases)over(partition by
location)max_case,new_cases,total_deaths,new_deaths from covid --get max_case
) X
where new_cases=max_case --fetch only max case
)Y where n=1

Unnest only one duplicate value per row

I have the following table -
ID A1 A2 A3 A4 A5 A6
1 324 243 3432 23423 342 342
2 342 242 4345 23423 324 342
I can unnest this table to give me counts of all numbers like so -
324 2
243 1
3432 1
23423 1
342 3
242 1
4345 1
23423 1
But how do I get it to count numbers in the same row only 1 time. For example, this is the output I am expecting -
324 2
243 1
3432 1
23423 1
342 2
242 1
4345 1
23423 1
342 is 2 because -
1) It is in the first row.
2) It appears 2 times in the second row, but I only want to count it once.
Simply use count(distinct):
select v.a, count(distinct t.id)
from t cross join lateral
(values (a1), (a2), (a3), (a4), (a5), (a6)
) v(a)
group by v.a;
Here is a db<>fiddle.