Select Sum of Grouped Values over Date Range (Window Function) - sql

I have a table of names, dates and numeric values. I want to know the total first date entry and the total sum of numeric values for the first 90 days after the first date.
Eg
name
date
value
Joe
2020-10-30
3
Bob
2020-12-23
5
Joe
2021-01-03
7
Joe
2021-05-30
2
I want a query that returns
name
min_date
sum_first_90_days
Joe
2020-10-30
10
Bob
2020-12-23
5
So far I have
SELECT name, min(date) min_date,
sum(value) over (partition by name
order by date
rows between min(date) and dateadd(day,90,min(date))
) as first_90_days_sum
FROM table
but it's not executing. What's a good approach here? How can I set up a window function to use a dynamic date range for each partition?

You can use window functions and aggregation:
select name, sum(value)
from (select t.*,
min(date) over (partition by name) as min_date
from t
) t
where date <= min_date + interval '90 day'
group by name;

Related

How to calculate average number of actions in selected month per client in Teradata SQL?

I have table with transactions in Teradata SQL like below:
ID | trans_date
-------------------
123 | 2021-09-15
456 | 2021-10-20
777 | 2021-11-02
890 | 2021-02-14
... | ...
And I need to calculate average number of transactions made by clients in month: 09, 10 and 11, so as a result I need something like below:
Month | Avg_num_trx
--------------------------------------------------------
09 | *average number of transactions per client in month 09*
10 | *average number of transactions per client in month 10*
11 | *average number of transactions per client in month 11*
How can I do taht in Teradata SQL ?
Not as familiar with Teradata, you could probably start by extracting the month from the trans_date, then grouping id and month and adding in count(id). From there you could group month by avg(count_id). Something like this -
WITH extraction AS(
SELECT
ID,
EXTRACT (MONTH FROM trans_date) AS MM
FROM your_table)
,
WITH id_counter AS(
SELECT
ID,
MM,
COUNT(ID) as id_count
FROM extraction
GROUP BY ID, MM)
SELECT
MM,
AVG(id_count) AS Avg_num_trx
FROM id_counter
ORDER BY MM;
The first CTE grabs month from trans_date.
The second CTE groups ID and month with count(ID) - should give you the total actions in that month for that client ID as id_count.
The final table gets the average of id_count grouped by month, which should be the average interactions per client for the period.
If EXTRACT doesn't work for some reason you could also try STRTOK(trans_date, '-', 2).
Other potential methods to replace -
--current
EXTRACT (MONTH FROM trans_date) AS MM
--option 1
STRTOK(trans_date, '-', 2) AS MM
--option 2
LEFT(RIGHT(trans_date, 5),2) AS MM
Above reworked as subqueries - should help with debugging -
SELECT
MM,
AVG(id_count) AS Avg_num_trx
FROM (SELECT
ID,
MM,
COUNT(ID) as id_count
FROM (SELECT
ID,
EXTRACT (MONTH FROM trans_date) AS MM
FROM your_table) AS a
GROUP BY ID, MM) AS b
ORDER BY MM;
This will return the expected answer:
SELECT
Extract (MONTH From trans_date) AS MM,
Cast(Count(*) AS FLOAT) / Count(DISTINCT id)
FROM my_table
GROUP BY MM
Compare to #procopypaster's answer too see which one is more efficient for your data.

Max value by ID, date and last x days

Supposed I have a table :
---------------
id | date | value
------------------
1 | Jan 1 | 10
1 | Jan 2 | 12
1 | Jan 3 | 11
2 | Jan 4 | 11
I need to get the max and median value of each id, each date, each for the past 90 days. Im using query :
select id, date, value
max(value) over (partition by id, date) as max_date,
median(value) over (partition by id, date) as med_date
from table
where date > date - interval '90 days'
I tried to export the data and check manually but the result is not correct. Any thing I missed? thanks
expected output is to get maximum value of since the last 90 days. for example the date is April 5th, then it will find the maximum value from Jan 5th (the last 90 days) until April 5th. and then the date moves to April 6th, then it will do again for jan 6th until April 6h and so on for each ID
So im assuming u can get several values for same ID and Date and right ? otherwise partitioning for both id and date makes no sense
SELECT id, date, max(value), avg(value) from table where date > date - interval '90 days'
group by id, value
'group by' does the partitioning
Why are you using window functions? This seems to do what you describe:
select id,
max(value) as max_date,
percentile_disc(0.5) within group (order by value) as median_value
from table
where date > date - interval '90 days';
If you want this per date, use window functions:
select t.*
from (select t.*,
max(value) over (order by date range between '89 day' preceding and current row) as running_max_value,
percentile_disc(0.5) within group (order by value) range between '89 day' preceding and current row) as running_median_value
from t
) t
where date > date - interval '90 days';
The filter is in the outer query so the preceding period can go back further in time.

how to count a column by month if the date column has time stamp?

I have two columns in a table:
id date
1 1/1/18 12:55:00 AM
2 1/2/18 01:34:00 AM
3 1/3/18 02:45:00 AM
How do I count the number of IDs per month if the time is appended into the date column?
The output would be:
Count month
3 1
In ANSI SQL, you would use:
select extract(month from date) as month, count(*)
from t
group by extract(month from date);
I think more databases support a month() function rather than extract(), though.
you have to extract month and count by using group by
select DATE_PART('month', date) as month,count(id) from yourtable
group by DATE_PART('Month', date)

sql Select Earliest Date Multiple Rows

I have the following data:
id from_date to_date empty
1 24/03/2016 01/04/2016 Y
1 01/04/2016 23/06/2016 Y
1 05/08/2016 01/04/2017 Y
1 01/04/2017 01/04/2018 Y
1 01/04/2018 01/04/2019 Y
The current date falls between 01/04/2018 and 01/04/2019 however, the earliest consecutive date is 05/08/2016. How can I write an sql script to pick up the earliest from date for the period that includes today.
Is this possible without creating a temporary table and updating the from date for each id? where the from_date = to_date for the previous row.
Hope that all makes sense.
Thanks
Iain
You seem to want to group the values together. Here is one method to get the periods of the continuous dates:
select id, min(from_date), max(to_date)
from (select t.*,
sum(case when prev_to_date = to_date then 1 else 0 end) over (partition by id) as grp
from (select t.*,
lag(to_date) over (partition by id order by from_date) as prev_to_date
from t
) t
) t
group by id, grp;
For filtering, you can add:
having current_date >= min(from_date) and current_date <= max(to_date)

Find the first missing date in a column (Oracle)

I need to find the first missing date in a date column from plan_table table. which should not be in holiday_table or it should be belongs to any week end.
holiday_table stores all the holiday dates.
Plan_table contains dates. here we have to find the first missing date
Plan_id Date
1 10/2/2016
2 10/3/2016
3 10/6/2016
4 10/9/2016
5 10/10/2016
6 10/12/2016
7 10/13/2016
8 10/16/2016
Here the first missing date is 10/4/2016, but if this date is in holiday_table then we have to show 10/5/2016 or next first occurrence..
Please help me to write a query for the same.
you can use the LEAD analytic function like this
select d
from
(
select
date + 1 as d
from
(
select
date,
lead(date) over(order by date) as next_date
from
(
select date from plan_table
union
select date from holliday_table
)
order by date
)
where
trunc(date) + 1 < trunc(next_date)
order by d
)
where rownum = 1
;