How to calculate moving sum with reset based on condition in teradata SQL? - sql

I have this data and I want to sum the field USAGE_FLAG but reset when it drops to 0 or moves to a new ID keeping the dataset ordered by SU_ID and WEEK:
SU_ID WEEK USAGE_FLAG
100 1 0
100 2 7
100 3 7
100 4 0
101 1 0
101 2 7
101 3 0
101 4 7
102 1 7
102 2 7
102 3 7
102 4 0
So I want to create this table:
SU_ID WEEK USAGE_FLAG SUM
100 1 0 0
100 2 7 7
100 3 7 14
100 4 0 0
101 1 0 0
101 2 7 7
101 3 0 0
101 4 7 7
102 1 7 7
102 2 7 14
102 3 7 21
102 4 0 0
I have tried the MSUM() function using GROUP BY but it won't keep the order I want above. It groups the 7's and the week numbers together which I don't want.
Anyone know if this is possible to do? I'm using teradata

In standard SQL a running sum can be done using a windowing function:
select su_id,
week,
usage_flag,
sum(usage_flag) over (partition by su_id order by week) as running_sum
from the_table;
I know Teradata supports windowing functions, I just don't know whether it also supports an order by in the window definition.
Resetting the sum is a bit more complicated. You first need to create "group IDs" that change each time the usage_flag goes to 0. The following works in PostgreSQL, I don't know if this works in Teradata as well:
select su_id,
week,
usage_flag,
sum(usage_flag) over (partition by su_id, group_nr order by week) as running_sum
from (
select t1.*,
sum(group_flag) over (partition by su_id order by week) as group_nr
from (
select *,
case
when usage_flag = 0 then 1
else 0
end as group_flag
from the_table
) t1
) t2
order by su_id, week;

Try below code, with use of RESET function it is working fine.
select su_id,
week,
usage_flag,
SUM(usage_flag) OVER (
PARTITION BY su_id
ORDER BY week
RESET WHEN usage_flag < /* preceding row */ SUM(usage_flag) OVER (
PARTITION BY su_id ORDER BY week
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING)
ROWS UNBOUNDED PRECEDING
)
from emp_su;

Please try below SQL:
select su_id,
week,
usage_flag,
SUM(usage_flag) OVER (PARTITION BY su_id ORDER BY week
RESET WHEN usage_flag = 0
ROWS UNBOUNDED PRECEDING
)
from emp_su;
Here RESET WHEN usage_flag = 0 will reset sum whenever sum usage_flag drops to 0

Related

Rank customer Transactions per segments in SQL Server

I have below table which has customer's transaction details.
Tranactaction date
CustomerID
1/27/2022
1
1/29/2022
1
2/27/2022
1
3/27/2022
1
3/29/2022
1
3/31/2022
1
4/2/2022
1
4/4/2022
1
4/6/2022
1
In this table consecutive transactions occurred in every two days considered as a segment.
For example, Transactions between Jan 27th and Jan 29th considered as segment 1 & Transactions between Mar 29th and Apr 6th considered as Segment 2. I need to rank the transactions per segment with date order. If a transaction not fall under any segment by default the rank is 1. Expected output is below.
Segment Rank
Tranactaction date
CustomerID
1
1/27/2022
1
2
1/29/2022
1
1
2/27/2022
1
1
3/27/2022
1
2
3/29/2022
1
3
3/31/2022
1
4
4/2/2022
1
5
4/4/2022
1
6
4/6/2022
1
Can somebody guide how to achieve this in T-sql?
Using lag() to check for change in TransDate that is within 2 days and groups together (as a segment). After that use row_number() to generate the required sequence
with
cte as
(
select *,
g = case when datediff(day,
lag(t.TransDate) over (order by t.TransDate),
t.TransDate
) <= 2
then 0
else 1
end
from tbl t
),
cte2 as
(
select *, grp = sum(g) over (order by TransDate)
from cte
)
select *, row_number() over (partition by grp order by TransDate)
from cte2
db<>fiddle demo

SQL query for incoming and outgoing stocks, first and last

I need to make a query that shows sales and stocks (incoming and outgoing) for each model in October 2021.
The point is that for obtaining incoming and outgoing stocks I need to get vt_stocks_cube_sz.qty respectively for the first day of month and for the last day of month .
Now I wrote just sum of stocks (SUM(vt_stocks_cube_sz.qty) as stocks) but it isn't correct.
Could you help me to split the stocks according to the rule above, I cannot understant how to write the query correctly.
%%time
SELECT vt_sales_cube_sz.modc_barc2 model,
SUM(vt_sales_cube_sz.qnt) sales,
SUM(vt_stocks_cube_sz.qty) as stocks
FROM vt_sales_cube_sz
LEFT JOIN vt_date_cube2
ON vt_sales_cube_sz.id_calendar_int = vt_date_cube2.id_calendar_int
LEFT JOIN vt_stocks_cube_sz ON
vt_stocks_cube_sz.parent_modc_barc = vt_sales_cube_sz.modc_barc AND
vt_stocks_cube_sz.id_stock = vt_sales_cube_sz.id_stock AND
vt_stocks_cube_sz.id_calendar_int = vt_sales_cube_sz.id_calendar_int AND
vt_stocks_cube_sz.vipusk_type = vt_sales_cube_sz.price_type
WHERE vt_date_cube2.wk_year_id = 2021
AND vt_date_cube2.wk_MoY_id = 10
AND vt_sales_cube_sz.id_stock IN
(SELECT id_stock
FROM vt_warehouse_cube
WHERE channel = \'OffLine\')
GROUP BY vt_sales_cube_sz.modc_barc2
If you're looking for a robust and generalizable approach I'd suggest using analytic functions such as FIRST_VALUE, LAST_VALUE or something slightly different with RANK or ROW_NUMBER.
A simple example follows, so you can rerun it on your side and adjust it to the specific tables/fields you're using.
N.B.: You might need some tiebreakers in case you had multiple entries for the same first/last day.
with dummy_table as (
SELECT 1 as month, 1 as day, 10 as value UNION ALL
SELECT 1 as month, 2 as day, 20 as value UNION ALL
SELECT 1 as month, 3 as day, 30 as value UNION ALL
SELECT 2 as month, 1 as day, 5 as value UNION ALL
SELECT 2 as month, 3 as day, 15 as value UNION ALL
SELECT 2 as month, 5 as day, 25 as value
)
SELECT
month,
day,
case when day = first_day then 'first' else 'last' end as type,
value,
FROM (
SELECT *
, FIRST_VALUE(day) over (partition by month order by day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as first_day
, LAST_VALUE(day) over (partition by month order by day ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_day
FROM dummy_table
) tmp
WHERE day = first_day OR day=last_day
Dummy table:
Row
month
day
value
1
1
1
10
2
1
2
20
3
1
3
30
4
2
1
5
5
2
3
15
6
2
5
25
Result:
Row
month
day
type
value
1
1
1
first
10
2
1
3
last
30
3
2
1
first
5
4
2
5
last
25

Estimation of Cumulative value every 3 months in SQL

I have a table like this:
ID Date Prod
1 1/1/2009 5
1 2/1/2009 5
1 3/1/2009 5
1 4/1/2009 5
1 5/1/2009 5
1 6/1/2009 5
1 7/1/2009 5
1 8/1/2009 5
1 9/1/2009 5
And I need to get the following result:
ID Date Prod CumProd
1 2009/03/01 5 15 ---Each 3 months
1 2009/06/01 5 30 ---Each 3 months
1 2009/09/01 5 45 ---Each 3 months
What could be the best approach to take in SQL?
You can try the below - using window function
DEMO Here
select * from
(
select *,sum(prod) over(order by DATEPART(qq,dateval)) as cum_sum,
row_number() over(partition by DATEPART(qq,dateval) order by dateval) as rn
from t
)A where rn=1
How about just filtering on the month number?
select t.*
from (select id, date, prod, sum(prod) over (partition by id order by date) as running_prod
from t
) t
where month(date) in (3, 6, 9, 12);

Oracle select sum by time window

Lets assume that we have the ORACLE table of the following format and data:
TIMESTAMP MESSAGENO ORGMESSAGE
------------------------- ---------------------- -------------------------------------
27.04.13 1 START PERIOD
27.04.13 3 10
27.04.13 4 5
28.04.13 5 6
28.04.13 3 20
29.04.13 4 25
29.04.13 5 26
30.04.13 2 END PERIOD
30.04.13 1 START PERIOD
01.05.13 3 10
02.05.13 4 15
02.05.13 5 16
03.05.13 3 30
03.05.13 4 35
04.05.13 5 36
05.05.13 2 END PERIOD
I want to select sum of all the ORGMESSAGE for all the period (window between START PERIOD and END PERIOD) grouped by MESSAGENO.
Exapmle output would be:
PERIOD START PERIOD END MESSAGENO SUM
------------ ------------- -------- ----
27.04.13 30.04.13 3 25
27.04.13 30.04.13 4 30
27.04.13 30.04.13 5 32
30.04.13 05.05.13 3 45
30.04.13 05.05.13 4 50
30.04.13 05.05.13 5 52
I am guessing that use of ORACLE Analityc function woulde be suitable but really dont know how and where to start.
Thanks in advance for any help.
If we assume that the period starts and ends match, then a simple way to find the matching messages is to count the preceding number of starts. This is a cumulative sum and it is easy in Oracle. The rest is just aggregation:
select min(timestamp) as periodstart, max(timestamp) as periodend, messageno, count(*)
from (select om.*,
sum(case when messageno = 1 then 1 else 0 end) over (order by timestamp) as grp
from orgmessages om
) om
where messageno not in (1, 2)
group by grp, messageno;
Note that this method (as with the others) really wants the timestamp to be unique on each record. In the data presented, these solutions will work. But if you have multiple starts and ends on the same day, none of them will work assuming that timestamp only has the date.
First find all period ends per period start. Then join with your table to group and sum.
select
dates.start_date,
dates.end_date,
messageno,
sum(to_number(orgmessage)) as period_sum
from mytable
join
(
select start_dates.timestmp as start_date, min(end_dates.timestmp) as end_date
from (select * from mytable where orgmessage = 'START PERIOD') start_dates
join (select * from mytable where orgmessage = 'END PERIOD') end_dates
on start_dates.timestmp < end_dates.timestmp
group by start_dates.timestmp
) dates on mytable.timestmp between dates.start_date and dates.end_date
where mytable.orgmessage not like '%PERIOD%'
group by dates.start_date, dates.end_date, messageno
order by dates.start_date, dates.end_date, messageno;
SQL fiddle: http://www.sqlfiddle.com/#!4/365de/15.
please, try this one, replace rrr with your table name
select periodstart, periodend, messageno, sum(to_number(orgmessage)) s
from (select TIMESTAMP periodstart,
(select min (TIMESTAMP) from rrr r2 where orgmessage = 'END PERIOD' and r2.TIMESTAMP > r.TIMESTAMP) periodend
from rrr r
where orgmessage = 'START PERIOD'
) borders, rrr r
where r.TIMESTAMP between borders.periodstart and borders.periodend
and r.orgmessage not in ('END PERIOD', 'START PERIOD')
group by periodstart, periodend, messageno
order by periodstart, periodend, messageno

Counters in Teradata while inserting records

I am trying to insert records in a table in the below format
Name Amount Date Counter
A 100 Jan 1 1
A 100 Jan2 1
A 200 Jan 10 2
A 300 Mar 30 3
B 50 Jan 7 1
C 20 Jan 7 1
Could someone tell me the sql for generating the value for the Counter field .
The counter value should increment whenever the amount changes and reset when the name changes.
What you need is a DENSE_RANK function. Unfortunately it's not natively implemented before TD14.10, but it can be written using nested OLAP-functions:
SELECT
Name
,Amount
,date_col
,SUM(flag)
OVER (PARTITION BY Name
ORDER BY date_col
ROWS UNBOUNDED PRECEDING) AS "DENSE_RANK"
FROM
(
SELECT
Name
,Amount
,date_col
,CASE
WHEN Amount = MIN(Amount)
OVER (PARTITION BY Name
ORDER BY date_col
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING)
THEN 0
ELSE 1
END AS flag
FROM dropme
) AS dt;