PL SQL Recursive Query - sql

Please find below tables
EVENT table
event_id | gross_amount | transaction_id
1 | 10 | 1
2 | 12 | 5
TRANSACTION table
trx_id | debit | credit | type | original_trx_id | last_updated
1 | 0 | 0 | payment | null | 25-JUL-11
2 | 0 | 2 | settlement | 1 | 26-JUL-11
3 | 0 | 1 | settlement | 1 | 27-JUL-11
4 | 3 | 0 | settlement | 1 | 28-JUL-11
5 | 0 | 0 | payment | null | 24-JUL-11
6 | 0 | 3 | settlement | 5 | 25-JUL-11
RESULT EXPECTED:
trx_id | debit | credit | current_gross | current_net
2 | 0 | 2 | 10 | 12
3 | 0 | 1 | 12 | 13
4 | 3 | 0 | 12 | 9
6 | 0 | 3 | 10 | 13
Explanation
Transaction 1,2,3,4 falling into one set and transaction 5,6 falling into an another set. Each transaction set can be ordered using last updated column.
For the calculation we do not take the transactions type "payment". The "payment" transaction is linked to the event table. From where can find "original_gorss_amount" for calculation.
Steps
Find event table payment transaction from transaction table. ( Ex: transaction_id = 1, Also from that we can find original_gross_amount = 10 )
Take all the "settlement" transaction that has original_trx_id = 1
Order them based on last updated time.
Apply the calculation
Hope you have understood my question. I want to get the "RESULT EXPECTED" somehow using PL SQL ( Please no custom function)
I can not think a way to apply CONNECT BY here. Your help is highly appreciate.
Please find below create table and insert statements.
create table event
(event_id number(9),
gross_amount number(9),
transaction_id number(9) );
insert into event values (1,10,1);
insert into event values (2,10,5);
create table transaction
(trx_id number(9),
debit number(9),
credit number(9),
type varchar2(50),
original_trx_id number(9),
last_updated DATE
);
insert into transaction values (1,0,0,'payment',null,'2011-07-25');
insert into transaction values (2,0,2,'settlement',1,'2011-07-26');
insert into transaction values (3,0,1,'settlement',1,'2011-07-27');
insert into transaction values (4,3,0,'settlement',1,'2011-07-28');
insert into transaction values (5,0,0,'payment',null,'2011-07-24');
insert into transaction values (6,0,3,'settlement',5,'2011-07-25');

If I understand you question right you don't want a hierarchial or recursive query. Just an analytic sum with a windowing clause.
SELECT T1.trx_id
, T1.debit
, T1.credit
, E2.gross_amount
+ NVL( SUM( T1.credit ) OVER( PARTITION BY T1.original_trx_id
ORDER BY T1.last_updated
RANGE BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING ), 0 )
- NVL( SUM( T1.debit ) OVER( PARTITION BY T1.original_trx_id
ORDER BY T1.last_updated
RANGE BETWEEN UNBOUNDED PRECEDING
AND 1 PRECEDING ), 0 )
AS current_gross
, E2.gross_amount
+ SUM( T1.credit ) OVER( PARTITION BY T1.original_trx_id
ORDER BY T1.last_updated
RANGE BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW )
- SUM( T1.debit ) OVER( PARTITION BY T1.original_trx_id
ORDER BY T1.last_updated
RANGE BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW )
AS current_net
FROM g1_transaction T1
, g1_event E2
WHERE T1.original_trx_id = E2.transaction_id
ORDER BY T1.original_trx_id, T1.last_updated
NOTE: A few of problems in your question (or at least my understanding of it).
Should the 2nd insert into events set the gross_amount to be 12
Should the current_gross of trx_id 4 in the results be 13 (instead of 12) because it includes the 1 credit from trx_id 3. And thus the net should be 10 (instead of 9)
Should the current_gross of trx_id 6 be 12 (instead of 10) because this is the gross_amount of event 2. And thus the current_net would be 15 (instead of 13)
If these assumptions are correct then the query I provided gives these results.
TRX_ID DEBIT CREDIT CURRENT_GROSS CURRENT_NET
---------- ---------- ---------- ------------- -----------
2 0 2 10 12
3 0 1 12 13
4 3 0 13 10
6 0 3 12 15

Related

How to verify if max value of a column corresponds to the max value from another column group by a third column

I have a table
Token | Acct_No | Customer_ID |
|:----|:-------:|-----:|
10 | 1 | ABC
7 | 2 | ABC
6 | 3 | ABC
12 | 4 | ABC
11 | 1 | ABC
8 | 1 | ABC
15| 4 | ABC
16 | 3 | ABC
10 | 2 | CDA
I want to know if there are any rows where max(token) for max(acct_no) < max(token) for any other acct_no for a particular customer_id.
In this case, it is the 2nd last record.
You can use the window functions first_value() to calculate the maximum token for the biggest acct_no for each customer. Then for rows that have the biggest token for each customerid/acct_no check to see if any tokens are larger:
select t.*
from (select t.*,
first_value(token) over (partition by customerid order by acct_no desc, token desc) as token_biggest_acct_no,
row_number() over (partition by customerid, acct_no order by token desc) as seqnum
from t
) t
where seqnum = 1 and -- only consider last rows for customers/accounts
token > token_biggest_acct_no;

Get some values from the table by selecting

I have a table:
| id | Number |Address
| -----| ------------|-----------
| 1 | 0 | NULL
| 1 | 1 | NULL
| 1 | 2 | 50
| 1 | 3 | NULL
| 2 | 0 | 10
| 3 | 1 | 30
| 3 | 2 | 20
| 3 | 3 | 20
| 4 | 0 | 75
| 4 | 1 | 22
| 4 | 2 | 30
| 5 | 0 | NULL
I need to get: the NUMBER of the last ADDRESS change for each ID.
I wrote this select:
select dh.id, dh.number from table dh where dh =
(select max(min(t.history)) from table t where t.id = dh.id group by t.address)
But this select not correctly handling the case when the address first changed, and then changed to the previous value. For example id=1: group by return:
| Number |
| -------- |
| NULL |
| 50 |
I have been thinking about this select for several days, and I will be happy to receive any help.
You can do this using row_number() -- twice:
select t.id, min(number)
from (select t.*,
row_number() over (partition by id order by number desc) as seqnum1,
row_number() over (partition by id, address order by number desc) as seqnum2
from t
) t
where seqnum1 = seqnum2
group by id;
What this does is enumerate the rows by number in descending order:
Once per id.
Once per id and address.
These values are the same only when the value is 1, which is the most recent address in the data. Then aggregation pulls back the earliest row in this group.
I answered my question myself, if anyone needs it, my solution:
select * from table dh1 where dh1.number = (
select max(x.number)
from (
select
dh2.id, dh2.number, dh2.address, lag(dh2.address) over(order by dh2.number asc) as prev
from table dh2 where dh1.id=dh2.id
) x
where NVL(x.address, 0) <> NVL(x.prev, 0)
);

Add a CASE on a SUM when using conditional aggregation

I have the following table (called report) in SQL Server with millions of records so performance is a factor.
+---------+------------------------+---------+
| user_id | timestamp | balance |
+---------+------------------------+---------+
| 1 |2021-04-29 09:31:10.100 | 10 |
| 1 |2021-04-29 09:35:25.800 | 15 |
| 1 |2021-04-29 09:36:30.550 | 5 |
| 2 |2021-04-29 09:38:15.009 | 100 |
| 3 |2021-04-29 10:36:30.550 | 50 |
| 3 |2021-04-29 10:38:15.009 | 250 |
+---------+------------------------+---------+
Here are the requirements :
I would like to group the opening balance, closing balance and net movement of all users between a date range.
I require 2 queries:
all movement greater than a variable threshold (lets call it 10)
all movement less than a variable threshold (lets call it 10)
The records must also be returned using the OFFET x FETCH NEXT y ROWS for a lighter response to the UI.
Here I have a working query that does not take into account the less than / greater than the threashold requirement.
select user_id,
max(case when seqnum = 1 then balance end) as opening,
max(case when seqnum_desc = 1 then balance end) as closing,
sum(case when seqnum = 1 and seqnum_desc = 1 then 0
when seqnum = 1 then - balance
when seqnum_desc = 1 then balance
end) as movement
from (select r.*,
row_number() over (partition by user_id order by timestamp) as seqnum,
row_number() over (partition by user_id order by timestamp desc) as seqnum_desc
from report r where timestamp >= '2020-03-21 16:22:26.580' and timestamp <= '2022-03-24 16:22:26.580'
) r where timestamp >= '2020-03-21 16:22:26.580' and timestamp <= '2022-03-24 16:22:26.580'
group by user_id order by user_id OFFSET 0 ROWS FETCH NEXT 200 ROWS ONLY
Here is the DB fiddle to get the output below
+---------+-----------------+-----------------+--------------+
| user_id | opening | closing | movement |
+---------+-----------------+-----------------+--------------+
| 1 | 10 | 5 | -5 |
| 2 | 100 | 100 | 0 |
| 3 | 50 | 250 | 200 |
+---------+-----------------+-----------------+--------------+
How do I conditionally return only movements greater than 10 and less than 10.
Thank you in advance.
I would suggest a CTE or subquery:
with cte as (
<your query here>
)
select cte.*
from cte
where movement < 10; -- or whatever condition
Note: You might actually want the absolute value if you really mean -10 to 10 rather than 0 to 10:
where abs(movement) < 10

Create a 'partial' window function to update data in SQL Server

I have a table where I want to use window functions to fill in some NULLs, but I only want data to flow downwards, based on a Rank column in the table. Using window functions (PARTITION BY), all rows get assigned the same data, which is not the requirement here.
The initial table has NULL values for columns A and B where Rank=2 and ID=1, which I want to populate with the values where Rank=1. Column C is NULL where Rank=1, and 15 where Rank=2 and ID=1, which needs to stay the same way.
Here is the structure of the initial table, the desired output, as well as some sample code. I am unsure how to incorporate the rank into the partition by statement
Initial Table
ID A B C Rank
---------------------------------
1 10 10 NULL 1
1 NULL NULL 15 2
2 10 NULL NULL 1
2 NULL NULL 15 2
2 NULL NULL 15 3
Target table
ID A B C Rank
---------------------------------
1 10 10 NULL 1
1 10 10 15 2
2 10 NULL NULL 1
2 10 NULL 15 2
2 10 NULL 15 3
SQL query:
SELECT
ID,
MAX(A) OVER (PARTITION BY ID),
MAX(B) OVER (PARTITION BY ID),
MAX(C) OVER (PARTITION BY ID),
Rank
FROM
TBL;
As expected, partitioning by both, ID and Rank leads to no changes in the initial table
You can use first_value():
select
id,
coalesce(a, first_value(a) over (partition by id order by rnk)) a,
coalesce(b, first_value(b) over (partition by id order by rnk)) b,
coalesce(c, first_value(c) over (partition by id order by rnk)) c,
rnk
from tbl;
Note that rank is a language keyword (as in window function rank() over()), hence not a good choice for a column name. I renamed it to rnk in the query.
Demo on DB Fiddle:
id | a | b | c | rnk
-: | -: | ---: | ---: | --:
1 | 10 | 10 | null | 1
1 | 10 | 10 | 15 | 2
2 | 10 | null | null | 1
2 | 10 | null | 15 | 2
2 | 10 | null | 15 | 3
Unfortunately, SQL Server doesn't support ignore nulls in any window function. That is the best approach. In your example, you only have a value or NULL. If this is the case, you can use a cumulative max:
select id,
coalesce(a, max(a) over (partition by id order by rank)) as a,
coalesce(b, max(b) over (partition by id order by rank)) as b,
coalesce(c, max(c) over (partition by id order by rank)) as c,
rank
from tbl;
If this is not the case, the logic is trickier. But actually, I would suggest that you ask a new question if that is the case. Let this one be about the data as you have presented it.
You were almost there by the partitioning. You need to use a relatively infrequently used approach that limits the partition size - in this case, all the preceding rows only (so as to not use any 'future' when calculating the row's values).
CREATE TABLE #Temp (ID int, Rnk int, A int, B int, C int)
INSERT INTO #Temp (ID, Rnk, A, B, C)
VALUES (1, 1, 10, 10, NULL),
(1, 2, NULL, NULL, 15),
(2, 1, 10, NULL, NULL),
(2, 2, NULL, NULL, 15),
(2, 3, NULL, NULL, 15)
SELECT ID, Rnk,
MAX(A) OVER (PARTITION BY ID ORDER BY Rnk ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS A_Fill,
MAX(B) OVER (PARTITION BY ID ORDER BY Rnk ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS B_Fill,
MAX(C) OVER (PARTITION BY ID ORDER BY Rnk ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS C_Fill
FROM #Temp
I believe this outputs the same results as above e.g.,
ID | Rnk | A_Fill | B_Fill | C_Fill
1 | 1 | 10 | 10 | NULL
1 | 2 | 10 | 10 | 15
2 | 1 | 10 | NULL | NULL
2 | 2 | 10 | NULL | 15
2 | 3 | 10 | NULL | 15
Note that if you add two more rows for ID = 2 (with rnks 4 and 5) and the rest NULL, this approach puts 15 into the 'C' Columns, which the above (COALESCE/FIRST_VALUE) doesn't, I believe.
ID | Rnk | A_Fill | B_Fill | C_Fill
1 | 1 | 10 | 10 | NULL
1 | 2 | 10 | 10 | 15
2 | 1 | 10 | NULL | NULL
2 | 2 | 10 | NULL | 15
2 | 3 | 10 | NULL | 15
2 | 4 | 10 | NULL | 15
2 | 5 | 10 | NULL | 15
Here is a DB_Fiddle showing differences (based on previous ones above)

Setting batch number for set of records in sql

I have following table in SQL
id,date,records
1,2019-03-28 01:22:12,5
2,2019-03-29 01:23:23,5
3,2019-03-30 01:28:54,5
4,2019-03-28 01:12:21,2
5,2019-03-12 01:08:11,1
6,2019-03-28 01:01:21,12
7,2019-03-12 01:02:11,1
What i am trying to achieve is set a batch number that should keep on increasing after moving sum value crosses 15 and the moving sum should reset as well, so i am trying to create batch for records that has total moving sum value as 15
For ex. if Moving sum becomes 15 the batch number value should increment, which would given me rows containing total value of 15.
so the output i am looking for is
id,date,records, moving_sum,batch_number
1,2019-03-28 01:22:12,5,5,1
2,2019-03-29 01:23:23,5,10,1
3,2019-03-30 01:28:54,5,15,1
4,2019-03-28 01:12:21,2,2,2
5,2019-03-12 01:08:11,1,1,2
6,2019-03-28 01:01:21,2,12,2
7,2019-03-12 01:02:11,1,1,3
You need a recursive query for this:
with
tab as (select t.*, row_number() over(order by id) rn from mytable t),
cte as (
select
id,
date,
records,
records moving_sum,
1 batch_number,
rn
from tab
where rn = 1
union all
select
t.id,
t.date,
t.records,
case when c.moving_sum + t.records > 15 then t.records else c.moving_sum + t.records end,
case when c.moving_sum + t.records > 15 then c.batch_number + 1 else c.batch_number end,
t.rn
from cte c
inner join tab t on t.rn = c.rn + 1
)
select id, date, records, moving_sum, batch_number from cte order by id
The syntax for recursive common table expressions slightly varies across databases, so you might need to adapt that a little depending on your database.
Also note that if ids start at 1, and are always incrementing without gaps, you don't actually common table expression tab, and you can replace rn with id in the second common table expression.
Demo on DB Fiddle:
id | date | records | moving_sum | batch_number
-: | :--------- | ------: | ---------: | -----------:
1 | 2019-03-28 | 5 | 5 | 1
2 | 2019-03-29 | 5 | 10 | 1
3 | 2019-03-30 | 5 | 15 | 1
4 | 2019-03-28 | 2 | 2 | 2
5 | 2019-03-12 | 1 | 3 | 2
6 | 2019-03-28 | 12 | 15 | 2
7 | 2019-03-12 | 1 | 1 | 3