Can this cursor be replaced - sql

I am currently using a cursor in my sql server procedure. Wanted to know if there is anyway to replace it with a better approach. Process is
Customer pays some money, and I create an entry for it in the payment table.
I start a cursor that selects all payments of that customer that have an available balance, from PAYMENT TABLE
Then I start an inner cursor that fetches all the bills of that customer which are still unpaid, from BILL TABLE
I pay off each bill till the current payment is exhausted and then repeat the process
How can I remove the cursors for a more effective way in steps 2 and 3. Also does use of cursors mean that the PAYMENT and BILL tables remain locked till the procedure runs?
Tx

Here's one way it can be done, with made up tables and data since we don't know what yours look like. I'm putting some narrative in in places but all of the code should be run as one single script.
Data setup:
declare #bills table (billid int, balance decimal(38,4))
declare #payments table (paymentid int, balance decimal(38,4))
insert into #bills (billid, balance) values
(1,0), (2,22.50), (3,12.75), (4,19.20)
insert into #payments (paymentid,balance) values
(1,20.19),(2,5.50),(3,20)
declare #newpayments table (billid int, paymentid int,
paymentamount decimal(38,4))
I've assumed that the bills and payments tables have a column, called balance which shows any amounts not dealt with as yet. Alternatively, you may have to calculate this from a couple of columns. But no sample data in your question means I get to make up an easy structure :-)
Query to populate #newpayments with which bills should be paid from which (partial) payments1:
; With unpaidbills as (
select billid,balance,
ROW_NUMBER() OVER (ORDER BY billid) as rn,
SUM(balance) OVER (ORDER BY billid
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) as endbalance,
SUM(balance) OVER (ORDER BY billid
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) - balance as startbalance
from #bills
where balance > 0
), unusedpayments as (
select paymentid,balance,
ROW_NUMBER() OVER (ORDER BY paymentid) as rn,
SUM(balance) OVER (ORDER BY paymentid
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) as endbalance,
SUM(balance) OVER (ORDER BY paymentid
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) - balance as startbalance
from #payments
where balance > 0
), overlaps as (
select
billid,paymentid,
CASE WHEN ub.startbalance < up.startbalance
THEN up.startbalance ELSE ub.startbalance END as overlapstart,
CASE WHEN ub.endbalance > up.endbalance
THEN up.endbalance ELSE ub.endbalance END as overlapend
from
unpaidbills ub
inner join
unusedpayments up
on
ub.startbalance < up.endbalance and
up.startbalance < ub.endbalance
)
insert into #newpayments(billid,paymentid,paymentamount)
select billid,paymentid,overlapend - overlapstart as paymentamount
from overlaps
At this point, #newpayments can be used to generate transaction history, etc
And then, finally we update the original tables to mark the amounts used:
;With totalpaid as (
select billid,SUM(paymentamount) as payment from #newpayments
group by billid
)
update b
set b.balance = b.balance - tp.payment
from #bills b
inner join
totalpaid tp
on b.billid = tp.billid
;With totalused as (
select paymentid,SUM(paymentamount) as payment from #newpayments
group by paymentid
)
update p
set p.balance = p.balance - tu.payment
from #payments p
inner join
totalused tu
on p.paymentid = tu.paymentid
The key part was to use SUM() with window functions to calculate the running totals of the amounts owed (bills) or amounts available (payments), in both cases using a column (billid or paymentid) to determine in what order each of these items should be dealt with. E.g. the unpaidbills CTE produces a result set like this:
billid balance rn endbalance startbalance
----------- --------- -------------------- ------------- -------------
2 22.5000 1 22.5000 0.0000
3 12.7500 2 35.2500 22.5000
4 19.2000 3 54.4500 35.2500
and unusedpayments looks like this:
paymentid balance rn endbalance startbalance
----------- ---------- -------------------- ------------ -------------
1 20.1900 1 20.1900 0.0000
2 5.5000 2 25.6900 20.1900
3 20.0000 3 45.6900 25.6900
We then create the overlaps CTE which finds overlaps2 between the bills and payments where (part of) a payment can be used to satisfy (part of) a bill. The region of the overlap is the actual amount to pay for that bill.
1 The ROW_NUMBER() calls aren't really needed. In an early part of writing this query, I thought I was going to use these but it turned out to be unnecessary. But removing them doesn't shorten things enough to allow SO to stop scrolling that query anyway, and so I may as well leave them in (and not have to edit the result sets shown lower down also)
2 Many people trying to find overlaps make things absurdly complicated and deal with many special cases to find all overlaps. This can usually be done far more simply in the way that I show in the overlaps CTE - two ranges overlap if the first range starts before the second range ends, and the second range starts before the first range ends.
The only tricky thing to do is to decide whether you want to deal with two ranges that abut (the first one's end value is exactly equal to the second one's start or vice versa) but that just leads to a decision on whether to use < or <= in the comparisons.
In this instance, we don't care if a payment exactly paid off the previous bill so we use < to avoid treating such situations as an overlap.

Related

SQL query to add column that counts number of encounters for the past year from each encounter

I am trying to identify High-Usage status for customers, so at time of order how many orders did the customer place in the last year. Each customer has unique ID and each order has unique ID, with a date/time stamp at time of order. This is not just adding a count column, but a conditional count. I can recreate this in Excel using sumproduct, but wanted to see if I can automate the process in SMSS before my pull.
I tried a subquery column and then doing a join on a subquery result:
SELECT (*)
,HU_CUSTOMER_YOY
FROM data
LEFT JOIN (SELECT MAX(ORDER_ID) AS ORDER_ID
,COUNT(CUSTOMER_ID) AS HU_CUSTOMER_YOY
FROM data AS CUS_HU
WHERE ORDER_DTTM > DATEADD(YEAR, -1, ORDER_DTTM)
GROUP BY CUS_HU.CUSTOMER_ID)
CUSHU on CUSHU.ORDER_ID = data.ORDER_ID
This pulls in a value ONLY on the most recent order and counts ALL previous orders. To reiterate, I need a value on EACH unique order to count every order for that customer for the previous year from that unique order. My issue is using the DTTM column. If I use a static date like getdate(), it will count but I need the count for the DTTM-1year on EACH order to view historical data, i.e., when a customer began and fell-off High-Usage status, what contributed to the change, etc.
This is for a rather large dataset that is refreshed daily. I would prefer to not have the main query be aggregated, if possible, which is why I thought creating and joining a reference table would be preferred.
Is this possible?
Adding expected query results:
customer_id
order_id
HU_count
order_dttm
c1
c1-1
0
1/1/2020
c1
c1-2
1
7/1/2020
c1
c1-3
0
1/1/2022
c1
c1-4
1
1/10/2022
c2
c2-1
0
1/11/2022
c1
c1-5
2
1/14/2022
c2
c2-2
1
1/15/2022
I assume you are using SQL Server from your usage of the DATEADD function.
Based on my understanding of the requirements, this will show the count for each order for each customer in the previous year.
SELECT DISTINCT
customer_id as hu_customer_yoy
,COUNT(case when order_dttm > DATEADD(YEAR, -1, order_dttm) THEN 1 ELSE null END)
over (partition by customer_id, order_id) AS ORDER_COUNT
,ORDER_ID
FROM data

Multiple sum subqueries for percentage

I need help with the following problem: I want to make a query that contains multiples sums and then takes those sums and uses them to get a percentage: percentage= s1/s1+s2.
I have as input the following data:
Orders shipping date, Nb of orders that have arrived late, Nb of orders that have arrived on time
What I want as output: The percentage of orders that have arrived late and orders that have arrived on time.
I want another column in the table that will have the percentage using SQL.
Concrete example:
*On 2022/01/04 **10:00 AM** I have 3 orders late and 4 order on time=> 7 orders in total. Percentage=3/7 (late), (4/7) on time
*At 2022/01/04 **11:00 AM** I have 5 orders late and 6 orders on time=>11 orders in total (but all this entry is summed with the previous entry so:) <=> 5+3 orders late, 4+6 orders on time, 18 orders in total => percentage= 8/18 late, 10 on time.
In order to sum previous entries order numbers with status "LATE" to current on time order number I wrote the following sql:
(sum1=s1)
SELECT s1.EventDate, (
SELECT SUM(s2.NbOfOrders)
FROM OrderShipmentStats s2
WHERE s2.EventDate <= s1.EventDate AND s2.Status='LATE'
) AS cnt
FROM OrderShipmentStats s1
GROUP BY s1.EventDate, s1.Status
The same kind of sql was written for "On Time" and it works. But what I need to do now is get the values and add them together of the two sql queries and based on the status which is late or on time do s1/s1+s2 or s2/s2+s1.
My problem is that I do not know how to do this formula in a single query using those 2 subqueries, any help would be great.
Picture with Table
Above there is the link with the picture containing how the table looks(I am new so I am not allowed to embed a photo).
The percentage column is the one I will add and there are lines pointing towards how that is calculated.
I created the table based on your image and added a few rows to it.
In the query you could see total orders count per hour, per status and the grand total as you mentioned in the image.
The query looks like:
create table OrderShipmentsStats
(
EventDate datetime not null,
Status varchar(10) not null,
OrdersCount int not null
)
insert into OrderShipmentsStats
values
('2022-01-04T10:00:00','Late',3),
('2022-01-04T10:00:00','On Time',4),
('2022-01-04T11:00:00','Late',5),
('2022-01-04T11:00:00','On Time',6),
('2022-01-04T12:00:00','Late',1),
('2022-01-04T12:00:00','On Time',2)
SELECT
EventDate,
Status,
OrdersCount,
TotalPerHour,
StatusTotal,
GrandStatusTotal,
-- at the line below, multiplying by 1.0 will convert the result and we would receive smth like 0.45, 0.123, some percentage
-- but we want the actual percent like 15%, or 50%. to obtain it, just multiply by 100
cast(1.0 * o.StatusTotal / o.GrandStatusTotal as decimal(5,3)) * 100 as Percentage
from
(
select
EventDate,
Status,
OrdersCount,
TotalPerHour,
StatusTotal,
SUM(TotalPerHour) over (partition by Status order by EventDate asc) as GrandStatusTotal
from
(
select
EventDate,
Status,
OrdersCount,
Sum(OrdersCount) over (partition by EventDate order by EventDate asc) as TotalPerHour,
SUM(OrdersCount) over (partition by Status order by EventDate asc) as StatusTotal
from OrderShipmentsStats
) as t
) as o
order by EventDate, Status

Select all rows where the sum of column X is greather or equal than Y

I need to find a group of lots to satisfy X demand for items. I can't do it with aggregate functions, it seems to me that I need something more than a window function, do you know anything that can help me solve this problem?
For example, if I have a demand for 1 Item, the query should return any lot with a quantity greater than or equal to 1. But if I have a demand for 15, there are no lots with that availability, so it should return a lot of 10 and another with 5 or one of 10 and two of 3, etc.
With a programming language like Java this is simple, but with SQL is it possible? I am trying to achieve it with sales functions but I cannot find a way to add the available quantity of the current row until reaching the required quantity.
SELECT id,VC_NUMERO_LOTE,SF_FECHA_CREACION,SI_ID_M_ARTICULO,VI_CANTIDAD,NEXT, VI_CANTIDAD + NEXT AS TOT FROM (
SELECT row_number() over (ORDER BY SF_FECHA_CREACION desc) id ,VC_NUMERO_LOTE,SF_FECHA_CREACION,SI_ID_M_ARTICULO,
VI_CANTIDAD,LEAD(VI_CANTIDAD,1) OVER (ORDER BY SF_FECHA_CREACION desc) as NEXT FROM PUBLIC.M_LOTE WHERE SI_ID_M_ARTICULO = 44974
AND VI_CANTIDAD > 0 ) AS T
WHERE MOD(id, 2) != 0
I tried with lead to then sum only odd records but I saw that it is not the way, any suggestions?
You need a recursive query like this:
demo:db<>fiddle
WITH RECURSIVE lots_with_rowcount AS ( -- 1
SELECT
*,
row_number() OVER (ORDER BY avail_qty DESC) as rowcnt
FROM mytable
), lots AS ( -- 2
SELECT -- 3
lot_nr,
avail_qty,
rowcnt,
avail_qty as total_qty
FROM lots_with_rowcount
WHERE rowcnt = 1
UNION
SELECT
t.lot_nr,
t.avail_qty,
t.rowcnt,
l.total_qty + t.avail_qty -- 4
FROM lots_with_rowcount t
JOIN lots l ON t.rowcnt = l.rowcnt + 1
AND l.total_qty < --<your demand here>
)
SELECT * FROM lots -- 5
This CTE is only to provide a row count to each record which can be used within the recursion to join the next records.
This is the recursive CTE. A recursive CTE contains two parts: The initial SELECT statement and the recursion.
Initial part: Queries the lot record with the highest avail_qty value. Naturally, you can order them in any order you like. Most qty first yield the smallest output.
After the UNION the recursion part: Here the current row is joined the previous output AND as an additional condition: Join only if the previous output doesn't fit your demand value. In that case, the next total_qty value is calculated using the previous and the current qty value.
Recursion end, when there's no record left which fits the join condition. Then you can SELECT the entire recursion output.
Notice: If your demand was higher than your all your available quantities in total, this would return the entire table because the recursion runs as long as the demanded is not reached or your table ends. You should add a query before, which checks this:
SELECT SUM(avail_qty) > demand FROM mytable
I gratefully fiddled around with S-Man's fiddle and found a query, at least simpler to understand
select lot_nr, avail_qty, tot_amount from
(select lot_nr, avail_qty,
sum(avail_qty) over (order by avail_qty desc rows between unbounded preceding and current row) as tot_amount,
sum(avail_qty) over (order by avail_qty desc rows between unbounded preceding and current row) - avail_qty as last_amount
from mytable) amounts
where last_amount < 15 -- your amount here
so this lists all rows where with the predecesor (in descending order by avail_qty) the limit isn't yet reached
Here is a simple old-school PL/pgSQL version that uses a (slow) loop. It returns only the lot numbers as an illustration. Basically what it does is return lot numbers for a particular item_id in certain order (that reflects the required business rules) and allocates the available quantities until the allocated quantity is equal or exceeds the required quantity.
create function get_lots(required_item integer, required_qty integer) returns setof text as
$$
declare
r record;
allocated_qty integer := 0;
begin
for r in select * from lots where item_id = required_item order by <your biz-rule> loop
return next r.lot_number;
allocated_qty := allocated_qty + r.available_qty;
exit when allocated_qty >= required_qty;
end loop;
end;
$$ language plpgsql;
-- Use
select lot_id from get_lots(1, 17) lot_id;

sql lowest running balance in a group

I've been trying for days to solve this problem to no solution.
I want to get the lowest running balance in a group.
Here is a sample data
The running balance is imaginary and is not part of the table.
the running balance is also computed dynamically.
the problem is I want to get the lowest running balance in a Specific month (January)
so the output should be 150 for memberid 10001 and 175 for memberid 10002 as highlighted in the image.
my desired out put should be
memberid | balance
10001 | 150
10002 | 175
Is that possible using sql query only?
PS. Using c# to compute lowest running balance is very slow since I have more than 600,000 records in my table.
I've updated the question.
The answer provided by Mihir Shah gave me the idea how solve my problem.
His answer takes to much time to process making it as slow as my computation on my c# program because his code loops on every record.
Here is my answer to get the minimum lowest value in a specific group (specific month) with a running value or running total without sacrificing a lot of performance.
with IniMonth1 as
(
select a.memberid, a.iniDeposit, a.iniWithdrawal,
(cast(a.iniDeposit as decimal(10,2)) - cast(a.iniWithdrawal as decimal(10,2))) as RunningTotal
from
(
select b.memberid, sum(b.depositamt) as iniDeposit, sum(b.withdrawalamt) as iniWithdrawal
from savings b
where trdate < '01/01/2016'
group by b.memberid
) a /*gets all the memberid, sum of deposit amount and withdrawal amt from the beginning of the savings before the specific month */
where cast(a.iniDeposit as decimal(10,2)) - cast(a.iniWithdrawal as decimal(10,2)) > 0 /*filters zero savings */
)
,DetailMonth1 as
(
select a.memberid, a.depositamt,a.withdrawalamt,
(cast(a.depositamt as decimal(10,2)) - cast(a.withdrawalamt as decimal(10,2))) as totalBal,
Row_Number() Over(Partition By a.memberid Order By a.trdate Asc) RowID
from savings a
where
a.trdate >= '01/01/2016'
and
a.trdate <= '01/31/2016'
and (a.depositamt<>0 or a.withdrawalamt<>0)
) /* gets all record within the specific month and gives a no of row as an id for the running value in the next procedure*/
,ComputedDetailMonth1 as
(
select a.memberid, min(a.runningbalance) as MinRunningBal
from
(
select a.rowid, a.memberid, a.totalbal,
(
sum(b.totalbal) +
(case
when c.runningtotal is null then 0
else c.runningtotal
end)
)as runningbalance , c.runningtotal as oldbalance
from DetailMonth1 a
inner join DetailMonth1 b
on b.rowid<=a.rowid
and a.memberid=b.memberid
left join IniMonth1 c
on a.memberid=c.memberid
group by a.rowid,a.memberid,a.totalbal,c.runningtotal
) a
group by a.memberid
) /* the loop is only for the records of the specific month only making it much faster */
/* this gets the running balance of specific month ONLY and ADD the sum total of IniMonth1 using join to get the running balance from the beginning of savings to the specific month */
/* I then get the minimum of the output using the min function*/
, OldBalanceWithNoNewSavingsMonth1 as
(
select a.memberid,a.RunningTotal
from
IniMonth1 a
left join
DetailMonth1 b
on a.memberid = b.memberid
where b.totalbal is null
)/*this gets all the savings that is not zero and has no transaction in the specific month making and this will become the default value as the lowest value if the member has no transaction in the specific month. */
,finalComputedMonth1 as
(
select a.memberid,a.runningTotal as MinRunTotal from OldBalanceWithNoNewSavingsMonth1 a
union
select b.memberid,b.MinRunningBal from ComputedDetailMonth1 b
)/*the record with minimum running total with clients that has a transaction in the specific month Unions with the members with no current transaction in the specific month*/
select * from finalComputedMonth1 order by memberid /* display the final output */
I have more than 600k savings record on my savings table
Surprisingly the performance of this code is very efficient.
It takes almost 2hr using my c# program to manually compute every record of all the members.
This code makes only 2 secs and at most 9 secs just to compute everything.
i Just display to c# for another 2secs.
The output of this code was tested and compared with my computation using my c# program.
May be below one is help you
Set Nocount On;
Declare #CashFlow Table
(
savingsid Varchar(50)
,memberid Int
,trdate Date
,deposit Decimal(18,2)
,withdrawal Decimal(18,2)
)
Insert Into #CashFlow(savingsid,memberid,trdate,deposit,withdrawal) Values
('10001-0002',10001,'01/01/2015',1000,0)
,('10001-0003',10001,'01/07/2015',25,0)
,('10001-0004',10001,'01/13/2015',25,0)
,('10001-0005',10001,'01/19/2015',0,900)
,('10001-0006',10001,'01/25/2015',25,0)
,('10001-0007',10001,'01/31/2015',25,0)
,('10001-0008',10001,'02/06/2015',25,0)
,('10001-0009',10001,'02/12/2015',25,0)
,('10001-0010',10001,'02/18/2015',0,200)
,('10002-0001',10002,'01/01/2015',500,0)
,('10002-0002',10002,'01/07/2015',25,0)
,('10002-0003',10002,'01/13/2015',0,200)
,('10002-0004',10002,'01/19/2015',25,0)
,('10002-0005',10002,'01/25/2015',25,0)
,('10002-0006',10002,'01/31/2015',0,200)
,('10002-0007',10002,'02/06/2015',25,0)
,('10002-0008',10002,'02/12/2015',25,0)
,('10002-0009',10002,'02/12/2015',0,200)
;With TrialBalance As
(
Select Row_Number() Over(Partition By cf.memberid Order By cf.trdate Asc) RowNum
,cf.memberid
,cf.deposit
,cf.withdrawal
,cf.trdate
From #CashFlow As cf
)
,RunningBalance As
(
Select tb.RowNum
,tb.memberid
,tb.deposit
,tb.withdrawal
,tb.trdate
From TrialBalance As tb
Where tb.RowNum = 1
Union All
Select tb.RowNum
,rb.memberid
,Cast((rb.deposit + tb.deposit - tb.withdrawal) As Decimal(18,2))
,rb.withdrawal
,tb.trdate
From TrialBalance As tb
Join RunningBalance As rb On tb.RowNum = (rb.Rownum + 1) And tb.memberid = rb.memberid
)
Select rb.memberid
,Min(rb.deposit) As runningBalance
From RunningBalance As rb
Where Year(rb.trdate) = 2015
And Month(rb.trdate) = 1
Group By rb.memberid

Calculating information by using values from previous line

I have the current balance for each account and I need to subtract the netamount for transactions to create the previous month's end balance for the past 24 months. Below is a sample dataset;
create table txn_by_month (
memberid varchar(15)
,accountid varchar(15)
,effective_year varchar(4)
,effective_month varchar(2)
,balance money
,netamt money
,prev_mnthendbal money)
insert into txn_by_month values
(10001,111222333,2012,12,634.15,-500,1134.15)
,(10001,111222333,2012,11,NULL,-1436,NULL)
,(10001,111222333,2012,10,NULL,600,NULL)
,(10002,111333444,2012,12,1544.20,1650,-105.80)
,(10002,111333444,2012,11,NULL,1210,NULL)
,(10002,111333444,2012,10,NULL,-622,NULL)
,(10003,111456456,2012,01,125000,1200,123800)
,(10003,111456456,2011,12,NULL,1350,NULL)
,(10003,111456456,2011,11,NULL,-102,NULL)
As you can see I already have a table of all the transactions for each month totaled up. I just need to calculate the previous month end balance on the first line and bring it down to the second, third line etc. I have been trying to use CTEs, but am not overly familiar with them and seem to be stuck at the moment. This is what I have;
;
WITH CTEtest AS
(SELECT ROW_NUMBER() OVER (PARTITION BY memberid order by(accountid)) AS Sequence
,memberid
,accountid
,prev_mnthendbal
,netamt
FROM txn_by_month)
select c1.memberid
,c1.accountid
,c1.sequence
,c2.prev_mnthendbal as prev_mnthendbal
,c1.netamt,
COALESCE(c2.prev_mnthendbal, 0) - COALESCE(c1.netamt, 0) AS cur_mnthendbal
FROM CTEtest AS c1
LEFT OUTER JOIN CTEtest AS c2
ON c1.memberid = c2.memberid
and c1.accountid = c2.accountid
and c1.Sequence = c2.Sequence + 1
This is working only for the sequence = 2. I know that my issue is that I need to bring my cur_mnthendbal value down into the next line, but I can't seem to wrap my head around how. Do I need another CTE?
Any help would be greatly appreciated!
EDIT: Maybe I need to explain it better.... If I have this;
The balance for line 2 would be the prev_mnthendbal from line 1 ($1,134.15). Then the prev_mnthendbal from line 2 would be the balance - netamt ($1,134.15 - (-$1,436) = $2,570.15). I have been trying to use CTEs, but I can't seem to figure out how to populate the balance field with the prev_mnthendbal from the previous line (since it isn't calculated until the balance is available). Maybe I can't use CTE? Do I need to use cursor?
Turns out that I needed to combine a running total with the sequential CTE I was using to begin with.
;
with CTEtest AS
(SELECT ROW_NUMBER() OVER (PARTITION BY memberid order by effective year, effective month desc) AS Sequence, *
FROM txn_by_month)
,test
as (select * , balance - netamt as running_sum from CTEtest where sequence = 1
union all
select t.*, t1.running_sum - t.netamt from CTEtest t inner join test t1
on t.memberid = t1.memberid and t.sequence = t1.Sequence+1 where t.sequence > 1)
select * from test
order by memberid, Sequence
Hopefully this will help someone else in the future.
See LEAD/LAG analytic functions.