SQL - Minus Payment amount from rows in ascending order? - sql

I have a PaymentSchedule table that looks like the below which contains information about contracts, and when we expect to get paid on them.
contractkey
payment
total
DueDate
385884
Upfront
95.356
2022-05-17 00:00:00.000
385884
First
1
2022-06-09 00:00:00.000
385884
Final
143.034
2024-07-17 00:00:00.000
I then have another table which contains payments received at ContractKey level structured like the below..
PaymentKey
ContractKey
Total
1
385884
47.68
These tables are joined using ContractKey. What I am trying to do is add a column to my PaymentSchedule table which shows the amount of each scheduled payment that has already been paid off in the Payments table. So the example below we can see that 47.68 has been received for ContractKey 385884, which should then show in my calculated column the below..
I have wrote the below SQL and it isn't giving me the correct output for the subsequent rows..
with debitdetails as(
select contractkey,sum(total)[totalpaid] from fact.Payments
group by contractkey
)
select s.contractkey, s.Payment, s.total, [DueDate],
sum(s.total) over (partition by s.contractkey order by [DueDate] asc) - totalpaid [TotalRemaining]
from [ref].[PaymentSchedule] s
left join debitdetails dd on s.contractkey=dd.ContractKey
where s.contractkey = 385884
order by s.contractkey
This is giving me the below.. which isn't what I want as I want it to show me of the amount due, how much is remaining after minusing the already paid amount. So the 2nd row should show as 1, and the third as 143.03
contractkey
Payment
total
DueDate
TotalRemaining
385884
Upfront
95.356
2022-05-17 00:00:00.000
47.676
385884
First
1
2022-06-09 00:00:00.000
47.676
385884
Final
143.034
2024-07-17 00:00:00.000
190.71
Can anyone help me identify where I am going wrong? I assume I am just missing something really simple..

use case expression to check the totalpaid against the cumulative sum and calculate the remaining amount accordingly
First condition is when totalpaid is more than the cumulative sum, so remaining = 0
Second condition is when totalpaid is only able to partially cover the cumulative sum
Final condition (else) is when totalpaid is totally not enough to cover amount, so Remaining = 0
TotalRemaining = case when isnull(dd.totalpaid, 0)
>= sum(s.Total) over (partition by s.contractkey
order by s.DueDate)
then 0
when isnull(dd.totalpaid, 0)
>= sum(s.Total) over (partition by s.contractkey
order by s.DueDate)
- s.Total
then sum(s.Total) over (partition by s.contractkey
order by s.DueDate)
- isnull(dd.totalpaid, 0)
else s.Total
end

Related

Window function or Recursive query in Redshift

I try to classify customer based on monthly sales. Logic for calculation:
new customer – it has sales and appears for first time or has sales and returned after being lost (after 4 month period, based on sales_date)
active customer - not new and not lost.
lost customer - no sales and period (based on sales_date) more than 4 months
This is the desired output I'm trying to achieve:
The below Window function in Redshift classify however it is not correct.
It classified lost when difference between month > 4 in one row, however it did not classify lost if it was previous lost and revenue 0 until new status appear. How it can be updated?
with customer_status (
select customer_id,customer_name,sales_date,sum(sales_amount) as sales_amount_calc,
nvl(DATEDIFF(month, LAG(reporting_date) OVER (PARTITION BY customer_id ORDER by sales_date ASC), sales_date),0) AS months_between_sales
from customer
GROUP BY customer_id,customer_name,sales_date
)
select *,
case
WHEN months_between_sales = 0 THEN 'New'
WHEN months_between_sales > 0 THEN 'Active'
WHEN months_between_sales > 0 AND months_between_sales <= 4 and sales_amount_calc = 0 THEN 'Active'
WHEN /* months_between_sales > 0 and */ months_between_sales > 4 and sales_amount_calc = 0 THEN 'Lost'
ELSE 'Unknown'
END AS status
from customer_status
One way to solve to get cumulative sales amount partitioned on subset of potentially lost customers ( sales_amount = 0).
Cumulative amount for the customer partitioned
sum(months_between_sales) over (PARTITION BY customer_id ORDER by sales_date ASC rows unbounded preceding) as cumulative_amount,
How can it be changed to get sub-partitioned, for example when sales amount= 0 , in order to get lost correctly?
Does someone have ideas to translate the above logic into an
recursive query in Redshift?
Thank you

How to use window function to aggregate data based on date or rank column?

So I have a list of shipments and I have the the order total and the total for each individual shipment, but I'm struggling to come up with the code to create an additional column for cumulative shipments, which would include the current shipment, plus all previous shipments for that order. Here's a result of what I have so far:
OrderNo
ShipDate
OrderTotal
Shipment Total
Cumulative Shipments
Rank
22396
2022-04-04
639,964
2,983
639,966
3
22396
2022-03-31
639,964
5,626
639,966
2
22396
2022-02-24
639,964
631,355
639,966
1
So these are 3 separate shipments for the same order. The 1st shipments in row 3 is correct, but I need the cumulative shipments column for row 2 to be the shipments total sum of both, so $631,555 + 5,626. Following that same logic, row 1 should be the sum of all 3, which at that point would be equal to the order total of $639,964. Here's what that would look like:
OrderNo
ShipDate
OrderTotal
Shipment Total
Cumulative Shipments
Rank
22396
2022-04-04
639,964
2,983
639,964
3
22396
2022-03-31
639,964
5,626
636,981
2
22396
2022-02-24
639,964
631,355
631,355
1
I'm assuming the best way to accomplish this is using over(partition by ()), but I'm struggling to come up with the code. Here's what I have so far:
SELECT
OrderNo,
ShipDate,
OrderTotal,
[Shipment Total],
SUM([Shipment Total]) OVER(PARTITION BY OrderNo) AS [Cumulative Shipments],
[Rank]
FROM Shipments
WHERE OrderNo = '22396'
The [Rank] column is from an earlier CTE which calculates the rank of that shipment based on shipdate:
ROW_NUMBER() OVER(PARTITION BY d.OrderNo ORDER BY d.ShipDate) AS [Rank]
I need something like SUM([Shipment Total]) where rank is equal or less than the current rank. Same thing can be accomplished with the date column I'm sure, but just not sure how to finish the query
You seem to be half way there just missing an ordering criteria for a functioning cumulative sum, such as
select *,
Sum(ShipmentTotal)
over(partition by OrderNo
order by ShipDate rows between unbounded preceding and current row)
from Shipments;
I didn't bother building the CTE, but assuming you have some PK, you can self-join:
SELECT s.ShipmentsID, SUM(cumulative.ShipmentTotal) AS Sum
FROM Shipments s
LEFT JOIN Shipments cumulative ON cumulative.ShipDate <= s.ShipDate
GROUP BY s.ShipmentsID
ShipmentsID
Sum
1
631355.52
2
636982.29
3
639966.04

I have 3 rows per user, need to have one row (with 3 columns) per user instead

I'm creating a table with the earliest 3 purchases by customer along with the total count of purchases by said customer, using a CTE. I did this successfully with the query below, but it shows 3 rows for each user with a row for the first purchase date, 2nd purchase date, and 3rd purchase date as separate rows. I'm trying to show the 3 purchase dates as columns, with one row for each user, instead.
This table has hundreds of rows so I can't write the needed user IDs in the code. Any ideas? Is there a way to merge 3 CTEs or write code to spit out the earliest payment date, 2nd earliest, 3rd earliest, and total amount for the user as columns. Current code is below:
WITH cte_2
AS (SELECT customer_id,
payment_date,
Row_number()
OVER (
partition BY customer_id
ORDER BY payment_date ASC) AS purchase_number
FROM payment)
SELECT cte_2.customer_id,
cte_2.payment_date,
cte_2.purchase_number,
Count(payment_id) AS total_payments
FROM payment
INNER JOIN cte_2
ON payment.customer_id = cte_2.customer_id
WHERE purchase_number <= 3
GROUP BY cte_2.customer_id,
cte_2.payment_date,
purchase_number
ORDER BY customer_id ASC
Current Output with above code:
Preferred Output:
Using pandas you can use pivot:
df = df.set_index('customer_id')
pivot_df = df.pivot(columns='purchase_number', values='payment_dates')
# To improve readability of your columns you can add a prefix:
pivot_df = pivot_df.add_prefix('payment_')
pivot_df.merge(df['total_payments'], left_index=True, right_index=True).drop_duplicates()
When using:
df = pd.DataFrame({
'customer_id':[1,1,1,2,2,2,3],
'payment_dates':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-01'],
'purchase_number':[1,2,3,1,2,3,1],
'total_payments':[4,4,4,26,26,26,1]})
Our result is:
payment_1 payment_2 payment_3 total_payments
customer_id
1 2021-01-01 2021-01-02 2021-01-03 4
2 2021-01-04 2021-01-05 2021-01-06 26
3 2021-01-01 NaN NaN 1
if your sql product supports 'case when' then you can do it with:
WITH
cte_2
AS (SELECT payment_id,
Row_number()
OVER (
partition BY customer_id
ORDER BY payment_date ASC) AS purchase_number
FROM payment)
SELECT pmt.customer_id,
Count(case when cte_2.purchase_number=1 then 1 else null end) as [First Payment],
Count(case when cte_2.purchase_number=2 then 1 else null end) as [2nd Payment],
Count(case when cte_2.purchase_number=3 then 1 else null end) as [3rd Payment],
Count(pmt.payment_id) AS total_payments
FROM payment pmt
LEFT JOIN cte_2
ON pmt.payment_id=cte_2.payment_id
and cte_2.purchase_number <= 3
GROUP BY pmt.customer_id
ORDER BY pmt.customer_id ASC
CTE simply assigns payment numbers for each payment, we then join the payments table to that CTE by payment id using left join, because an inner join would remove payments with payment number > 3 (but we want to count them)

Using COUNT and UNION to extract data and support a scenario

I am dealing with a database with thousands of customers.
I am wanting to find groups of single customers who have exactly ONE qualifying discount voucher which is valid and exactly ONE non-qualifying voucher which is valid.
A qualifying voucher is one that has a minimum spend amount of £0.01 or more.
A non-qualifying voucher is one that does not have a minimum spend and is therefore £0.00
'Valid' refers to the 'from' date being today or before and the 'to' date being today or in the future
I have initially set up the query below but all this is doing is searching for all customers who have valid qualifying AND non-qualifying voucherS. I am trying to find customers who have JUST ONE valid qualifying voucher and JUST ONE non-qualifying voucher:
select CustomerId, VoucherId, MinimumSpendAmount, ValidFromDate, ValidToDate
from dbo.discountvoucher
where ValidFromDate <= 15/11/2013
and ValidToDate >= 15/11/2013
order by CustomerId
I think I need to split this into 2 separate SELECT statements, one looking for single customers with 1 qualifying voucher (using COUNT), and one looking for single customers with 1 non-qualifying voucher (using COUNT). And then combining them with a UNION. But I could be totally wrong...
Please can anybody help
You can use a sub select with a GROUP BY and HAVING CLAUSE to find the customers that match your criteria.
select CustomerId, VoucherId, MinimumSpendAmount, ValidFromDate, ValidToDate
from dbo.discountvoucher
where ValidFromDate <= 15/11/2013
and ValidToDate >= 15/11/2013
and CustomerId in
(select CustomerId
from dbo.discountvoucher
where ValidFromDate <= 15/11/2013
and ValidToDate >= 15/11/2013
group by CustomerId
having sum(case when MinimumSpendAmount > 0 then 1 else 0 end) = 1
and sum(case when MinimumSpendAmount = 0 then 1 else 0 end) = 1
)
order by CustomerId

Join two Queries so that the second query becomes a row in the results of query 1

I have two queries that I would like to combine so i can make a chart out of the results.
The results have to be very specific or the chart will not display the information properly
I am using MS SQL in Crystal Reports 11
Below is the results I am looking for.
Date Invoice Type Amount
2012/08 Customer Payment 500
2012/08 Customer Invoice 1000
2012/08 Moving Balance 1500
2012/09 Customer Invoice 400
2012/09 Moving Balance 1900
2012/10 Interest 50
2012/10 Moving Balance 1950
So the First query returns the following results
Date Invoice Type Amount
2012/08 Customer Payment 500
2012/08 Customer Invoice 1000
2012/09 Customer Invoice 400
2012/10 Interest 50
and the second query returns
Date Invoice Type Amount
2012/08 Moving Balance 1500
2012/09 Moving Balance 1900
2012/10 Moving Balance 1950
The second query is very long and complicated with a join .
What is the best way of joining these two queries
so that I have one column called invoice Type ( as the chart is based on this field)
that covers all the invoice types plus the moving balance
I assume that the place of the Moving Balance rows inside the result set is important.
You can do something like this:
select date, invoice_type, amount
from
(
select date, invoice_type, amount from query1
union all
select date, invoice_type, amount from query2
)
order by date, case invoice_type when 'Moving Balance' then 1 else 0 end
This first appends the results of the second query to the results of the first query and then reorders the resulting list first by date and then by the invoice type in such a way that the row with Moving balance will come last.
With the actual queries you have given, it should look something like this:
select date, invoice_type, amount
from
(
SELECT
CONVERT(VARCHAR(7),case_createddate, 111) AS Date,
case_invoicetype as invoice_type,
Sum(case_totalexvat) as amount
FROM cases AS ca
WHERE case_primaryCompanyid = 2174 and
datediff(m,case_createddate,getDate())
union all
select
CONVERT(VARCHAR(7),ca.case_createddate, 111) AS Date,
'Moving Balance' as Invoice_Type,
sum(mb.Amount) as Amount
from
cases as ca
left join (
select
case_primaryCompanyId as ID,
case_createdDate,
case_TotalExVat as Amount
from
cases
) mb
on ca. case_primaryCompanyId = mb.ID
and ca.case_createdDate >= mb.case_CreatedDate
where
ca.case_primaryCompanyId = 2174 and
ca.case_createdDate > DATEADD(m, -12, current_timestamp)
group by
case_primaryCompanyId,
CONVERT(VARCHAR(7),ca.case_createddate, 111)
order by ca.case_primaryCompanyid, CONVERT(VARCHAR(7),ca.case_createddate, 111)
)
order by date, case invoice_type when 'Moving Balance' then 1 else 0 end
You can use Union and can use Order by clause
Select * from (Query 1
Union
Query 2
) as a Order by a.Date Asc