Calculate days from the last time a customer has zero balance - sql

I have a Table with these columns :
TransID, CustomerID, Date, Credit, Debit, CurrentBalance
I want to know how many days have passed since the customer had a clear balance, because I don't give credit if they have not cleared their balance in the last 14 days,
let's speak about a specific customer:
TransID, CustomerID, Date, Credit, Debit, CurrentBalance
1 1 01/01/2014 0 50 50
2 1 01/05/2014 50 0 0
3 1 06/28/2014 0 100 100
Now on 6/29/14 they have only 1 day since their balance was clear, but if I calculate from the last row with CurrentBalance = 0, it is more than 175 days

Logically, the last time the balance was zero was the instant before the last sale was made when the balance was zero, which can be identified by the sale amount equalling the balance after the sale, ie Debit = CurrentBalance - this confition can only happen when the balance before the sale was zero.
select
c.id customerid,
coalesce(datediff(day, max(t.date), getdate()), 0) days_clear
from customer c
left join transaction t on t.CustomerID = c.id
and Debit = CurrentBalance
group by customerid
Using the customer table and a left join to the transaction table allows for the case when customer has never made a transaction (so his day count is zero).

Is this what you are looking for?
select customerid,
datediff(day, max(case when balance = 0 then date end), getdate())
from table t
group by customerid;
This returns the number of days since the most recent 0 balance record.
EDIT:
Now I think I understand. The problem is that the 0 balance lasts until 6/18/2014. With SQL Server 2012 or later, we can handle this with lead():
select customerid,
datediff(day, max(case when balance = 0 then nextdate end), getdate())
from (select t.*, lead(date) over (partition by customerid order by date) as nextdate
from table t
) t
group by customerid;
T

Related

Window function or Recursive query in Redshift

I try to classify customer based on monthly sales. Logic for calculation:
new customer – it has sales and appears for first time or has sales and returned after being lost (after 4 month period, based on sales_date)
active customer - not new and not lost.
lost customer - no sales and period (based on sales_date) more than 4 months
This is the desired output I'm trying to achieve:
The below Window function in Redshift classify however it is not correct.
It classified lost when difference between month > 4 in one row, however it did not classify lost if it was previous lost and revenue 0 until new status appear. How it can be updated?
with customer_status (
select customer_id,customer_name,sales_date,sum(sales_amount) as sales_amount_calc,
nvl(DATEDIFF(month, LAG(reporting_date) OVER (PARTITION BY customer_id ORDER by sales_date ASC), sales_date),0) AS months_between_sales
from customer
GROUP BY customer_id,customer_name,sales_date
)
select *,
case
WHEN months_between_sales = 0 THEN 'New'
WHEN months_between_sales > 0 THEN 'Active'
WHEN months_between_sales > 0 AND months_between_sales <= 4 and sales_amount_calc = 0 THEN 'Active'
WHEN /* months_between_sales > 0 and */ months_between_sales > 4 and sales_amount_calc = 0 THEN 'Lost'
ELSE 'Unknown'
END AS status
from customer_status
One way to solve to get cumulative sales amount partitioned on subset of potentially lost customers ( sales_amount = 0).
Cumulative amount for the customer partitioned
sum(months_between_sales) over (PARTITION BY customer_id ORDER by sales_date ASC rows unbounded preceding) as cumulative_amount,
How can it be changed to get sub-partitioned, for example when sales amount= 0 , in order to get lost correctly?
Does someone have ideas to translate the above logic into an
recursive query in Redshift?
Thank you

SQL Divide previous row balance by current row balance and insert that value into current rows column "Growth"

I have a table where like this.
Year
ProcessDate
Month
Balance
RowNum
Calculation
2022
20220430
4
22855547
1
2022
20220330
3
22644455
2
2022
20220230
2
22588666
3
2022
20220130
1
33545444
4
2022
20221230
12
22466666
5
I need to take the previous row of each column and divide that amount by the current row.
Ex: Row 1 calculation should = Row 2 Balance / Row 1 Balance (22644455/22855547 = .99% )
Row 2 calculation should = Row 3 Balance / Row 2 Balance etc....
Table is just a Temporary table I created titled #MonthlyLoanBalance2.
Now I just need to take it a step further.
Let me know what and how you would go about doing this.
Thank you in advance!
Insert into #MonthlytLoanBalance2 (
Year
,ProcessDate
,Month
,Balance
,RowNum
)
select
--CloseYearMonth,
left(ProcessDate,4) as 'Year',
ProcessDate,
--x.LOANTypeKey,
SUBSTRING(CAST(x.ProcessDate as varchar(38)),5,2) as 'Month',
sum(x.currentBalance) as Balance
,ROW_NUMBER()over (order by ProcessDate desc) as RowNum
from
(
select
distinct LoanServiceKey,
LoanTypeKey,
AccountNumber,
CurrentBalance,
OpenDateKey,
CloseDateKey,
ProcessDate
from
cu.LAFactLoanSnapShot
where LoanStatus = 'Open'
and LoanTypeKey = 0
and ProcessDate in (select DateKey from dimDate
where IsLastDayOfMonth = 'Y'
and DateKey > convert(varchar, getdate()-4000, 112)
)
) x
group by ProcessDate
order by ProcessDate desc;``
I am assuming your data is already prepared as shown in the table. Now you can try Lead() function to resolve your issue. Remember format() function is used for taking only two precision.
SELECT *,
FORMAT((ISNULL(LEAD(Balance,1) OVER (ORDER BY RowNum), 1)/Balance),'N2') Calculation
FROM #MonthlytLoanBalance2

Can my query be made more efficient, and how about adding a GROUP BY clause?

I am trying to write a query that returns:
The total amount of transactions that occurred before a date range, for a particular customer.
The total amount of transactions that occurred within a date range, for a particular customer.
The total amount of payments that occurred before a date range, for a particular customer.
The total amount of payments that occurred within a date range, for a particular customer.
To that end, I've come up with the following query.
declare #StartDate DATE = '2016-08-01'
declare #EndDate DATE = '2016-08-31'
declare #BillingCategory INT = 0
select c.Id, c.Name, c.StartingBalance,
(select coalesce(sum(Amount), 0) from Transactions t where t.CustomerId = c.Id and t.[Date] < #StartDate) xDebits,
(select coalesce(sum(Amount), 0) from Transactions t where t.CustomerId = c.Id and t.[Date] >= #StartDate and t.[Data] <= #EndDate) Debits,
(select coalesce(sum(Amount), 0) from Payments p where p.CustomerId = c.Id and p.[Date] < #StartDate) xCredits,
(select coalesce(sum(Amount), 0) from Payments p where p.CustomerId = c.Id and p.[Date] >= #StartDate and p.[Date] <= #EndDate) Credits
from customers c
where c.BillingCategory in (0,1,2,3,4,5)
This query seems to give the results I want. I used the subqueries because I couldn't seem to figure out how to accomplish the same thing using JOINs. But I have a few questions.
Does this query retrieve the transaction and payment data for every single customer before filtering it according to my WHERE condition? If so, that seems like a big waste. Can that be improved?
I'd also like to add a GROUP BY to total each payment and transaction column by BillingCategory. But how can you add a GROUP BY clause here when the SELECTed columns are limited to aggregate functions if they are not in the GROUP BY clause?
The Transactions and Payments tables both have foreign keys to the Customers table.
Sample data (not real)
Customers:
Id Name BillingCategory
----- ------- ---------------
1 'ABC' 0
2 'DEF' 1
3 'GHI' 0
Transactions:
Id CustomerId Date Amount
----- ---------- ------------ ------
1 2 '2016-08-01' 124.90
2 2 '2016-08-04' 37.23
3 1 '2016-08-27' 450.02
Payments:
Id CustomerId Date Amount
----- ---------- ------------ ------
1 1 '2016-09-01' 50.00
2 1 '2016-09-23' 75.00
3 2 '2016-09-01' 100.00
You could build your sums seperately for Transactions and Payments in a CTE and then join them together:
WITH
CustomerTransactions AS
(
SELECT CustomerId,
SUM(CASE WHEN [Date] < #StartDate THEN 1 ELSE 0 END * COALESCE(Amount, 0)) AS xDebits,
SUM(CASE WHEN [Date] >= #StartDate THEN 1 ELSE 0 END * COALESCE(Amount, 0)) AS Debits
FROM Transactions
GROUP BY CustomerId
),
CustomerPayments AS,
(
SELECT CustomerId,
SUM(CASE WHEN [Date] < #StartDate THEN 1 ELSE 0 END * COALESCE(Amount, 0)) AS xCredits,
SUM(CASE WHEN [Date] >= #StartDate THEN 1 ELSE 0 END * COALESCE(Amount, 0)) AS Credits
FROM Payments
GROUP BY CustomerId
)
SELECT C.Id, c.Name, c.StartingBalance,
COALESCE(T.xDebits, 0) AS xDebits,
COALESCE(T.Debits, 0) AS Debits,
COALESCE(P.xCredits, 0) AS xCredits,
COALESCE(P.Credits, 0) AS Credits
FROM Custormers C
LEFT OUTER JOIN CustomerTransactions T ON T.CustomerId = C.Id
LEFT OUTER JOIN CustomerPayments P ON P.CustomerId = C.Id
WHERE C.BillingCategory IN(0, 1, 2, 3, 4, 5);
You can do with sub-queries to be more efficient. Pre-query grouped by each customer for only those customers who qualify by the categories in question. These sub-queries will always result in an at-most, 1 record per customer so you don't get a Cartesian result. Get that for your debits and credits and re-join back to your master list of customers with a left-join in case one side or the other (debits/credits) may not exist.
declare #StartDate DATE = '2016-08-01'
declare #EndDate DATE = '2016-08-31'
declare #BillingCategory INT = 0
select
c.ID,
c.Name,
c.StartingBalance,
coalesce( AllDebits.xDebits, 0 ) DebitsPrior,
coalesce( AllDebits.Debits, 0 ) Debits
coalesce( AllCredits.xCredits, 0 ) CreditsPrior,
coalesce( AllCredits.Credits, 0 ) Credits
from
customers c
LEFT JOIN
( select t.CustomerID,
sum( case when t.[Date] < #StartDate then Amount else 0 end ) xDebits,
sum( case when t.[Date] >= #StartDate then Amount else 0 end ) Debits
from
customers c1
JOIN Transactions t
on c1.CustomerID = t.CustomerID
where
c1.BillingCategory in (0,1,2,3,4,5)
group by
t.CustomerID ) AllDebits
on c.CustomerID = AllDebits.CustomerID
LEFT JOIN
( select p.CustomerID,
sum( case when p.[Date] < #StartDate then Amount else 0 end ) xCredits,
sum( case when p.[Date] >= #StartDate then Amount else 0 end ) Credits
from
customers c1
JOIN Payments p
on c1.CustomerID = p.CustomerID
where
c1.BillingCategory in (0,1,2,3,4,5)
group by
p.CustomerID ) AllCredits
on c.CustomerID = AllCredits.CustomerID
where
c.BillingCategory in (0,1,2,3,4,5)
COMMENT ADDITION
With respect to Thomas's answer, yes they are close. My version also adds the join to the customer table for the specific billing category and here is why. I don't know the size of your database, how many customers, how many transactions. If you are dealing with a large amount that DOES have performance impact, Thomas's version is querying EVERY customer and EVERY transaction. My version is only querying the qualified customers by the billing category criteria you limited.
Again, not knowing data size, if you are dealing with 100k records may be no noticeable performance. If you are dealing with 100k CUSTOMERS, could be a totally different story.
#JonathanWood, correct, but my version has each internal subquery inclusive of the cus

Need to add total # of orders to summary query

In the following, I need to add here the total of orders per order type which is IHORDT. I tried count(t01.ihordt), but its not a valid. I need this order total to get average amount per order.
Data expected:
Current:
IHORDT current year previous year
RTR 100,000 90,000
INT 2,000,000 1,500,000
New change: add to the above one column:
Total orders
RTR 100
INT 1000
SELECT T01.IHORDT
-- summarize by current year and previous year
,SUM( CASE WHEN YEAR(IHDOCD) = YEAR(CURRENT TIMESTAMP) - 1
THEN (T02.IDSHP#*T02.IDNTU$) ELSE 0 END) AS LastYear
,SUM( CASE WHEN YEAR(IHDOCD) = YEAR(CURRENT TIMESTAMP)
THEN (T02.IDSHP#*T02.IDNTU$) ELSE 0 END) AS CurYear
FROM ASTDTA.OEINHDIH
T01 INNER JOIN
ASTDTA.OEINDLID T02
ON T01.IHORD# = T02.IDORD#
WHERE T01.IHORDT in ('RTR', 'INT')
--------------------------------------------------------
AND ( YEAR(IHDOCD) = YEAR(CURRENT TIMESTAMP) - 1
OR YEAR(IHDOCD) = YEAR(CURRENT TIMESTAMP))
GROUP BY T01.IHORDT
To receive a count of records in a group you need to use count(*).
So here is a generic example:
select order_type,
sum(order_amount) as total_sales,
count(*) as number_of_orders
from order_header
group by order_type;

SQL query to identify seasonal sales items

I need a SQL query that will identify seasonal sales items.
My table has the following structure -
ProdId WeekEnd Sales
234 23/04/09 543.23
234 30/04/09 12.43
432 23/04/09 0.00
etc
I need a SQL query that will return all ProdId's that have 26 weeks consecutive 0 sales. I am running SQL server 2005. Many thanks!
Update: A colleague has suggested a solution using rank() - I'm looking at it now...
Here's my version:
DECLARE #NumWeeks int
SET #NumWeeks = 26
SELECT s1.ProdID, s1.WeekEnd, COUNT(*) AS ZeroCount
FROM Sales s1
INNER JOIN Sales s2
ON s2.ProdID = s1.ProdID
AND s2.WeekEnd >= s1.WeekEnd
AND s2.WeekEnd <= DATEADD(WEEK, #NumWeeks + 1, s1.WeekEnd)
WHERE s1.Sales > 0
GROUP BY s1.ProdID, s1.WeekEnd
HAVING COUNT(*) >= #NumWeeks
Now, this is making a critical assumption, namely that there are no duplicate entries (only 1 per product per week) and that new data is actually entered every week. With these assumptions taken into account, if we look at the 27 weeks after a non-zero sales week and find that there were 26 total weeks with zero sales, then we can deduce logically that they had to be 26 consecutive weeks.
Note that this will ignore products that had zero sales from the start; there has to be a non-zero week to anchor it. If you want to include products that had no sales since the beginning, then add the following line after `WHERE s1.Sales > 0':
OR s1.WeekEnd = (SELECT MIN(WeekEnd) FROM Sales WHERE ProdID = s1.ProdID)
This will slow the query down a lot but guarantees that the first week of "recorded" sales will always be taken into account.
SELECT DISTINCT
s1.ProdId
FROM (
SELECT
ProdId,
ROW_NUMBER() OVER (PARTITION BY ProdId ORDER BY WeekEnd) AS rownum,
WeekEnd
FROM Sales
WHERE Sales <> 0
) s1
INNER JOIN (
SELECT
ProdId,
ROW_NUMBER() OVER (PARTITION BY ProdId ORDER BY WeekEnd) AS rownum,
WeekEnd
FROM Sales
WHERE Sales <> 0
) s2
ON s1.ProdId = s2.ProdId
AND s1.rownum + 1 = s2.rownum
AND DateAdd(WEEK, 26, s1.WeekEnd) = s2.WeekEnd;