Performing calculations based on dates in oracle - sql

I have the following tables.
Accounts(account_number*,balance)
Transactions(account_number*,transaction_number*,date,amount,type)
Date is the date that the transaction happened. Amount is the amount of the transaction
and it can have a positive or a negative value, dependent of the type(Withdrawal -,Deposit +). I think the type is irrelevant here as the amount is already given in the proper way.
I need to write a query which points out the account_number of the accounts that have at least once had negative balance.
Here's some sample data from the Transactions table, ordered by account_number and date.
account_number transaction_number date amount type
--------------------------------------------------------------------
1 2 02/03/2013 -20000 withdrawal
1 3 03/15/2013 300 deposit
1 1 01/01/2013 100 deposit
2 1 04/15/2013 235236 deposit
3 1 06/15/2013 500 deposit
4 1 03/01/2013 10 deposit
4 2 04/01/2013 80 deposit
5 1 11/11/2013 10000 deposit
5 2 12/11/2013 20000 deposit
5 3 12/13/2013 -10002 withdrawal
6 1 03/15/2013 102300 deposit
7 1 03/15/2013 100 deposit
8 1 08/08/2013 133990 deposit
9 1 05/09/2013 10000 deposit
9 2 06/01/2013 300 deposit
9 3 10/11/2013 23 deposit

Something like this with an analytic to keep a running balance for an account:
SELECT DISTINCT account_number
FROM ( SELECT account_number
,SUM(amount)
OVER (PARTITION BY account_number ORDER BY date) AS running_balance
FROM transactions
) x
WHERE running_balance < 0
Explanation:
It is using an analytic function: the PARTITION BY breaks the table into groups identified by the account number. Within each group, the data is ordered by date. Then there is a walk through each element in the ordered group and the SUM function is applied (by default summing everything from the beginning of the group to the current row). This gives you a running balance. Just run the inner query on its own and take a look at the output, then read a bit about analytic queries. They are pretty cool.

Related

Need to group records based on matching reversal in sql

I have a tricky scenario to aggregate the data.
Data in my source table is as follows.
CustomerId Transaction Type Transaction Amount
1 Payment 100
1 ReversePayment -100
1 payment 100
1 ReversePayment -100
1 Payment 100
1 Payment 100
Requirement is as follows:
If the payment as a assoociated Reversepayment with matched amount, sum these two records.
If the payment does not have an associated Reverse payment, consider it as orphan(dont sum it).
I want output to be like this.
CustomerId Transaction Type Transaction Amount
1 Payment,ReversePayment 0
1 payment,ReversePayment 0
1 payment 100
1 Payment 100
In this scenario,
First record which is payment has an associated reverse payment (2nd record), Hence the sum becomes 0
Third record which is payment has an associated reverse payment (4th record), then the sum becomes 0
Fifth and sixth does not have associated reversals. dont sum these records.
Second Example:
Data in the source as follows:
CustomerId Transaction Type Transaction Amount
1 Payment 100
1 ReversePayment -100
1 payment 300
1 ReversePayment -300
1 Payment 400
1 Payment 500
Expected Output
CustomerId Transaction Type Transaction Amount
1 Payment,ReversePayment 0
1 payment,ReversePayment 0
1 payment 400
1 Payment 500
Second example requirement:
-As first and second records (payment and its associated reverse payment got
matched) ,sum these two records, output is 0.
- As third and fourth records (payment and its associated reverse payment got
matched), sum these two records, output is 0.
- Fifth and sixth does not have associated reversals. don't sum these records.
I got solutions in group, but data is not always guaranteed to have orphan records as 'payments'. Some times they are 'Payments' and some times they are 'ReversePayments'. Can some help me get ouptut like the below (using rank or rownumber functions ) so that i can group by using RRR column.
CustomerId Transaction Type Transaction Amount RRR
1 Payment 100 1
1 ReversePayment -100 1
1 payment 100 2
1 ReversePayment -100 2
1 Payment 100 3
1 Payment 100 4
CustomerId Transaction Type Transaction Amount RRR
1 Payment 100 1
1 ReversePayment -100 1
1 payment 300 2
1 ReversePayment -300 2
1 Payment 400 3
1 Payment 500 4
You can enumerate the different types and then aggregate:
select customerid,
listagg(ttype, ',') within group (order by ttype) as types,
sum(amount) as amount
from (select t.*,
row_number() over (partition by customerid, ttype, amount order by customerid) as seqnum
from t
) t
group by customerid, seqnum;
Edited to include your second scenario:
Using rownum to enforce inherent ordering (i.e. transactions happened in the order you've listed ), since your example is missing a transaction id or transaction time
SQL> select * from trans_data2;
CUSTOMER_ID TRANSACTION_TY TRANSACTION_AMOUNT
----------- -------------- ------------------
1 Payment 100
1 ReversePayment -100
1 payment 300
1 ReversePayment -300
1 Payment 400
1 Payment 500
6 rows selected.
SQL> select customer_id,
2 case
3 when upper(next_transaction) = 'REVERSEPAYMENT' then transaction_type||','||next_transaction
4 else transaction_type
5 end transaction_type,
6 case
7 when upper(next_transaction) = 'REVERSEPAYMENT' then transaction_amount + next_transaction_amount
8 else transaction_amount
9 end transaction_amount
10 from (
11 select customer_id, transaction_type, transaction_amount,
12 lead (transaction_type) over ( partition by customer_id order by transaction_id ) next_transaction,
13 nvl(lead (transaction_amount) over ( partition by customer_id order by transaction_id),0) next_transaction_amount
14 from ( select rownum transaction_id, t.* from trans_data2 t )
15 ) where upper(transaction_type) = 'PAYMENT'
16 ;
CUSTOMER_ID TRANSACTION_TYPE TRANSACTION_AMOUNT
----------- ----------------------------- ------------------
1 Payment,ReversePayment 0
1 payment,ReversePayment 0
1 Payment 400
1 Payment 500

SQL How to calculate Average time between Order Purchases? (do sql calculations based on next and previous row)

I have a simple table that contains the customer email, their order count (so if this is their 1st order, 3rd, 5th, etc), the date that order was created, the value of that order, and the total order count for that customer.
Here is what my table looks like
Email Order Date Value Total
r2n1w#gmail.com 1 12/1/2016 85 5
r2n1w#gmail.com 2 2/6/2017 125 5
r2n1w#gmail.com 3 2/17/2017 75 5
r2n1w#gmail.com 4 3/2/2017 65 5
r2n1w#gmail.com 5 3/20/2017 130 5
ation#gmail.com 1 2/12/2018 150 1
ylove#gmail.com 1 6/15/2018 36 3
ylove#gmail.com 2 7/16/2018 41 3
ylove#gmail.com 3 1/21/2019 140 3
keria#gmail.com 1 8/10/2018 54 2
keria#gmail.com 2 11/16/2018 65 2
What I want to do is calculate the time average between purchase for each customer. So lets take customer ylove. First purchase is on 6/15/18. Next one is 7/16/18, so thats 31 days, and next purchase is on 1/21/2019, so that is 189 days. Average purchase time between orders would be 110 days.
But I have no idea how to make SQL look at the next row and calculate based on that, but then restart when it reaches a new customer.
Here is my query to get that table:
SELECT
F.CustomerEmail
,F.OrderCountBase
,F.Date_Created
,F.Total
,F.TotalOrdersBase
FROM #FullBase F
ORDER BY f.CustomerEmail
If anyone can give me some suggestions, that would be greatly appreciated.
And then maybe I can calculate value differences (in percentage). So for example, ylove spent $36 on their first order, $41 on their second which is a 13% increase. Then their second order was $140 which is a 341% increase. So on average, this customer increased their purchase order value by 177%. Unrelated to SQL, but is this the correct way of calculating a metric like this?
looking to your sample you clould try using the diff form min and max date divided by total
select email, datediff(day, min(Order_Date), max(Order_Date))/(total-1) as avg_days
from your_table
group by email
and for manage also the one order only
select email,
case when total-1 > 0 then
datediff(day, min(Order_Date), max(Order_Date))/(total-1)
else datediff(day, min(Order_Date), max(Order_Date)) end as avg_days
from your_table
group by email
The simplest formulation is:
select email,
datediff(day, min(Order_Date), max(Order_Date)) / nullif(total-1, 0) as avg_days
from t
group by email;
You can see this is the case. Consider three orders with od1, od2, and od3 as the order dates. The average is:
( (od2 - od1) + (od3 - od2) ) / 2
Check the arithmetic:
--> ( od2 - od1 + od3 - od2 ) / 2
--> ( od3 - od1 ) / 2
This pretty obviously generalizes to more orders.
Hence the max() minus min().

Finding financial patterns of rows - sql with recursion

Let's consider I have a following table:
id date transaction_type amount
1 2017-01-01 deposit 30
1 2017-01-01 deposit 20
1 2017-01-02 withdrawal -20
1 2017-01-02 deposit 40
1 2017-01-04 deposit 50
1 2017-01-05 withdrawal -100
1 2017-01-07 withdrawal -10
1 2017-01-09 deposit 100
1 2017-01-11 withdrawal -50
1 2017-01-21 deposit 20
1 2017-01-22 deposit 30
1 2017-01-31 withdrawal -60
2 2017-01-01 deposit 200
... ... ... ...
The dates in the table are ordered from the oldest to the newest for each id (timestamp is not visible). I would like to find how many times a specific transaction pattern took place:
deposit -> deposit -> withdrawal, and the time between the first deposit and the withdrawal is 7 days or less.
So for the customer with id = 1, I would have 2 such cases (the third one does not satisfy the time condition).
As a result, I would like to get the following table:
id number_of_times
1 2
2 ...
... ...
Is it something that can be done in SQL? Would I require recursion to get to the final table?
UPDATE:
As correctly pointed out, there are no intervening transactions - but what if there were some? Like any number of other transactions between 1st and 2nd deposit etc.
Assuming you have no intervening transactions:
select id, count(*)
from (select t.*,
lead(transaction_type) over (partition by id order by date) as next_tt,
lead(transaction_type, 2) over (partition by id order by date) as next_tt_2,
lead(date, 2) over (partition by id order by date) as next_date_2
from t
) t
where transaction_type = 'deposit' and next_tt = 'deposit' and
next_tt_2 = 'withdrawal' and
next_date_2 < date + interval '7 day'
group by id;

Average time between each transaction for customers

How can I know the Average time between each transaction for customers (In seconds)?
Time Customer ID Transaction
11/08/2020 00:00:01 1 111
11/08/2020 00:02:00 2 0
11/08/2020 00:02:07 1 0
11/08/2020 00:03:09 3 412
11/08/2020 00:04:00 1 0
Before the Expected table I need to show the required steps:
for Customer ID 1 has 3 transactions , the differences transactions.
the difference between first and second transaction 126 seconds.
the difference between second and third transaction 113 seconds.
The Expected table:
Customer ID Average time between each transactions for customer
1 (126+113)/3
2
3
The average time is the total time divided by one less than the number of transactions. So:
select customerId,
(case when count(*) > 1
then datediff(second, min(time), max(time)) / (count(*) - 1)
end) as avg_time
from t
group by customerId;
Note: SQL Server does integer division. If you want a non-integer as a result, you might want a conversion or perhaps count(*) - 1.0 in the expression.
This does assume that the times are only increasing (which seems like a reasonable assumption for this type of problem).

Isolating the date a value turns 0 and aggregating another value from that date back

I'm looking to see two things:
When a customer closed all of their accounts with us (date when
accounts goes to 0)
The total interactions a customer had with us up
until that point (sum of interactions from when accounts was a
number greater than one).
The total interactions a customer had with us up
until that point (sum of interactions from when accounts was a
number greater than one).
Basically I'm trying to get from the top table to the bottom table in the attached image.
Customer month Accounts Interactions
12345 Jan-15 3 5
12345 Feb-15 3 1
12345 Mar-15 2 7
12345 Apr-15 1 3
12345 May-15 1 9
12345 Jun-15 1 2
12345 Jul-15 0 3
67890 Feb-15 1 4
67890 Mar-15 1 4
67890 Apr-15 1 9
67890 May-15 0 5
Customer Month close date Interactions
12345 Jul-15 30
67890 May-15 23
When I first read the question it sounded like there would be a neat solution with window functions, but after re-reading it, I don't think that's necessary. Assuming that closing his last account would be the last interaction a customer would have with you, you just need the last interaction date per customer, which means this problem can be solved with simple aggregate functions:
SELECT customer, MAX(month), SUM(interactions)
FROM mytable
GROUP BY customer
To get the last three months you need an OLAP-function:
SELECT Customer, MAX(months), SUM(Interactions)
FROM
(
SELECT Customer, month, Interactions
FROM mytable
QUALIFY
-- only closed accounts
MIN(Accounts) OVER (PARTITION BY Customer) = 0
-- last three months
AND month >= oADD_MONTHS(MAX(month) OVER (PARTITION BY Customer), -3)
) AS dt
GROUP BY customer