Let's consider I have a following table:
id date transaction_type amount
1 2017-01-01 deposit 30
1 2017-01-01 deposit 20
1 2017-01-02 withdrawal -20
1 2017-01-02 deposit 40
1 2017-01-04 deposit 50
1 2017-01-05 withdrawal -100
1 2017-01-07 withdrawal -10
1 2017-01-09 deposit 100
1 2017-01-11 withdrawal -50
1 2017-01-21 deposit 20
1 2017-01-22 deposit 30
1 2017-01-31 withdrawal -60
2 2017-01-01 deposit 200
... ... ... ...
The dates in the table are ordered from the oldest to the newest for each id (timestamp is not visible). I would like to find how many times a specific transaction pattern took place:
deposit -> deposit -> withdrawal, and the time between the first deposit and the withdrawal is 7 days or less.
So for the customer with id = 1, I would have 2 such cases (the third one does not satisfy the time condition).
As a result, I would like to get the following table:
id number_of_times
1 2
2 ...
... ...
Is it something that can be done in SQL? Would I require recursion to get to the final table?
UPDATE:
As correctly pointed out, there are no intervening transactions - but what if there were some? Like any number of other transactions between 1st and 2nd deposit etc.
Assuming you have no intervening transactions:
select id, count(*)
from (select t.*,
lead(transaction_type) over (partition by id order by date) as next_tt,
lead(transaction_type, 2) over (partition by id order by date) as next_tt_2,
lead(date, 2) over (partition by id order by date) as next_date_2
from t
) t
where transaction_type = 'deposit' and next_tt = 'deposit' and
next_tt_2 = 'withdrawal' and
next_date_2 < date + interval '7 day'
group by id;
Related
I have invoices pending payment, every invoice has two dates, first when the invoice is required to pay and the other when the invoice is paid. I want to know in a period of time the max debt and the avg debt
This is the table
Id Invoice Amount InvoiceDate InvoicePayment
----------- ------- ----------- ----------- -------------
1 Bill 1 314 2019-01-20 2019-03-01
2 Bill 2 205 2019-01-14 2019-02-18
3 Bill 3 90 2019-02-04 2019-02-06
4 Bill 4 456 2019-01-03 2019-04-27
I would like to know the max debt amount in february and the avg debt
You can unpivot with cross apply, and use a window sum to compute the "running" debt at each given point in time. The rest is just filtering and aggregation:
select avg(debt) avg_debt, max(debt) max_debt
from (
select x.dt, sum(x.amount) over(order by x.dt) debt
from mytable t
cross apply (values (invoicedate, amount), (invoicepayment, -amount)) as x(dt, amount)
) t
where dt >= '20200201' and dt < '20200301'
I have the below table where I will need to compute the rolling average and standard deviation based on the dates. I have listed below the tables and expected results. I am trying to compute the rolling average for an id based on date. rollAvgA is computed based on metricA. For example, for the first occurrence of id for a particular date the result should return zero as it does not have any preceding values. Please let me know how this can be accomplished?
Current Table :
Date id metricA
8/1/2019 100 2
8/2/2019 100 3
8/3/2019 100 2
8/1/2019 101 2
8/2/2019 101 3
8/3/2019 101 2
8/4/2019 101 2
Expected Table :
Date id metricA rollAvgA
8/1/2019 100 2 0
8/2/2019 100 3 2.5
8/3/2019 100 2 2.3
8/1/2019 101 2 0
8/2/2019 101 3 2.5
8/3/2019 101 2 2.3
8/4/2019 101 2 2.25
You seem to want a cumulative average. This is basically:
select t.*,
avg(metricA * 1.0) over (partition by id order by date) as rollingavg
from t;
The only caveat is that the first value is an average of one value. To handle this, use a case expression:
select t.*,
(case when row_number() over (partition by id order by date) > 1
then avg(metricA * 1.0) over (partition by id order by date)
else 0
end) as rollingavg
from t;
Currently, I need a simple thing:
sale_date
Gross
SUM_GROSS
2018-01-01
1
6
2018-01-02
2
6
2018-01-03
3
6
I know this question already mentioned before, the difference now, is that I need to calculate a sum based on selected dates. (I use BigQuery)
SUM(SALES.GrossValueBaseCurrency) OVER(PARTITION BY ???) AS SUM_GROSS
If I will use
SUM(SALES.GrossValueBaseCurrency) OVER(PARTITION BY SALE.SALE_DATE) AS SUM_GROSS
It will give me what I would like ONLY if I will select specific ONE day.
How can I make it work, so if I will select different dates, SUM_GROSS will repeat the SUM of ALL gross values for a selected period of time?
SAMPLE DATA and Expectations:
Expecting 60 in SUM_GROSS column
Row SALE_DATE GROSS SUM_GROSS
1 25/08/2018 10.00 60
2 04/10/2018 10.00 60
3 04/07/2018 10.00 60
4 01/03/2018 10.00 60
5 10/02/2018 10.00 60
6 10/01/2018 10.00 60
If you will query this table result should be :
SELECT SUM(GROSS) AS GROSS, SUM_GROSS FROM TABLE
WHERE SALE_DATE BETWEEN 01/01/2018 AND 01/04/2018
GROUP BY SUM_GROSS
RESULT:
GROSS SUM_GROSS
30 30
I think you want conversation in partition clause:
SUM(SALES.GrossValueBaseCurrency) OVER (PARTITION BY EXTRACT(YEAR from SALE.SALE_DATE), EXTRACT(MONTH from SALE.SALE_DATE)) AS SUM_GROSS
EDIT :
SELECT . . .,
SUM(SALES.GrossValueBaseCurrency) OVER () AS SUM_GROSS
FROM SALES s
WHERE SALE.SALE_DATE BETWEEN "2018-01-01 AND "2018-02-01"
Is this what you are looking for?
SUM(CASE WHEN sales.sale_date = '2018-01-01'
THEN SALES.GrossValueBaseCurrency
ELSE 0
END) OVER () AS sales_20180101
Please consider the following table transaction: a company regularly sends invoices to their customers that are part of the same order. The companies' clients will often pay only once per so many weeks.
(trans_date in format yyyy-mm-dd)
id order_id trans_type trans_date trans_amount
----------------------------------------------------------
1 1 invoice 2017-01-10 100
2 1 invoice 2017-05-23 150
3 1 invoice 2017-05-28 200
4 2 invoice 2017-03-01 700
5 2 payment 2017-06-16 700
6 1 payment 2017-10-12 450
7 3 invoice 2017-06-24 199
The company would like to see on what date each invoice was paid for. For example: invoice (id) 1 (part of order_id=1 group) was sent on 2017-01-10 and paid on 2017-10-12 (id=6). Invoice with id=7 has not been paid at all.
The desired output would be the payment date for each invoice (payment_date):
id order_id trans_type trans_date trans_amount payment_date
--------------------------------------------------------------------------
1 1 invoice 2017-01-10 100 2017-10-12
2 1 invoice 2017-05-23 150 2017-10-12
3 1 invoice 2017-05-28 200 2017-10-12
4 2 invoice 2017-03-01 700 2017-06-16
5 2 payment 2017-06-16 700
6 1 payment 2017-10-12 450
7 3 invoice 2017-06-24 199
For transactions 5, 6 and 7, the payment_date is empty because it is either a payment (id=5 and 6) or an unpaid invoice (id=7).
I don't understand how I should solve this issue. In combination with regular scripting, I would get the whole set and loop through it to find each payment. But how can this be solved in SQL only?
Any help would be greatly appreciated!
Did you try a simple left join?
Below code is standard SQL.
Select a.id , a.order_id, a.trans_type, a.trans_date, a.trans_amount, isnull(b.trans_date, '') As payment_date
From transaction a
Left join transaction b
On a.order_id = b.order_id
And a.trans_type = 'invoice'
And b.trans_type = 'payment'
You can do a cumulative sum of payments and invoices and get the first date when the payment total meets or exceeds the invoice total:
with ip as (
select ip.*,
sum(case when ip.trans_type = 'invoice' then ip.trans_amount else 0 end) over (order by ip.trans_date) as running_invoice,
sum(case when ip.trans_type = 'payment' then ip.trans_amount else 0 end) over (order by ip.trans_date) as running_payment,
from invoicepayments i
)
select ip.*,
(select min(ip2.trans_date)
from ip ip2
where ip2.running_payment >= ip.running_invoice and
ip.trans_type = 'invoice'
) as payment_date
from ip;
I have the following tables.
Accounts(account_number*,balance)
Transactions(account_number*,transaction_number*,date,amount,type)
Date is the date that the transaction happened. Amount is the amount of the transaction
and it can have a positive or a negative value, dependent of the type(Withdrawal -,Deposit +). I think the type is irrelevant here as the amount is already given in the proper way.
I need to write a query which points out the account_number of the accounts that have at least once had negative balance.
Here's some sample data from the Transactions table, ordered by account_number and date.
account_number transaction_number date amount type
--------------------------------------------------------------------
1 2 02/03/2013 -20000 withdrawal
1 3 03/15/2013 300 deposit
1 1 01/01/2013 100 deposit
2 1 04/15/2013 235236 deposit
3 1 06/15/2013 500 deposit
4 1 03/01/2013 10 deposit
4 2 04/01/2013 80 deposit
5 1 11/11/2013 10000 deposit
5 2 12/11/2013 20000 deposit
5 3 12/13/2013 -10002 withdrawal
6 1 03/15/2013 102300 deposit
7 1 03/15/2013 100 deposit
8 1 08/08/2013 133990 deposit
9 1 05/09/2013 10000 deposit
9 2 06/01/2013 300 deposit
9 3 10/11/2013 23 deposit
Something like this with an analytic to keep a running balance for an account:
SELECT DISTINCT account_number
FROM ( SELECT account_number
,SUM(amount)
OVER (PARTITION BY account_number ORDER BY date) AS running_balance
FROM transactions
) x
WHERE running_balance < 0
Explanation:
It is using an analytic function: the PARTITION BY breaks the table into groups identified by the account number. Within each group, the data is ordered by date. Then there is a walk through each element in the ordered group and the SUM function is applied (by default summing everything from the beginning of the group to the current row). This gives you a running balance. Just run the inner query on its own and take a look at the output, then read a bit about analytic queries. They are pretty cool.