How to calculate the running balance of an asset based on INPUT and OUTPUT - sql

I'm looking at different blockchain transactions and wanted to create a running balance of a given asset based on INPUT_ADDRESS (the address sending the currency) INPUT_AMOUNT (the amount being sent by an INPUT_ADDRESS), OUTPUT_ADDRESS (the address receiving the currency) and OUTPUT_AMOUNT (the amount being received by an OUTPUT_ADDRESS)
Here's a sample of a table I'm using:
BLOCK_DATE | BLOCK_HEIGHT | TRANS_HASH | INPUT_ADDRESS | OUTPUT_ADDRESS | INPUT_AMOUNT | OUTPUT_AMOUNT
01/11/2020 190 15c7853 abc xyz1 -0.01 0.0001
01/11/2020 190 14v9876 abc xyz2 -0.50 0.70
01/11/2020 191 19vc842 abc xyz3 -5.03 0.413
01/12/2020 192 20ff4d3 abc xyz4 -0.06 0.201
01/12/2020 192 154gf34 xyz1 abc -0.07 0.18
01/12/2020 192 45f4ti5 ggg abc -0.10 0.24
01/12/2020 192 33cv5c5 jjj abc -0.08 1.13
If I were to calculate a running sum of address abc, what's an efficient way of going about this? I tried using something like:
SELECT BLOCK_DATE, BLOCK_HEIGHT, TRANS_HASH, INPUT_ADDRESS, OUTPUT_ADDRESS, INPUT_AMOUNT, OUTPUT_AMOUNT, SUM (INPUT_AMOUNT) OVER (ORDER BY DATE) AS RunningAgeTotal
FROM TRANSACTION_TABLE
WHERE INPUT_ADDRESS = abc
In this particular example, the total balance for abc would be the sum of OUTPUT_AMOUNT where abc is the OUTPUT_ADDRESS (i.e 0.18 + 0.24 + 1.13) + the sum of INPUT_AMOUNT where abc is the INPUT_ADDRESS (i.e. -0.01 + -0.50 + -5.03 + -0.06). So, 1.55 + (-5.60) = -4.05
But I don't think this is the right way of going about this and I'm not sure how to account for the OUTPUT_AMOUNT (e.g. when abc receives is an OUTPUT_ADDRESS and receives an OUTPUT_AMOUNT)

Is this what you want?
select t.*,
sum(case when input_address = 'ABC' then input_amount
when output_address = 'ABC' then output_amount
end) over (order by block_date) as running_amount
from transaction_table t
where 'ABC' in (input_address, output_address);
This is a cumulative sum of the amounts aligned with the input/output columns.
EDIT:
You may want:
sum(case when input_address = 'ABC' then input_amount
when output_address = 'ABC' then output_amount
end) over (order by block_date, block_height) as running_amount

Related

Match group of variables and values with the nearest datetime

I have a transaction table that looks like that:
transaction_start store_no item_no amount post_voided
2021-03-01 10:00:00 001 101 45 N
2021-03-01 10:00:00 001 105 25 N
2021-03-01 10:00:00 001 109 40 N
2021-03-01 10:05:00 002 103 35 N
2021-03-01 10:05:00 002 135 20 N
2021-03-01 10:08:00 001 140 2 N
2021-03-01 10:11:00 001 101 -45 Y
2021-03-01 10:11:00 001 105 -25 Y
2021-03-01 10:11:00 001 109 -40 Y
The table does not have an id column; the transaction_start for a given store_no will never be the same.
Whenever a transaction is post voided, the transaction is then repeated with the same store_no, item_no but with a negative/minus amount and an equal or higher transaction_start. Also, the column post_voided is then equal to 'Y'.
In the example above, the rows 1-3 have the same transaction_start and store_no, thus belonging to the same receipt, containing three different items (101, 105, 109). The same logic is applied to the other rows: rows 4-5 belong to a same receipt, and so on. In the example, 4 different receipts can be seen. The last receipt, given by the last three rows, is a post voided of the first receipt (rows 1-3).
What I want to do is to change the transaction_start for the post_voided = 'Y' transactions (in my example, only one receipt - represented by the last three rows - has it) to the next/closest datetime of a similar receipt that has the variables store_no, item_no and (negative) amount (but post_voided = 'N') (in my example, the similar ticket is given by the first three rows - store_no, all item_no and (positive) amount match). The transaction_start for the post voided receipt is always equal or higher than the "original" receipt.
Desired output:
transaction_start store_no item_no amount post_voided
2021-03-01 10:00:00 001 101 45 N
2021-03-01 10:00:00 001 105 25 N
2021-03-01 10:00:00 001 109 40 N
2021-03-01 10:05:00 002 103 35 N
2021-03-01 10:05:00 002 135 20 N
2021-03-01 10:08:00 001 140 2 N
2021-03-01 10:00:00 001 101 -45 Y
2021-03-01 10:00:00 001 105 -25 Y
2021-03-01 10:00:00 001 109 -40 Y
Here a link of the table: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=26142fa24e46acb4213b96c86f4eb94b
Thanks in advance!
Consider below
select a.* replace(ifnull(b.transaction_start, a.transaction_start) as transaction_start)
from `project.dataset.table` a
left join (
select * replace(-amount as amount)
from `project.dataset.table`
where post_voided = 'N'
) b
using (store_no, item_no)
if applied to sample data in your question - output is
Consider below for new / extended example (https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=91f9f180fd672e7c357aa48d18ced5fd)
select x.* replace(ifnull(y.original_transaction_start, x.transaction_start) as transaction_start)
from `project.dataset.table` x
left join (
select b.transaction_start, b.store_no, b.item_no, b.amount amount,
max(a.transaction_start) original_transaction_start
from `project.dataset.table` a
join `project.dataset.table` b
on a.store_no = b.store_no
and a.item_no = b.item_no
and a.amount = -b.amount
and a.post_voided = 'N'
and b.post_voided = 'Y'
and a.transaction_start < b.transaction_start
group by b.transaction_start, b.store_no, b.item_no, b.amount
) y
using (store_no, item_no, amount, transaction_start)
with output

TERADATA SQL- how to find a sequence in the data

I have a table with the following columns: account, validity_month, amount. The table contains data for half a year: January to June 2019. For each account i'm trying to find sequence in "amount" field - meaning 6 months sequence with similar amount (closed in a range of 10 percent).
In case there is a sequence to the account, ind=1, else 0.
account validity_month amount
------- --------------- --------
123 201901 1000
123 201901 500
123 201902 1002
123 201902 3000
123 201903 0
123 201903 1050
123 201904 1020
123 201905 1020
123 201905 555
123 201906 998
In this example there is a match- 6 month with similar amount (1000,1002,1050,1020,1020,998).
The 10 percent range is calculated according to the value of the previous month.
account validity_month amount
------- --------------- --------
124 201901 500
124 201901 0
124 201902 530
124 201903 500
124 201903 2000
124 201904 2000
124 201905 60
124 201905 2100
124 201906 2000
In this example there is NO a match (3 months with similar amount, and then 3 months with different similar amount).
In this case, this is the requested output:
account IND
------- ------
123 1
124 0
Can anyone help?
Thanks!
If I assume that all values need to be within 10% of each other, then brute force might be the only way:
select t1.account,
(case when t2.account is not null and t3.account is not null and . . .
then 1 else 0
end) as flag
from t t1 left join
t t2
on t2.account = t1.account and
t2.amount between t1.amount / 1.1 and t1.amount * 1.1 and
t2.validity_month = 201902 left join
t t3
on t3.account = t1.account and
t3.amount between t1.amount / 1.1 and t1.amount * 1.1 and
t3.validity_month = 201903 left join
. . .
where t1.validity_month = 201901
group by t1.account;

query to get the SUM

Supposed I have a data of
code_table
code_id | code_no | stats |
2 60 22A3
3 60 22A3
value_table
value_no | amount_value_one | amount_value_two | amount_diff | code_no | sample_no | code_id
1 1200.00 400.00 800.00 60 90 2
1 600.00 200.00 400.00 60 100 3
1 1800.00 600.00 1200.00 60 110 2
2 1200.00 1200.00 0.00 60 110 2
2 800.00 600.00 200.00 60 90 2
2 400.00 0.00 400.00 60 100 3
What I want to happen is to get all the SUM of amount_value_two and just retain the first amount_value_one which has the value_no = 1
the output can be conclude as
amount_value_one | SUM_of_amount_value_two | amount_diff | sample_no
1200.00 1000.00 200.00 90
600.00 200.00 400.00 100
1800.00 1.800.00 0.00 110
so far i have this following query
SELECT SUM(p.amount_value_one) as value_one,
SUM(p.amount_value_two) as value_two,
SUM(p.amount_diff) as amount_diff,
p.sample_no as sampleNo FROM value_table p
INNER JOIN code_table On code_table.code_no = p.code_no
WHERE code_table.code_id = p.code_id
AND code_table.stats = '22A3'
GROUP BY p.sample_no
the query above that I used is wrong because it gets the sum of both p.amount_value_one
and p.amount_diff
its just a test query because i cant imagine what would the query will look like.
Assuming that you have a column that specifies the ordering, then you can use that to figure out the "first" row. Then use conditional aggregation:
SELECT SUM(CASE WHEN seqnum = 1 THEN p.amount_value_one END) as value_one,
SUM(p.amount_value_two) as value_two,
SUM(p.amount_diff) as amount_diff,
p.sample_no as sampleNo
FROM (SELECT p.*,
ROW_NUMBER() OVER (PARTITION BY p.sample_no ORDER BY <ordering column>) as seqnum
FROM value_table p
) p JOIN
code_table ct
ON ct.code_no = p.code_no AND
ct.code_id = p.code_id
WHERE ct.stats = '22A3'
GROUP BY p.sample_no

Cumulative Stock Holding

I am trying to create a stock holding based on the below data.
Input and Desired Output
I have tried using creating a transactions column (Starting + UK Open POs - UK Sales).
Then used the below SQL code to create a stock holding.
Sum OVER ( TRANSACTIONS)
[ <PARTITION BY No_ ]
[ <ORDER BY Date ]
But the problem is I don't want the stock holding to go into a negative. I want it to show 0, so when 960 units come in on 14/04/19 the stock holding is 921 units (960-39) instead of 116 units.
The column highlighted in yellow is my desired output. I need this over 5k SKUs (column no_)
Any help would be very appreciated.
No_ Date UK-Open PO UK-Sales Starting Stock Trans. Cumul Stock Stock Level
111111 22/03/2019 47 100 53 53 53
111111 24/03/2019 330 -330 -277 0
111111 31/03/2019 443 -443 -720 0
111111 07/04/2019 85 -85 -805 0
111111 14/04/2019 960 39 921 116 921
111111 21/04/2019 960 112 848 964 1769
111111 28/04/2019 100 -100 864 1669
111111 05/05/2019 504 -504 360 1165
111111 12/05/2019 606 -606 -246 559
111111 19/05/2019 118 -118 -364 441
111111 26/05/2019 400 -400 -764 41
111111 02/06/2019 674 -674 -1438 0
111111 09/06/2019 338 -338 -1776 0
111111 16/06/2019 206 -206 -1982 0
111111 23/06/2019 115 -115 -2097 0
111111 30/06/2019 500 66 434 -1663 434
111111 07/07/2019 33 -33 -1696 401
Suppressing the negative numbers as you are doing requires remembering what has happened on all previous rows. Alas, this can't be done using window function.
The alternative is a recursive CTE:
with t as (
select no_, date, starting_stock, trans,
row_number() over (partition by no_ order by date) as seqnum
from <table>
),
cte as (
select no_, date, trans, seqnum,
starting_stock as stock_level
from t
where seqnum = 1
union all
select t.no_, t.date, t.trans, t.seqnum,
(case when cte.starting_stock + t.trans < 0 then 0
else cte.starting_stock + t.trans
end) as stock_level
from cte join
t
on t.seqnum = cte.seqnum + 1 and
t.no_ = cte.no_
)
select *
from cte
option (maxrecursion 0);
You only need the option if the number of rows exceeds 100 from the recursion.

Incremental count in SQL Server 2005

I am working with a Raiser's Edge database using SQL Server 2005. I have written SQL that will produce a temporary table containing details of direct debit instalments. Below is a small table containing the key variables for the question I'm going to ask, with some fictional data:
Donor_ID Instalment_ID Instalment_Date Amount
1234 1111 01/01/2011 £5.00
1234 1112 01/02/2011 £0.00
1234 1113 01/03/2011 £5.00
1234 1114 01/04/2011 £5.00
1234 1115 01/05/2011 £0.00
1234 1116 01/06/2011 £0.00
2345 2111 01/01/2011 £0.00
2345 2112 01/02/2011 £5.00
2345 2113 01/03/2011 £5.00
2345 2114 01/04/2011 £0.00
2345 2115 01/05/2011 £0.00
2345 2116 01/06/2011 £0.00
As you will see, some of the values in the Amount column are £0.00. This can occur when a donor has insufficient funds in their account, for example.
What I'd like to do is write a SQL query that will create a field containing an incremental count of consecutive £0.00 payments that resets after a non-£0.00 payment or after a change in Donor_ID. I have reproduced the above data below, with the field I'd like to see.
Donor_ID Instalment_ID Instalment_Date Amount New_Field
1234 1111 01/01/2011 £5.00
1234 1112 01/02/2011 £0.00 1
1234 1113 01/03/2011 £5.00
1234 1114 01/04/2011 £5.00
1234 1115 01/05/2011 £0.00 1
1234 1116 01/06/2011 £0.00 2
2345 2111 01/01/2011 £0.00 1
2345 2112 01/02/2011 £5.00
2345 2113 01/03/2011 £5.00
2345 2114 01/04/2011 £0.00 1
2345 2115 01/05/2011 £0.00 2
2345 2116 01/06/2011 £0.00 3
To help clarify what I'm looking for, I think what I'm looking to do would be similar to a winning streak field on a list of a football team's results. For example:
Opponent Score Winning_Streak
Arsenal 1-0 1
Liverpool 0-0
Swansea 3-1 1
Chelsea 2-1 2
Fulham 4-0 3
Stoke 0-0
Man Utd 1-3
Reading 2-1 1
I've considered various options, but have made no progress. Unless I've missed something obvious, I think that a solution more advanced than my current SQL programming level might be required.
If I am thinking about this problem correctly, I believe that you want a row number when the Amount is 0.00 pounds.
Select 0 as As InsufficientCount
, Donor_ID
, Installment_ID
, Amount
From [Table]
Where Amount > 0.00
Union
Select Row_Number() Over (Partition By Donor_ID Order By Installment_ID)
, Donor_ID
, Installment_ID
, Amount
From [Table]
Where Amount = 0.00
This union select should only give you 'ranks' where the Amount equals 0.
Am calling your new field streakAmount
ALTER TABLE instalments ADD streakAmount int NULL;
Then, to update the value:
UPDATE instalments
SET streakAmount =
(SELECT
COUNT(*)
FROM
instalments streak
WHERE
streak.donor_id = instalments.donor_id
AND
streak.instalment_date <= instalments.instalment_date
AND
(streak.instalment_date >
-- find previous instalment date, if any exists
COALESCE(
(
SELECT
MAX(instalment_date)
FROM
instalments prev
WHERE
prev.donor_id = instalments.donor_id
AND
prev.amount > 0
AND
prev.instalment_date < instalments.instalment_date
)
-- otherwise min date
, cast('1753-1-1' AS date))
)
)
WHERE
amount = 0;
http://sqlfiddle.com/#!6/a571f/18