Calculate account balance history in PostgreSQL - sql

I am trying to get a balance history on the account using SQL. My table in PostgreSQL looks like this:
id sender_id recipient_id amount_money
--- ----------- ---------------------- -----------------
1 1 2 60.00
2 1 2 15.00
3 2 1 35.00
so the user with id number 2 currently has 40 dollars in his account.
I would like to get this result using sql:
[60, 75, 40]
Is it possible to do something like this using sql in postgres?

To get a rolling balance, you can SUM the amounts (up to and including the current row) based on whether the id was the recipient or sender:
SELECT id, sender_id, recipient_id, amount_money,
SUM(CASE WHEN recipient_id = 2 THEN amount_money
WHEN sender_id = 2 THEN -amount_money
END) OVER (ORDER BY id) AS balance
FROM transactions
Output:
id sender_id recipient_id amount_money balance
1 1 2 60.00 60.00
2 1 2 15.00 75.00
3 2 1 35.00 40.00
If you want an array, you can use array_agg with the above query as a derived table:
SELECT array_agg(balance)
FROM (
SELECT SUM(CASE WHEN recipient_id = 2 THEN amount_money
WHEN sender_id = 2 THEN -amount_money
END) OVER (ORDER BY id) AS balance
FROM transactions
) t
Output:
[60,75,40]
Demo on dbfiddle
If you want to be more sophisticated and support balances for multiple accounts, you need to split the initial data into account ids, adding when the id is the recipient and subtracting when the sender. You can use CTEs to generate the appropriate data:
WITH trans AS (
SELECT id, sender_id AS account_id, -amount_money AS amount
FROM transactions
UNION ALL
SELECT id, recipient_id AS account_id, amount_money AS amount
FROM transactions
),
balances AS (
SELECT id, account_id, ABS(amount),
SUM(amount) OVER (PARTITION BY account_id ORDER BY id) AS balance
FROM trans
)
SELECT account_id, ARRAY_AGG(balance) AS bal_array
FROM balances
GROUP BY account_id
Output:
account_id bal_array
1 [-60,-75,-40]
2 [60,75,40]
Demo on dbfiddle

Related

Return two values using one sql in PostgreSQL

My table with user transactions in postgres looks like this:
id sender_id recipient_id amount_money
--- ----------- ---------------------- -----------------
1 1 2 60.00
2 1 2 15.00
3 2 1 35.00
The user with id number 2 has 75.00 revenues and 35.00 expenses
I would like to get a result similar to this:
[{name: 'revenues', value: 75.00}, {name:'expenses', value: 35.00}]
Can such result be done in postgres sql itself? I think about array_agg function. If it is difficult, I can handle it in javascript, but I need a query that will return two values to me: revenues and expenses
Here is one method:
select sum(amount) filter (where recipient_id = 2) as revenues,
sum(amount) filter (where sender_id = 2) as expenses
from transactions t
where 2 in (sender_id, recipient_id);
If you wanted to do this for more than one person, then you can use a lateral join and aggregation:
select v.id, sum(v.revenue) as revenues, sum(v.expenses) as expenses
from transactions t cross join lateral
(values (t.recipient_id, t.amount, 0), (t.sender_id, 0, t.amount)
) v(id, revenue, expense)
group by v.id;

Count the number of transactions per month for an individual group by date Hive

I have a table of customer transactions where each item purchased by a customer is stored as one row. So, for a single transaction there can be multiple rows in the table. I have another col called visit_date.
There is a category column called cal_month_nbr which ranges from 1 to 12 based on which month transaction occurred.
The data looks like below
Id visit_date Cal_month_nbr
---- ------ ------
1 01/01/2020 1
1 01/02/2020 1
1 01/01/2020 1
2 02/01/2020 2
1 02/01/2020 2
1 03/01/2020 3
3 03/01/2020 3
first
I want to know how many times customer visits per month using their visit_date
i.e i want below output
id cal_month_nbr visit_per_month
--- --------- ----
1 1 2
1 2 1
1 3 1
2 2 1
3 3 1
and what is the avg frequency of visit per ids
ie.
id Avg_freq_per_month
---- -------------
1 1.33
2 1
3 1
I tried with below query but it counts each item as one transaction
select avg(count_e) as num_visits_per_month,individual_id
from
(
select r.individual_id, cal_month_nbr, count(*) as count_e
from
ww_customer_dl_secure.cust_scan
GROUP by
r.individual_id, cal_month_nbr
order by count_e desc
) as t
group by individual_id
I would appreciate any help, guidance or suggestions
You can divide the total visits by the number of months:
select individual_id,
count(*) / count(distinct cal_month_nbr)
from ww_customer_dl_secure.cust_scan c
group by individual_id;
If you want the average number of days per month, then:
select individual_id,
count(distinct visit_date) / count(distinct cal_month_nbr)
from ww_customer_dl_secure.cust_scan c
group by individual_id;
Actually, Hive may not be efficient at calculating count(distinct), so multiple levels of aggregation might be faster:
select individual_id, avg(num_visit_days)
from (select individual_id, cal_month_nbr, count(*) as num_visit_days
from (select distinct individual_id, visit_date, cal_month_nbr
from ww_customer_dl_secure.cust_scan c
) iv
group by individual_id, cal_month_nbr
) ic
group by individual_id;

SQL Query to get sums among multiple payments which are greater than or less than 10k

I am trying to write a query to get sums of payments from accounts for a month. I have been able to get it for the most part but I have hit a road block. My challenge is that I need a count of the amount of payments that are either < 10000 or => 10000. The business rules are that a single payment may not exceed 10000 but there can be multiple payments made that can total more than 10000. As a simple mock database it might look like
ID | AccountNo | Payment
1 | 1 | 5000
2 | 1 | 6000
3 | 2 | 5000
4 | 3 | 9000
5 | 3 | 5000
So the results I would expect would be something like
NumberOfPaymentsBelow10K | NumberOfPayments10K+
1 | 2
I would like to avoid doing a function or stored procedure and would prefer a sub query.
Any help with this query would be greatly appreciated!
I suggest avoiding sub-queries as much as possible because it hits the performance, specially if you have a huge amount of data, so, you can use something like Common Table Expression instead. You can do the same by using:
;WITH CTE
AS
(
SELECT AccountNo, SUM(Payment) AS TotalPayment
FROM Payments
GROUP BY AccountNo
)
SELECT
SUM(CASE WHEN TotalPayment < 10000 THEN 1 ELSE 0 END) AS 'NumberOfPaymentsBelow10K',
SUM(CASE WHEN TotalPayment >= 10000 THEN 1 ELSE 0 END) AS 'NumberOfPayments10K+'
FROM CTE
You can get the totals per account using SUM and GROUP BY...
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
You can use that result to count the number over 10000
SELECT COUNT(*)
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)
WHERE TotPay>10000
You can get the the number over and the number under in a single query if you want but that's a but more complicated:
SELECT
COUNT(CASE WHEN TotPay<=10000 THEN 1 END) AS Below10K,
COUNT(CASE WHEN TotPay> 10000 THEN 1 END) AS Above10K
FROM (
SELECT AccountNo, SUM(Payment) AS TotPay
FROM payments
GROUP BY AccountNo
)

How to remove duplicate accounts in SQL?

I am using SQL Server 2008 and I was wondering how to remove duplicate customers either from the table or exclude it in my query. An Account_ID can only have 1 product associated with it. And the account with the most recent purchase date is what should be showing. An example is below:
Account_ID, Account_Purchase, Purchase_Date
1 Product 1 1/1/2016
2 Product 1 1/2/2016
3 Product 2 1/5/2016
1 Product 3 3/12/2016
4 Product 3 1/5/2016
Ideally I would only see:
Account_ID, Account_Purchase, Purchase_Date
2 Product 1 1/2/2016
3 Product 2 1/5/2016
1 Product 3 3/12/2016
4 Product 3 1/5/2016
This should not show up because it is not the most recent purchase from account 1
Account_ID, Account_Purchase, Purchase_Date
1 Product 1 1/1/2016
Thank you all for help, folks!
Simply acquire the latest purchase_date using max and group by account_id. Then use inner join to get the other details from the acquired details.
SELECT TABLE_NAME.* FROM TABLE_NAME
INNER JOIN(
SELECT Account_ID, MAX(Purchase_Date) AS Purchase_Date
GROUP BY Account_ID
) LatestPurchases
ON TABLE_NAME.Account_ID = LatestPurchases.Account_ID
AND TABLE_NAME.Purchase_Date = LatestPurchases.Purchase_Date
Try below query, please replace TABLENAME with your table
WITH CTE
AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Account_ID ORDER BY Purchase_Date DESC) AS RN
FROM TABLENAME
)
SELECT
*
FROM CTE
WHERE RN = 1
Here is another query
SELECT
t.Account_id,
t.Account_Purchase,
t.Purchase_Date
FROM
tablename t
WHERE
t.Purchase_Date = (SELECT MAX(Purchase_date) FROM Tablename WHERE Account_ID = t.Account_ID)
ORDER BY
t.Purchase_Date DESC

How do I create a frequency distribution?

I'm trying to create a frequency distribution to show how many customers have transacted 1x, 2x, 3x, etc.
I have a database transactions and column user_id. Each row indicates a transaction, and if a user_id shows up in multiple rows, that user has done multiple transactions.
Now I'd like to get a list that looks something like this:
Tra. | Freq.
0 | 345
1 | 543
2 | 45
3 | 20
4 | 0
5 | 3
etc
Currently I have this, but it just shows a list of users and how many transactions they have had.
SELECT user_id, COUNT(user_id) as number_of_transactions
FROM transactions
GROUP BY user_id
ORDER BY number_of_transactions DESC;
I did some digging and was suggested that generate_series might help, but I'm stuck and don't know how to move forward.
Use the first result as input to an outer query where you apply the count again, but this time grouping on number_of_transactions:
SELECT number_of_transactions, COUNT(*) AS freq
FROM (
SELECT user_id, COUNT(user_id) as number_of_transactions
FROM transactions
GROUP BY user_id
) A
GROUP BY number_of_transactions;
This would transform a result like:
user_id number_of_transactions
----------- ----------------------
1 2
2 1
3 2
4 4
to this:
number_of_transactions freq
---------------------- -----------
1 1
2 2
4 1