Calculating user's balance with multiple cost tables

Calculating user's balance with multiple cost tables - sql

We're building a SaaS offering where a user can incur costs from various types of transactions for example:
Making phone calls
Sending SMS messages
Storing audio recordings
We have built our system to store the costs of each service, for example the call_audit table looks like:
Date Call ID Our Cost User Cost Currency Duration User ID
---------- -------- --------- --------- -------- -------- -------
2018-01-02 sm_123 0.01 0.02 USD 72 us_1
The sms_audit table looks like:
Date SMS ID Our Cost User Cost Currency User ID
---------- -------- --------- --------- -------- -------
2018-01-02 sm_123 0.01 0.02 USD us_1
Then there is a payment_audit table with user payments and refunds:
Date User ID Amount Currency Type
---------- -------- ------ -------- ----
2018-01-02 us_1 12 USD CHARGE
2018-01-02 us_1 -2 USD REFUND
We also have a user table with a balance column which we decrement when the user incurs a call, sms cost or refund. We increment it when the user pays into their account (CHARGE as above).
But going forward I'm thinking we need something more resilient than a single balance figure which gets updated in code.
One improvement is to update the balance figure with triggers instead of in code.
Another approach would be to calculate the user's total costs and payments across multiple tables and sum the lot. As the tables grow to many 1000s of transactions I can imagine this becoming a slow computation.
Another approach we thought of was to have a balance_transactions table with a debit, credit and running balance column. This of course incurs transitive dependencies between rows which isn't great if seeking a nicely normalized DB. It also means we're duplicating data, but in the real world is this an acceptable trade off?

You can avoid duplicating the data by using materialized views. Note, that updating the balance (in any way - either by the application, triggers, partial running balances) already duplicates the data. As such, you should have some validation procedures running to alert on discrepancies. And such validation procedures should do all the calculations, so they might as well populate materialized view.
However, the actual solution depends on frequency you need these data. If you, for example, fetch all the customers balances monthly for invoicing purposes, just don't duplicate them. But if you print the balance after each customer operation, e.g. in some kind of transaction confirmation (like PDF generated and e-mailed to customer), you might want to keep the running balance in a form that was presented to the customer, since he owns the balance evidence.

Related

SQL Database Design for monthly issued coupons

I'm struggling with a database schema for a problem I'm having.
Let's say I own a business that sells monthly services (cleaning) to different companies.
However, I give companies monthly saveable 'coupons' that act like a reduction (of 5 dollars) based on their amount of users.
Example:
It's april 2018
Company XYZ has to pay 1.000 dollars for their monthly cleaning services by my business.
XYZ, has 5 employees, so they will have 5 coupons for the month of april.
HOWEVER, since coupons can be saved (for a period of 2 months), company XYZ will not use the coupons of only april, but also of march (since they didn't use any that month and february coupons are already used up).
Result:
10 coupons are used on their april invoice (5 of march, 5 of april):
total amount to pay 950 dollars
My thing is that I want to automate this. With one click on the button, my system will have to check:
How many users there are
If there are any unused coupons from last 2 months (and use those first if they exist)
Apply coupons to their invoice.
I want to design this first in a database but i'm struggling:
This is my design
Company
CompanyID
Name
User
UserID
CompanyID
UserID
Now I'm struggling with the coupon design, how can I develop this so that I can automise my problem.
I will need to save coupons per company per month.
My idea is to do it like this:
Company_Month_Coupon
CompanyID
Coupon_Count
Month
I wasn't sure if i could do this in one table and i'm not so sure with the following problem:
what if my program user decides to cancel an invoice, how would my system know from which month the coupons came?
What design would be adviced in a coupon-sharing system?
Any advice to tackling this problem would greatly appreciated.

I would go with your idea and have 2 more tables: Invoices and Invoices_UsedCoupons
Invoices:
ID (Primary key)
CompanyID
Month
Status (to set a cancelled status on your invoice if you don't want to delete from the DB)
Invoices_UsedCoupons:
InvoiceId (foreign key to Invoices table)
Coupon_Count
Month (this field is for the used coupons from Company_Month_Coupon table)
The reasons for this:
We should still store the issued coupons (in your Company_Month_Coupon table) because for each month, the number of employees may change. It means that you have to keep track of the issued coupons whenever the number of employees changes.
With Invoices and Invoices_UsedCoupons table, you could easily calculate the actual used coupons & the remaining coupons.
what if my program user decides to cancel an invoice, how would my
system know from which month the coupons came?
All the information is available in Invoices and Invoices_UsedCoupons tables. If you want to reclaim coupons after cancelling the invoice, it's also easy to do.

"I will need to save coupons per company per month."
Maybe you can do the opposite. In the database does not store coupons that can be used, but only those that are actually used, for example in the table "used_coupons"
The idea is that the coupons are given up by default, so it makes no sense to store them. Only need to save the used coupons.
At checkout you need to find out how much users is in the company and how many "used coupons" is saved in the last two months.
If X coupons are returned then from the "used_coupons" table you need to delete the latest X coupons.

Calculating interest using SQL

I am using PostgreSQL, and have a table for a billing cycle and another for payments made in a billing cycle.
I am trying to figure out how to calculate interest based on how much amount was left after each billing cycle's last payment date. Problem is that every time a repayment is made, the interest has to be calculated on the amount remaining after that.
My thoughts on building this query are like this. Build data for all dates from last pay date of the billing cycle to today. Using partitioning, get the remaining amount for the first date. For second date, use amount from previous row and add interest to it, and then calculate interest on this one.
Unfortunately I am stuck just at the thought and can't figure out how to make this into a query!
Here's some sample data to make things easier to understand.
Billing Cycles:
id | ends_at
-----+---------------------
1 | 2017-11-30
2 | 2017-11-30
Payments:
amount | billing_cycle_id | type | created_at
-----------+------------------+---------+----------------------------
6000.0000 | 1 | payment | 2017-11-15 18:40:22.151713
2000.0000 | 1 |repayment| 2017-11-19 11:45:15.6167
2000.0000 | 1 |repayment| 2017-12-02 11:46:40.757897
So if we see, user made a repayment on the 19th, so amount due for interest post ends date(30th Nov 2017), is only 4000. So, from 30th to the 2nd, interest will be calculated daily on 4000. However, from the 2nd, interest needs to be calculated on 2000 only.
Interest Calculations(Today being 2017-12-04):
date | amount | interest
------------+---------+----------
2017-12-01 | 4000 | 100 // First day of pending dues.
2017-12-02 | 2100 | 52.5 // Second day of pending dues.
2017-12-03 | 2152.5 | 53.8125 // Third day of pending dues.
2017-12-04 |2206.3125| // Fourth's day interest will be added tomorrow

Your data is too sparse. It doesn't make any sense to need to write this query, because over time the query will get significantly more complicated. What happens when interest rates change over time?
The table itself (or a secondary table, depending on how you want to structure it) could have a running balance you add every time a deposit / withdrawal is made. (I suggest this table be add-only) Otherwise you're making both the calculation and accounting far harder on yourself than it should be. Even with the way you've presented the problem here, there's not enough information to do the calculation. (interest rate is missing) When that's the case, your stored procedure is going to be too complicated. Complicated means bugs, and people get irritated about bugs when you're talking about their money.

SQL SUM expression and Lock

I have a problem with right SQL solution.
Current situation:
My database contains table with bank transactions (credit and debit).
Credit transactions are signed as posivitive amount (+), and
debit transactions as negative amount (-).
Application which uses the DB is a multiuser webapp, so Transactions Table contains many rows, which reference to different users.
Some webapp actions need to check actual balance of logged user, using Transactions table and save debit Transaction (action price).
I think about architecture of this mechanism and have some questions:
Is it a good idea to calculate balance as a SUM of Transactions credits and debits each time user requests? I know it may be inefficient for db. Maybe should I save a snapshot somewhere?
How to ensure data cohesion when one user checks ""balance"" as a SUM of credit/debit transactions, and another user in the same time saves debit transaction (because he/she was faster)? I think about a pessimistic lock but what should I lock? I know that lock with aggregation (SUM) may be impossible on Postgresql (database which I use)."
Sorry for my English, I hope my problem is understandable. :)

I would consider EITHER:
Storing a balance on the account record, along with the date for which the balance is accurate.
Getting the current balance is a matter of reading the account balance, and then including any transactions since that date.
You can have a scheduled job that recalculates and timestamps that balance at an hour past midnight.
OR (and this is my preferred solution):
Every time a transaction or batch of transactions is loaded, lock the relevant account records and update them with the values from the insert as part of the same transaction.
This has the advantage of serialising access to the account, which can then help with determining whether a transaction can go ahead or not because of decisions based on the balance calculation.

If you want to avoid having the balance on the user account, something that could have a better performance, the approach I would experiment would be:
Each transaction would be related to only one account.
Each transaction would have the account balance after that transaction.
Therefore, the last transaction for that account would have the current balance.
Ex.:
TransactionId | AccountId | Datetime | Ammount | Balance
1 | 1 | 7/11/16 | 0 | 0
2 | 1 | 7/11/16 | 500 | 500
3 | 1 | 7/11/16 | -20 | 480
4 | 1 | 8/11/16 | 50 | 530
5 | 1 | 8/11/16 | -200 | 330
This way you would be able to get the account balance (last transaction with that accountId) and you would be able to provide a better view into the balance change over time.

Different scenarios with Slowly Changing Dimensions (SCD) Type 2

We currently have a table in the data warehouse named 'Cards'. This was designed as a slowly changing dimension of type 2; where we create a new record should the card state change so that we can keep track of the state changes of the card.
We are also keeping a daily record for each card, even if no state changed - this is done to keep track of the daily balance. Example:
cardId state balanceAsAt balance ....
1 ACTIVE 2014-01-01 100.00
1 ACTIVE 2014-01-02 99.00
1 DELETED 2014-01-03 0.00
What is the optimal way to store data should I need to execute the ETL for a past date range (e.g. 2nd January 2014) today, Feb 2015 (example for 2014-01-01), assuming there is no way to retrieve the past state of the card?
Option A - insert a record with the current data for the past day
cardId state balanceAsAt balance ....
1 ACTIVE 2014-01-01 100.00
1 DELETED 2014-01-01 0.00 [new entry here? - however now the card seems to have been 're-activated' on the 2nd, which is not the case]
1 ACTIVE 2014-01-02 99.00
1 DELETED 2014-01-03 0.00
Option B - do not modify records already created in the dimension
cardId state balanceAsAt balance ....
1 ACTIVE 2014-01-01 100.00
1 ACTIVE 2014-01-02 99.00
1 DELETED 2014-01-03 0.00
Any other options/standard practices?

After reading the description, I realized you might have in hands a very fast changing dimension. My proposal it would be if possible, change the balance attribute to slow changing dimensions type 1 (update), and keep record of the balances in a fact table. For this you have two options:
snapshot: for each day is created an entry for each card. This is very good for example, if you need to know (frequently) what was the average balance balance of the cards in a given day (or during a given month).
transaction logging: a fact table where you keep track the transactions on the card, and the balance before and after the transaction. This have the advantage of the snapshot of take less space in the hard drive.
You should be aware that using Slow Changing Dimensions should be used for attributes that change slowly, and use slow changing dimensions for attributes that change very often is not a good idea. You will have your cards dimensions growing too fast; and this will carry a performance burden.
My second note is that you are not implementing the Slow Changing Dimensions properly. You neither implementing dimensional keys, neither Slow Changing Dimensions flags.
A properly implemented dimension would look like this:
keyCard cardId state startDate endDate balance
1 1 ACTIVE 2014-01-01 2014-01-03 100.00
2 1 ACTIVE 2014-01-03 2014-02-01 99.00
3 1 DELETED 2014-02-01 null 0.00
You can retrieve the last record by doing a:
select * from DimensionCards where endDate is null;

Run a query to check consistency in SQL Server

I need some help with a SQL query and logic in general. (Using MSSQL Server)
I need to check the consistency of payments at certain retailers over a period of three months.
So I've got a table with all my transactions and the following columns:
TransactionID , AccountNumber , Retailer, Date .... (few other irrelevant ones)
Now one Accountnumber could have many transaction IDs. (One account could decide to make several payments during one month).
I have 4 unique retailers' ids, let's call them (101,102,103,104)
Now for consistency I want to get the following data:
The count of transactions where there was only one payment per account for the month at each retailer.
So I'd have:
| # Payments For Month | Retailer | Number of Transactions
| 1 Payment | 101 | 5000
...
But I also want to see how many transactions there were from accounts that made payments at multiple retailers
So I'd want something like:
| 2 Payments | 102 & 104 | 20
Which would mean that an account made 20 payments at retailer 102 & 104.
I don't as much care about how many accounts, more the amount of transactions.
I also want it broken down by month, but I've decided to do a seperate query for each month.
I've imported the data into a local DB on my personal laptop so I could go crazy, so I'll be able to try any method.
The goal of this query is to check the consistency of payments by people (accounts) at certain retailers. How many transactions do they loyally make at one retailer every month, how many transactions are there where they've gone to two retailers? or three? or all four?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas