In an app that I am developing, we have users who will make deposits into the app and that becomes their balance.
They can use the balance to perform certain actions and also withdraw the balance. What is the schema design that ensures that the user can never withdraw / or perform more than he has, even in concurrency.
For example:
CREATE TABLE user_transaction (
transaction_id SERIAL NOT NULL PRIMARY KEY,
change_value BIGINT NOT NULL,
user_id INT NOT NULL REFERENCES user
)
The above schema can keep track of balance, (select sum() from user_transction); However, this does not hold in concurrency. Because the user can post 2 request simultaneously, and two records could be inserted in 2 simultaneous database connections.
I can't do in-app locking either (to ensure only one transcation gets written at a time), because I run multiple web servers.
Is there a database schema design that can ensure correctness?
P.S. Off the top of my head, I can imagine leveraging the uniqueness constraint in SQL. By having later transaction reference earlier transactions, and since each earlier transaction can only be referenced once, that ensures correctness at the database level.
Relying on calculating an account balance every time you go to insert a new transaction is not a very good design - for one thing, as time goes by it will take longer and longer, as more and more rows appear in the transaction table.
A better idea is to store the current balance in another table - either a new table, or in the existing users table that you are already using as a foreign key reference.
It could look like this:
CREATE TABLE users (
user_id INT PRIMARY KEY,
balance BIGINT NOT NULL DEFAULT 0 CHECK(balance>=0)
);
Then, whenever you add a transaction, you update the balance like this:
UPDATE user SET balance=balance+$1 WHERE user_id=$2;
You must do this inside a transaction, in which you also insert the transaction record.
Concurrency issues are taken care of automatically: if you attempt to update the same record twice from two different transactions, then the second one will be blocked until the first one commits or rolls back. The default transaction isolation level of 'Read Committed' ensures this - see the manual section on concurrency.
You can issue the whole sequence from your application, or if you prefer you can add a trigger to the user_transaction table such that whenever a record is inserted into the user_transaction table, the balance is updated automatically.
That way, the CHECK clause ensures that no transactions can be entered into the database that would cause the balance to go below 0.
Related
I'm creating a POS like system and I'm not really sure how to do the Shopping Cart part wherein after the cashier enters all the customer's item (from Inventory table), the items entered will have a single Transaction #, just like what we see in the receipts.
Should I put a Trans_No column in the Cart table? If yes, how will I handle the assigning of a single Trans_No to multiple items? I'm thinking of getting the last Trans_No and increment that to 1 then assign it to all the items in the shopping cart of the casher. But there's a huge possibility that if 2 cashiers are simultaneously using the system they both retrieve the same latest transaction # and will increment it both to 1 resulting to merging of 2 customers' order in to 1 single transaction/receipt.
What's the best way to handle this?
The data object on which your transaction id goes depends on the functional requirements of your application. If whatever is in a cart should share a transaction id, then the cart table is the right place for the transaction id.
Database systems offer a variety of features to prevent the concurrent increment problem you describe. The easiest way to avoid this is to use a serial data type as offered e.g. by PostgreSQL. If you declare a column as serial, the database will care for generating a fresh value for each record you insert.
If no such data type is available, there might still be a mechanism for generating a unique primary key for a record. An example is the auto_increment directive for MySQL.
If all these are not viable for you, e.g. because you want to have some fancy logic of generating your transaction ids, putting the logic of reading, incrementing, and storing the value needs to be enclosed in a database transaction. Statements like
start transaction;
select key from current_key;
update current_key set key = :key + 1;
commit;
will prevent collisions on the key value. However, make sure that your transactions are short, in particular that you don't leave a transaction open during a wait for user input. Otherwise, other users' transactions may be blocked too long.
How do I start a trigger so that this allows nobody to be able to rent a movie if their unpaid balance exceeds 50 dollars?
What you have here is a cross-row table constraint - i.e. you can't just put a single Oracle CONSTRAINT on a column, as these can only look at data within a single row at a time.
Oracle has support for only two cross-row constraint types - uniqueness (e.g. primary keys and unique constraints) and referential integrity (foreign keys).
In your case, you'll have to hand-code the constraint yourself - and with that comes the responsibility to ensure that the constraint is not violated in the presence of multiple sessions, each of which cannot see data inserted/updated by other concurrent sessions (at least, until they commit).
A simplistic approach is to add a trigger that issues a query to count how many records conflict with the new record; but this won't work because the trigger cannot see rows that have been inserted/updated by other sessions but not committed yet; so the trigger will sometimes allow members to rent 6 videos, as long as (for example) they get two cashiers to enter the data in separate terminals.
One way to get around this problem is to put some element of serialization in - e.g. the trigger would first request a lock on the member record (e.g. with a SELECT FOR UPDATE) before it's allowed to check the rentals; that way, if a 2nd session tries to insert rentals, it will wait until the first session does a commit or rollback.
Another way around this problem is to use an aggregating Materialized View, which would be based on a query that is designed to find any rows that fail the test; the expectation is that the MV will be empty, and you put a table constraint on the MV such that if a row was ever to appear in the MV, the constraint would be violated. The effect of this is that any statement that tries to insert rows that violate the constraint will cause a constraint violation when the MV is refreshed.
Writing the query for this based on your design is left as an exercise for the reader :)
If you want to restrict something about your table data then you should have a look at Constraints and not Triggers.
Constraints are ensuring some conditions about your table data. Like your example.
Triggers are fired when some action (i.e. INSERT, UPDATE, DELETE) took place and you can do some work then as a reaction to this action.
Each customer has several accounts. One of the accounts, the oldest, is labeled as 'Primary' and all the others are labeled as 'Secondary'. Every week I run a single update statement "ClassifyAccounts" that takes a collection of new accounts and evaluates them according to this rule.
However, if sometime later the Primary account is closed, I then need to re-evaluate the new Primary from the remaining Secondary accounts the customer has. I want to find a way to do this so that
it is handled from the same "ClassifyAccounts" update statement I already execute each week and
the re-evaluation is optimized so that a re-evaluation does not occur unless it needs to occur.
Under these constraints, wherein I'm trying to avoid code with branches (I'm attempting a purely set-based approach), I can only achieve goal #1. The closest I can get to goal #2 is, perhaps, to set a 'NeedsReEvaluation' flag on the customer record and have the "ClassifyAccounts" update statement select any accounts that either (a) are new, with a NULL classification or (b) have a 'NeedsReEvaluation' flag set.
If I use that last trick, it would be nice to reset the 'NeedsReEvaluation' in the self-same update statement, but doing so would mean updating both sides of a join simultaneously (account and customer tables). Is this reasonable? Any other ideas?
Normalize (further) the table. One way would be:
I suppose you have a Customer and an Account table in 1:n relationship. I also guess you have an IsPrimary flag in the Account table that is set to True for the primary account of a customer and False for all others.
Create a new PrimaryAccount table with:
PrimaryAccount
--------------
CustomerId
AccountId
PRIMARY KEY (CustomerId)
FOREIGN KEY (CustomerId, AccountId)
REFERENCES Account(CustomerId, AccountId)
ON DELETE CASCADE
Then, update this table using the Account.IsPrimary flag.
You can then drop that flag and modify the ClassifyAccounts you specify in your question. It will only need to change (insert or update) the PrimaryAccount table.
When a Primary Account is deleted, it will be off course deleted from both tables and then the ClassifyAccounts can be called.
As a side effect, you will not be able to have a customer with 2 accounts set as primary, even by mistake.
If you want to keep the current structure, you could use a transaction. See this answer for an example: how-to-update-two-tables-in-one-statement-in-sql-server-2005
What about using an update trigger on your customer table that updates the 'NeedsReEvaluation' flag for the corresponding row(s) in your account table whenever the primary account value (however that is stored) in your customer table changes?
We have a table and a set of procedures that are used for generating pk ids. The table holds the last id, and the procedures gets the id, increments it, updates the table, and then returns the newly incremented id.
This procedure could potentially be within a transaction. The problem is that if the we have a rollback, it could potentially rollback to an id that is before any id's that came into use during the transaction (say generated from a different user or thread). Then when the id is incremented again, it will cause duplicates.
Is there any way to exclude the id generating table from a parent transaction to prevent this from happening?
To add detail our current problem...
First, we have a system we are preparing to migrate a lot of data into. The system consists of a ms-sql (2008) database, and a textml database. The sql database houses data less than 3 days old, while the textml acts as an archive for anything older. The textml db also relies on the sql db to provide ids' for particular fields. These fields are Identity PK's currently, and are generated on insertion before publishing to the texml db. We do not want to wash all our migrated data through sql since the records will flood the current system, both in terms of traffic and data. But at the same time we have no way of generating these id's since they are auto-incremented values that sql server controls.
Secondly, we have a system requirement which needs us to be able to pull an old asset out of the texml database and insert it back into the sql database with the original id's. This is done for correction and editing purposes, and if we alter the id's it will break relations downstream on clients system which we have no control over. Of course all this is an issue because id columns are Identity columns.
procedures gets the id, increments it,
updates the table, and then returns
the newly incremented id
This will cause deadlocks. procedure must increment and return in one single, atomic, step, eg. by using the OUTPUT clause in SQL Server:
update ids
set id = id + 1
output inserted.id
where name= #name;
You don't have to worry about concurrency. The fact that you generate ids this way implies that only one transaction can increment an id, because the update will lock the row exclusively. You cannot get duplicates. You do get complete serialization of all operations (ie. no performance and low throughput) but that is a different issue. And this why you should use built-in mechanisms for generating sequences and identities. These are specific to each platform: AUTO_INCREMENT in MySQL, SEQUENCE in Oracle, IDENTITY and SEQUENCE in SQL Server (sequence only in Denali) etc etc.
Updated
As I read your edit, the only reason why you want control of the generated identities is to be able to insert back archived records. This is already possible, simply use IDENTITY_INSERT:
Allows explicit values to be inserted
into the identity column of a table
Turn it on when you insert back the old record, then turn it back off:
SET IDENTITY_INSERT recordstable ON;
INSERT INTO recordstable (id, ...) values (#oldid, ...);
SET IDENTITY_INSERT recordstable OFF;
As for why manually generated ids serialize all operations: any transaction that generates an id will exclusively lock the row in the ids table. No other transaction can read or write that row until the first transaction commits or rolls back. Therefore there can be only one transaction generating an id on a table at any moment, ie. serialization.
Say I have two tables, users and groups each of which has an auto-incrementing primary key. Whenever a new user is created, I would like to create a group along with that user (at the same time/same transaction). However, the users need to know which group they belong to thus each user stores a group_id.
In order to do this, I create and insert a group into the database, find out what the primary key of that group I just inserted was, and finally insert the user with that new group's primary key.
Note that I need to commit the group to the database (thus having it outside any transaction with committing the user as well) in order to retrieve the primary key it was assigned.
Although this will work in most situations, if there is some kind of failure (power failure, system crash, etc.) between when I insert the group and find out its primary key, and when I insert the user, I will end up with an inconsistent database.
Is there a way to do something like reserving a primary key temporarily so that if the system crashes, it won't end up in an inconsistent state?
I'm primarily concerned with MySQL databases but if there is something in standard SQL which will allow me to do this (and is thus, compatible with other database backends), I'm also interested in knowing that.
Easy, just put both operations in a transaction. Start the transaction, create the group, create the user, then commit the transaction.
SET autocommit = 0
START TRANSACTION
INSERT INTO Groups ...
INSERT INTO Users ...
COMMIT
You would have to be using an engine that supports transactions, such as InnoDB, for your tables in order for that to work though. The default MyISAM engine does not support transactions.
If you use transactions then you'll have no problem.