One-Shot Set-Logic-Only Evaluation (Improve This Query) - sql

Each customer has several accounts. One of the accounts, the oldest, is labeled as 'Primary' and all the others are labeled as 'Secondary'. Every week I run a single update statement "ClassifyAccounts" that takes a collection of new accounts and evaluates them according to this rule.
However, if sometime later the Primary account is closed, I then need to re-evaluate the new Primary from the remaining Secondary accounts the customer has. I want to find a way to do this so that
it is handled from the same "ClassifyAccounts" update statement I already execute each week and
the re-evaluation is optimized so that a re-evaluation does not occur unless it needs to occur.
Under these constraints, wherein I'm trying to avoid code with branches (I'm attempting a purely set-based approach), I can only achieve goal #1. The closest I can get to goal #2 is, perhaps, to set a 'NeedsReEvaluation' flag on the customer record and have the "ClassifyAccounts" update statement select any accounts that either (a) are new, with a NULL classification or (b) have a 'NeedsReEvaluation' flag set.
If I use that last trick, it would be nice to reset the 'NeedsReEvaluation' in the self-same update statement, but doing so would mean updating both sides of a join simultaneously (account and customer tables). Is this reasonable? Any other ideas?

Normalize (further) the table. One way would be:
I suppose you have a Customer and an Account table in 1:n relationship. I also guess you have an IsPrimary flag in the Account table that is set to True for the primary account of a customer and False for all others.
Create a new PrimaryAccount table with:
PrimaryAccount
--------------
CustomerId
AccountId
PRIMARY KEY (CustomerId)
FOREIGN KEY (CustomerId, AccountId)
REFERENCES Account(CustomerId, AccountId)
ON DELETE CASCADE
Then, update this table using the Account.IsPrimary flag.
You can then drop that flag and modify the ClassifyAccounts you specify in your question. It will only need to change (insert or update) the PrimaryAccount table.
When a Primary Account is deleted, it will be off course deleted from both tables and then the ClassifyAccounts can be called.
As a side effect, you will not be able to have a customer with 2 accounts set as primary, even by mistake.
If you want to keep the current structure, you could use a transaction. See this answer for an example: how-to-update-two-tables-in-one-statement-in-sql-server-2005

What about using an update trigger on your customer table that updates the 'NeedsReEvaluation' flag for the corresponding row(s) in your account table whenever the primary account value (however that is stored) in your customer table changes?

Related

If I got many to many relationship data first, how do I insert them to my table?

Say I have a customer table, a product table and an order table to record who buys what, the order table basically has 2 foreign keys, customer_id & product_id.
Now I got the order information first, within in it I can't find the customer information in my local database. As it turns out this is a new customer, whose information will come later from another thread/queue. To make things even worse the customer id I got from the order information is not the same one I used locally. My local customer id is INTEGER PRIMARY KEY (I do record that "true customer id" as another column and set index on it)
So how I do record this order information? I can come up with some clumsy solution, e.g. if I can't find contact info, I insert a record for it first. Later after I get the real information for this customer, I update customer & order table. But I was wondering is there any "standard" way for the situation like this?
Inserting NULL values, and then updating with the real values later, is simple and will work (if you don't have a NOT NULL constraint).
You should use a transaction so that concurrent users don't see the incomplete data.
You could use deferred foreign key constraints:
If a statement modifies the contents of the database such that a deferred foreign key constraint is violated, the violation is not reported immediately. Deferred foreign key constraints are not checked until the transaction tries to COMMIT. For as long as the user has an open transaction, the database is allowed to exist in a state that violates any number of deferred foreign key constraints.
However, deferred foreign key constraints are useful only if you are inserting values that violate the constraint; a NULL value is not considered a FK violation.

DELETE FROM table becomes heavy as the number of records in table's children increase

I have a main table called Campaign. Campaign's Id is a foreign key in another table CampaignRun and CampaignRun's Id is a foreign key in a third table CampaignRecipient. Due to my CASCADE requirements I am using
DELETE FROM Campaign WHERE Id = x
to remove all the associated information about a campaign. But this function becomes very heavy on the server and of course locks the tables while running. I was wondering if there is a faster way of dealing with DELETE FROM. TRUNCATE is faster but it unfortunately accepts no condition.
Will appreciate any working suggestions.
maybe you can check this out, interrupt() might be the answer

Maintaining consistency when doing logical delete

I'm performing logical delete when an item should be deleted from the database.
I have added an additional DateTime column to every table that we need to perform logical delete. So when deleting you just update the field like...
UPDATE Client
SET deleted = GETDATE()
WHERE Client.CID = #cid
Later if it should be recovered then...
UPDATE Client
SET deleted = NULL
WHERE Client.CID = #cid
So that a typical selection statement would look like...
SELECT *
FROM client
WHERE CID = #cid AND deleted IS NULL
But the problem is how to handle dependencies to maintain the consistency of the database in this approach. For ex. before deleting (actually updating) an employee I have to do several checks such as to see whether there are any related attendance / Bank Accounts / Wages data/ History etc. in related tables pertaining to the employee being deleted.
So that what's the normal practice in doing such things? Do I need to check every thing in
IF EXISTS (SELECT...)
statements?
EDIT:
If I want to prevent the update when it has related records I could do something like this using UNION...
IF NOT EXISTS (SELECT emp_id FROM BankAccount WHERE emp_id = '100' UNION SELECT EID FROM Attendance WHERE EID = '100' UNION SELECT employee_id FROM SalaryTrans WHERE employee_id = '100')
UPDATE Employee SET Employee.deleted = GETDATE() WHERE emp_id = '100'
Would this be a acceptable solution?
But the problem is how to handle dependencies to maintain the
consistency of the database in this approach. For ex. before deleting
(actually updating) an employee I have to do several checks such as to
see whether there are any related attendance / Bank Accounts / Wages
data/ History etc. in related tables pertaining to the employee being
deleted.
So that what's the normal practice in doing such things?
It depends entirely on your application.
Some companies might require all the pending wages, accumulated vacation days and sick days, etc., to be "handled" before deleting a person. Handled might mean converting all those things to money, which is added to a final paycheck. Other companies might allow deleting at any time, knowing that a logical delete doesn't affect any of the related rows in other tables. Application code would be expected to know how to deal with cutting a final check to a deleted person.
Other applications might not deal with anything as important as wages and taxes. They might allow a logical delete at any time, and just not worry about the trivial consequences.
Look into triggers, they might be helpful here.
You could define a trigger on your employee table that checked to see if your logical delete would cause problems for other tables. It involves manually keeping track of what tables need access to employees, so it isn't as robust as allowing foreign key constraints to track that for you, but it can work. I'd set it up as an "AFTER UPDATE" trigger and roll back the transaction (within the trigger) if it found another table referencing the employee. They'd get a rollback anyway if they tried to actually delete an employee used in a FK constraint, so that's not that different.
Another approach is to use an AFTER DELETE trigger to copy deleted employees to a "deleted_employees" table, that way you're still hanging on to them, but any tables that reference that employee via FK will error and roll back the transaction before your trigger even has a chance to run.
I have to use a similar logic to what you proposed (just check every time you use it) in some of my stuff, and mostly I include a bit field "IsDead" that I set when I kill a record, and then I have to reference that EVERY time I use the table. But I mostly build views because my schema is complex, and it's trivial to include the IsDead = 0 in the where clause of the view. I don't know how IsDead = 0 would compare to DelDate IS NULL, if you have a large database you might test that out.

Database Schema Design: Tracking User Balance with concurrency

In an app that I am developing, we have users who will make deposits into the app and that becomes their balance.
They can use the balance to perform certain actions and also withdraw the balance. What is the schema design that ensures that the user can never withdraw / or perform more than he has, even in concurrency.
For example:
CREATE TABLE user_transaction (
transaction_id SERIAL NOT NULL PRIMARY KEY,
change_value BIGINT NOT NULL,
user_id INT NOT NULL REFERENCES user
)
The above schema can keep track of balance, (select sum() from user_transction); However, this does not hold in concurrency. Because the user can post 2 request simultaneously, and two records could be inserted in 2 simultaneous database connections.
I can't do in-app locking either (to ensure only one transcation gets written at a time), because I run multiple web servers.
Is there a database schema design that can ensure correctness?
P.S. Off the top of my head, I can imagine leveraging the uniqueness constraint in SQL. By having later transaction reference earlier transactions, and since each earlier transaction can only be referenced once, that ensures correctness at the database level.
Relying on calculating an account balance every time you go to insert a new transaction is not a very good design - for one thing, as time goes by it will take longer and longer, as more and more rows appear in the transaction table.
A better idea is to store the current balance in another table - either a new table, or in the existing users table that you are already using as a foreign key reference.
It could look like this:
CREATE TABLE users (
user_id INT PRIMARY KEY,
balance BIGINT NOT NULL DEFAULT 0 CHECK(balance>=0)
);
Then, whenever you add a transaction, you update the balance like this:
UPDATE user SET balance=balance+$1 WHERE user_id=$2;
You must do this inside a transaction, in which you also insert the transaction record.
Concurrency issues are taken care of automatically: if you attempt to update the same record twice from two different transactions, then the second one will be blocked until the first one commits or rolls back. The default transaction isolation level of 'Read Committed' ensures this - see the manual section on concurrency.
You can issue the whole sequence from your application, or if you prefer you can add a trigger to the user_transaction table such that whenever a record is inserted into the user_transaction table, the balance is updated automatically.
That way, the CHECK clause ensures that no transactions can be entered into the database that would cause the balance to go below 0.

Fixing DB Inconsistencies - ID Fields

I've inherited a (Microsoft?) SQL database that wasn't very pristine in its original state. There are still some very strange things in it that I'm trying to fix - one of them is inconsistent ID entries.
In the accounts table, each entry has a number called accountID, which is referenced in several other tables (notes, equipment, etc. ). The problem is that the numbers (for some random reason) - range from about -100000 to +2000000 when there are about only 7000 entries.
Is there any good way to re-number them while changing corresponding numbers in the other tables? At my disposal I also have ColdFusion, so any thing that works with SQL and/or that I'll accept.
For surrogate keys, they are meant to be meaningless, so unless you actually had a database integrity issue (like there were no foreign key contraints properly defined) or your identity was approaching the maximum for its datatype, I would leave them alone and go after some other low hanging fruit that would have more impact.
In this instance, it sounds like "why" is a better question than "how". The OP notes that there is a strange problem that needs to be fixed but doesn't say why it is a problem. Is it causing problems? What positive impact would changing these numbers have? Unless you originally programmed the system and understand precisely why the number is in its current state, you are taking quite a risky making changes like this.
I would talk to an accountant (or at least your financial people) before messing in anyway with the numbers in the accounts tables if this is a financial app. The Table of accounts is very critical to how finances are reported. These IDs may have meaning you don't understand. No one puts in a negative id unless they had a reason. I would under no circumstances change that unless I understood why it was negative to begin with. You could truly screw up your tax reporting or some other thing by making an uneeded change.
You could probably disable the foreign key relationships (if you're able to take it offline temporarily) and then update the primary keys using a script. I've used this update script before to change values, and you could pretty easily wrap this code in a cursor to go through the key values in question, one by one, and update the arbitrary value to an incrementing value you're keeping track of.
Check out the script here: http://vyaskn.tripod.com/sql_server_search_and_replace.htm
If you just have a list of tables that use the primary key, you could set up a series of UPDATE statements that run inside your cursor, and then you wouldn't need to use this script (which can be a little slow).
It's worth asking, though, why these values appear out of wack. Does this database have values added and deleted constantly? Are the primary key values really arbitrary, or do they just appear to be, but they really have meaning? Though I'm all for consolidating, you'd have to ensure that there's no purpose to those values.
With ColdFusion this shouldn't be a herculean task, but it will be messy and you'll have to be careful. One method you could use would be to script the database and then generate a brand new, blank table schema. Set the accountID as an identity field in the new database.
Then, using ColdFusion, write a query that will pull all of the old account data and insert them into the new database one by one. For each row, let the new database assign a new ID. After each insert, pull the new ID (using either ##IDENTITY or MAX(accountID)) and store the new ID and the old ID together in a temporary table so you know which old IDs belong to which new IDs.
Next, repeat the process with each of the child tables. For each old ID, pull its child entries and re-insert them into the new database using the new IDs. If the primary keys on the child tables are fine, you can insert them as-is or let the server assign new ones if they don't matter.
Assigning new IDs in place by disabling relationships temporarily may work, but you might also run into conflicts if one of the entries is assigned an ID that is already being used by the old data which could cause conflicts.
Create a new column in the accounts table for your new ID, and new column in each of your related tables to reference the new ID column.
ALTER TABLE accounts
ADD new_accountID int IDENTITY
ALTER TABLE notes
ADD new_accountID int
ALTER TABLE equipment
ADD new_accountID int
Then you can map the new_accountID column on each of your referencing tables to the accounts table.
UPDATE notes
SET new_accountID = accounts.new_accountID
FROM accounts
INNER JOIN notes ON (notes.accountID = accounts.accountID)
UPDATE equipment
SET new_accountID = accounts.new_accountID
FROM accounts
INNER JOIN equipment ON (equipment.accountID = accounts.accountID)
At this point, each table has both accountID with the old keys, and new_accountID with the new keys. From here it should be pretty straightforward.
Break all of the foreign keys on accountID.
On each table, UPDATE [table] SET accountID = new_accountID.
Re-add the foreign keys for accountID.
Drop new_accountID from all of the tables, as it's no longer needed.