T-SQL: Trigger that runs right before the end of a modifying transaction - sql

Problem statement
I have a view for recursively collecting and aggregating infos from 3 different large to very large tables. This view itself needs quite a time to execute but is needed in many select statements and is executed quite often.
The resulting view, however, is very small (a few dozend results in 2 columns).
All updating actions typically start a transaction, execute many thousand INSERTs and then commit the transaction. This does not occur very frequently, but if something is written to the database it is usually a large amount of data.
What I tried
As the view is small, does not change frequently and is read often, I thought of creating an indexed view. However, sadly you can not create an indexed view with CTEs or even recursive CTEs.
To 'emulate' a indexed or materialized view, I thought about writing a trigger that executes the view and stores the results into a table every time one of the base tables get modified. However, I guess this would take forever if a large amout of entries are UPDATEed or INSERTed and the trigger runs for each INSERT/UPDATE statement on those tables, even if they are inside a single transaction.
Actual question
Is it possible to write a trigger that runs once before commiting and after the last insert/update statement of a transaction has finished and only if any of the statements has changed any of the three tables?

No, there's no direct way to make a trigger that runs right before the end of a transaction. DML Triggers run once per triggering DML statement (INSERT, UPDATE, DELETE), and there's no other kind of trigger related to data modification.
Indirectly, you could have all of your INSERT's insert into a temporary table and then INSERT them all together from the #temp table into the real table, resulting in one trigger firing for that table. But if you are writing to multiple tables, you would still have the same problem.
The SOP (Standard Operating Practice) way to address this is to have a stored procedure handle everything up front instead of a Trigger trying to catch everything on the back-side.
If data consistency is important, then I'd recommend that you follow the SOP approach based in a stored procedure that I mentioned above. Here's a hi-level outline of this approach:
Use a stored procedure that dumps all of the changes into #temp tables first,
then start a transaction,
then make the changes, moving data/changes from your #temp table(s) into your actual tables,
then do the follow-up work you wanted in a trigger. If these are consistency checks, then if they fail, you should rollback the transaction.
Otherwise, it then commits the transaction.
This is almost always how something like this is done correctly.

If your view is small and queried frequently and your underline tables are rarely changed, you don't need a "view". Instead you need a summary table with the same result of the view and updated by your triggers on each underline table.
A trigger is triggered every time you have data modification (insert, delete and update) but one modification will only trigger once, whether it updates one record or one million rows. You don't need worry about the size of update. Instead the frequency of updating is your concern.
If your have a procedure periodically insert large number of rows, or updates large number of rows one by one, you can change the procedure and disable the triggers before the update so the summary table will be updated only before the end of procedure, where you can call the same "sum" procedure and enable those triggers.
If you HAVE TO keep the "summary" up-to-date all the time, even during large number of transactions (i doubt it's very helpful or practical, if your view is slow to execute), you can disable those triggers, do some calculation by yourself after each transaction, update the summary table after each transaction, in your procedure.

Related

SQL lazy update operations

We insert SQL entities into a table, one by one. It's easy and fast. After the entity insert, we are executing an SP to updates several tables according to the new entity, update some calculated fields, some lookup tables to help to find this new entity. This takes a lot of time and sometimes ends up in a deadlock state.
Inserting the main entity must be fast and reliable, updating the additional tables is not important to happen immediately. I was wondering (I am not a DB expert) if there is an SQL methodology similar to the thread handling in C#, to maintain an update thread, which can be awakened when a new entity arrives to update the additional tables after the insertion. This thread can update these tables in "one thread" to avoid deadlock.
I can imagine an sql job which executes every minute, searches for new entities and executes the updates, but it seems too rough to me.
What is the best practice to implement this on MS SQL side?
There are a number of ways you could achieve this. You mention that the two can be done separately - immediate updating is not important. In that case, you could set up a SQL Agent to run a stored procedure that checks for missing records and performs the update.
Another approach would be to put the entire original update inside a stored procedure responsible for performing the update and all the housekeeping work, then all you would do is call the stored procedure with the right parameters and it would do all the work behind the curtain.
Another way would be to add triggers on the inserted table to do the update for you. Sounds like the first is what you probably want.

concurrent SQL statements in different transanctions

Reading up the documentation of PL-SQL CREATE TRIGGER statement in ORACLE, I went through the following bit of information:
When a trigger fires, tables that the trigger references might be
undergoing changes made by SQL statements in other users'
transactions. SQL statements running in triggers follow the same rules
that standalone SQL statements do.
It basically says the rules that would apply to two conflicting standalone SQL statements (running at the same time) are unchanged when one of the statements is performed from within a trigger.
So we have some "usual" rules about concurrent transactions and, as for these rules, the following two are mentioned:
Specifically:
Queries in the trigger see the current read-consistent materialized
view of referenced tables and any data changed in the same
transaction.
Updates in the trigger wait for existing data locks to be released
before proceeding.
These two rules look like "obscure" to non-expert users.
What do they mean more precisely?
Queries in the trigger see the current read-consistent materialized
view of referenced tables and any data changed in the same
transaction.
This means the data the trigger sees, like if it does a SELECT on a different table, represents the state of that table when the statement started running. The trigger does not see rows that have been changed by other sessions that have not been committed yet.
Updates in the trigger wait for existing data locks to be released
before proceeding.
When an Oracle statement modifies a row, the row is locked against other people changing it until that session either commits or rolls back its transaction. So if you do an insert on table A, your trigger does an update on table B, but someone else's session has already done an update on table B for that same row, your transaction will wait until they commit or rollback.

View Creation using DDL trigger

I am looking back at Oracle (11g) development after few years for my team project and need help. We are trying to implement a POC where any add/drop column will drop and recreate a corrosponding view. View refers to a Mapping table for producing its alias names and selection of columns.
My solutions:
--1. DDL Trigger that scans for Add Column, Drop Column -> Identifies Column Names -> Updates Field_Map table -> Drops View -> Creates View with Field_Map table alias names
Challenge: Received recursive trigger error because of View creation inside DDL
--2. DDL Trigger scans for Add Column, Drop Column -> -> Updates Field Map table -> Writes identified column names, tables to Audit_DDL table -> DML trigger on Audit_DDL table fires -> Disables DDL trigger (to avoid recursion) -> Drops view -> Creates view with Field_Map table alias names
Challenge: Received recursive trigger error. I think, it is still considering whole flow as one transaction. Separating create view under DML trigger didn't help.
so, I am thinking of alternatives:
--3. Store Trigger, Tables in Schema1 and View Schema2. I am expecting, this may avoid recursion since create view will now happen on schema2 and trigger is built on schema1.
--4. Create a Stored Procedure which scans for Audit_DDL entries (from #2) for tables, columns updated. Creates views and marks checked for processed Audit_DDL entries. Hourly job now runs this procedure.
Any suggestions? Thanks in advance for helping me out!
If you want to do DDL from a trigger, it would need to be asynchronous. The simplest solution would be for the DDL trigger to submit a job using the DBMS_JOB package that would execute whatever DDL you want to do. That job would not run until the triggering transaction (the ALTER statement) committed. But it would probably run a few seconds later (depending on how many other jobs are running, how many jobs are allowed, etc.). Whether you build the DDL statement you want to execute in the trigger and just pass it to the job or whether you store the information the job will need in a table and pass some sort of key (i.e. the object name) and let the job assemble the DDL statement is an implementation detail.
That being said, this seems like a really exceptionally poor architecture. If you are adding or removing a column, that is something that should be going through a proper change control process. If the change is going through change control, it should be easy enough to include the changes to the views in the same script. And applications that depend on the views should be tested as part of the change control process. If the change is not going through change control and columns are being added to or removed from views willy-nilly, you've got much bigger problems in the business process and you're very likely to cause one or more applications to barf in strange and wonderful ways at seemingly obscure points in time.

Trigger on Audit Table failing due to update conflict

I have a number of tables that get updated through my app which return a lot of data or are difficult to query for changes. To get around this problem, I have created a "LastUpdated" table with a single row and have a trigger on these complex tables which just sets GetDate() against the appropriate column in the LastUpdated table:
CREATE TRIGGER [dbo].[trg_ListItem_LastUpdated] ON [dbo].[tblListItem]
FOR INSERT, UPDATE, DELETE
AS
UPDATE LastUpdated SET ListItems = GetDate()
GO
This way, the clients only have to query this table for the last updated value and then can decided whether or not they need to refresh their data from the complex tables. The complex tables are using snapshot isolation to prevent dirty reads.
In busy systems, around once a day we are getting errors writing or updating data in the complex tables due to update conflicts in "LastUpdated". Because this occurs in the statement executed by the trigger, the affected complex table fails to save data. The following error is logged:
Snapshot isolation transaction aborted due to update conflict. You
cannot use snapshot isolation to access table 'dbo.tblLastUpdated'
directly or indirectly in database 'devDB' to update, delete, or
insert the row that has been modified or deleted by another
transaction. Retry the transaction or change the isolation level for
the update/delete statement.
What should I be doing here in the trigger to prevent this failure? Can I use some kind of query hints on the trigger to avoid this - or can I just ignore errors in the trigger? Updating the data in LastUpdated is not critical, but saving the data correctly into the complex tables is.
This is probably something very simple that I have overlooked or am not aware of. As always, thanks for any info.
I would say that you should look into using Change Tracking (http://msdn.microsoft.com/en-gb/library/cc280462%28v=sql.100%29.aspx), which is lightweight builtin SQL Server functionality that you can use to monitor the fact that a table has changed, as opposed to logging each individual change (which you can also do with Change Data Capture). It needs Snapshot Isolation, which you are already using.
Because your trigger is running in your parent transaction, and your snapshot has become out of date, your whole transaction would need to start again. If this is a complex workload, maintaining this last updated data in this way would be costly.
Short answer - don't do that! Making the updated transactions dependent on one single shared row makes it prone to deadlocks and and update conflicts whole gammut of nasty things.
You can either use views to determine last update, e.g.:
SELECT
t.name
,user_seeks
,user_scans
,user_lookups
,user_updates
,last_user_seek
,last_user_scan
,last_user_lookup
,last_user_update
FROM sys.dm_db_index_usage_stats i JOIN sys.tables t
ON (t.object_id = i.object_id)
WHERE database_id = db_id()
Or, if you really insist on the solution with LastUpdate, you can implement it's update from the trigger in an autonomous transactions. Even though SQL Server doesn't support autonomous transactions, it could done using liked servers: How to create an autonomous transaction in SQL Server 2008
The schema needs to change. If you have to keep your update table, make a row for every table. That would greatly reduce your locks because each table could update their very own row and not competing for the sole row in a table.
LastUpdated
table_name (varchar(whatever)) pk
modified_date (datetime)
New Trigger for tblListItem
CREATE TRIGGER [dbo].[trg_ListItem_LastUpdated] ON [dbo].[tblListItem]
FOR INSERT, UPDATE, DELETE
AS
UPDATE LastUpdated SET modified_date = GetDate() WHERE table_name = 'tblListItem'
GO
Another option that I use a lot is having a modified_date column in every table. Then people know exactly which records to update/insert to sync with your data rather than dropping and reloading everything in the table each time one record changes or is inserted.
Alternatively, you can update the log table inside the same transaction which you use to update your complex tables inside your application & avoid the trigger altogether.
Update
You can also opt for inserting a new row instead of updating the same row in LastUpdated table. You can then query max timestamp for latest update. However, with this approach your LastUpdated table would grow each day which you need to take care of if volume of transactions is high.

CREATE TRIGGER is taking more than 30 minutes on SQL Server 2005

On our live/production database I'm trying to add a trigger to a table, but have been unsuccessful. I have tried a few times, but it has taken more than 30 minutes for the create trigger statement to complete and I've cancelled it.
The table is one that gets read/written to often by a couple different processes. I have disabled the scheduled jobs that update the table and attempted at times when there is less activity on the table, but I'm not able to stop everything that accesses the table.
I do not believe there is a problem with the create trigger statement itself. The create trigger statement was successful and quick in a test environment, and the trigger works correctly when rows are inserted/updated to the table. Although when I created the trigger on the test database there was no load on the table and it had considerably less rows, which is different than on the live/production database (100 vs. 13,000,000+).
Here is the create trigger statement that I'm trying to run
CREATE TRIGGER [OnItem_Updated]
ON [Item]
AFTER UPDATE
AS
BEGIN
SET NOCOUNT ON;
IF update(State)
BEGIN
/* do some stuff including for each row updated call a stored
procedure that increments a value in table based on the
UserId of the updated row */
END
END
Can there be issues with creating a trigger on a table while rows are being updated or if it has many rows?
In SQLServer triggers are created enabled by default. Is it possible to create the trigger disabled by default?
Any other ideas?
The problem may not be in the table itself, but in the system tables that have to be updated in order to create the trigger. If you're doing any other kind of DDL as part of your normal processes they could be holding it up.
Use sp_who to find out where the block is coming from then investigate from there.
I believe the CREATE Trigger will attempt to put a lock on the entire table.
If you have a lots of activity on that table it might have to wait a long time and you could be creating a deadlock.
For any schema changes you should really get everyone of the database.
That said it is tempting to put in "small" changes with active connections. You should take a look at the locks / connections to see where the lock contention is.
That's odd. An AFTER UPDATE trigger shouldn't need to check existing rows in the table. I suppose it's possible that you aren't able to obtain a lock on the table to add the trigger.
You might try creating a trigger that basically does nothing. If you can't create that, then it's a locking issue. If you can, then you could disable that trigger, add your intended code to the body, and enable it. (I do not believe you can disable a trigger during creation.)
Part of the problem may also be the trigger itself. Could your trigger accidentally be updating all rows of the table? There is a big differnce between 100 rows in a test database and 13,000,000. It is a very bad idea to develop code against such a small set when you have such a large dataset as you can have no way to predict performance. SQL that works fine for 100 records can completely lock up a system with millions for hours. You really want to know that in dev, not when you promote to prod.
Calling a stored proc in a trigger is usually a very bad choice. It also means that you have to loop through records which is an even worse choice in a trigger. Triggers must alawys account for multiple record inserts/updates or deletes. If someone inserts 100,000 rows (not unlikely if you have 13,000,000 records), then looping through a record based stored proc could take hours, lock the entire table and cause all users to want to hunt down the developer and kill (or at least maim) him because they cannot get their work done.
I would not even consider putting this trigger on prod until you test against a record set simliar in size to prod.
My friend Dennis wrote this article that illustrates why testing a small volumn of information when you have a large volumn of information can create difficulties on prd that you didn't notice on dev:
http://blogs.lessthandot.com/index.php/DataMgmt/?blog=3&title=your-testbed-has-to-have-the-same-volume&disp=single&more=1&c=1&tb=1&pb=1#c1210
Run DISABLE TRIGGER triggername ON tablename before altering the trigger, then reenable it with ENABLE TRIGGER triggername ON tablename