Do triggers take a lot of computation? - sql

When implementing "Triggers" , is it required to check the criteria/condition for trigger WHENEVER a change is made? If it is so, can someone have a large number of triggers?
Or else, how is it done?

Triggers add overhead to the operation they are working on. As with any other stored code, triggers can be made very complex and time consuming. You generally want to avoid those.
Part of the issue with triggers is that they often old locks on tables/rows. This can slow down other components of a query.
The number of triggers depends on the database. Some limit to a single trigger per action per table.
In general, it is considered good practice to use other mechanisms if they can be used -- for instance, NULL constraints, DEFAULT values, and CHECK constraints.
I tend to avoid triggers because they are hard to maintain. But there are well-designed, highly performant systems that do use them.

Related

Concurrent issues in SQL Server

I have set of validations which decides record to be inserted into the database with valid status code, the issue we are facing is that many users are making requests at the same time and middle of one transaction another transaction comes and both are getting inserted with valid status, which it shouldn't. it should return an error that record already exists which can be easily handled by a simple query but at specific scenarios we are allowing them to insert duplicates, I have tried sp_getapplock which is solving my problem but it is compromising performance big time. Are there any optimal ways to handle concurrent requests?
Thanks.
sp_getapplock is pretty much the befiest and most arbitrary lock you can take. It functions more like the lock keyword does in OOO programming. Basically you name a resource, give it a scope (proc or transaction), then lock it. Pretty much nothing can bypass that lock, which is why it's solved your race conditions. It's also probably mad overkill for what you're trying to do.
The first code/architecture idea that comes to mind is to restructure this table. I'm going to assume you have high update volumes or you wouldn't be running into these violations. You could simply use a try/catch block, and have the catch block retry on a PK violation. Clumsy, but might just do the trick.
Next, you could consider altering the structure of the table which receives this stream of updates throughout the day. Make this table primary keyed off an identity column, and pretty much nothing else. Inserts will be lightning fast, so any blockage will be negligible. You can then move this data in batches into a table better suited for batch processing (as opposed to trying to batch-process in real time)
There are also a whole range of transaction isolation settings which adjust SQL's regular locking system to support different variants (whether at the batch level, or inline via query hints. I'd read up on those, but you might consider looking at Serialized isolation. Various settings will enforce different runtime rules to fit your needs.
Also be sure to check your transactions. You probably want to be locking the hell out of this table (and potentially during some other action) but once that need is gone, so should the lock.

Trigger for updated date column vs explicitly setting it

I'm wondering if using a trigger to set the updated date column of a table (or all tables) is considered a better practice versus having the application explicitly set it. I realize this could devolve into a debate over preferences and design patterns, so in an effort to parameterize the question a bit - I'd like to get a take from a best practices stand point. Taking separation of concerns into consideration (keeping business logic out of the database type of thing) as well as making sure database columns have what is intended within them (actually updating the "last modified date" column), I'm more inclined to let the database handle this via a trigger. That said, I also tend to shy away from triggers since they tend to hide the consequences of an action in a database. I'm hoping that the many smarter people here on SO have a more concrete thought than mine.
It rather depends on what you want to do. If you want to know when any column has changed, then a trigger would be the most suitable. It would be a bit of a drag to have to change code every time you added a new column.
However, if you were only interested in certain columns, you might simply wish to handle this in the update SQL.
There are of course shades in between - you could choose to handle only certain columns in the trigger.
Another consideration is how many bits of code can update a given table. You might have some all singing all dancing update code, but it is perfectly possible this could be spread out.
One word of caution - triggers tend to be the last place you consider when tracking an issue. For this sake (and this is a personal preference), I tend to avoid them unless they're absolutely necessary.
Apart from the tuning, the issue with triggers is that if they are disabled, you will never know this, and transactions will be committed. This could cause a major problem if the update-date is important to your business logic (for instance for sync processes or for finance). I prefer to explicitly handle all business related data within the procedures. And use triggers mainly for auditing.

Multiple triggers vs a single trigger

Scenario:
Each time data is inserted/updated/deleted into/in/from a table, up to 3 things need to happen:
The data needs to be logged to a separate table
Referencial integrity must be enforced on implicit related data (I'm referring to data that should be linked with a foreign key relationship, but isn't: eg. When a updating Table1.Name should also update Table2.Name to the same value)
Arbitrary business logic needs to execute
The architecture and schema of the database must not be changed and the requirements must be accomplished by using triggers.
Question
Which option is better?:
A single trigger per operation (insert/update/delete) that handles multiple concerns (logs, enforces implicit referencial integrity, and executes arbitrary business logic). This trigger could be named D_TableName ("D" for delete).
Multiple triggers per operation that were segregated by concern. They could be named:
D_TableName_Logging - for logging when something is deleted from
D_TableName_RI
D_TableName_BL
I prefer option 2 because a single unit of code has a single concern. I am not a DBA, and know enough about SQL Server to make me dangerous.
Are there any compelling reasons to handle all of the concerns in a single trigger?
Wow, you are in a no-win situation. Who ever requested that all this stuff be done via triggers should be shot and then fired. Enforcing RI via triggers?
You said the architecture and schema of the database must not be changed. However, by creating triggers, you are, at the very least, changing the schema of the database, and, it could be argued, the architecture.
I would probably go with option #1 and create additional stored procs and UDFs that take care of logging, BL and RI so that code is not duplicated amoung the individual triggers (the triggers would call these stored procs and/or UDFs). I really don't like naming the triggers they way you proposed in option 2.
BTW, please tell someone at your organization that this is insane. RI should not be enforced via triggers and business logic DOES NOT belong in the database.
Doing it all in one trigger might be more efficient in that you can possibly end up with fewer operations against the (un indexed) inserted and deleted tables.
Also when you have multiple triggers it is possible to set the first and last one that fires but any others will fire in arbitrary order so you can't control the sequence of events deterministically if you have more than 3 triggers for a particular action.
If neither of those considerations apply then it's just a matter of preference.
Of course it goes without saying that the specification to do this with triggers sucks.
I agree with #RandyMinder. However, I would go one step further. Triggers are not the right way to approach this situation. The logic that you describe is too complicated for the trigger mechanism.
You should wrap inserts/updates/deletes in stored procedures. These stored procedures can manage the business logic and logging and so on. Also, they make it obvious what is happening. A chain of stored procedures calling stored procedures is explicit. A chain of triggers calling triggers is determined by insert/update/delete statements that do not make the call to the trigger explicit.
The problem with triggers is that they introduce dependencies and locking among disparate tables, and it can be a nightmare to disentangle the dependencies. Similarly, it can be a nightmare to determine performance bottlenecks when the problem may be located in a trigger calling a trigger calling a stored procedure calling a trigger.
If you are using Microsoft SQL Server and you're able to modify the code performing the DML statements, you could use the OUTPUT clause to dump the updated, inserted, deleted values into temporary tables or memory variables instead of triggers. This will keep performance penalty to a minimum.

Is it a good idea to use a trigger to record a time a row is updated?

I hate triggers. I've been tripped up by them way too many times. But I'd like to know if it's a better to specify the time every time a row is updated or let a trigger take care of it to keep code to a minimum?
Just to clarify, the table the trigger would be on would have a column called something like LastModified
The particular scenario I'm dealing with is I am one of the developers who uses a database with about 400 stored procedures. There are about 20 tables that would have this LastModified column. These tables would be updated by about 10 different stored procedures each.
Triggers can definitely be a huge problem, especially if there are multiple layers of them. It makes debugging, performance tuning, and understanding the data logic almost impossible.
But if you keep your design to a single layer (a trigger for the table), and it is used soley for auditing (i.e. putting the updated time), I don't think that'll be a big problem at all.
Equally, if you are using a stored procedure to be the actor to your tables and views, I think it would make just as much sense (and be a lot easier to remember and look back on) to have your stored procedure put in the current datetime stamp. I think that's a great design.
But if you're using ad hoc queries, and you have a datetime field as not null, it will be a hindrance to remember to call the current datetime. Obviously this won't be a problem with the aforementioned two ideas with the stored procedure or the trigger.
So I think in this situation, is should be personal preference (as long as you don't make a string of triggers that become spaghetti).
You can always use a TimeStamp column (in MySql). If it's MSSQL, just note that the timestamp column is a serial number, not a DateTime data type.
Generally, I avoid triggers like the plague (close to on par with cursors) and so always just update the column manually. To me, it helps reinforce the business logic. But in truth, "better" is a a personal opinion.
Generally, triggers are most useful as a bottleneck if you have a table that needs certain business logic applied for every update, and you have updates coming from different sources.
For example suppose you have php and python apps that both use the database. You could try and remember to update the python code every time you update the php, and vice versa, and hope that they both work the same. Or you could use a trigger, so that no matter what client connects, the same thing will happen to the table.
In most other cases triggers are a bad idea and will cause more pain than they are worth.
Its true that some triggers can often be a source of confusion and frustration when debugging, especially when there are cascading foreign key updates.
But they do have their use.
In this case, you have a choice of updating 400 stored procedures, and updating the date manually. If you miss one of those stored procs, then the field is as good as useless.
Also while triggers 'hide' functionality, the same can be said of explicitly updating it in a stored procedure. What happens when someone writes a new stored procedure, you need to document that the field should be updated.
Personally if its critical the field is up to date, I would use a trigger. Use an INSTEAD OF trigger, so that instead of doing an update and that triggering the trigger, you are simply overriding the insert statement.

Do I really need to use transactions in stored procedures? [MSSQL 2005]

I'm writing a pretty straightforward e-commerce app in asp.net, do I need to use transactions in my stored procedures?
Read/Write ratio is about 9:1
Many people ask - do I need transactions? Why do I need them? When to use them?
The answer is simple: use them all the time, unless you have a very good reason not to (for instance, don't use atomic transactions for "long running activities" between businesses). The default should always be yes. You are in doubt? - use transactions.
Why are transactions beneficial? They help you deal with crashes, failures, data consistency, error handling, they help you write simpler code etc. And the list of benefits will continue to grow with time.
Here is some more info from http://blogs.msdn.com/florinlazar/
Remember in SQL Server all single statement CRUD operations are in an implicit transaction by default. You just need to turn on explict transactions (BEGIN TRAN) if you need to make multiple statements act as an atomic unit.
The answer is, it depends. You do not always need transaction safety. Sometimes it's overkill. Sometimes it's not.
I can see that, for example, when you implement a checkout process you only want to finalize it once you gathered all data, etc.. Think about a payment f'up, you can rollback - that's an example when you need a transaction. Or maybe when it's wise to use them.
Do you need a transaction when you create a new user account? Maybe, if it's across 10 tables (for whatever reason), if it's just a single table then probably not.
It also depends on what you sold your client on and who they are, and if they requested it, etc.. But if making a decision is up to you, then I'd say, choose wisely.
My bottom line is, avoid premature optimization. Build your application, keep in mind that you may want to go back and refactor/optimize later when you need it. Look at a couple opensource projects and see how they implemented different parts of their app, learn from that. You'll see that most of them don't use transactions at all, yet there are huge online stores that use them.
Of course, it depends.
It depends upon the work that the particular stored procedure performs and, perhaps, not so much the "read/write ratio" that you suggest. In general, you should consider enclosing a unit of work within a transaction if it is query that could be impacted by some other, simultaneously running query. If this sounds nondeterministic, it is. It is often difficult to predict under what circumstances a particular unit of work qualifies as a candidate for this.
A good place to start is to review the precise CRUD being performed within the unit of work, in this case within your stored procedure, and decide if it a) could be affected by some other, simultaneous operation and b) if that other work matters to the end result of this work being performed (or, even, vice versa). If the answer is "Yes" to both of these then consider wrapping the unit of work within a transaction.
What this is suggesting is that you can't always simply decide to either use or not use transactions, rather you should apply them when it makes sense. Use the properties defined by ACID (Atomicity, Consistency, Isolation, and Durability) to help decide when this might be the case.
One other thing to consider is that in some circumstances, particularly if the system must perform many operations in quick succession, e.g., a high-volume transaction processing application, you might need to weigh the relative performance cost of the transaction. Depending upon the size of the unit of work, a commit (or rollback) of a transaction can be resource expensive, perhaps negatively impacting the performance of your system unnecessarily or, at least, with limited benefit.
Unfortunately, this is not an easy question to precisely answer: "It depends."
Use them if:
There are some errors that you may want to test for and catch which won't be caught except by you going out and doing the work (looking things up, testing values, etc.), usually from within a transaction so that you can roll back the whole operation.
There are multi-step operations of any sort, which should, logically, be rolled back as a group if they fail.