I have a application written by other team in our company that insert data in one table.
Let's say they write data into table Log1 with fields:
Id (auto-generated primary key);
KeyId;
Value1;
Value2;
Value3.
For now I need to have another additional record in another table (Log2) from them that has only part of their data:
Id (it will be my own auto-generated Id);
KeyId;
Value1.
I see 2 ways to do that:
Create trigger that on adding records into Log1 will automatically create record in Log2 with required data;
Implement SP that will accept all required data for Log1 table and will create records in both tables, then ask those applications authors use SP instead of direct INSERT query.
What do you think is the best way in this case and why?
Thank you very much for your help.
P.S. I'm using MS SQL 2005
Go with option 1.
It means that the tables will be synchronised properly even if the "correct" stored procedure interface isn't used and it will be easier and more efficient to insert multiple rows (How would you do this with a stored procedure in SQL Server 2005? - Call it multiple times? Convert all the data to XML format first?)
If you use a trigger, be aware that as it seems both Log1 and Log2 use identity columns, that you can't use SELECT ##IDENTITY to return the PK of Log1 - you will need to use SCOPE_IDENTITY().
On the other hand, if you use a SPROC, what you can do is revoke INSERT privileges to your table from (just about) everyone, and instead grant EXEC on your SPROC. This way access to your table should be fairly well guarded.
The only way to really guarantee your data integrity is with a trigger. There is always a chance that someone will execute an operation (bulk operation, sql insert statement, etc.) that will bypass your SP.
Go with option 2.
Triggers should be avoided whenever possible.
One not so obvious reason: Have you ever used SQL Server replication facilities? Triggers won't be very straightforward to replicate. (ie it is not as easy as a couple of clicks, like it is for tables for instance). But I'm going off topic ... bottom line, triggers are evil... avoid when you can.
EDIT
More reasons: Triggers are not easy to see like other objects in the DBMS. On the application side, they are invisible, and if not well documented, they tend to be forgotten. If there are changes to the schema ... oh well, it's just easier to maintain stuff with stored procedures.
Related
How do I debug a complex query with multiple nested sub-queries in SQL Server 2005?
I'm debugging a stored procedure and trigger in Visual Studio 2005. I'd like to be able to see what the results of these sub-queries are, as I feel that this is where the bug is coming from. An example query (slightly redacted) is below:
UPDATE
foo
SET
DateUpdated = ( SELECT TOP 1 inserted.DateUpdated FROM inserted )
...
FROM
tblEP ep
JOIN tblED ed ON ep.EnrollmentID = ed.EnrollmentID
WHERE
ProgramPhaseID = ( SELECT ...)
Visual Studio doesn't seem to offer a way for me to Watch the result of the sub query. Also, if I use a temporary table to store the results (temporary tables are used elsewhere also) I can't view the values stored in that table.
Is there anyway that I can add a watch or in some other way view these sub-queries? I would love it if there was some way to "Step Into" the query itself, but I imagine that wouldn't be possible.
Ok first I would be leary of using subqueries in a trigger. Triggers should be as fast as possible, so get rid of any correlated subqueries which might run row by row instead of in a set-based fashion. Rewrite to joins. If you only want to update records based on what was in the inserted table, then join to it. Also join to the table you are updating. Exactly what are you trying to accomplish with this trigger? It might be easier to give advice if we understood the business rule you are trying to implement.
To debug a trigger this is what I do.
I write a script to:
Do the actual insert to the table
without the trigger on on it
Create a temp table named #inserted
(and/or one named #deleted)
Populate the table as I would expect
the inserted table in the trigger to
be populated from the insert you do.
Add the trigger code (minus the
create or alter trigger parts)
substituting #inserted every time I
reference inserted. (if you plan to
run multiple times until you are
ready to use it in a trigger throw
it in an explicit transaction and
rollback after checking your
results.
Add a query to check the table(s)
you are changing with the trigger for
the values you wanted to change.
Now if you need to add debug
statements to see what is happening
between steps, you can do so.
Run making changes until you get the
results you want.
Once you have the query working as
you expect it to, it is easy to take
the # signs off inserted and use it
to create the body of the trigger.
This is what I usually do in this type of scenerio:
Print out the exact sqls getting generated by each subquery
Then run each of then in the Management Studio as suggested above.
You should check if different parts are giving you the right data you expect.
I read about temporary tables, global temporary tables and table variables. I understood it but could not imagine a condition when I have to use this. Please elaborate on when I should use the temporary table.
Most common scenario for using temporary tables is from within a stored procedure.
If there is logic inside a stored procedure which involves manipulation of data that cannot be done within a single query, then in such cases, the output of one query / intermediate results can be stored in a temporary table which then participates in further manipulation via joins etc to achieve the final result.
One common scenario in using temporary tables is to store the results of a SELECT INTO statement
The table variable is relatively new (introduced in SQL Server 2005 - as far as i can remember ) can be used instead of the temp table in most cases. Some differences between the two are discussed here
In a lot of cases, especially in OLTP applications, usage of temporary tables within your procedures means that you MAY possibly have business processing logic in your database and might be a consideration for you to re-look your design - especially in case of n tier systems having a separate business layer in their application.
The main difference between the three is a matter of lifetime and scope.
By a global table, I am assuming you mean a standard, run of the mill, table. Tables are used for storing persistent data. They are accessible to all logged in users. Any changes you make are visible to other users and vice versa.
A temporary table exist solely for storing data within a session. The best time to use temporary tables are when you need to store information within SQL server for use over a number of SQL transactions. Like a normal table, you'll create it, interact with it (insert/update/delete) and when you are done, you'll drop it. There are two differences between a table and a temporary table.
The temporary table is only visible to you. Even if someone else creates a temporary table with the same name, no one else will be able to see or affect your temporary table.
The temporary table exists for as long as you are logged in, unless you explicitly drop it. If you log out or are disconnected SQL Server will automatically clean it up for you. This also means the data is not persistent. If you create a temporary table in one session and log out, it will not be there when you log back in.
A table variable works like any variable within SQL Server. This is used for storing data for use in a single transaction. This is a relatively new feature of TSQL and is generally used for passing data between procedures - like passing an array. There are three differences between a table and a table variable.
Like a temporary table, it is only visible to you.
Because it is a variable, it can be passed around between stored procedures.
The temporary table only exists within the current transaction. Once SQL Server finishes a transaction (with the GO or END TRANSACTION statements) or it goes out of scope, it will be deallocated.
I personally avoid using temporary tables and table variables, for a few reasons. First, the syntax for them is Microsoft specific. If your program is going to interact with more than one RDBMS, don't use them. Also, temporary tables and table variables have a tendency to increase the complexity of some SQL queries. If your code can be accomplished using a simpler method, I'd recommend going with simple.
I am using the Function in stored procedure , procedure contain transaction and update the table and insert values in the same table , while the function is call in procedure is also fetch data from same table.
i get the procedure is hang with function.
Can have any solution for the same?
If I'm hearing you right, you're talking about an insert BLOCKING ITSELF, not two separate queries blocking each other.
We had a similar problem, an SSIS package was trying to insert a bunch of data into a table, but was trying to make sure those rows didn't already exist. The existing code was something like (vastly simplified):
INSERT INTO bigtable
SELECT customerid, productid, ...
FROM rawtable
WHERE NOT EXISTS (SELECT CustomerID, ProductID From bigtable)
AND ... (other conditions)
This ended up blocking itself because the select on the WHERE NOT EXISTS was preventing the INSERT from occurring.
We considered a few different options, I'll let you decide which approach works for you:
Change the transaction isolation level (see this MSDN article). Our SSIS package was defaulted to SERIALIZABLE, which is the most restrictive. (note, be aware of issues with READ UNCOMMITTED or NOLOCK before you choose this option)
Create a UNIQUE index with IGNORE_DUP_KEY = ON. This means we can insert ALL rows (and remove the "WHERE NOT IN" clause altogether). Duplicates will be rejected, but the batch won't fail completely, and all other valid rows will still insert.
Change your query logic to do something like put all candidate rows into a temp table, then delete all rows that are already in the destination, then insert the rest.
In our case, we already had the data in a temp table, so we simply deleted the rows we didn't want inserted, and did a simple insert on the rest.
This can be difficult to diagnose. Microsoft has provided some information here:
INF: Understanding and resolving SQL Server blocking problems
A brute force way to kill the connection(s) causing the lock is documented here:
http://shujaatsiddiqi.blogspot.com/2009/01/killing-sql-server-process-with-x-lock.html
Some more Microsoft info here: http://support.microsoft.com/kb/323630
How big is the table? Do you have problem if you call the procedure from separate windows? Maybe the problem is related to the amount of data the procedure is working with and lack of indexes.
Is it possible to force a column in a SQL Server 2005 table to a certain value regardless of the value used in an insert or update statement is? Basically, there is a bug in an application that I don't have access to that is trying to insert a date of 1/1/0001 into a datetime column. This is producing a SqlDateTime overflow exception. Since this column isn't even used for anything, I'd like to somehow update the constraints on the columns or something in the database to avoid the error. This is obviously just a temporary emergency patch to avoid the problem... Ideas welcome...
How is the value being inserted? If it's through a stored proc... you could just modify the Sproc to ignore that input parameter.
if it's through client-side generated SQL, or an ORM tool, otoh, then afaik, the only option is a "Before" Trigger that "replaces" the value with an acceptable one...
If you're using SQL 2005 you can create an INSTEAD OF trigger.
The code in this trigger wil run in stead of the original insert/update
-Edoode
I'd create a trigger to check and change the value
If it is a third party application then I will assume you don't have access to the Stored Procedure, or logic used to generate and insert that value (it is still worth checking the SPs for the application's database though, to see if you can modify them).
As Charles suggested, if you don't have access to the source, then you need to have a trigger on the insert.
The Microsoft article here will give you some in depth information on creating triggers.
However, SQL Server doesn't have a true 'before insert' trigger (to my knowledge), so you need to try INSTEAD OF. Have a look here for more information. In that article, pay particular note of section 37.7, and the following example (again from that article):
CREATE TRIGGER T_InsertInventory ON CurrentInventory
INSTEAD OF INSERT AS
BEGIN
INSERT INTO Inventory (PartNumber, Description, QtyOnOrder, QtyInStock)
SELECT PartNumber, Description, QtyOnOrder, QtyInStock
FROM inserted
END
Nick.
the simplest hack would be to make it a varchar, and let it insert that as a string into the column.
The more complicated answer is, you can massage the data with a trigger, but it would still have to be valid in the first place. For instance I can reset a fields value in an update/insert trigger, but it would still have to get through the insert first.
Suppose I have a database table that has a timedate column of the last time it was updated or inserted. Which would be preferable:
Have a trigger update the field.
Have the program that's doing the insertion/update set the field.
The first option seems to be the easiest since I don't even have to recompile to do it, but that's not really a huge deal. Other than that, I'm having trouble thinking of any reasons to do one over the other. Any suggestions?
The first option can be more robust because the database will be maintaining the field. This comes with the possible overhead of using triggers.
If you could have other apps writing to this table in the future, via their own interfaces, I'd go with a trigger so you're not repeating that logic anywhere else.
If your app is pretty much it, or any other apps would access the database through the same datalayer, then I'd avoid that nightmare that triggers can induce and put the logic directly in your datalayer (SQL, ORM, stored procs, etc.).
Of course you'd have to make sure your time-source (your app, your users' pcs, your SQL server) is accurate in either case.
Regarding why I don't like triggers:
Perhaps I was rash by calling them a nightmare. Like everything else, they are appropriate in moderation. If you use them for very simple things like this, I could get on board.
It's when the trigger code gets complex (and expensive) that triggers start to cause lots of problems. They are a hidden tax on every insert/update/delete query you execute (depending on the type of trigger). If that tax is acceptable then they can be the right tool for the job.
You didn't mention 3. Use a stored procedure to update the table. The procedure can set timestamps as desired.
Perhaps that's not feasible for you, but I didn't see it mentioned.
As long as I'm using a DBMS in whose triggers I trust, I'd always go with the trigger option. It allows the DBMS to take care of as many things as possible, which is usually a good thing.
It work make sure under any circumstances that the timestamp column has the correct value. The overhead would be negligible.
The only thing that would be against triggers is portability. If that's not an issue, I don't think there is a question which direction to go.
I would say trigger just in case that someone uses something besides your app to update the table, you probably also want to have a LastUpdatedBy and use SUSER_SNAME() for that, this way you can see who did the update
I'm a proponent of stored procedures for everything. Your update proc could contain a GETDATE() for the column.
And I don't like triggers for this kind of update. Lack of visibility of triggers tends to cause confusion.
This sounds like business logic to me ... I would be more disposed to putting this in the code. Let the database manage the storage of data ... No more and no less.
Triggers are a blessing and a curse.
Blessing: You can use them to enable all kinds of custom constraint checking and data management without backend systems knowledge or changes.
Curse: You don't know whats happening behind your back. Concurrency issues/deadlocks by additional objects brought into transactions that were not origionally expected. Phantom behavior including session environment changes, unreliable rowcounts. Excessive triggering of conditions..additional hotspot/performance penalties.
The answer to this question (Update dates implicitly(trigger) or explicitly (code)) ususally weights heavily on context. For example if you are using last change date as an informational field you might want to only change it when a 'user' actually makes salient changes to a row vs an automated process that simply updates some sort of internal marker users don't care about.
If you are using the trigger for change synchronization or you have no control over code that is executing a trigger makes a lot more sense.
My advise on trigger use it to be careful. Most systems allow you to filter execution based on the operation and fields changed. Proper use of 'before' vs 'after' triggers can have a significant performance impacts.
Finally a few systems are capable of executing a single trigger on multiple changes (multiple rows effected within a transaction) your code should be prepared to apply itself as a bulk update to multiple rows.
Normally I'd say do it database side, but it depends on your application. If you're using LINQ-to-SQL you can just set the field as Timestamp and have your DAL use the Timestamp field for concurrency. It handles it for you automatically, so having to repeat code is a non event.
If you're writing your DAL yourself though, then I'd be more likely to handle this on the database side as it makes writing user interfaces far more flexible - although, I'd likely do this in a stored procedure that has "public" access and the tables locked down - you don't want just any clown coming along and bypassing your stored procedure by writing to the tables directly... unless you plan on making your DAL a standalone component that any future application must use to access the database, in which case, you could code it directly into the DAL - of course, you should only do this if you can guarantee that everyone accessing the database is doing so through your DAL component.
If you're going to allow "public" access to the database to insert into tables, then you'll have to go with the trigger because otherwise anyone can insert/update a single field in the table and the updated field could never get updated.
I would have the date maintained at the database, i.e., a trigger, stored procedure, etc. In most of your database-driven applications the user app is not going to be the only means by which the business users get at data. There are reporting tools, extracts, user SQL, etc. There's also updates and corrections that are done by the DBA that the application won't be providing the date for as well.
But honestly the #1 reason I wouldn't do it from the application is you have no control over the date/time on the client machine. They might be rolling it back to get more days out of a trial license on something or may just want to do bad things to your program.
You can do this without the trigger if your database supports default values on the fields. For example, in SQL Server 2005 I have a table with a field created like this:
create table dbo.Repository
(
...
last_updated datetime default getdate(),
...
)
then the insert code just leaves that field out of the insert field list.
I forgot that only worked for the first insert - I do have an update trigger as well, to update the date fields and put a copy of the updated record in my history table - which I would post ... but the editor keeps erroring out on my code ...
Finally:
create trigger dbo.Repository_Upd on dbo.Repository instead of update
as
--**************************************************************************
-- Trigger: Repository_Upd
-- Author: Ron Savage
-- Date: 09/28/2008
--
-- Description:
-- This trigger sets the last_updated and updated_by fields before the update
-- and puts a copy of the updated row into the Repository_History table.
--
-- Modification History:
-- Date Init Comment
-- 10/22/2008 RS Blocked .prm files from updating the history as they
-- get updated every time the cfg file is run.
-- 10/21/2008 RS Updated the insert into the history table to use the
-- d.last_updated field from the Repository table rather
-- than getdate() to avoid micro second differences.
-- 09/28/2008 RS Created.
--**************************************************************************
begin
--***********************************************************************
-- Update the record but fill in the updated_by, updated_system and
-- last_updated date with current information.
--***********************************************************************
update cr set
cr.filename = i.filename,
cr.created_by = i.created_by,
cr.created_system = i.created_system,
cr.create_date = i.create_date,
cr.updated_by = user,
cr.updated_system = host_name(),
cr.last_updated = getdate(),
cr.content = i.content
from
Repository cr
JOIN Inserted i
on (i.config_id = cr.config_id);
--***********************************************************************
-- Put a copy in the history table
--***********************************************************************
declare #extention varchar(3);
select #extention = lower(right(filename,3)) from Inserted;
if (#extention <> 'prm')
begin
Insert into Repository_History
select
i.config_id,
i.filename,
i.created_by,
i.created_system,
i.create_date,
user as updated_by,
host_name() as updated_system,
d.last_updated,
d.content
from
Inserted i
JOIN Repository d
on (d.config_id = i.config_id);
end
end
Ron