SQL Server 2 Table Sync Recursion - sql

I'm using triggers to keep two identical tables in a single database in sync. One is for an internal proprietary system, the other is used to expose a subset of data to the outside world. I'm not able to use the same table for both.
I need updates, inserts, deletes in either of the tables to be applied to the other.
So far I'm using triggers on both tables instead of a scheduled stored procedure because I wanted immediate updates. The problem is, an update to table A fires the trigger to update table B, which fires a trigger to update table A, which fires the trigger to table B.... and so on.
What's the best way to stop the recursion?
One way is to check the data first to see if it is different, something like this:
SELECT #cempno = inserted.cempno FROM inserted
SELECT #count = COUNT(*)
FROM jcempy j INNER JOIN zhhjcempy z ON j.cempno = z.cempno AND j.cempno = #cempno
WHERE (j.ccostcode <> z.ccostcode)
OR (j.cimearnreg <> z.cimearnreg)
OR (j.cimearnot <> z.cimearnot)
OR (j.cimearndt <> z.cimearndt)
OR (j.cimearnl1 <> z.cimearnl1)
IF #count = 1
BEGIN
-- Update the record
END
Another way could be to use a third table that holds status flags to show which table first initiated the update, but I've a feeling managing that will be a problem once I have 100 users hammering away at the system.
Any ideas, or comments on what sort of performance penalties the data check will incur?
Thanks!

If you can't avoid triggers look at ##NESTLEVEL within your trigger.
It should be always the same number. If number is bigger, just do nothing.

Related

Quickly update and do statistical normalization of data SQL

I have a SQL-Server table which needs to be updated via a SQLCLR function. An update to a single row will need to trigger table-wide update. I was wondering how to properly perform the update and normalize the table. The main problem I see is that it's linked to a website, so there could be multiple updates coming in at any time. The website will only see a few hundred visitors and after a short period of time will be closed (collecting data for research).
To get an idea of the SQLClr call:
DECLARE #ID INT = 501;
SELECT dbo.fn_ComputeJaccard(
RectangleTwo.MinX, RectangleTwo.MinY, RectangleTwo.MaxX, RectangleTwo.MaxY,
RectangleOne.MinX, RectangleOne.MinY, RectangleOne.MaxX, RectangleOne.MaxY) as RunningTotal
FROM PreProcessed RectangleOne
INNER JOIN PreProcessed RectangleTwo
ON RectangleTwo.ID <> RectangleOne.ID
WHERE dbo.fn_ComputeJaccard(
RectangleTwo.MinX, RectangleTwo.MinY, RectangleTwo.MaxX, RectangleTwo.MaxY,
RectangleOne.MinX, RectangleOne.MinY, RectangleOne.MaxX, RectangleOne.MaxY) > .97
AND RectangleTwo.ID = #ID
I would need to select this data into a temp table, normalize that table and then perform an update to the original table with the values (newValue*.5 + oldValue*.9) then renormalize the whole table. I imagine this would take a while to process, so I'm looking for the most efficient way of doing that, plus a solution to the multiple updates flying in issue.
Any advice you could give me would be great!
Thanks

Simultaneous updates to SQL table

I have a table will needs to be updated based on values within the table. The whole table will be updated and I wanted to ensure that SQL-Server will handle the locking for me. I know I can use WITH(LOCKTYPE). But as my update will be something like:
SELECT MyTable.ID, MyTable.Cost
INTO #tempTable from MyTable
-- More complex calculations here
UPDATE MyTable
SET MyTable.Cost = #tempTable.Cost+1
FROM #tempTable
WHERE MyTable.ID = #TempTable.ID
I wanted to be sure that I could lock the entire table so that if another request comes in to update the table it has to wait, but ideally, if a read request comes in it is processed. Is that possible?
Edit
Process:
Read from Table X
Perform calculations
Write to temp table Y
Average values between X and Y
Write back to X
Within the SP, once step 1 has happened, I don't want any other calls to the SP to be processed, I want them to sit in a queue. Because each update is dependent on the values already in the table, I cannot allow another call to the SP to read from the table.
In general, the SP needs to be run serially, the SP should not be allowed to be running twice at the same time. I don't mind if the call goes through and the SELECT is queued, but once a select happens, another select (within the SP) cannot be allowed to happen in another thread.
You can use a SQL transaction : http://msdn.microsoft.com/en-us/library/ms188929.aspx#fbid=-CQjt8-slkk

Updating 100k records in one update query

Is it possible, or recommended at all, to run one update query, that will update nearly 100k records at once?
If so, how can I do that? I am trying to pass an array to my stored proc, but it seems not to work, this is my SP:
CREATE PROCEDURE [dbo].[UpdateAllClients]
#ClientIDs varchar(max)
AS
BEGIN
DECLARE #vSQL varchar(max)
SET #vSQL = 'UPDATE Clients SET LastUpdate=GETDATE() WHERE ID IN (' + #ClientIDs + ')';
EXEC(#vSQL);
END
I have not idea whats not working, but its just not updating the relevant queries.
Anyone?
The UPDATE is reading your #ClientIDs (as a Comma Separated Value) as a whole. To illustrate it more, you are doing like this.
assume the #ClientIDs = 1,2,3,4,5
your UPDATE command is interpreting it like this
UPDATE Clients SET LastUpdate=GETDATE() WHERE ID IN ('1,2,3,4,5')';
and not
UPDATE Clients SET LastUpdate=GETDATE() WHERE ID IN (1,2,3,4,5)';
One suggestion to your question is by using subquery on your UPDATE, example
UPDATE Clients
SET LastUpdate = GETDATE()
WHERE ID IN
(
SELECT ID
FROM tableName
-- where condtion
)
Hope this makes sense.
A few notes to be aware of.
Big updates like this can lock up the target table. If > 5000 rows are affected by the operation, the individual row locks will be promoted to a table lock, which would block other processes. Worth bearing in mind if this could cause an issue in your scenario. See: Lock Escalation
With a large number of rows to update like this, an approach I'd consider is (basic):
bulk insert the 100K Ids into a staging table (e.g. from .NET, use SqlBulkCopy)
update the target table, using a join onto the above staging table
drop the staging table
This gives some more room for controlling the process, but breaking the workload up into chunks and doing it x rows at a time.
There is a limit for the number of items you pass to 'IN' if you are giving an array.
So, if you just want to update the whole table, skip the IN condition.
If not specify an SQL inside IN. That should do the job
The database will very likely reject that SQL statement because it is too long.
When you need to update so many records at once, then maybe your database schema isn't appropriate. Maybe the LastUpdate datum should not be stored separately for each client but only once globally or only once for a constant group of clients?
But it's hard to recommend a good course of action without seeing the whole picture.
What version of sql server are you using? If it is 2005+ I would recommend using TVPs (table valued parameters - http://msdn.microsoft.com/en-us/library/bb510489.aspx). The transfer of data will be faster (as opposed to building a huge string) and your query would look nicer:
update c
set lastupdate=getdate()
from clients c
join #mytvp t on c.Id = t.Id
Each SQL statement on its own is a transaction statement . This means sql server is going to grab locks for all these million of rows .It can really degrade the performance of a table .So you really don’t tend to update a table which has million of rows in it which hurts the performance.So the workaround is to set rowcount before DML operation
set rowcount=100
UPDATE Clients SET LastUpdate=GETDATE()
WHERE ID IN ('1,2,3,4,5')';
set rowcount=0
or from SQL server 2008 you can parametrize Top keyword
Declare #value int
set #value=100000
again:
UPDATE top (#value) Clients SET LastUpdate=GETDATE()
WHERE ID IN ('1,2,3,4,5')';
if ##rowcount!=0 goto again
See how fast the above query takes and then adjust and change the value of the variable .You need to break the tasks for smaller units as suggested in the above answers
Method 1:
Split the #clientids with delimiters ','
put in array and iterate over that array
update clients table for each id.
OR
Method 2:
Instead of taking #clientids as a varchar2, follow below steps
create object type table for ids and use join.
For faster processing u can also create index on clientid as well.

Slow join on Inserted/Deleted trigger tables

We have a trigger that creates audit records for a table and joins the inserted and deleted tables to see if any columns have changed. The join has been working well for small sets, but now I'm updating about 1 million rows and it doesn't finish in days. I tried updating a select number of rows with different orders of magnitude and it's obvious this is exponential, which would make sense if the inserted/deleted tables are being scanned to do the join.
I tried creating an index but get the error:
Cannot find the object "inserted" because it does not exist or you do not have permissions.
Is there any way to make this any faster?
Inserting into temporary tables indexed on the joining columns could well improve things as inserted and deleted are not indexed.
You can check ##ROWCOUNT inside the trigger so you only perform this logic above some threshold number of rows though on SQL Server 2008 this might overstate the number somewhat if the trigger was fired as the result of a MERGE statement (It will return the total number of rows affected by all MERGE actions not just the one relevant to that specific trigger).
In that case you can just do something like SELECT #NumRows = COUNT(*) FROM (SELECT TOP 10 * FROM INSERTED) T to see if the threshold is met.
Addition
One other possibility you could experiment with is simply bypassing the trigger for these large updates. You could use SET CONTEXT_INFO to set a flag and check the value of this inside the trigger. You could then use OUTPUT inserted.*, deleted.* to get the "before" and "after" values for a row without needing to JOIN at all.
DECLARE #TriggerFlag varbinary(128)
SET #TriggerFlag = CAST('Disabled' AS varbinary(128))
SET CONTEXT_INFO #TriggerFlag
UPDATE YourTable
SET Bar = 'X'
OUTPUT inserted.*, deleted.* INTO #T
/*Reset the flag*/
SET CONTEXT_INFO 0x

basic SQL atomicity "UPDATE ... SET .. WHERE ..."

I have a rather basic and general question about atomicity of "UPDATE ... SET .. WHERE ..." statement.
having a table (without extra constraint),
+----------+
| id | name|
+----------+
| 1 | a |
+----+-----+
now, I would execute following 4 statements "at the same time" (concurrently).
UPDATE table SET name='b1' WHERE name='a'
UPDATE table SET name='b2' WHERE name='a'
UPDATE table SET name='b3' WHERE name='a'
UPDATE table SET name='b4' WHERE name='a'
is there only one UPDATE statement would be executed with table update?
or, is it possible that more than one UPDATE statements can really update the table?
should I need extra transaction or lock to let only one UPDATE write values into table?
thanks
[EDIT]
the 4 UPDATE statements are executed parallel from different processes.
[EDIT] with Postgresql
One of these statements will lock the record (or the page, or the whole table, depending on your engine and locking granularity) and will be executed.
The others will wait for the resource to be freed.
When the lucky statement will commit, the others will either reread the table and do nothing (if your transaction isolation mode is set to READ COMMITTED) or fail to serialize the transaction (if the transaction isolation level is SERIALIZABLE).
If you ran these UPDATEs in one go, it will run them in order, so it would update everything to b1 but then the other 3 would fail to update any as there will be no A's left to update.
There would be an implicit transaction around each which would hold other UPDATEs in a queue. Only the first one through would win in this case as each subsequent update will not see a name called 'a'.
Edit: I was assuming here that you were calling each UPDATE from separate processes. If they were called in a single script, they would be run consecutively in the order of appearence.
There is only ever one UPDATE statement that can access a record. Before it runs, it starts a transaction and locks the table (or more correctly, the page the record is on, or even only the record itself - this depends on many factors). Then it makes the change and unlocks the table again, commiting the transaction in the process.
Updates/deletes/inserts to a certain record are single threaded by their very nature, it is required to ensure table integrity. This does not mean that updates to many records (that are on different pages) could not run in parallel.
With the statement you posted and you would end up with an error because after the first update 'a' can't be found. What are you trying to achieve with this?
Even if you don't specify begin transaction and commit transaction, a single SQL statement is always transactional. That means only one of the updates is allowed to modify the row at the same time. The other queries will block on the update lock owned by the first query.
I believe this may be what you are looking for (in T-SQL):
UPDATE [table]
SET [table].[name] = temp.new_name
FROM
[table],
(
SELECT
id,
new_name = 'B'+ cast(row_number() over (order by [name]) as varchar)
FROM
[table]
WHERE
[table].name = 'A'
) temp
WHERE [table].id = temp.id
There should be an equivalent function to row_number in the other SQL flavors.