Output result and update 2nd table without using a temporary table - sql

I currently have the following query
SELECT TOP (#batchSize) * into #t
from iv.SomeView
where EXISTS(select *
from iv.PendingChanges
where SomeView.RecordGuid = PendingChanges .RecordGuid)
update iv.PendingChanges
set IsQueued = 1
where RecordGuid in (select #t.RecordGuid from #t)
select * from #t
This pattern (using different views and PendingChanges tables in the various queries) is run very frequently and I was trying to figure out how to reduce the writes to tempdb due to the write to #t.
I came up with this solution which does have a performance boost when comparing it to the old version in the profiler
select top (0) * into #t from iv.SomeView
insert into #t with (TABLOCK)
output inserted.*
SELECT TOP (#batchSize) *
from iv.SomeView
where EXISTS(select *
from iv.PendingChanges
where SomeView.RecordGuid = PendingChanges.RecordGuid)
update iv.PendingChanges
set IsQueued = 1
where RecordGuid in (select #t.RecordGuid from #t)
But is there any way to do this even better so I can both get the result outputted and update PendingChanges.IsQueued without ever needing to write the result out temporarily to #t?
Important note: I can only have a single select from iv.SomeView because the table is very active and doing multiple SELECT TOP (#batchSize) is not deterministic nor do I have any field I could order by to make it so.

Have you tried code like :
update top (#batchsize) PC
Set IsQueued = 1
output inserted.*
From iv.PendingChanges PC
inner join iv.SomeView SV
on iv.SomeView.RecordGuid = iv.PendingChanges.RecordGuid;

Related

Using INSERT and/or UPDATE together from a single CTE

I'm trying to query a cte using a cte called s4 and based on the results either insert or update to another table eventslog. I was unable to do this in one single query and needed to insert the data first in a temp table. I would like to get rid of insertion to a temp table part and just insert or update it directly.
I think I was having issues with calling that cte more than once. Is there a work around? How can I either insert or update into a table by querying a single cte? Any help is most appreciated.
SQL;
,ss4
AS (SELECT DISTINCT h.groupid,
h.eventid,
Sum(h.vcheck) AS tot ,
max(h.eventtime) as eventtime
FROM ss3 h
GROUP BY h.groupid,
h.eventid
)
INSERT INTO #glo
(eventtime,
eventid,
groupid,
vcheck)
SELECT DISTINCT i.eventtime,
i.eventid,
i.groupid,
i.tot
FROM ss4 i
INSERT INTO eventslog
(eventtime,
eventid,
groupid)
SELECT DISTINCT j.eventtime,
j.eventid,
j.groupid
FROM #glo j
WHERE
j.vcheck = 0
AND NOT EXISTS(SELECT eventid
FROM eventslog
WHERE eventid = j.eventid
AND groupid = j.groupid
AND clearedtime IS NULL)
UPDATE k
SET k.clearedtime = l.eventtime
FROM eventslog k
RIGHT JOIN #glo l
ON k.groupid = l.groupid
AND k.eventid = l.eventid
WHERE l.vcheck > 0
AND k.groupid = l.groupid
I was working on something similar myself recently. I used a merge statement to handle doing an insert or update using a CTE.
Example below:
;WITH cte AS
(
SELECT id,
name
FROM [TableA]
)
MERGE INTO [TableA] AS A
USING cte
ON cte.ID = A.id
WHEN MATCHED
THEN UPDATE
SET A.name = cte.name
WHEN NOT MATCHED
THEN INSERT
VALUES(cte.name);
You can use Merge command which is available in SQL 2008,
A very good tutorial is available here
http://www.made2mentor.com/2013/05/writing-t-sql-merge-statements-the-right-way/

Problems inserting new data into SQL Server table

I am trying to do an if else insert into one table from another (a table type).
I am having problems where basically the first time the script runs it adds all data into the table OK but then if something is added to the source data afterwards, it does not add the new record and I don't know why.
I can't include the exact code but it looks like this...
UPDATE CUSTOMER
Set Target.Desc = Source.Desc
From #source source
WHERE Target.AccountNumber = Source.AccountNumber
IF ##ROWCOUNT=0
INSERT INTO CUSTOMER(AccountNumber, Desc)
SELECT Source.AccountNumber, Source.Desc
FROM #Source Source
I have also tried a traditional if else insert but it had the same results.
Can you see anything wrong that might be stopping the newly added records from being inserted?
Your current code will only work correctly if #source contains either all existing or all new rows.
You can use MERGE when this is not the case
MERGE CUSTOMER AS target
USING #source AS source
ON ( target.AccountNumber = source.AccountNumber )
WHEN MATCHED THEN
UPDATE SET [Desc] = source.[Desc]
WHEN NOT MATCHED THEN
INSERT (AccountNumber, [Desc])
VALUES (AccountNumber, [Desc]);
How about doing this instead of using ##ROWCOUNT
-- update existing customers
UPDATE c
SET c.Desc = Source.Desc
FROM #source source
INNER JOIN CUSTOMER c ON c.AccountNumber = Source.AccountNumber
-- insert new customers
INSERT INTO CUSTOMER(AccountNumber, Desc)
SELECT Source.AccountNumber, Source.Desc
FROM #Source Source
LEFT JOIN CUSTOMER c ON Source.AccountNumber = c.AccountNumber
WHERE c.AccountNumber IS NULL
-- update all existing rows
update c set
c.Desc = s.Desc
from CUSTOMER c
join #source s on s.AccountNumber=c.AccountNumber
-- insert all missing rows
insert into CUSTOMER (AccountNumber, Desc)
select s.AccountNumber, s.Desc
from #Source s
where not exists(
select *
from CUSTOMER c
where c.AccountNumber=s.AccountNumber
)
The first time, there is no data in your target so ##rowcount is 0.
Next time, the update updates all data and ##rowcount is not 0 and you get no data inserted.
You should not use ##rowcount but do what Andrew suggest: do both UPDATE and INSERT
or MERGE (do both in one statement)
##ROWCOUNT value contains the count of affected rows. the insert statement works only if all the records are new. and it does not go for updation. if any of the record got updated, it would not go for insertion even the source contains new records
If your requirement is to update existing records and Insert New records from source , you can use the below code.
-- update existing Rows
UPDATE CUSTOMER
SET CUSTOMER.Desc = SOURCE.Desc
from #SOURCE Source
WHERE Source.AccountNumber=CUSTOMER.AccountNumber
-- Insert New Data
INSERT INTO CUSTOMER (AccountNumber, Desc)
SELECT s.AccountNumber, s.Desc
FROM #Source Source
WHERE not exists( SELECT 1
FROM CUSTOMER
WHERE CUSTOMER.AccountNumber=Source.AccountNumber)

Avoiding creating the same query twice in SQL

I have a pretty much simple and self explanatory SQL statement:
ALTER PROCEDURE [dbo].[sp_getAllDebatesForAlias](#SubjectAlias nchar(30))
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
SELECT *
FROM tblDebates
WHERE (SubjectID1 in (SELECT SubjectID FROM tblSubjectAlias WHERE SubjectAlias = #SubjectAlias))
OR (SubjectID2 in (SELECT SubjectID FROM tblSubjectAlias WHERE SubjectAlias = #SubjectAlias)) ;
END
I am certain that there is a way to make this statement more efficient, at least get rid of that multiple creation of the same table in the in section, i.e., the
SELECT SubjectID FROM tblSubjectAlias WHERE SubjectAlias = #SubjectAlias
part.
Any ideas?
Try:
select d.* from tblDebates d
where exists
(select 1
from tblSubjectAlias s
where s.SubjectID in (d.SubjectID1, d.SubjectID2) and
s.SubjectAlias = #SubjectAlias)
SELECT d.*
FROM tblDebates d
inner join tblSubjectAlias s on s.SubjectID in (d.SubjectID1, d.SubjectID2)
where s.SubjectAlias = #SubjectAlias

How can I do a SQL UPDATE in batches, like an Update Top?

Is it possible to add a TOP or some sort of paging to a SQL Update statement?
I have an UPDATE query, that comes down to something like this:
UPDATE XXX SET XXX.YYY = #TempTable.ZZZ
FROM XXX
INNER JOIN (SELECT SomeFields ... ) #TempTable ON XXX.SomeId=#TempTable.SomeId
WHERE SomeConditions
This update will affect millions of records, and I need to do it in batches. Like 100.000 at the time (the ordering doesn't matter)
What is the easiest way to do this?
Yes, I believe you can use TOP in an update statement, like so:
UPDATE TOP (10000) XXX SET XXX.YYY = #TempTable.ZZZ
FROM XXX
INNER JOIN (SELECT SomeFields ... ) #TempTable ON XXX.SomeId=#TempTable.SomeId
WHERE SomeConditions
You can use SET ROWCOUNT { number | #number_var } it limits number of rows processed before stopping the specific query, example below:
SET ROWCOUNT 10000 -- define maximum updated rows at once
UPDATE XXX SET
XXX.YYY = #TempTable.ZZZ
FROM XXX
INNER JOIN (SELECT SomeFields ... ) #TempTable ON XXX.SomeId = #TempTable.SomeId
WHERE XXX.YYY <> #TempTable.ZZZ and OtherConditions
-- don't forget about bellow
-- after everything is updated
SET ROWCOUNT 0
I've added XXX.YYY <> #TempTable.ZZZ to where clause to make sure you will not update twice already updated value.
Setting ROWCOUNT to 0 turn off limits - don't forget about it.
You can do something like the following
declare #i int = 1
while #i <= 10 begin
UPDATE top (10) percent
masterTable set colToUpdate = lt.valCol
from masterTable as mt
inner join lookupTable as lt
on mt.colKey = lt.colKey
where colToUpdate is null
print #i
set #i += 1
end
--one final update without TOP (assuming lookupTable.valCol is mostly not null)
UPDATE --top (10) percent
masterTable set colToUpdate = lt.valCol
from masterTable as mt
inner join lookupTable as lt
on mt.colKey = lt.colKey
where colToUpdate is null
Depending on your ability to change the datastructure of the table, I would suggest that you add a field to your table that can hold some sort of batch-identificator. Ie. it can be a date-stamp if you do it daily, an incremenal value or basically any value that you can make unique for your batch. If you take the incremental approach, your update will then be:
UPDATE TOP (100000) XXX SET XXX.BATCHID = 1, XXX.YYY = ....
...
WHERE XXX.BATCHID < 1
AND (rest of WHERE-clause here).
Next time, you'll set the BATCHID = 2 and WHERE XXX.BATCHID < 2
If this is to be done repeatedly, you can set an index on the BATCHID and reduce load on the server.
DECLARE #updated_Rows INT;
SET #updated_Rows = 1;
WHILE (#updated_Rows > 0)
BEGIN
UPDATE top(10000) XXX SET XXX.YYY = #TempTable.ZZZ FROM XXX
INNER JOIN #TempTable ON XXX.SomeId=#TempTable.SomeId
WHERE SomeConditions
SET #updated_Rows = ##ROWCOUNT;
END

Is there a way to query 135 distinct tables in one single query?

In our database, we have 135 tables that have a column named EquipmentId. I need to query each of those tables to determine if any of the them have an EquipmentId equal to a certain value. Any way to do this in a single query, instead of 135 separate queries?
Thanks very much.
You are looking at either Dynamic SQL to generate queries to all of the tables and perhaps union the results, or using something like the undocumented sp_MSforeachtable stored procedure.
sp_msforeachtable 'select * from ? where equipmentid = 5'
You could use a query to build a query:
select 'union all select * from ' + name +
' where EquipmentId = 42' + char(13) + char(10)
from sys.tables
Copy the result, strip the first union all, and run the query :)
I would dump them into a temp table or something else similar:
CREATE TABLE #TempTable (Equip NVARCHAR(50))
sp_msforeachtable 'INSERT INTO #TempTable (Equip) SELECT Equip FROM ?'
SELECT * FROM #TempTable
DROP TABLE #TempTable
I assume that not all the tables in the DB have EquipmentId column.
If this is a valid assumption then #whereand parameter of sp_msforeachtable would help to filter the tables.
The query bellow will show all table names that have specified EquipmentId.
Table name will be shown as many times as many rows from this table have the specified EquipmentId.
declare #EquipmentId int = 666
create table #Result (TableName sysname)
declare #command nvarchar(4000) =
'insert into #Result select ''?'' from ? where EquipmentId = ' + cast(#EquipmentId as varchar)
execute sp_msforeachtable
#command1 = #command,
#whereand = 'and o.id in (select object_id from sys.columns where name = ''EquipmentId'')'
select *
from #Result
drop table #Result
You're probably going to have to go Dynamic SQL on this one - query the system tables for all the tables that have columns named EquipmentId, and build a dynamic SQL statement querying each table for the presence of that particular EquipmentId you need.
EDIT: #mellamokb's seems much easier - try that.
This can be implemented using LEFT JOIN's. First we'll need a base table to hold the certain values of EquipmentID's we're looking for:
CREATE TABLE #CertainValues
(
EquipmentID int
)
INSERT INTO #CertainValues(EquipmentID) VALUES (1)
INSERT INTO #CertainValues(EquipmentID) VALUES (2)
INSERT INTO #CertainValues(EquipmentID) VALUES (3)
We can then join the 135 known tables to this base table using their respective [EquipmentID] fields. To avoid cardinality issues (duplication) due to an [EquipmentID] appearing in multiple rows of one table, it's best to use a subquery to get counts per [EquipmentID] on each of the 135 tables.
SELECT
CV.EquipmentID,
ISNULL(T001.CNT, 0) AS T001,
ISNULL(T002.CNT, 0) AS T002,
...
ISNULL(T134.CNT, 0) AS T134,
ISNULL(T135.CNT, 0) AS T135
FROM
#CertainValues AS CV
LEFT OUTER JOIN (SELECT EquipmentID, SUM(1) AS CNT FROM Table001 GROUP BY EquipmentID) AS T001 ON CV.EquipmentID = T001.EquipmentID
LEFT OUTER JOIN (SELECT EquipmentID, SUM(1) AS CNT FROM Table002 GROUP BY EquipmentID) AS T002 ON CV.EquipmentID = T002.EquipmentID
...
LEFT OUTER JOIN (SELECT EquipmentID, SUM(1) AS CNT FROM Table134 GROUP BY EquipmentID) AS T134 ON CV.EquipmentID = T134.EquipmentID
LEFT OUTER JOIN (SELECT EquipmentID, SUM(1) AS CNT FROM Table135 GROUP BY EquipmentID) AS T135 ON CV.EquipmentID = T135.EquipmentID
This also gives us a more meaningful resultset which shows the count of rows per table for each of the certain values we are looking for. Below is a sample resultset:
EquipmentID T001 T002 ... T134 T135
----------- ---- ---- ... ---- ----
1 0 1 ... 2 3
2 3 2 ... 1 0
3 0 0 ... 0 0