I have two tables.
Table A and Table B. The columns are same.
create table TableA (
id int
, name varchar
, last datetime
)
create table TableB (
id int
, name varchar
, last datetime
)
I m populating table A with mass data. and i would like to either insert or update the data in table A into table B.
I d like to take the data from table A and either insert into table B if id and name doenst match or update if the id and name does match.
I tried some ETL tool but the result was very slow. I have indexing on id and name, I wanted to try this with SQL.
I have the following but not working correct:
SELECT #id = ID,
#name = name,
#LSDATE = LastSeen_DateTime
FROM DBO.A
IF EXISTS (SELECT ID, name FROM DBO.A
WHERE #ID = ID AND #name = Name)
begin
-- update
end
else
begin
--insert
end
i guess i need to put this in a loop and not quite sure how I can make this run.
Thanks.
Its probably faster to do it two statements one update and one insert rather than a loop
This statement updates all B rows using the data from A where the ID is the same but the name is different
Update
Update
tableB
SET
name = a.Name
From
tableB a
INNER JOIN tableA a
on b.ID = a.ID
and A.Name <> b.Name
This statement inserts all B rows into A where the id doesn't exist in A
INSERT
INSERT INTO
tableB
( ID,
Name
)
SELECT
a.ID
a.Name
FROM
tableA b
WHERE
not exists (Select A.ID From tableB a WHERE a.ID = b.ID)
Updated (reversed it from A into B rather than B into A)
If you were using SQL Server 2008 (or Oracle or DB2), then you could use a merge statement.
MERGE B
USING A AS source
ON (B.ID = source.ID and B.Name = source.Name)
WHEN MATCHED THEN
UPDATE SET Last = source.Last
WHEN NOT MATCHED BY TARGET THEN
INSERT (ID, Name, Last) VALUES (source.ID, source.Name, source.Last)
-- the following is optional, if you remove it, add a semicolon to the end of the above line.
OUTPUT $action,
inserted.ID AS SourceID, inserted.Name AS SourceName,
inserted.Last AS SourceLast,
deleted.ID AS TargetID, deleted.Name AS TargetName,
deleted.Last AS TargetLast ;
The bit with the "output $action" will display what rows are getting updated and what rows are getting updated.
weasel words: I recognize this isn't exactly what you were looking for, but since others may search this topic, it may be helpful for others in the future.
DECLARE #id int
DECLARE #name nvarchar
DECLARE #last datetime
DECLARE TableA_Cursor CURSOR FOR
select id
, name
, last
from TableA;
OPEN TableA_Cursor;
FETCH NEXT from TableA_Cursor
INTO #id, #name, #last;
WHILE ##FETCH_STATUS = 0
BEGIN
IF (EXISTS select 1 from TableB b where b.Id = #id)
update TableB
set Name = #name
, Last = #last
ELSE
insert into TableB (Id, Name, Last)
values (#id, #name, #last)
FETCH NEXT from TableA_Cursor
INTO #id, #name, #last
END
CLOSE TableA_Cursor;
DEALLOCATE TableA_Cursor;
There may be some syntax error, particularly around the IF condition, but you may get the point.
Related
I have a table with around 17k unique rows for which I need to run these set of statements in sequence
INSERT INTO TABLE1 using MASTERTABLE data (MASTERTABLE have 6 column)
SELECT value of column ID (Primary Key) of newly inserted row from TABLE1
Update that ID value in TABLE2 using a Stored Procedure
I have tried:
while loop: took around 3 hours to complete the execution
cursor: cancelled the query after executing it overnight
In my understanding I can not use JOIN as I need to execute the statements in a sequence
The questions is not detailed enough. The general idea I would like to use something like this
-- create a output table to hold new id, and key columns to join later
DECLARE #OutputTbl TABLE (ID INT, key_Columns in MASTERTABLE)
INSERT INTO TABLE1
OUTPUT INSERTED.ID, MASTERTABLE.key_columns INTO #OutputTbl
SELECT *
FROM MASTERTABLE
UPDATE T2
SET ID = o.ID
FROM TABLE2 t2
INNER JOIN OutputTbl o
ON t2.key_column = o.key_column
Maybe you can consider a TRIGGER on TABLE1 from which to call the stored procedure on TABLE2, and then you can call your INSERT as you wish/need.. one by one or in blocks..
DROP TRIGGER TR_UPD_TABLE2
GO
CREATE TRIGGER TR_UPD_TABLE2 ON TABLE1 AFTER INSERT
AS
BEGIN
SET NOCOUNT ON
DECLARE #columnID INT = NULL
IF (SELECT COUNT(*) FROM INSERTED)=1 BEGIN
-- SINGLE INSERT
SET #columnID = (SELECT columnID FROM INSERTED)
EXEC TableTwoUpdateProcedure #columnID
END ELSE BEGIN
-- MASSIVE INSERT (IF NEEDED)
SET #columnID = 0
WHILE #columnID IS NOT NULL BEGIN
SET #columnID = (SELECT MIN(columnID) FROM INSERTED WHERE columnID > #columnID)
IF #columnID IS NOT NULL BEGIN
EXEC TableTwoUpdateProcedure #columnID
END
END
END
END
I have a table #data which has the columns Id and Count. In addition I have a stored procedure MyProc which accepts a parameter #id (equals the Id column) and returns a dataset (the count equals the Count column).
My goal is to assign the Count column from Id with MyProc without a cursor.
I know, something like this does not work:
UPDATE d
SET Count = (SELECT COUNT(*) FROM (EXEC MyProc d.Id))
FROM #data AS d
Is there a syntax I do not know or is a cursor the only option to achieve this?
PS: It is a code quality and performance problem for me. Calling the stored procedure would be the easiest way without repeating 50 lines of SQL but a cursor slows it down.
I believe you can make use of the below query :
IF OBJECT_ID('dbo.data') IS NOT NULL DROP TABLE data;
IF OBJECT_ID('dbo.MyFunct') IS NOT NULL DROP FUNCTION dbo.MyFunct;
GO
CREATE TABLE data
(
ID int,
[Count] int
);
INSERT data VALUES (1,5), (1,10), (2,3), (4,6);
GO
UPDATE d
SET d.[Count] = f.CNT
FROM
(SELECT ID,COUNT(id) AS CNT FROM data GROUP BY ID) f
INNER JOIN data d ON f.ID = d.ID
I couldn't find a way to use Stored procedure. Needed you can use Table valued function:
CREATE FUNCTION dbo.MyFunct(#id INT)
RETURNS #i TABLE
(ID INT , CNT INT)
AS
BEGIN
INSERT INTO #i
SELECT ID,COUNT(id) AS CNT FROM data GROUP BY ID
RETURN
END;
GO
UPDATE d
SET d.[Count] = f.CNT
FROM dbo.MyFunct(1) f INNER JOIN data d ON f.ID = d.ID
To do what you say, you need a function, not a procedure.
CREATE FUNCTION dbo.myFunc (#Id INT)
RETURNS INT
AS
BEGIN
UPDATE someTable
SET someCol = 'someValue'
WHERE id = #Id;
RETURN ##ROWCOUNT;
END
GO
Then call the function in your update statement;
UPDATE d
SET d.Count = dbo.myFunc(d.Id)
FROM #data AS d;
However, row-based operations is bad practice. You should always strive to perform set-based operations, but as I don't know what your procedure does, I cannot provide more than a wild guess to what you should do (not using a procedure at all):
DECLARE #data TABLE (Id INT);
UPDATE x
SET x.someCol = 'SomeVal'
OUTPUT INSERTED.id INTO #data
FROM someTable AS x
INNER JOIN #data AS d
ON d.Id = x.Id;
WITH cte (Id, myCount) AS (
SELECT d.Id
,COUNT(d.Id) AS myCount
FROM #data AS d
GROUP BY d.Id
)
UPDATE d
SET d.[Count] = c.myCount
FROM #data AS d
INNER JOIN cte AS c
ON c.Id = d.Id;
I don't fully understand what you're trying to do but I think your solution will involve ##ROWCOUNT; Observe:
-- Sample data and proc...
----------------------------------------------------------------------
IF OBJECT_ID('tempdb..#data') IS NOT NULL DROP TABLE #data;
IF OBJECT_ID('dbo.MyProc') IS NOT NULL DROP PROC dbo.MyProc;
GO
CREATE TABLE #data
(
id int,
[Count] int
);
INSERT #data VALUES (1,5), (1,10), (2,3), (4,6);
GO
CREATE PROC dbo.MyProc(#id int)
AS
BEGIN
SELECT 'some value'
FROM #data
WHERE #id = id;
END;
GO
Data BEFORE:
id Count
----------- -----------
1 5
1 10
2 3
4 6
A routine that uses ##ROWCOUNT
DECLARE #someid int = 1; -- the value you're passing to your proc
EXEC dbo.MyProc 1;
DECLARE #rows int = ##ROWCOUNT; -- this is what you need.
UPDATE #data
SET [Count] = #rows
WHERE id = #someid;
Data AFTER
id Count
----------- -----------
1 2
1 2
2 3
4 6
Browsing on various examples on how to create a "good" UPSERT statement shown here, I have created the following code (I have changed the column names):
BEGIN TRANSACTION
IF EXISTS (SELECT *
FROM Table1 WITH (UPDLOCK, SERIALIZABLE), Table2
WHERE Table1.Data3 = Table2.data3)
BEGIN
UPDATE Table1
SET Table1.someColumn = Table2.someColumn,
Table1.DateData2 = GETDATE()
FROM Table1
INNER JOIN Table2 ON Table1.Data3 = Table2.data3
END
ELSE
BEGIN
INSERT INTO Table1 (DataComment, Data1, Data2, Data3, Data4, DateData1, DateData2)
SELECT
'some comment', data1, data2, data3, data4, GETDATE(), GETDATE()
FROM
Table2
END
COMMIT TRANSACTION
My problem is, that it never does the INSERT part. The INSERT alone works fine. The current script only does the update part.
I have an idea that the insert is only good, if it can insert the entire data it finds (because of the select query)? Otherwise it won't work. If so, how can I improve it?
I have also read about the MERGE clause and would like to avoid it.
//EDIT:
After trying out few samples found on the internet and explained here, I re-did my logic as follows:
BEGIN TRANSACTION
BEGIN
UPDATE table1
SET something
WHERE condition is met
UPDATE table2
SET helpColumn = 'DONE'
WHERE condition is met
END
BEGIN
INSERT INTO table1(data)
SELECT data
FROM table2
WHERE helpColumn != 'DONE'
END
COMMIT TRANSACTION
When trying other solutions, the INSERT usually failed or ran for a long time (on a few tables, I can accept it, but not good, if you plan to migrate entire data from one database to another database).
It's probably not the best solution, I think. But for now it works, any comments?
Instead of
if (something )
update query
else
insert query
Structure your logic like this:
update yourTable
etc
where some condition is met
insert into yourTable
etc
select etc
where some condition is met.
You cannot check this in general, like you are doing. You have to check each ID from Table 2 if it exists in Table 1 or not. If it exists, then update Table 1 else insert into Table 1. This can be done in following way.
We are going to iterate on Table 2 for each ID using CURSORS in SQL,
DECLARE #ID int
DECLARE mycursor CURSOR
FOR
SELECT ID FROM Table2 FORWARD_ONLY --Any Unique Column
OPEN mycursor
FETCH NEXT FROM mycursor
INTO #ID
WHILE ##FETCH_STATUS = 0
BEGIN
IF EXISTS (SELECT 1 FROM Table1 WHERE ID = #ID)
UPDATE t1 SET t1.data= T2.data --And whatever else you want to update
FROM
Table1 t1
INNER JOIN
Table2 t2
ON t1.ID = t2.ID --Joining column
WHERE t1.id = #ID
ELSE
INSERT INTO Table1
SELECT * FROM Table2 WHERE ID = #ID
FETCH NEXT FROM mycursor
INTO #ID
END
CLOSE mycursor
DEALLOCATE mycursor
TABLEA
MasterCategoryID MasterCategoryDesc
1 Housing
1 Housing
1 Housing
2 Car
2 Car
2 Car
3 Shop
TABLEB
ID Description
1 Home
2 Home
3 Plane
4 Car
INSERT into TableA
(
[MasterCategoryID]
[MasterCategoryDesc]
)
Select
case when (Description) not in (select MasterCategoryDesc from TableA)
then (select max(MasterCategoryID)+1 from TableA)
else (select top 1 MasterCategoryID from TableA where MasterCategoryDesc = Description)
end as [MasterCategoryID]
,Description as MasterCategoryDesc
from TableB
I want to enter rows using SQL/Stored Procedure from tableB to tableA. for example when inserting first row 'Home' it does not exist in MastercategoryDesc therefore will insert '4' in MasterCategoryID. Second row should keep the '4' again in MasterCategoryID.
The code below does it however after the first row the MastercategoryID remains the same for all rows. I Dont know how to keep track of ids while inserting the new rows.
p.s. Pls do not reply by saying i need to use IDENTITY() index. I have to keep the table structure the same and cannot change it. thanks
Create a new table your_table with fields x_MasterCategoryDesc ,x_SubCategoryDesc
Insert all your values in that table and the run the below SP.
CREATE PROCEDURE x_experiment
AS
BEGIN
IF object_id('TEMPDB..#TABLES') IS NOT NULL
BEGIN
DROP TABLE #TABLES
END
DECLARE #ROWCOUNT INT
DECLARE #ROWINDEX INT =0,
#MasterCategoryDesc VARCHAR(256),
#SubCategoryDesc VARCHAR(256)
select IDENTITY(int,1,1) as ROWID,*
into #TABLES
From your_table
SELECT #ROWCOUNT=COUNT(*) from #TABLES --where ROWID between 51 and 100
WHILE (#ROWINDEX<#ROWCOUNT)
BEGIN
set #ROWINDEX=#ROWINDEX+1
Select
#MasterCategoryDesc=x_MasterCategoryDesc,
#SubCategoryDesc=x_SubCategoryDesc
from #TABLES t
where rowid = #ROWINDEX
INSERT into Table1
([MasterCategoryID], [MasterCategoryDesc], [SubCategoryDesc], [SubCategoryID])
select TOP 1
case when #MasterCategoryDesc not in (select [MasterCategoryDesc] from Table1)
then (select max([MasterCategoryID])+1 from Table1)
else (select distinct max([MasterCategoryID]) from Table1
where [MasterCategoryDesc]=#MasterCategoryDesc
group by [MasterCategoryID])
end as [MasterCategoryID]
,#MasterCategoryDesc as [MasterCategoryDesc]
,#SubCategoryDesc as [SubCategoryDesc]
,case when #SubCategoryDesc not in (select [SubCategoryDesc] from Table1)
then (select max([SubCategoryID])+1 from Table1 )
else (select max([SubCategoryID]) from Table1
where [SubCategoryDesc]=#SubCategoryDesc
group by [SubCategoryID])
end as [SubCategoryID]
from Table1
END
select * from Table1 order by MasterCategoryID
END
GO
exec x_experiment --SP Execute
SQL FIDDLE
Use a CURSOR to do the work. The cursor loops through each row of TableA and the MasterCategoryID increases if it is not found in TableB. This happens before the next row of TableA is loaded into the cursor ...
DECLARE #ID int
DECLARE #Description VARCHAR(MAX)
DECLARE my_cursor CURSOR FOR
SELECT ID, Description FROM TableB
OPEN my_cursor
FETCH NEXT FROM my_cursor
INTO #ID, #Description
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT into TableA(MasterCategoryID, MasterCategoryDesc)
SELECT CASE WHEN #Description NOT IN (SELECT MasterCategoryDesc FROM TableA)
THEN (SELECT MAX(MasterCategoryID)+1 FROM TableA)
ELSE (SELECT TOP 1 MasterCategoryID
FROM TableA
WHERE MasterCategoryDesc = #Description)
END AS MasterCategoryID, Description as MasterCategoryDesc
FROM TableB
WHERE ID = #ID
FETCH NEXT FROM my_cursor
INTO #ID, #Description
END
Your data structure leaves something to be desired. You shouldn't have a master id column that has repeated values.
But you can still do what you want:
INSERT into TableA ([MasterCategoryID], [MasterCategoryDesc])
Select coalesce(a.MasterCategoryId,
amax.maxid + row_number() over (partition by (a.MasterCategoryId) order by b.id)
),
coalesce(a.MasterCategoryDesc, b.desc)
from TableB b left outer join
(select desc, max(MasterCaegoryId) as maxid
from TableA a
group by desc
) a
on b.desc = a.desc left outer join
(select max(MasterCategoryID) as maxid
from TableA
) amax
The idea is to take the information from the master table when it is available. When not available, then MasterCategoryId will be NULL. A new id is calculated, using row_number() to generate sequential numbers. These are then added to the previous maximum id.
I have a collection of rows that I get from a web service. Some of these rows are to be inserted, some are updates to existing rows. There is no way of telling unless I do a query for the ID in the table. If I find it, then update. If I don't, then insert.
Select #ID from tbl1 where ID = #ID
IF ##ROWCOUNT = 0
BEGIN
Insert into tbl1
values(1, 'AAAA', 'BBBB', 'CCCC', 'DDD')
END
ELSE
BEGIN
UPDATE tbl1
SET
A = #AAA,
B = #BBB,
C = #CCC,
D = #DDD
WHERE ID = #ID
END
I am trying to figure out the most effient way to update/insert these rows into the table without passing them into a stored procedure one at a time.
UPDATE 1
I should have mentioned I am using SQL Server 2005. Also if I have 300 records I don't want to make 300 stored procedure calls.
the most efficient way will be first try to update the table if it returns 0 row updated then only do insertion. for ex.
UPDATE tbl1
SET
A = #AAA,
B = #BBB,
C = #CCC,
D = #DDD
WHERE ID = #ID
IF ##ROWCOUNT = 0
BEGIN
Insert into tbl1
values(1, 'AAAA', 'BBBB', 'CCCC', 'DDD')
END
ELSE
BEGIN
END
Instead of paying for a seek first and then updating using another seek, just go ahead and try to update. If the update doesn't find any rows, you've still only paid for one seek, and didn't have to raise an exception, but you know that you can insert.
UPDATE dbo.tbl1 SET
A = #AAA,
B = #BBB,
C = #CCC,
D = #DDD
WHERE ID = #ID;
IF ##ROWCOUNT = 0
BEGIN
INSERT dbo.tbl1(ID,A,B,C,D)
VALUES(#ID,#AAA,#BBB,#CCC,#DDD);
END
You can also look at MERGE but I shy away from this because (a) the syntax is daunting and (b) there have been many bugs and several of them are still unresolved.
And of course instead of doing this one #ID at a time, you should use a table-valued parameter.
CREATE TYPE dbo.tbl1_type AS TABLE
(
ID INT UNIQUE,
A <datatype>,
B <datatype>,
C <datatype>,
D <datatype>
);
Now your stored procedure can look like this:
CREATE PROCEDURE dbo.tbl1_Update
#List AS dbo.tbl1_type READONLY
AS
BEGIN
SET NOCOUNT ON;
UPDATE t
SET A = i.A, B = i.B, C = i.C, D = i.D
FROM dbo.tbl1 AS t
INNER JOIN #List AS i
ON t.ID = i.ID;
INSERT dbo.tbl1
SELECT ID, A, B, C, D
FROM #List AS i
WHERE NOT EXISTS
(
SELECT 1
FROM dbo.tbl1 WHERE ID = i.ID
);
END
GO
Now you can just pass your DataTable or other collection from C# directly into the procedure as a single parameter.
From the collection of rows you get from the server find out which ones are already there:
select #id from tbl1 where id in (....)
Then you have a list of ids that are in the table and one that there are not in the table.
You will have then 2 batch operations: one for update, the other for insert.
what i understand is this :
at the front end u issue a single sql statement
ArrayofIDsforInsert = select ID from tbl1 where ID not in ( array of ids at the front end)
ArrayofIDsforUpdate = (IntialArrayofids at frontend) - (ArrayofIdsforInsert)
one insert into table and one update table...
now call the insert into table with ArrayofIds for insert
call the update table with ArrayofIds for update..