If Record Exists, Update Else Insert - sql

I'm trying to move some data between two SQL Server 2008 tables. If the record exists in Table2 with the email from Table1 then update that record with the data from Table1, else insert a new record.
In Table1 I have a number of columns; first name, surname, email and so on.
I'm not quite sure how to structure the query to update Table2 if the email from Table1 exists or insert a new row if email from Table1 does not exist in Table2.
I tried doing a few searches on Google but most solutions seem to work by creating some stored procedure. So I wondered if anyone might know how to build a suitable query that might do the trick?

I think MERGE is what you want.

MERGE
INTO table2 t2
USING table1 t1
ON t2.email = t1.email
WHEN MATCHED THEN
UPDATE
SET t2.col1 = t1.col1,
t2.col2 = t1.col2
WHEN NOT MATCHED THEN
INSERT (col1, col2)
VALUES (t1.col1, t1.col2)

Microsoft released a tool to compare data between SQL tables, this might a good option in certain situations.
Edit: Forgot to mention, it also generates a script to insert/update missing or different rows.
For completeness, I hacked up this query which does what you want, it updates existing table2 records, and adds those that are missing, based off the email address.
The 'updating' and 'insert missing' queries below are the ones you want.
BEGIN TRAN
create table #table1 (id int, fname varchar(20), email varchar(20))
insert into #table1 values (1, 'name_1_updated', 'email_1')
insert into #table1 values (3, 'name_3_updated', 'email_3')
insert into #table1 values (100, 'name_100', 'email_100')
create table #table2 (id int, fname varchar(20), email varchar(20))
insert into #table2 values (1, 'name_1', 'email_1')
insert into #table2 values (2, 'name_2', 'email_2')
insert into #table2 values (3, 'name_3', 'email_3')
insert into #table2 values (4, 'name_4', 'email_4')
print 'before update'
select * from #table2
print 'updating'
update #table2
set #table2.fname = t1.fname
from #table1 t1
where t1.email = #table2.email
print 'insert missing'
insert into #table2
select * from #table1
where #table1.email not in (select email from #table2 where email = #table1.email)
print 'after update'
select * from #table2
drop table #table1
drop table #table2
ROLLBACK

Related

Subquery for split_string

I have 2 tables in SQL Server. One holds the names of fields, the other is a combination of the id's. I'm a bit newer to more advanced SQL queries and am having trouble figuring out a good way to accomplish this. If I were using javascript I would just split each data address into an array and loop over each to give me the desired output. Not sure how to accomplish in SQL.
t1
id
name
0
Manager
1
Client
2
FirstName
3
LastName
t2
dataaddress
0.2
0.3
1.2
1.3
Desired Output:
addressname
Manager.FirstName
Manager.LastName
Client.FirstName
Client.LastName
I've tried using split_string to parse out each from dataaddress but am having trouble figuring out / a good google search for a way to accomplish this.
Please try the following solution.
SQL
-- DDL and sample data population, start
DECLARE #tbl1 TABLE (id INT PRIMARY KEY, name VARCHAR(20));
INSERT #tbl1 (id, name) VALUES
(0, 'Manager'),
(1, 'Client'),
(2, 'FirstName'),
(3, 'LastName');
DECLARE #tbl2 TABLE (dataaddress VARCHAR(20));
INSERT #tbl2 (dataaddress) VALUES
('0.2'),
('0.3'),
('1.2'),
('1.3');
-- DDL and sample data population, end
;WITH rs AS
(
SELECT t1_id = PARSENAME(dataaddress,2)
, t2_id = PARSENAME(dataaddress,1)
FROM #tbl2
)
SELECT addressname = CONCAT(t1.name, '.' , t2.name)
FROM #tbl1 AS t1 INNER JOIN rs ON t1.id = rs.t1_id
INNER JOIN #tbl1 AS t2 ON t2.id = rs.t2_id;
Output
addressname
Manager.FirstName
Manager.LastName
Client.FirstName
Client.LastName

Put a table of values as an argument to a query

I need to do something like this:
SET #MyTableAsArgument = 'Foo;Bar\n1;2\n3;4\n'; -- CSV or any other table-format
SET #AnOtherArgument = 'somedata';
SELECT * FROM table1 t1, #MyTableAsArgument t2
WHERE t1.foo = t2.foo
AND t1.bar = #AnOtherArgument
Is there a way to do this?
The only other solution I see is:
Create a temporary table tmp1
Insert my MyTableAsArgument to the tmp1
Do My query on table1 and tmp1
Delete my temporary table tmp1
I am not sure if this is an abuse of temporary tables.
Is there a significant performance overhead with temporary tables as they are used for queries?
Can you use a table variable or temporary table?
DECLARE #MyTable TABLE (foo VARCHAR(255));
INSERT INTO #MyTable (foo)
VALUES ('Foo'), ('Bar\n12\n3'), ('4\n');
SET #AnOtherArgument = 'somedata';
SELECT t1.*
FROM table1 t1
WHERE t1.foo IN (SELECT foo FROM #MyTable) AND
t1.bar = #AnOtherArgument;
If this is not possible, you can use a SPLIT() functions -- STRING_SPLIT() is built into the most recent versions -- but other versions are on the web:
SELECT *
FROM table1 t1
WHERE t1.foo IN (SELECT foo FROM string_split(#MyTableLIST, ';') ss(foo)) AND
t1.bar = #AnOtherArgument;
#Cosinus, there is a temp table variable too in SQL. This allows you to define the columns you want in that table and using a 'union all' you can insert elements in that table.
See this mockup below.
DECLARE #MyTableAsArgument table(Name varchar(20), foo varchar(50), descb varchar(100), pj_id int)
DECLARE #AnOtherArgument varchar(20) = 'somedata';
INSERT into #MyTableAsArgument
SELECT 'James', 'iphone', 'cell phone', 1 union all
SELECT 'Michael', 'macbook', 'laptop', 2 union all
SELECT 'Henry', 'windows', 'os', 3
SELECT *
FROM
table1 t1
Join #MyTableAsArgument t2 ON
t1.foo = t2.foo
AND
t1.bar = #AnOtherArgument

update each row (sql server) based on subquery

I want to update each row of table1->keyField based on table2 value
Table1
Id|keyField
1|test_500
2|test_501
3|test_501
500,501 are primary key of and my another table2
Table2
Id|value
500|A
501|B
502|C
I have tried something like
update table1 set keyField=(select value from table2 where id=substring(expression))
but my select return multiple statement so unable to run the query.
any help or direction please?
You can use the syntax like this
UPDATE table1 SET keyField = Table2.Value
FROM table1 INNER JOIN table2
ON table1.Id = substring(expression))
If I get it right, this might be what you need:
UPDATE T1 SET
keyField = T2.Value
FROM
Table1 AS T1
INNER JOIN Table2 AS T2 ON T2.id = SUBSTRING(T1.keyField, 6, 100)
Careful when comparing substring result with an numeric value, might get a conversion error.
Try this code (necessary notes are in comments below):
--generate some sample data (the same as you provided)
declare #table1 table (id int, keyField varchar(10))
insert into #table1 values (1,'test_500'),(2,'test_501'),(3,'test_502')
declare #table2 table (id int, value char(1))
insert into #table2 values (500,'A'),(501,'B'),(502,'C')
--in case you want to see tables first
--select * from #table1
--select * from #table2
--here you extract the number in first table in keyField column and match it with ID from second table, upon that, you update first table
update #table1 set keyField = value from #table2 [t2]
where cast(right(keyfield, len(keyfield) - charindex('_',keyfield)) as int) = [t2].id
select * from #table1

How do I update a table based off the generated index key of an insert?

I'm created a temp table with most of the values I need to insert into a set of tables. From this temp table I have all the values I need for the insert to the first table, but the insert to the next table depends on the identity key generated by the insert to the first table.
I could very well just update my temp table after the first insert, but I'd like to try using the output clause.
I want something like this:
INSERT INTO Table1
<values from temp table>
OUTPUT <update my temp table with generated identity keys>
INSERT INTO Table2
<values from temp table including the output updated id column>
I think you better create another temp table (OR) table type variable and go from there as shown below. Cause I don't think you can update the same temp table from where you are inserting using output clause.
CREATE TABLE TestTable (ID INT not null identity primary key,
TEXTVal VARCHAR(100))
create TABLE #tmp(ID INT, TEXTVal VARCHAR(100))
create TABLE #tmp1(ID INT, TEXTVal VARCHAR(100))
CREATE TABLE TestTable1 (ID INT not null, TEXTVal VARCHAR(100))
INSERT #tmp (ID, TEXTVal)
VALUES (1,'FirstVal')
INSERT #tmp (ID, TEXTVal)
VALUES (2,'SecondVal')
INSERT INTO TestTable (TEXTVal)
OUTPUT Inserted.ID, Inserted.TEXTVal INTO #tmp1
select TEXTVal from #tmp
INSERT INTO TestTable1 (ID, TEXTVal)
select ID, TEXTVal from #tmp1
You could merge your temptable into Table1, and output the results to a variable table, then insert the original data joined to the variable table into Table2.
Example:
DECLARE #MyIDs TABLE (TempTableID int NOT NULL, Table1ID int NOT NULL)
MERGE INTO Table1
USING TempTable AS Tmp
ON Table1.SomeValue = Tmp.SomeValue
WHEN NOT MATCHED THEN
INSERT (col1, col2, col3, col4, col5)
VALUES (tmp.col1, tmp.col2, tmp.col3, tmp.col4, tmp.col5)
OUTPUT Tmp.ID
,Table1.ID
INTO #MyIDs;
INSERT INTO Table2 (col1, col2, col3, col4, col5, Table1ID)
SELECT tmp.col1, tmp.col2, tmp.col3, tmp.col4, tmp.col5, new.Table1ID
FROM TempTable tmp
JOIN #MyIDs new ON tmp.ID = new.TempTableID

Refactoring SQL to avoid using TABLOCKX

I have two tables like this:
Table1 Table2
----------------------------------
Table1Id IDENTITY Table2Id
Table2Id NOT NULL SomeStuff
SomeOtherStuff
With a foreign key constraint on Table2Id between them. It goes without saying (yet I'm saying it anyway) that a Table2 row needs to be inserted before its related Table1 row. The nature of the procedure that loads both tables does so in bulk set operations, meaning I have a whole bunch of Table1 and Table2 data in a #temp table that was created with an IDENTITY column to keep track of things. I am currently doing the inserts like this (transaction and error handling omitted for brevity):
DECLARE #currentTable2Id INT
SET #currentTable2Id = IDENT_CURRENT('dbo.Table2')
INSERT INTO dbo.Table2 WITH (TABLOCKX)
( SomeStuff,
SomeOtherStuff
)
SELECT WhateverStuff,
WhateverElse
FROM #SomeTempTable
ORDER BY SomeTempTableId
INSERT INTO dbo.Table1
( Table2Id )
SELECT #currentTable2Id + SomeTempTableId
FROM #SomeTempTable
ORDER BY SomeTempTableId
This works fine, all of the relationships are sound after the inserts. However, due to the TABLOCKX, we are running into constant situations where people are waiting for each other's queries to finish, whether it be this "load" query, or other UPDATES and INSERTS (I'm using NOLOCK on selects). The nature of the project calls for a lot of data to be loaded, so there are times when this procedure can run for 20-30 minutes. There's nothing I can do about this performance. Trust me, I've tried.
I cannot use SET IDENTITY_INSERT ON, as the DBAs do not allow users to issue this command in production, and I think using IDENTITY_INSERT would require a TABLOCKX anyways. Is there any way I can do this sort of insert without using a TABLOCKX?
Make sure you have a ID field in #SomeTempTable. Create a new column TempID in Table2. Insert the ID from #SomeTempTable to TempID when you add rows to Table2. Use column TempID in a join when you insert into Table1 to fetch the auto incremented Table2ID.
Something like this:
alter table Table2 add TempID int
go
declare #SomeTempTable table(ID int identity, WhateverStuff int, WhateverElse int)
insert into #SomeTempTable values(1, 1)
insert into #SomeTempTable values(2, 2)
insert into Table2(SomeStuff, SomeOtherStuff, TempID)
select WhateverStuff, WhateverElse, ID
from #SomeTempTable
insert into Table1(Table2Id)
select Table2ID
from #SomeTempTable as S
inner join Table2 as T2
on S.ID = T2.TempID
go
alter table Table2 drop column TempID
Instead of add and drop of the TempID column you can have it in there but you need to clear it before every run so old values from previous runs don't mix up your joins.
I assume that you're using tablockx in an attempt to prevent anything else from inserting into Table2 (and thus incrementing the identity value) for the duration of your process. Try this instead
DECLARE #t TABLE (Table2Id int), #currentTable2Id int
INSERT INTO dbo.Table2
( SomeStuff,
SomeOtherStuff
)
OUTPUT INSERTED.Table2Id into #t
SELECT WhateverStuff,
WhateverElse
FROM #SomeTempTable
ORDER BY SomeTempTableId
SELECT #currentTable2Id = Table2Id FROM #t
INSERT INTO dbo.Table1
( Table2Id )
SELECT #currentTable2Id + SomeTempTableId
FROM #SomeTempTable
ORDER BY SomeTempTableId
DELETE #t