Using merge to combine matching records - sql

I'm trying to combine matching records from a table into a single record of another table. I know this can be done with group by, and sum(), max(), etc..., My difficulty is that the columns that are not part of the group by are varchars that i need to concatenate.
I'm using Sybase ASE 15, so I do not have a function like MySQL's group_concat or similar.
I tried using merge without luck, the target table ended with the same number of records of source table.
create table #source_t(account varchar(10), event varchar(10))
Insert into #source_t(account, event) values ('account1','event 1')
Insert into #source_t(account, event) values ('account1','event 2')
Insert into #source_t(account, event) values ('account1','event 3')
create table #target(account varchar(10), event_list varchar(2048))
merge into #target as t
using #source_t as s
on t.account = s.account
when matched then update set event_list = t.event_list + ' | ' + s.event
when not matched then insert(account, event_list) values (s.account, s.event)
select * from #target
drop table #target
drop table #source_t
Considering the above tables, I wanted to have one record per account, with all the events of the account concatenated in the second column.
account, event_list
'account1', 'event 1 | event 2 | event 3'
However, all I've got is the same records as #source.
It seems to me that the match in merge is attempted against the "state" of the table at the beginning of statement execution, so the when matched never executes. Is there a way of telling the DBMS to match against the updated target table?
I managed to obtain the results I needed by using a cursor, so the merge statement is executed n times, n being the number of records in #source, thus the merge actually executes the when matched part.
The problem with it is the performance, removing duplicates this way takes about 5 minutes to combine 63K records into 42K.
Is there a faster way of achieving this?

There's a little known (poorly documented?) aspect of the UPDATE statement when using it to update a #variable which allows you to accumulate/concatenate values in the #variable as part of a set-based UPDATE operation.
This is easier to 'explain' with an example:
create table source
(account varchar(10)
,event varchar(10)
)
go
insert source values ('account1','event 1')
insert source values ('account1','event 2')
insert source values ('account1','event 3')
insert source values ('account2','event 1')
insert source values ('account3','event 1')
insert source values ('account3','event 2')
go
declare #account varchar(10),
#event_list varchar(40) -- increase the size to your expected max length
select #account = 'account1'
-- allow our UPDATE statement to cycle through the events for 'account1',
-- appending each successive event to #event_list
update source
set #event_list = #event_list +
case when #event_list is not NULL then ' | ' end +
event
from source
where account = #account
-- we'll display as a single-row result set; we could also use a 'print' statement ...
-- just depends on what format the calling process is looking for
select #account as account,
#event_list as event_list
go
account event_list
---------- ----------------------------------------
account1 event 1 | event 2 | event 3
PRO:
single UPDATE statement to process a single account value
CON:
still need a cursor to process a series of account values
if your desired final output is a single result set then you'll need to store intermediate results (eg, #account and #update) in a (temp) table, then run a final SELECT against this (temp) table to produce the desired result set
while you're not actually updating the physical table, you may run into problems if you don't have access to 'update' the table
NOTE: You could put the cursor/UPDATE logic in a stored proc, call the proc through a proxy table, and this would allow the output from a series of 'select #account,#update' statements to be returned to the calling process as a single result set ... but that's a whole 'nother topic on a (somewhat) convoluted coding method.
For your process you'll need a cursor to loop through your unique set of account values, but you'll be able to eliminate the cursor overhead for looping through the list of events for a given account. Net result is that you should see some improvement in the time it takes to run your process.

After applying the given suggestions, and also speaking with our DBA, the winner idea was to ditch the merge and use logical conditions over the loop.
Adding begin/commit seemed to reduce execution time by 1.5 to 3 seconds.
Adding a primary key to the target table did the best reduction, reducing execution time to about 13 seconds.
Converting the merge to conditional logic was the best option in this case, achieving the result in about 8 seconds.
When using conditionals, the primary key in target table is detrimental by a small amount (around 1 sec), but having it drastically reduces time afterwards, since this table is only a previous step for a big join. (That is, the result of this record-merging is latter used in a join with 11+ tables.) So I kept the P.K.
Since there seems to be no solution without a cursor loop, I used the conditionals to merge the values using variables and issuing only the inserts to the target table, thus eliminating the need to seek a record to update or to check its existence.
Here is a simplified example.
create table #source_t(account varchar(10), event varchar(10));
Insert into #source_t(account, event) values ('account1','event 1');
Insert into #source_t(account, event) values ('account1','event 2');
Insert into #source_t(account, event) values ('account1','event 3');
Insert into #source_t(account, event) values ('account2','came');
Insert into #source_t(account, event) values ('account2','saw');
Insert into #source_t(account, event) values ('account2','conquered');
create table #target(
account varchar(10), -- make primary key if the result is to be joined afterwards.
event_list varchar(2048)
);
declare ciclo cursor for
select account, event
from #source_t c
order by account --,...
for read only;
declare #account varchar(10), #event varchar(40), #last_account varchar(10), #event_list varchar(1000)
open ciclo
fetch ciclo into #account, #event
set #last_account = #account, #event_list = null
begin tran
while ##sqlstatus = 0 BEGIN
if #last_account <> #account begin -- if current record's account is different from previous, insert into table the concatenated event string
insert into #target(account, event_list) values (#last_account, #event_list)
set #event_list = null -- Empty the string for the next account
end
set #last_account = #account -- Copy current account to the variable that holds the previous one
set #event_list = case #event_list when null then #event else #event_list + ' | ' + #event end -- Concatenate events with separator
fetch ciclo into #account, #event
END
-- after the last fetch, ##sqlstatus changes to <> 0, the values remain in the variables but the loop ends, leaving the last record unprocessed.
insert into #target(account, event_list) values (#last_account, #event_list)
commit tran
close ciclo
deallocate cursor ciclo;
select * from #target;
drop table #target;
drop table #source_t;
Result:
account |event_list |
--------|---------------------------|
account1|event 1 | event 2 | event 3|
account2|saw | came | conquered |
This code worked fast enough in my real use case. However it could be further optimized by filtering the source table to hold only the values que would be necessary for the join afterward. For that matter I saved the final joined resultset (minus the join with #target) in another temp table, leaving some columns blank. Then #source_t was filled using only the accounts present in the resultset, processed it to #target and finally used #target to update the final result. With all that, execution time for production environment was dropped to around 8 seconds (including all steps).
UDF Solutions have solved this kind of problem for me before but in ASE 15 they have to be table-specific and need to write one function per column. Also, that is only possible in development environment, not authorized for production because of read only privileges.
In conclusion, A cursor loop combined with a merge statement is a simple solution for combining records using concatenation of certain values. A primary key or an index including the columns used for the match is required to boost performance.
Conditional logic results in even better performance but comes at the penalty of more complex code (the more you code, the more prone to error).

Related

How to INSERT INTO one table and one of the columns automatically 'INSERT INTO' another table? [duplicate]

My database contains three tables called Object_Table, Data_Table and Link_Table. The link table just contains two columns, the identity of an object record and an identity of a data record.
I want to copy the data from DATA_TABLE where it is linked to one given object identity and insert corresponding records into Data_Table and Link_Table for a different given object identity.
I can do this by selecting into a table variable and the looping through doing two inserts for each iteration.
Is this the best way to do it?
Edit : I want to avoid a loop for two reason, the first is that I'm lazy and a loop/temp table requires more code, more code means more places to make a mistake and the second reason is a concern about performance.
I can copy all the data in one insert but how do get the link table to link to the new data records where each record has a new id?
In one statement: No.
In one transaction: Yes
BEGIN TRANSACTION
DECLARE #DataID int;
INSERT INTO DataTable (Column1 ...) VALUES (....);
SELECT #DataID = scope_identity();
INSERT INTO LinkTable VALUES (#ObjectID, #DataID);
COMMIT
The good news is that the above code is also guaranteed to be atomic, and can be sent to the server from a client application with one sql string in a single function call as if it were one statement. You could also apply a trigger to one table to get the effect of a single insert. However, it's ultimately still two statements and you probably don't want to run the trigger for every insert.
You still need two INSERT statements, but it sounds like you want to get the IDENTITY from the first insert and use it in the second, in which case, you might want to look into OUTPUT or OUTPUT INTO: http://msdn.microsoft.com/en-us/library/ms177564.aspx
The following sets up the situation I had, using table variables.
DECLARE #Object_Table TABLE
(
Id INT NOT NULL PRIMARY KEY
)
DECLARE #Link_Table TABLE
(
ObjectId INT NOT NULL,
DataId INT NOT NULL
)
DECLARE #Data_Table TABLE
(
Id INT NOT NULL Identity(1,1),
Data VARCHAR(50) NOT NULL
)
-- create two objects '1' and '2'
INSERT INTO #Object_Table (Id) VALUES (1)
INSERT INTO #Object_Table (Id) VALUES (2)
-- create some data
INSERT INTO #Data_Table (Data) VALUES ('Data One')
INSERT INTO #Data_Table (Data) VALUES ('Data Two')
-- link all data to first object
INSERT INTO #Link_Table (ObjectId, DataId)
SELECT Objects.Id, Data.Id
FROM #Object_Table AS Objects, #Data_Table AS Data
WHERE Objects.Id = 1
Thanks to another answer that pointed me towards the OUTPUT clause I can demonstrate a solution:
-- now I want to copy the data from from object 1 to object 2 without looping
INSERT INTO #Data_Table (Data)
OUTPUT 2, INSERTED.Id INTO #Link_Table (ObjectId, DataId)
SELECT Data.Data
FROM #Data_Table AS Data INNER JOIN #Link_Table AS Link ON Data.Id = Link.DataId
INNER JOIN #Object_Table AS Objects ON Link.ObjectId = Objects.Id
WHERE Objects.Id = 1
It turns out however that it is not that simple in real life because of the following error
the OUTPUT INTO clause cannot be on
either side of a (primary key, foreign
key) relationship
I can still OUTPUT INTO a temp table and then finish with normal insert. So I can avoid my loop but I cannot avoid the temp table.
I want to stress on using
SET XACT_ABORT ON;
for the MSSQL transaction with multiple sql statements.
See: https://msdn.microsoft.com/en-us/library/ms188792.aspx
They provide a very good example.
So, the final code should look like the following:
SET XACT_ABORT ON;
BEGIN TRANSACTION
DECLARE #DataID int;
INSERT INTO DataTable (Column1 ...) VALUES (....);
SELECT #DataID = scope_identity();
INSERT INTO LinkTable VALUES (#ObjectID, #DataID);
COMMIT
It sounds like the Link table captures the many:many relationship between the Object table and Data table.
My suggestion is to use a stored procedure to manage the transactions. When you want to insert to the Object or Data table perform your inserts, get the new IDs and insert them to the Link table.
This allows all of your logic to remain encapsulated in one easy to call sproc.
If you want the actions to be more or less atomic, I would make sure to wrap them in a transaction. That way you can be sure both happened or both didn't happen as needed.
You might create a View selecting the column names required by your insert statement, add an INSTEAD OF INSERT Trigger, and insert into this view.
Before being able to do a multitable insert in Oracle, you could use a trick involving an insert into a view that had an INSTEAD OF trigger defined on it to perform the inserts. Can this be done in SQL Server?
Insert can only operate on one table at a time. Multiple Inserts have to have multiple statements.
I don't know that you need to do the looping through a table variable - can't you just use a mass insert into one table, then the mass insert into the other?
By the way - I am guessing you mean copy the data from Object_Table; otherwise the question does not make sense.
//if you want to insert the same as first table
$qry = "INSERT INTO table (one, two, three) VALUES('$one','$two','$three')";
$result = #mysql_query($qry);
$qry2 = "INSERT INTO table2 (one,two, three) VVALUES('$one','$two','$three')";
$result = #mysql_query($qry2);
//or if you want to insert certain parts of table one
$qry = "INSERT INTO table (one, two, three) VALUES('$one','$two','$three')";
$result = #mysql_query($qry);
$qry2 = "INSERT INTO table2 (two) VALUES('$two')";
$result = #mysql_query($qry2);
//i know it looks too good to be right, but it works and you can keep adding query's just change the
"$qry"-number and number in #mysql_query($qry"")
I have 17 tables this has worked in.
-- ================================================
-- Template generated from Template Explorer using:
-- Create Procedure (New Menu).SQL
--
-- Use the Specify Values for Template Parameters
-- command (Ctrl-Shift-M) to fill in the parameter
-- values below.
--
-- This block of comments will not be included in
-- the definition of the procedure.
-- ================================================
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE InsetIntoTwoTable
(
#name nvarchar(50),
#Email nvarchar(50)
)
AS
BEGIN
SET NOCOUNT ON;
insert into dbo.info(name) values (#name)
insert into dbo.login(Email) values (#Email)
END
GO

refactoring sql while loop to regular inserts

I am inserting parent records and child records at the same time in a stored procedure.
Rather than have outside code make nested calls to create each parent and then each child of that parent (which is even slower than my current approach), I am giving the sql a comma separated list of child types that I put into a temp table (#TempParentChildUpdateTable) which is associated with a parent record in a table value parameter (#PenguinParentChildUpdate).
Then I loop through both, insert the parent and then insert all related children.
The problem is this while loop is not performing very well at 13 requests per second.
How do I make this faster? How do I take out the while loop?
Is there a way to do this with non-looping inserts? If so, can the string parsing happen inside the inner insert?
DECLARE #RowCnt int = 0;
DECLARE #CounterId int = 1;
SELECT #RowCnt = COUNT(*) FROM #PenguinParentChildUpdate;
WHILE #CounterId <= #RowCnt
BEGIN
SELECT #GrandparentId = WorkflowInstanceId,
#ParentType = ParentType,
#ChildIds = ChildIds
FROM #PenguinParentChildUpdate
-- insert parent record
INSERT INTO WorkflowInstanceRole (ParentType, GrandparentId)
VALUES(#ParentType, #GrandparentId)
SET #ParentId = SCOPE_IDENTITY()
-- identify children
-- convert comma separated list (e.g. 4,91,12,6) into separate rows
INSERT INTO #TempParentChildUpdateTable (sq.ChildId, sq.ParentId)
select convert(int, value) ChildId, #ParentId RoleId FROM string_split(#ChildIds, CHAR(44))
-- insert children 7787+ =
IF (#ChildIds IS NOT NULL AND LEN(LTRIM(RTRIM(#ChildIds))) > 0)
BEGIN
INSERT INTO dbo.PenguinParentChild
(
grandparentId,
childid,
)
select
#ParentId,
childId,
from
#TempParentChildUpdateTable tsru
where
tsru.ParentId = #ParentId
END
SET #CounterId = #CounterId + 1
END
I am struggling to follow some of the logic in the loop, but I think the premise is
Insert to WorkFlowInstanceRole
Capture the inserted records then insert further children based on this.
Since step 2 requires data that is not in WorkFlowInstanceRole you need to use MERGE to add the new rows, rather than a standard insert. What this allows you to do is capture the ChildIds from the source table, even though you aren't inserting them. Something like this should do it (I've had to guess at some data types):
DECLARE #Output TABLE
(
ParentId INT,
ParentType INT,
GrandParentId INT,
ChildIDs VARCHAR(MAX)
);
MERGE INTO WorkflowInstanceRole AS t
USING #PenguinParentChildUpdate AS s
ON 1 = 0 -- Will never be true so will always insert
WHEN NOT MATCHED THEN
INSERT (ParentType, GrandparentId)
VALUES (s.ParentType, s.WorkflowInstanceId)
OUTPUT inserted.ParentId, inserted.ParentType, inserted.GrandParentId, s.ChilDIds
INTO #Output (ParentId,ParentType, GrandParentId, ChildIDs);
INSERT INTO dbo.PenguinParentChild (GrandParentId, ChildId)
SELECT o.ParentId,
CONVERT(int, ss.value)
FROM #Output AS o
CROSS APPLY STRING_SPLIT(i.ChileIds, CHAR(44)) AS ss;
The key part is the MERGE* really, since the condition is 1=0 then this will always insert. Unlike INSERT the OUTPUT clause on a merge will allow you to capture both the newly inserted identity value, and the non-inserted ChildIds column.
This is output into a temporary table, which you can then use, along with CROSS APPLY STRING_SPLIT() to insert the child records.
There may be some data errors, and logic may not be 100% perfect, but this should hopefully be a bit step in the right direction.
*MERGE has a number of known bugs, and I'd generally advise to steer clear, but I am not aware of any alternative that would allow you to capture the newly inserted identity value, and the non-inserted ChildIds column, and as far as I am aware none of these bugs affect this operation (Anecdotally, in that I have never encountered an issue using this method).

Get IDENTITY value in the same T-SQL statement it is created in?

I was asked if you could have an insert statement, which had an ID field that was an "identity" column, and if the value that was assigned could also be inserted into another field in the same record, in the same insert statement.
Is this possible (SQL Server 2008r2)?
Thanks.
You cannot really do this - because the actual value that will be used for the IDENTITY column really only is fixed and set when the INSERT has completed.
You could however use e.g. a trigger
CREATE TRIGGER trg_YourTableInsertID ON dbo.YourTable
AFTER INSERT
AS
UPDATE dbo.YourTable
SET dbo.YourTable.OtherID = i.ID
FROM dbo.YourTable t2
INNER JOIN INSERTED i ON i.ID = t2.ID
This would fire right after any rows have been inserted, and would set the OtherID column to the values of the IDENTITY columns for the inserted rows. But it's strictly speaking not within the same statement - it's just after your original statement.
You can do this by having a computed column in your table:
DECLARE #QQ TABLE (ID INT IDENTITY(1,1), Computed AS ID PERSISTED, Letter VARCHAR (1))
INSERT INTO #QQ (Letter)
VALUES ('h'),
('e'),
('l'),
('l'),
('o')
SELECT *
FROM #QQ
1 1 h
2 2 e
3 3 l
4 4 l
5 5 o
About the cheked answer:
You cannot really do this - because the actual value that will be used
for the IDENTITY column really only is fixed and set when the INSERT
has completed.
marc_s I suppose, you are not actually right. Yes, He can! ))
The way to solution is IDENT_CURRENT():
CREATE TABLE TemporaryTable(
Id int PRIMARY KEY IDENTITY(1,1) NOT NULL,
FkId int NOT NULL
)
ALTER TABLE TemporaryTable
ADD CONSTRAINT [Fk_const] FOREIGN KEY (FkId) REFERENCES [TemporaryTable] ([Id])
INSERT INTO TemporaryTable (FkId) VALUES (IDENT_CURRENT('[TemporaryTable]'))
INSERT INTO TemporaryTable (FkId) VALUES (IDENT_CURRENT('[TemporaryTable]'))
INSERT INTO TemporaryTable (FkId) VALUES (IDENT_CURRENT('[TemporaryTable]'))
INSERT INTO TemporaryTable (FkId) VALUES (IDENT_CURRENT('[TemporaryTable]'))
UPDATE TemporaryTable
SET [FkId] = 3
WHERE Id = 2
SELECT * FROM TemporaryTable
DROP TABLE TemporaryTable
More over, you can even use IDENT_CURRENT() as DEFAULT CONSTRAINT and it works instead of SCOPE_IDENTITY() for example. Try this:
CREATE TABLE TemporaryTable(
Id int PRIMARY KEY IDENTITY(1,1) NOT NULL,
FkId int NOT NULL DEFAULT IDENT_CURRENT('[TemporaryTable]')
)
ALTER TABLE TemporaryTable
ADD CONSTRAINT [Fk_const] FOREIGN KEY (FkId) REFERENCES [TemporaryTable] ([Id])
INSERT INTO TemporaryTable (FkId) VALUES (DEFAULT)
INSERT INTO TemporaryTable (FkId) VALUES (DEFAULT)
INSERT INTO TemporaryTable (FkId) VALUES (DEFAULT)
INSERT INTO TemporaryTable (FkId) VALUES (DEFAULT)
UPDATE TemporaryTable
SET [FkId] = 3
WHERE Id = 2
SELECT * FROM TemporaryTable
DROP TABLE TemporaryTable
You can do both.
To insert rows with a column "identity", you need to set identity_insert off.
Note that you still can't duplicate values!
You can see the command here.
Be aware to set identity_insert on afterwards.
To create a table with the same record, you simply need to:
create new column;
insert it with null value or other thing;
update that column after inserts with the value of the identity column.
If you need to insert the value at the same time, you can use the ##identity global variable. It'll give you the last inserted. So I think you need to do a ##identity + 1. In this case it can give wrong values because the ##identity is for all tables. So it'll count if the insert occurs in another table with identity.
Another solution is to get the max id and add one :) and you get the needed value!
use this simple code
`SCOPE_IDENTITY()+1
I know the original post was a long while ago. But, the top-most solution is using a trigger to update the field after the record has been inserted and I think there is a more efficient method.
Using a trigger for this has always bugged me. It always has seemed like there must be a better way. That trigger basically makes every insert perform 2 writes to the database, (1) the insert, and then (2) the update of the 2nd int. The trigger is also doing a join back into the table. This is a bit of overhead to have especially for a large database and large tables. And I suspect that as the table gets larger, the overhead of this approach does also. Maybe I'm wrong on that. But, it just doesn't seem like a good solution on a large table.
I wrote a function fn_GetIdent that can be used for this. It's funny how simple it is but really was some work to figure out. I stumbled onto this eventually. It turns out that calling IDENT_CURRENT(#variableTableName) from within a function that is called from the INSERT statements SET value assignment clause acts differently than if you call IDENT_CURRENT(#variableTableName) from the INSERT statement directly. And it makes it where you can get the new identity value for the record that you are inserting.
There is one caveat. When the identity is NULL (ie - an empty table with no records) it acts a little differently since the sys.identity_columns.last_value is NULL. So, you have to handle the very first record entered a little differently. I put code in the function to address that, and now it works.
This works because each call to the function, even within the same INSERT statement, is in it's own new "scope" within the function. (I believe that is the correct explanation). So, you can even insert multiple rows with one INSERT statement using this function. If you call IDENT_CURRENT(#variableTableName) from the INSERT statement directly, it will assign the same value for the newID in all rows. This is because the identity gets updated after the entire INSERT statement finishes processing (within the same scope). But, calling IDENT_CURRENT(#variableTableName) from within a function causes each insert to update the identity value with each row entered. But, it's all done in a function call from the INSERT statement itself. So, it's easy to implement once you have the function created.
This approach is a call to a function (from the INSERT statement) which does one read from the sys.identity_columns.last_value (to see if it is NULL and if a record exists) within the function and then calling IDENT_CURRENT(#variableTableName) and then returning out of the function to the INSERT statement to insert the row. So, it is one small read (for each row INSERTED) and then the one write of the insert which is less overhead than the trigger approach I think. The trigger approach could be rather inefficient if you use that for all tables in a large database with large tables. I haven't done any performance analysis on it compared to the trigger. But, I think this would be a lot more efficient, especially on large tables.
I've been testing it out and this seems to work in all cases. I would welcome feedback as to whether anyone finds where this doesn't work or if there is any problem with this approach. Can anyone can shoot holes in this approach? If so, please let me know. If not, could you vote it up? I think it is a better approach.
So, maybe being holed up due to COVID-19 out there, turned out to be productive for something. Thank you Microsoft for keeping me occupied. Anyone hiring? :) No, seriously, anyone hiring? OK, so now what am I going to do with myself now that I am done with this? :) Wishing everyone safe times out there.
Here is the code below. Wondering if this approach has any holes in it. Feedback welcomed.
IF OBJECT_ID('dbo.fn_GetIdent') IS NOT NULL
DROP FUNCTION dbo.fn_GetIdent;
GO
CREATE FUNCTION dbo.fn_GetIdent(#inTableName AS VARCHAR(MAX))
RETURNS Int
WITH EXECUTE AS CALLER
AS
BEGIN
DECLARE #tableHasIdentity AS Int
DECLARE #tableIdentitySeedValue AS Int
/*Check if the tables identity column is null - a special case*/
SELECT
#tableHasIdentity = CASE identity_columns.last_value WHEN NULL THEN 0 ELSE 1 END,
#tableIdentitySeedValue = CONVERT(int, identity_columns.seed_value)
FROM sys.tables
INNER JOIN sys.identity_columns
ON tables.object_id = identity_columns.object_id
WHERE identity_columns.is_identity = 1
AND tables.type = 'U'
AND tables.name = #inTableName;
DECLARE #ReturnValue AS Int;
SET #ReturnValue = CASE #tableHasIdentity WHEN 0 THEN #tableIdentitySeedValue
ELSE IDENT_CURRENT(#inTableName)
END;
RETURN (#ReturnValue);
END
GO
/* The function above only has to be created the one time to be used in the example below */
DECLARE #TableHasRows AS Bit
DROP TABLE IF EXISTS TestTable
CREATE TABLE TestTable (ID INT IDENTITY(1,1),
New INT,
Letter VARCHAR (1))
INSERT INTO TestTable (New, Letter)
VALUES (dbo.fn_GetIdent('TestTable'), 'H')
INSERT INTO TestTable (New, Letter)
VALUES (dbo.fn_GetIdent('TestTable'), 'e')
INSERT INTO TestTable (New, Letter)
VALUES (dbo.fn_GetIdent('TestTable'), 'l'),
(dbo.fn_GetIdent('TestTable'), 'l'),
(dbo.fn_GetIdent('TestTable'), 'o')
INSERT INTO TestTable (New, Letter)
VALUES (dbo.fn_GetIdent('TestTable'), ' '),
(dbo.fn_GetIdent('TestTable'), 'W'),
(dbo.fn_GetIdent('TestTable'), 'o'),
(dbo.fn_GetIdent('TestTable'), 'r'),
(dbo.fn_GetIdent('TestTable'), 'l'),
(dbo.fn_GetIdent('TestTable'), 'd')
INSERT INTO TestTable (New, Letter)
VALUES (dbo.fn_GetIdent('TestTable'), '!')
SELECT * FROM TestTable
/*
Result
ID New Letter
1 1 H
2 2 e
3 3 l
4 4 l
5 5 o
6 6
7 7 W
8 8 o
9 9 r
10 10 l
11 11 d
12 12 !
*/

At what point should a Table Variable be abandoned for a Temp Table?

Is there a row count that makes table variable's inefficient or what? I understand the differences between the two and I've seen some different figures on when that point is reached, but I'm curious if anyone knows.
When you need other indices on the table other than those that can be created on a temp table variable, or, for larger datasets (which are not likely to be persisted in available memory), when the table width (number of bytes per row) exceeds some threshold (This is because the number or rows of data per I/O page shrinks and the performance decreases... or if the changes you plan on making to the dataset need to be part of a multi-statement transaction which may need to be rolled back. (changes to Table variables are not written to the transaction log, changes to temp tables are...)
this code demonstrates that table variables are not stored in Transaction log:
create table #T (s varchar(128))
declare #T table (s varchar(128))
insert into #T select 'old value #'
insert into #T select 'old value #'
begin transaction
update #T set s='new value #'
update #T set s='new value #'
rollback transaction
select * from #T
select * from #T
Internally, table variables can be instantiated in tempdb as well as the temporary tables.
They differ in scope and persistence only.
Contrary to the popular belief, operations to the temp tables do affect the transaction log, despite the fact they are not subject to the transaction control.
To check it, run this simple query:
DECLARE #mytable TABLE (id INT NOT NULL PRIMARY KEY)
;
WITH q(num) AS
(
SELECT 1
UNION ALL
SELECT num + 1
FROM q
WHERE num <= 42
)
INSERT
INTO #mytable (id)
SELECT num
FROM q
OPTION (MAXRECURSION 0)
DBCC LOG(tempdb, -1)
GO
DBCC LOG(tempdb, -1)
GO
and browse the last entries from both recordsets.
In the first recordset, you will see 42 LOP_INSERT_ROWS entries.
In the second recordset (which is in another batch) you will see 42 LOP_DELETE_ROWS entries.
They are the result of the table variable getting out of scope and its record being deleted.

SQL Server: Is it possible to insert into two tables at the same time?

My database contains three tables called Object_Table, Data_Table and Link_Table. The link table just contains two columns, the identity of an object record and an identity of a data record.
I want to copy the data from DATA_TABLE where it is linked to one given object identity and insert corresponding records into Data_Table and Link_Table for a different given object identity.
I can do this by selecting into a table variable and the looping through doing two inserts for each iteration.
Is this the best way to do it?
Edit : I want to avoid a loop for two reason, the first is that I'm lazy and a loop/temp table requires more code, more code means more places to make a mistake and the second reason is a concern about performance.
I can copy all the data in one insert but how do get the link table to link to the new data records where each record has a new id?
In one statement: No.
In one transaction: Yes
BEGIN TRANSACTION
DECLARE #DataID int;
INSERT INTO DataTable (Column1 ...) VALUES (....);
SELECT #DataID = scope_identity();
INSERT INTO LinkTable VALUES (#ObjectID, #DataID);
COMMIT
The good news is that the above code is also guaranteed to be atomic, and can be sent to the server from a client application with one sql string in a single function call as if it were one statement. You could also apply a trigger to one table to get the effect of a single insert. However, it's ultimately still two statements and you probably don't want to run the trigger for every insert.
You still need two INSERT statements, but it sounds like you want to get the IDENTITY from the first insert and use it in the second, in which case, you might want to look into OUTPUT or OUTPUT INTO: http://msdn.microsoft.com/en-us/library/ms177564.aspx
The following sets up the situation I had, using table variables.
DECLARE #Object_Table TABLE
(
Id INT NOT NULL PRIMARY KEY
)
DECLARE #Link_Table TABLE
(
ObjectId INT NOT NULL,
DataId INT NOT NULL
)
DECLARE #Data_Table TABLE
(
Id INT NOT NULL Identity(1,1),
Data VARCHAR(50) NOT NULL
)
-- create two objects '1' and '2'
INSERT INTO #Object_Table (Id) VALUES (1)
INSERT INTO #Object_Table (Id) VALUES (2)
-- create some data
INSERT INTO #Data_Table (Data) VALUES ('Data One')
INSERT INTO #Data_Table (Data) VALUES ('Data Two')
-- link all data to first object
INSERT INTO #Link_Table (ObjectId, DataId)
SELECT Objects.Id, Data.Id
FROM #Object_Table AS Objects, #Data_Table AS Data
WHERE Objects.Id = 1
Thanks to another answer that pointed me towards the OUTPUT clause I can demonstrate a solution:
-- now I want to copy the data from from object 1 to object 2 without looping
INSERT INTO #Data_Table (Data)
OUTPUT 2, INSERTED.Id INTO #Link_Table (ObjectId, DataId)
SELECT Data.Data
FROM #Data_Table AS Data INNER JOIN #Link_Table AS Link ON Data.Id = Link.DataId
INNER JOIN #Object_Table AS Objects ON Link.ObjectId = Objects.Id
WHERE Objects.Id = 1
It turns out however that it is not that simple in real life because of the following error
the OUTPUT INTO clause cannot be on
either side of a (primary key, foreign
key) relationship
I can still OUTPUT INTO a temp table and then finish with normal insert. So I can avoid my loop but I cannot avoid the temp table.
I want to stress on using
SET XACT_ABORT ON;
for the MSSQL transaction with multiple sql statements.
See: https://msdn.microsoft.com/en-us/library/ms188792.aspx
They provide a very good example.
So, the final code should look like the following:
SET XACT_ABORT ON;
BEGIN TRANSACTION
DECLARE #DataID int;
INSERT INTO DataTable (Column1 ...) VALUES (....);
SELECT #DataID = scope_identity();
INSERT INTO LinkTable VALUES (#ObjectID, #DataID);
COMMIT
It sounds like the Link table captures the many:many relationship between the Object table and Data table.
My suggestion is to use a stored procedure to manage the transactions. When you want to insert to the Object or Data table perform your inserts, get the new IDs and insert them to the Link table.
This allows all of your logic to remain encapsulated in one easy to call sproc.
If you want the actions to be more or less atomic, I would make sure to wrap them in a transaction. That way you can be sure both happened or both didn't happen as needed.
You might create a View selecting the column names required by your insert statement, add an INSTEAD OF INSERT Trigger, and insert into this view.
Before being able to do a multitable insert in Oracle, you could use a trick involving an insert into a view that had an INSTEAD OF trigger defined on it to perform the inserts. Can this be done in SQL Server?
Insert can only operate on one table at a time. Multiple Inserts have to have multiple statements.
I don't know that you need to do the looping through a table variable - can't you just use a mass insert into one table, then the mass insert into the other?
By the way - I am guessing you mean copy the data from Object_Table; otherwise the question does not make sense.
//if you want to insert the same as first table
$qry = "INSERT INTO table (one, two, three) VALUES('$one','$two','$three')";
$result = #mysql_query($qry);
$qry2 = "INSERT INTO table2 (one,two, three) VVALUES('$one','$two','$three')";
$result = #mysql_query($qry2);
//or if you want to insert certain parts of table one
$qry = "INSERT INTO table (one, two, three) VALUES('$one','$two','$three')";
$result = #mysql_query($qry);
$qry2 = "INSERT INTO table2 (two) VALUES('$two')";
$result = #mysql_query($qry2);
//i know it looks too good to be right, but it works and you can keep adding query's just change the
"$qry"-number and number in #mysql_query($qry"")
I have 17 tables this has worked in.
-- ================================================
-- Template generated from Template Explorer using:
-- Create Procedure (New Menu).SQL
--
-- Use the Specify Values for Template Parameters
-- command (Ctrl-Shift-M) to fill in the parameter
-- values below.
--
-- This block of comments will not be included in
-- the definition of the procedure.
-- ================================================
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE InsetIntoTwoTable
(
#name nvarchar(50),
#Email nvarchar(50)
)
AS
BEGIN
SET NOCOUNT ON;
insert into dbo.info(name) values (#name)
insert into dbo.login(Email) values (#Email)
END
GO