When using OUTPUT in an insert statement, if you specify the order by in the select does it honor the order specified in the select? - sql

I am inserting data into two tables. Within each insert there is an OUTPUT to a #temp table each with an identity column. The select that is generating the data for the insert has the same order by for each insert. Later on I join the two #temp tables by the Identity column. What I would expect is that the identity column numbers would line up as the order by is specified on both sides when inserting. Every long once in a while it appears those numbers don't match up and the only thing I can think of is that perhaps the OUTPUT isn't always honoring the order by in the select statements when writing the OUTPUT data to the temp tables.
CREATE TABLE #TempTable
(
RowNumber Integer IDENTITY (1,1) NOT NULL,
TableID Integer
CONSTRAINT PK_TableID PRIMARY KEY NONCLUSTERED (RowNumber)
)
INSERT INTO Table
(column1,column2,column3,etc)
OUTPUT
INSERTED.ID
INTO #TempTable
(ID)
SELECT
column1,column2,column3,etc
FROM
Other table
ORDER BY
SourceFlag,
StoreID,
storenumber,
EstablishDate,
TableID
What I would expect is that the statements would insert for example 25 rows in both statements in the same order 1 through 25. Then I should be able to join based on the row number 1 = 1, 25= 25, etc. in order to get the matching data. What I think is happening is somehow that order is getting messed up, so that row #1 from the first insert really matches say row #14 from the second, so when I later join 1 on 1 I'm getting mismatched data.

Apparently, it doesn't:
However, SQL Server does not guarantee the order in which rows are
processed and returned by DML statements using the OUTPUT clause.
You need to identify a natural key in your data and then reference it to match the newly inserted rows with the OUTPUT resultset.
Alternatively, you can replace the INSERT with MERGE; in this case, you will be able to catch the newly created identity values for your records in the OUTPUT clause.

Related

Pull data without altering the item sequence in the reference table

I want to lookup values listed in a temp table:
So let us say:
Create Table #mylist
(
eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS,
refdate datetime
)
Insert Into #mylist (eserial, refdate) Values ('A', '2015-09-15')
Insert Into #mylist (eserial, refdate) Values ('B', '2015-09-14')
Insert Into #mylist (eserial, refdate) Values ('C', '2015-09-13')
Insert Into #mylist (eserial, refdate) Values ('D', '2015-09-12')
I need the result to be the Top 1 date less than the reference date.
And should be returned in the same sequence as is in the temporary table.
What I tried:
Select
lst.eserial,
lst.refdate,
app.CREATEDDATETIME,
From #mylist lst
Outer Apply
(Select Top 1 rec.CREATEDDATETIME, rec.ESERIAL, rec.ITEMID
From TableSource rec
Where lst.eserial=rec.ESERIAL And rec.CREATEDDATETIME<lst.refdate
Order By rec.CREATEDDATETIME Desc
) As app
This works but it is slow. Also, if the number of rows are increased, it does not consistently preserve the sequence of eserial. I need the query to preserve the order I put it in the temporary table.
Again my expected output is simply:
Where eserial is the same sequence as the temp table and CREATEDDATETIME is the maximum date less than the reference date. More like a conditional Vlookup if you know Excel.
It is not quite clear what you mean by
maintain the sequence of the items in the temporary table
, but if you want to get result ordered by eserial, then you have to add ORDER BY eserial to your query. Without ORDER BY the resulting rows can be returned in any order. This applies to any method that you choose.
So, taking your last query as a basis, it will look like this:
Select
lst.eserial
,lst.refdate
,app.CREATEDDATETIME
From
#mylist lst
Outer Apply
(
Select Top 1 rec.CREATEDDATETIME
From TableSource rec
Where lst.eserial=rec.ESERIAL And rec.CREATEDDATETIME<lst.refdate
Order By rec.CREATEDDATETIME Desc
) As app
ORDER BY lst.eserial;
To make it work fast and efficiently add an index to TableSource on (ESERIAL, CREATEDDATETIME). Order of columns in the index is important.
It is also important to know if there are any other columns that you use in OUTER APPLY query and how you use them. You mentioned column AREAID in the first variant in the question, but not in the last variant. If you do have more columns, then clearly show how you intend to use them, because the correct index would depend on it. The index on (ESERIAL, CREATEDDATETIME) is enough for the query I wrote above, but if you have more columns a different index may be required.
It would also help optimizer if you defined your temp table with a PRIMARY KEY:
Create Table #mylist
(
eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS PRIMARY KEY,
refdate datetime
)
Primary key would create a unique clustered index.
One more important note. What is the type and collation of columns ESERIAL and CREATEDDATETIME in the main TableSource table? Make sure that types and collation of columns in your temp table matches the main TableSource table. If the type is different (varchar vs. nvarchar or datetime vs. date) or collation is different index may not be used => it will be slow.
Edit
You use the phrase "same sequence as the temp table" several times in the question, but it is not really clear what you mean by it. Your sample data doesn't help to resolve the ambiguity. The column name eserial also adds to the confusion. I can see two possible meanings:
Return rows from temp table ordered by values in eserial column.
Return rows from temp table in the same order as they were inserted.
My original answer implies (1): it returns rows from temp table ordered by values in eserial column.
If you want to preserve the order of rows as they were inserted into the table, you need to explicitly remember this order somehow. The easiest method is to add an IDENTITY column to the temp table and later order by this column. Like this:
Create Table #mylist
(
ID int IDENTITY PRIMARY KEY,
eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS,
refdate datetime
)
And in the final query use ORDER BY lst.ID.
That's easy using identity. Query without Order is not guarantee to have order in SQL server.
Create Table #mylist
(
seqId int identity(1,1),
eserial nvarchar(35) Collate SQL_Latin1_General_CP850_CI_AS,
refdate datetime
)
Use the table freely and put Order By seqId at the end of your query
Edit
Use MAX() instead of TOP 1 with order if you have no cluster index on ESERIAL, CREATEDDATETIME on the TableSource
https://stackoverflow.com/a/21420643/1287352
Select
lst.eserial,
lst.refdate,
app.CREATEDDATETIME,
From #mylist lst
Outer Apply
(
Select MAX(rec.CREATEDDATETIME), rec.ESERIAL, rec.ITEMID
From TableSource rec
Where lst.eserial = rec.ESERIAL And rec.CREATEDDATETIME < lst.refdate
GROUP BY rec.ESERIAL, rec.ITEMID
) As app
ORDER BY lst.seqId
Perhaps the performance issue is due to indexing. Try adding the indexes below, removing UNIQUE if the keys are not unique.
CREATE UNIQUE NONCLUSTERED INDEX idx ON #mylist (eserial, refdate);
CREATE UNIQUE NONCLUSTERED INDEX idx ON TableSource (eserial, CREATEDDATETIME);

Can I keep old keys linked to new keys when making a copy in SQL?

I am trying to copy a record in a table and change a few values with a stored procedure in SQL Server 2005. This is simple, but I also need to copy relationships in other tables with the new primary keys. As this proc is being used to batch copy records, I've found it difficult to store some relationship between old keys and new keys.
Right now, I am grabbing new keys from the batch insert using OUTPUT INTO.
ex:
INSERT INTO table
(column1, column2,...)
OUTPUT INSERTED.PrimaryKey INTO #TableVariable
SELECT column1, column2,...
Is there a way like this to easily get the old keys inserted at the same time I am inserting new keys (to ensure I have paired up the proper corresponding keys)?
I know cursors are an option, but I have never used them and have only heard them referenced in a horror story fashion. I'd much prefer to use OUTPUT INTO, or something like it.
If you need to track both old and new keys in your temp table, you need to cheat and use MERGE:
Data setup:
create table T (
ID int IDENTITY(5,7) not null,
Col1 varchar(10) not null
);
go
insert into T (Col1) values ('abc'),('def');
And the replacement for your INSERT statement:
declare #TV table (
Old_ID int not null,
New_ID int not null
);
merge into T t1
using (select ID,Col1 from T) t2
on 1 = 0
when not matched then insert (Col1) values (t2.Col1)
output t2.ID,inserted.ID into #TV;
And (actually needs to be in the same batch so that you can access the table variable):
select * from T;
select * from #TV;
Produces:
ID Col1
5 abc
12 def
19 abc
26 def
Old_ID New_ID
5 19
12 26
The reason you have to do this is because of an irritating limitation on the OUTPUT clause when used with INSERT - you can only access the inserted table, not any of the tables that might be part of a SELECT.
Related - More explanation of the MERGE abuse
INSERT statements loading data into tables with an IDENTITY column are guaranteed to generate the values in the same order as the ORDER BY clause in the SELECT.
If you want the IDENTITY values to be assigned in a sequential fashion
that follows the ordering in the ORDER BY clause, create a table that
contains a column with the IDENTITY property and then run an INSERT ..
SELECT … ORDER BY query to populate this table.
From: The behavior of the IDENTITY function when used with SELECT INTO or INSERT .. SELECT queries that contain an ORDER BY clause
You can use this fact to match your old with your new identity values. First collect the list of primary keys that you intend to copy into a temporary table. You can also include your modified column values as well if needed:
select
PrimaryKey,
Col1
--Col2... etc
into #NewRecords
from Table
--where whatever...
Then do your INSERT with the OUTPUT clause to capture your new ids into the table variable:
declare #TableVariable table (
New_ID int not null
);
INSERT INTO #table
(Col1 /*,Col2... ect.*/)
OUTPUT INSERTED.PrimaryKey INTO #NewIds
SELECT Col1 /*,Col2... ect.*/
from #NewRecords
order by PrimaryKey
Because of the ORDER BY PrimaryKey statement, you will be guaranteed that your New_ID numbers will be generated in the same order as the PrimaryKey field of the copied records. Now you can match them up by row numbers ordered by the ID values. The following query would give you the parings:
select PrimaryKey, New_ID
from
(select PrimaryKey,
ROW_NUMBER() over (order by PrimaryKey) OldRow
from #NewRecords
) PrimaryKeys
join
(
select New_ID,
ROW_NUMBER() over (order by New_ID) NewRow
from #NewIds
) New_IDs
on OldRow = NewRow

INSERT INTO using SELECT and increment value in a column

I am trying to insert missing rows into a table. One of the columns is OrderNumber (sort number), this column should be +1 of the max value of OrderNumber returned for sID in the table. Some sIDs do not appear in the SPOL table which is why there is the WHERE clause at the end of the statement. I would run this statement again but set OrderNumber to 1 for the records where sID does not currently exist in the table.
The statement below doesn't work due to the OrderNumber causing issues with the primary key which is sID + OrderNumber.
How can I get the OrderNumber to increase for each row that is inserted based on the sID column?
INSERT INTO SPOL(sID, OrderNumber, oID)
SELECT
sID, OrderNumber, oID
FROM
(SELECT
sID,
(SELECT Max(OrderNumber) + 1
FROM SPOL
WHERE sID = TMPO.sID) AS OrderNumber,
oID
FROM TMPO
WHERE NOT EXISTS (SELECT * FROM SPOL
WHERE SPOL.oID = TMPO.oID)
) AS MyData
WHERE
OrderNumber IS NOT NULL
It's much better to handle this in the database design with an identity column - you don't mention whether or not you can change the schema but hopefully you can as queries will end up a lot cleaner if you don't have to manage it yourself.
You can set the Identity property to on for your OrderNumber column in SQL Server management studio, but the script it would generate clones the table with the new specification, inserts the values you've already got with Identity_Insert on, drops the original table, and renames the temporary one to replace it - this has massive overheads depending on how many rows you've got.
The most efficient way to go about it is probably:
create an additional column with the identity property on
copy across the values
rename the original column
rename the new column to the same name as the original
remove the original OrderNumber column
Once it's done, it's done though - and looks after itself. Wouldn't you rather your insert statement simply said something like this:
INSERT INTO SPOL (sID, oID)
SELECT sID, oID,
FROM TMPO
WHERE OrderNumber IS NOT NULL
Use identity(1,1) to increment your column Order Number,this would makes your task easy..!

Insert into a row at specific position into SQL server table with PK

I want to insert a row into a SQL server table at a specific position. For example my table has 100 rows and I want to insert a new row at position 9. But the ID column which is PK for the table already has a row with ID 9. How can I insert a row at this position so that all the rows after it shift to next position?
Relational tables have no 'position'. As an optimization, an index will sort rows by the specified key, if you wish to insert a row at a specific rank in the key order, insert it with a key that sorts in that rank position. In your case you'll have to update all rows with a value if ID greater than 8 to increment ID with 1, then insert the ID with value 9:
UPDATE TABLE table SET ID += 1 WHERE ID >= 9;
INSERT INTO TABLE (ID, ...) VALUES (9, ...);
Needless to say, there cannot possibly be any sane reason for doing something like that. If you would truly have such a requirement, then you would use a composite key with two (or more) parts. Such a key would allow you to insert subkeys so that it sorts in the desired order. But much more likely your problem can be solved exclusively by specifying a correct ORDER BY, w/o messing with the physical order of the rows.
Another way to look at it is to reconsider what primary key means: the identifier of an entity, which does not change during that entity lifetime. Then your question can be rephrased in a way that makes the fallacy in your question more obvious:
I want to change the content of the entity with ID 9 to some new
value. The old values of the entity 9 should be moved to the content
of entity with ID 10. The old content of entity with ID 10 should be
moved to the entity with ID 11... and so on and so forth. The old
content of the entity with the highest ID should be inserted as a new
entity.
Usually you do not want to use primary keys this way. A better approach would be to create another column called 'position' or similar where you can keep track of your own ordering system.
To perform the shifting you could run a query like this:
UPDATE table SET id = id + 1 WHERE id >= 9
This do not work if your column uses auto_increment functionality.
No, you can't control where the new row is inserted. Actually, you don't need to: use the ORDER BY clause on your SELECT statements to order the results the way you need.
DECLARE #duplicateTable4 TABLE (id int,data VARCHAR(20))
INSERT INTO #duplicateTable4 VALUES (1,'not duplicate row')
INSERT INTO #duplicateTable4 VALUES (2,'duplicate row')
INSERT INTO #duplicateTable4 VALUES (3,'duplicate rows')
INSERT INTO #duplicateTable4 VALUES (4,'second duplicate row')
INSERT INTO #duplicateTable4 VALUES (5,'second duplicat rows')
DECLARE #duplicateTable5 TABLE (id int,data VARCHAR(20))
insert into #duplicateTable5 select *from #duplicateTable4
delete from #duplicateTable4
declare #i int , #cnt int
set #i=1
set #cnt=(select count(*) from #duplicateTable5)
while(#i<=#cnt)
begin
if #i=1
begin
insert into #duplicateTable4(id,data) select 11,'indian'
insert into #duplicateTable4(id,data) select id,data from #duplicateTable5 where id=#i
end
else
insert into #duplicateTable4(id,data) select id,data from #duplicateTable5 where id=#i
set #i=#i+1
end
select *from #duplicateTable4
This kind of violates the purpose of a relational table, but if you need, it's not really that hard to do.
1) use ROW_NUMBER() OVER(ORDER BY NameOfColumnToSort ASC) AS Row to make a column for the row numbers in your table.
2) From here you can copy (using SELECT columnsYouNeed INTO ) the before and after portions of the table into two separate tables (based on which row number you want to insert your values after) using a WHERE Row < ## and Row >= ## statement respectively.
3) Next you drop the original table using DROP TABLE.
4) Then you use a UNION for the before table, the row you want to insert (using a single explicitly defined SELECT statement without anything else), and the after table. By now you have two UNION statements for 3 separate select clauses. Here you can just wrap this in a SELECT INTO FROM clause calling it the name of your original table.
5) Last, you DROP TABLE the two tables you made.
This is similar to how an ALTER TABLE works.
INSERT INTO customers
(customer_id, last_name, first_name)
SELECT employee_number AS customer_id, last_name, first_name
FROM employees
WHERE employee_number < 1003;
FOR MORE REF: https://www.techonthenet.com/sql/insert.php

Row number in Sybase tables

Sybase db tables do not have a concept of self updating row numbers. However , for one of the modules , I require the presence of rownumber corresponding to each row in the database such that max(Column) would always tell me the number of rows in the table.
I thought I'll introduce an int column and keep updating this column to keep track of the row number. However I'm having problems in updating this column in case of deletes. What sql should I use in delete trigger to update this column?
You can easily assign a unique number to each row by using an identity column. The identity can be a numeric or an integer (in ASE12+).
This will almost do what you require. There are certain circumstances in which you will get a gap in the identity sequence. (These are called "identity gaps", the best discussion on them is here). Also deletes will cause gaps in the sequence as you've identified.
Why do you need to use max(col) to get the number of rows in the table, when you could just use count(*)? If you're trying to get the last row from the table, then you can do
select * from table where column = (select max(column) from table).
Regarding the delete trigger to update a manually managed column, I think this would be a potential source of deadlocks, and many performance issues. Imagine you have 1 million rows in your table, and you delete row 1, that's 999999 rows you now have to update to subtract 1 from the id.
Delete trigger
CREATE TRIGGER tigger ON myTable FOR DELETE
AS
update myTable
set id = id - (select count(*) from deleted d where d.id < t.id)
from myTable t
To avoid locking problems
You could add an extra table (which joins to your primary table) like this:
CREATE TABLE rowCounter
(id int, -- foreign key to main table
rownum int)
... and use the rownum field from this table.
If you put the delete trigger on this table then you would hugely reduce the potential for locking problems.
Approximate solution?
Does the table need to keep its rownumbers up to date all the time?
If not, you could have a job which runs every minute or so, which checks for gaps in the rownum, and does an update.
Question: do the rownumbers have to reflect the order in which rows were inserted?
If not, you could do far fewer updates, but only updating the most recent rows, "moving" them into gaps.
Leave a comment if you would like me to post any SQL for these ideas.
I'm not sure why you would want to do this. You could experiment with using temporary tables and "select into" with an Identity column like below.
create table test
(
col1 int,
col2 varchar(3)
)
insert into test values (100, "abc")
insert into test values (111, "def")
insert into test values (222, "ghi")
insert into test values (300, "jkl")
insert into test values (400, "mno")
select rank = identity(10), col1 into #t1 from Test
select * from #t1
delete from test where col2="ghi"
select rank = identity(10), col1 into #t2 from Test
select * from #t2
drop table test
drop table #t1
drop table #t2
This would give you a dynamic id (of sorts)