I want to Rebuild data table from a sorted table - sql

Okay first a little bit of background, I've inherited maintaining a Database on MSSQL 2000.
In the Database there's a massive collection of interconnected tables, through Foreign keys.
What I'm attempting to do is to rebuild each table in a sorted fashion that will eliminate gaps in the IDENT column of the table.
On one table in particular I have the following columns:
RL_ID, RL_FK_RaidID, RL_FK_MemberID, RL_FK_ItemID, RL_ItemValue, RL_Notes, RL_IsUber, RL_IsWishItem, RL_LootModifier, RL_WishItemValue, RL_WeightedLootValue
It uses RL_ID as the IDENT column which currently reports 32620 by using DBCC CHECKIDENT (Table)
There is, however, only 12128 rows of information in this table.
So I tried a simple script to copy all the information in a sorted fashion into a new table:
INSERT INTO Table_1
SELECT RL_ID, RL_FK_RaidID, RL_FK_MemberID, RL_FK_ItemID, RL_ItemValue, RL_Notes, RL_IsUber, RL_IsWishItem, RL_LootModifier, RL_WishItemValue, RL_WeightedLootValue
FROM RaidLoot
ORDER BY RL_ID
Then Delete all the rows from the source table with:
TRUNCATE TABLE (RaidLoot)
Verify the IDENT is 1 with:
DBCC CHECKIDENT (RaidLoot)
Now copy the Data back into the Original table from Row 1 to the end:
SET IDENTITY_INSERT RaidLoot ON
INSERT INTO RaidLoot (RL_ID, RL_FK_RaidID, RL_FK_MemberID, RL_FK_ItemID, RL_ItemValue, RL_Notes, RL_IsUber, RL_IsWishItem, RL_LootModifier, RL_WishItemValue, RL_WeightedLootValue)
SELECT RL_ID, RL_FK_RaidID, RL_FK_MemberID, RL_FK_ItemID, RL_ItemValue, RL_Notes, RL_IsUber, RL_IsWishItem, RL_LootModifier, RL_WishItemValue, RL_WeightedLootValue
FROM Table_1
ORDER BY RL_ID
SET IDENTITY_INSERT RaidLoot OFF
Now verify that I only have the 12128 rows of data:
DBCC CHECKIDENT (RaidLoot)
(Note: I end up with 32620 again since it never did renumber the RL_ID, it just put them back into the same spots leaving the gaps). So where / how can I get it to Renumber the RL_ID column starting from 1 so that when it writes the data back to the original table I don't have the gaps?
The only other solution I can see is a heartache process of Manually changing each row RL_ID in the Table_1 before I write it back to the Original table. While this isn't impossible. I have another table that has approx 306,000 rows of data, but the IDENT report lists out as 450,123, so I'm hoping there is an easier way to automate the renumbering process.

If you really have to do this (seems like a great waste of time to me), you will have to adjust all of the foreign key references as well.
Consider the strategy of adding a NewID column for each table and populate the new column sequentially. Then you can use this NewID column in the queries needed to adjust the foreign keys. Very messy nonetheless unless you can come up with a consistent pattern to do so.
Since you can query the metadata to determine foreign keys, etc. this is certainly possible, and definitely should be considered seriously if you really do have lots of tables.
ADDED
There is a simple way to populate the NewID column
declare #id int
set #id = 0
update MyTable set NewID=#id, #id=#id+1
It is not obvious that this works, but it does.

I don't think it has to do with RL_ID being referenced by other tables in the schema - if I set up a single table test, the identity will always show up as the max number in the identity field:
CREATE TABLE #temp (id INT IDENTITY(1,1), other VARCHAR(1))
INSERT INTO #temp
( other )
VALUES ( -- id - int
'a' -- other - varchar(1)
),('b'),('c'),('d'),('e')
SELECT *
FROM #temp
SELECT *
INTO #holder
FROM #temp
WHERE other = 'C'
TRUNCATE TABLE #temp
SET IDENTITY_INSERT #temp ON
INSERT INTO #temp
( id, other )
SELECT id ,
other
FROM #holder
DBCC CHECKIDENT (#temp)
DROP TABLE #temp
DROP TABLE #holder
So your new identity is 32620 because that is the MAX(RL_ID)

Related

Updating Identity Column of a table with consecutive numbers through SQL Stored Procedure

After deleting the duplicate records from the table,
I want to update Identity column of a table with consecutive numbering starting with 1. Here is my table details
id(identity(1,1)),
EmployeeID(int),
Punch_Time(datetime),
Deviceid(int)
I need to perform this action through a stored procedure.
When i tried following statement in stored procedure
DECLARE #myVar int
SET #myVar = 0
set identity_insert TempTrans_Raw# ON
UPDATE TempTrans_Raw# SET #myvar = Id = #myVar + 1
set identity_insert TempTrans_Raw# off
gave error like...Cannot update identity column 'Id'
Anyone please suggest how to update Identity column of that table with consecutive numbering starting with 1.
--before running this make sure Foreign key constraints have been removed that reference the ID.
--insert everything into a temp table
SELECT (ColumnList) --except identity column
INTO #tmpYourTable
FROM yourTable
--clear your table
DELETE FROM yourTable
-- reseed identity
DBCC CHECKIDENT('table', RESEED, new reseed value)
--insert back all the values
INSERT INTO yourTable (ColumnList)
SELECT OtherCols FROM #tmpYourTable
--drop the temp table
DROP TABLE #tmpYourTable
GO
The IDENTITY keword is used to generate a key which can be used in combination with the PRIMARY KEY constraint to get a technical key. Such keys are technical, they are used to link table records. They should have no other meaning (such as a sort order). SQL Server does not guarantee the generated IDs to be consecutive. They do guarantee however that you get them in order. (So you might get 1, 2, 4, ..., but never 1, 4, 2, ...)
Here is the documentation for IDENTITY: https://msdn.microsoft.com/de-de/library/ms186775.aspx.
Personally I don't like it to be guaranteed that the generated IDs are in order. A technical ID is supposed to have no meaning other then offering a reference to a record. You can rely on the order, but if order is information you are interested in, you should store that information in my opinion (in form of a timestamp for example).
If you want to have a number telling you that a record is the fifth or sixteenth or whatever record in order, you can get always get that number on the fly using the ROW_NUMBER function. So there is no need to generate and store such consecutive value (which could also be quite troublesome when it comes to concurrent transactions on the table). Here is how to get that number:
select
row_number() over(order by id),
employeeid,
punch_time,
deviceid
from mytable;
Having said all this; it should never be necessary to change an ID. It is a sign for inappropriate table design, if you feel that need.
If you really need sequential numbers, may I suggest that you create a table ("OrderNumbers") with valid numbers, and then make you program pick one row from OrderNumbers when you add a row to yourTable.
If you everything in one transaction (i.e. with Begin Tran and Commit) then you can get one number for one row with no gabs.
You should have either Primary Keys or Unique Keys on both tables on this column to protect against duplicates.
HIH,
Henrik
Check this function: DBCC CHECKIDENT('table', RESEED, new reseed value)

How to use multiple identity numbers in one table?

I have an web application that creates printable forms, these forms have a unique number on them, the problem is I have 2 forms that separate numbers need to be created for them.
ie)
Form1- Numbered 2000000-2999999
Form2- Numbered 3000000-3999999
dbo.test2 - is my form information table
Tsel - is my autoinc table for the 3000000 series numbers
Tadv - is my autoinc table for the 2000000 series numbers
What I have done is create 2 tables with just autoinc row (one for 2000000 series numbers and one for 3000000 series numbers), I then created a trigger to add a record to the coresponding table, read back the autoinc number and add it to my table that stores the form information including the just created autoinc number for the right series of forms.
Although it does work, I'm concerned that the numbers will get messed up under load.
I'm not sure the ##IDENTITY will always return the right value when many people are using the system. (I cannot have duplicates and I need to use the numbering form show above.
See code below.
**** TRIGGER ****
CREATE TRIGGER MAKEANID2 ON dbo.test2
AFTER INSERT
AS
SET NOCOUNT ON
declare #someid int
declare #someid2 int
declare #startfrom int
declare #test1 varchar(10)
select #someid=##IDENTITY
select #test1 = (Select name1 from test2 where sysid = #someid )
if #test1 = 'select'
begin
insert into Tsel Default values
select #someid2 = ##IDENTITY
end
if #test1 = 'adv'
begin
insert into Tadv Default values
select #someid2 = ##IDENTITY
end
update test2
set name2=(#someid2) where sysid = #someid
SET NOCOUNT OFF
The best way to keep the two IDs in sync is to create a persisted Computed Column based on the actual identity column. Where Col1 is the identity column and Col2 is the persisted computed column that is the result of some formula based on Col1. You can then even Create Indexes on Computed Columns.
test this out:
CREATE TABLE YourTable
(Col1 int not null identity(2000000,1)
,Col2 AS (Col1-2000000+3000000) PERSISTED
,Col3 varchar(5)
)
GO
insert into YourTable (col3) values ('a')
insert into YourTable (col3) SELECT 'b' UNION SELECT 'c'
SELECT * FROM YourTable
OUTPUT:
Col1 Col2 Col3
----------- ----------- -----
2000000 3000000 a
2000001 3000001 b
2000002 3000002 c
(3 row(s) affected)
EDIT After OPs comments, I'm still not 100% sure what you are after.
I never used SQL Server 2000 (we skipped that version), and I don't really want to look up how to do everything in that version, it is so limited without the OUTPUT clause and ROW_NUMBER(), CTEs, etc.
I can think of three methods to do:
1) You could just create a sequence table, where you have 2 rows one for A and one for B, each time you need to insert one, look up, increment, and save the value of the type of seq you need and then insert with that value. for example if you are inserting a type "A" row, do this:
INSERT INTO test2
(col1, col2, col3,...)
SELECT
ISNULL(MAX(NextSeq),0)+1, col2, col3,...
FROM YourSequenceTable WITH (UPDLOCK, HOLDLOCK)
WHERE SequenceType='A'
UPDATE YourSequenceTable
SET NextSeq=ISNULL(NextSeq,0)+1
WHERE SequenceType='A'
2) change your table structure to just save the data in Tsel or Tadv and have a trigger insert into a third common table table where you can have your additional "common" identity. common table would be like
CommonTable
ID int not null indentity(1,1) primary key
TselID int null FK to Tsel.PK
TadvID int null FK to Tadv.PK
3) if you need a single table, try this, which is a real hack. Change your Tsel and Tadv tables to contain all the necessary columns and from the application INSERT INTO Tsel when the value is select and have a trigger grab that identity value and then INSERT that into test2, then remove the data from tsel. Then, from the application when the value is adv just INSERT INTO Tadv an have a trigger on that table insert the data into test2, and remove the data from Tadv. You need to have all data columns in Tsel and Tadv so the trigger can copy the values to test2, but the trigger will remove the rows from there (the identity will be sequential even if the original rows are removed).
your Tsel trigger would look like:
CREATE Trigger MAKEANID2_Tsel ON dbo.Tsel
AFTER INSERT
AS
--copy data from Tsel into test2., test2 can still have its own identity value
INSERT INTO test2
(PK, col1, col2, col3,...)
SELECT
col0, col1, col2, col3,....
FROM INSERTED
--remove rows from Tsel, which were just copied and not needed anymore.
DELETE Tsel
WHERE PK IN (SELECT PK FROM INSERTED)
GO
YOu are right to worry about ##identity, it is not a recommended peice of code, if somone else adds a differnet trigger that inserets an identity and that one triggers first, that is the value you will get.
But you have much bigger problems. Your trigger is deisgned to work on only one record ata time. This is a very very very bad thing to do with a trigger. Triggers operate on sets of data and must ALWAYS even if you think therer will never be more than one record inserted ata time) be set up to handle sets of data not one record. Further, you don;t need to ask for the identity, you have the identities of all records inserted inteh batch in a psuedotable availlble in triggers called inserted.
Now reading one of your comments, you say you can't have any missing values at all. Inthat case you cannot under any circustance use an identity column as it will have gaps if any transaction is rolled back. You will have to write your own process to create the numbers based onteh last number and look out for race conditions.

Stuck trying to migrate two tables from one DB to another DB

i'm trying to migrate some data from two tables in an OLD database, to a NEW database.
The problem is that I wish to generate new Primary Key's in the new database, for the first table that is getting imported. That's simple.
But the 2nd table in the old database has a foreign key dependency on the first table. So when I want to migrate the old data from the second table, the foreign key's don't match any more.
Are there any tricks/best practices involved to help me migrate the data?
Serious Note: i cannot change the current schema of the new tables, which do not have any 'old id' column.
Lets use the following table schema :-
Old Table1 New Table1
ParentId INT PK ParentId INT PK
Name VARCHAR(50) Name VARCHAR(50)
Old Table 2 New Table 2
ChildId INT PK ChildId INT PK
ParentId INT FK ParentId INT FK
Foo VARCHAR(50) Foo VARCHAR(50)
So the table schema's are identical.
Thoughts?
EDIT:
For those that are asking, RDBMS is Sql Server 2008. I didn't specify the software because i was hoping i would get an agnostic answer with some generic T-Sql :P
I think you need to do this in 2 steps.
You need to import the old tables and keep the old ids (and generate new ones). Then once they're in the new database and they have both new and old ids you can use the old Id's to get associate the new ids, then you drop the old ids.
You can do this by importing into temporary (i.e. they will be thrown away) tables, then inserting into the permanent tables, leaving out the old ids.
Or import directy into the new tables (with schema modified to also hold old ids), then drop the old id's when they're no longer necessary.
EDIT:
OK, I'm a bit clearer on what you're looking for thanks to comments here and on other answers. I knocked this up, I think it'll do what you want.
Basically without cursors it steps through the parent table, row by row, and inserts the new partent row, and all the child rows for that parent row, keeping the new id's in sync.
I tried it out and it should work, it doesn't need exclusive access to the tables and should be orders of magniture faster than a cursor.
declare #oldId as int
declare #newId as int
select #oldId = Min(ParentId) from OldTable1
while not #oldId is null
begin
Insert Into NewTable1 (Name)
Select Name from OldTable1 where ParentId = #oldId
Select #newId = SCOPE_IDENTITY()
Insert Into NewTable2 (ParentId, Foo)
Select #newId, Foo From OldTable2 Where ParentId = #oldId
select #oldId = Min(ParentId) from OldTable1 where ParentId > #oldId
end
Hope this helps,
Well, I guess you'll have to determine other criteria to create a map like oldPK => newPK (for example: Name field is equal?
Then you can determine the new PK that matches the old PK and adjust the ParentID accordingly.
You may also do a little trick: Add a new column to the original Table1 which stores the new PK value for a copied record. Then you can easily copy the values of Table2 pointing them to the value of the new column instead of the old PK.
EDIT: I'm trying to provide some sample code of what I meant by my little trick. I'm not altering the original database structure, but I'm using a temporary table now.
OK, you might try to following:
1) Create temporary table that holds the values of the old table, plus, it gets a new PK:
CREATE TABLE #tempTable1
(
newPKField INT,
oldPKField INT,
Name VARCHAR(50)
)
2) Insert all the values from your old table into the temporary table calculating a new PK, copying the old PK:
INSERT INTO #tempTable1
SELECT
newPKValueHere AS newPKField,
ParentID as oldPKField,
Name
FROM
Table1
3) Copy the values to the new table
INSERT INTO NewTable1
SELECT
newPKField as ParentId,
Name
FROM
#tempTable1
4) Copy the values from Table2 to NewTable2
INSERT INTO NewTable2
SELECT
ChildID,
t.newPKField AS ParentId,
Foo
FROM
Table2
INNER JOIN #tempTable1 t ON t.ParentId = parentId
This should do. Please note that this is only pseudo T-SQL Code - I have not tested this on a real database! However, it should come close to what you need.
Can you change the schema of the old tables? If so, you could put a "new id" column on the old tables, and use that as the reference.
You might have to do a row by row insert on the new table and then retrieve the scope_identity, store it in the old table1. But for table2, you can then join to the old table1 and grab the new_id.
First of all - can you not even have some temporary schema that you can later drop?! That would make life easier. Assuming you can't:
If you're lucky (and if you can guarantee that no other inserts will be happening at the same time) then when you insert the Table1's data into your new table you could perhaps cheat by relying on the sequential order of the inserts.
You could then create a view that joins the 2 tables on a row-count so that you have a way to correlate the keys to each other. That way you'd be one step closer to being able to identify the 'ParentId' for the new Table2.
I'm not sure from your question what database software you're using, but if temporary tables are an option, create a temporary table containing the original primary key of table1 and the new primary key of table1. Then create another temporary table with a copy of table2, update the copy using the "old key, new key" table you created earlier, then use "insert into select from" (or whatever the appropriate command is for your database) to copy the revised temporary table into its permanent location.
I had the wonderful opportunity to be dug deep in migration scripts last summer. I was using Oracle's PL/SQL for the task. But you did not mention what technology are you using? What are you migrating the data into? SQL Server? Oracle? MySQL?
The approach is to INSERT a row from table1 RETURING the new primary key generated (probably by a SEQUENCE [in Oracle]) and then INSERT the dependent records from table2, changing their foreign key value to the value returned by the first INSERT. Can't help you any better unless you can specify what DBMS are you migrating data into.
The following Pseudo-ish code should work for you
CREATE TABLE newtable1
ParentId INT PK
OldId INT
Name VARCHAR(50)
CREATE TABLE newtable2
ChildId INT pk
ParentId INT FK
OldParent INT
Foo VARCHAR(50)
INSERT INTO newtable1(OldId, Name)
SELECT ParentId, Name FROM oldtable1
INSERT INTO newtable2(OldParent, Foo)
SELECT ParentId, Foo FROM oldtable2
UPDATE newtable2 SET ParentId = (
SELECT n.ParentId
FROM newtable1 AS n
WHERE n.OldId = newtable2.oldParent
)
ALTER TABLE newtable1 DROP OldId
ALTER TABLE newtable2 DROP OldParent

Row number in Sybase tables

Sybase db tables do not have a concept of self updating row numbers. However , for one of the modules , I require the presence of rownumber corresponding to each row in the database such that max(Column) would always tell me the number of rows in the table.
I thought I'll introduce an int column and keep updating this column to keep track of the row number. However I'm having problems in updating this column in case of deletes. What sql should I use in delete trigger to update this column?
You can easily assign a unique number to each row by using an identity column. The identity can be a numeric or an integer (in ASE12+).
This will almost do what you require. There are certain circumstances in which you will get a gap in the identity sequence. (These are called "identity gaps", the best discussion on them is here). Also deletes will cause gaps in the sequence as you've identified.
Why do you need to use max(col) to get the number of rows in the table, when you could just use count(*)? If you're trying to get the last row from the table, then you can do
select * from table where column = (select max(column) from table).
Regarding the delete trigger to update a manually managed column, I think this would be a potential source of deadlocks, and many performance issues. Imagine you have 1 million rows in your table, and you delete row 1, that's 999999 rows you now have to update to subtract 1 from the id.
Delete trigger
CREATE TRIGGER tigger ON myTable FOR DELETE
AS
update myTable
set id = id - (select count(*) from deleted d where d.id < t.id)
from myTable t
To avoid locking problems
You could add an extra table (which joins to your primary table) like this:
CREATE TABLE rowCounter
(id int, -- foreign key to main table
rownum int)
... and use the rownum field from this table.
If you put the delete trigger on this table then you would hugely reduce the potential for locking problems.
Approximate solution?
Does the table need to keep its rownumbers up to date all the time?
If not, you could have a job which runs every minute or so, which checks for gaps in the rownum, and does an update.
Question: do the rownumbers have to reflect the order in which rows were inserted?
If not, you could do far fewer updates, but only updating the most recent rows, "moving" them into gaps.
Leave a comment if you would like me to post any SQL for these ideas.
I'm not sure why you would want to do this. You could experiment with using temporary tables and "select into" with an Identity column like below.
create table test
(
col1 int,
col2 varchar(3)
)
insert into test values (100, "abc")
insert into test values (111, "def")
insert into test values (222, "ghi")
insert into test values (300, "jkl")
insert into test values (400, "mno")
select rank = identity(10), col1 into #t1 from Test
select * from #t1
delete from test where col2="ghi"
select rank = identity(10), col1 into #t2 from Test
select * from #t2
drop table test
drop table #t1
drop table #t2
This would give you a dynamic id (of sorts)

Insert into ... Select *, how to ignore identity?

I have a temp table with the exact structure of a concrete table T. It was created like this:
select top 0 * into #tmp from T
After processing and filling in content into #tmp, I want to copy the content back to T like this:
insert into T select * from #tmp
This is okay as long as T doesn't have identity column, but in my case it does. Is there any way I can ignore the auto-increment identity column from #tmp when I copy to T? My motivation is to avoid having to spell out every column name in the Insert Into list.
EDIT: toggling identity_insert wouldn't work because the pkeys in #tmp may collide with those in T if rows were inserted into T outside of my script, that's if #tmp has auto-incremented the pkey to sync with T's in the first place.
SET IDENTITY_INSERT ON
INSERT command
SET IDENTITY_INSERT OFF
As identity will be generated during insert anyway, could you simply remove this column from #tmp before inserting the data back to T?
alter table #tmp drop column id
UPD: Here's an example I've tested in SQL Server 2008:
create table T(ID int identity(1,1) not null, Value nvarchar(50))
insert into T (Value) values (N'Hello T!')
select top 0 * into #tmp from T
alter table #tmp drop column ID
insert into #tmp (Value) values (N'Hello #tmp')
insert into T select * from #tmp
drop table #tmp
select * from T
drop table T
See answers here and here:
select * into without_id from with_id
union all
select * from with_id where 1 = 0
Reason:
When an existing identity column is selected into a new table, the new column inherits the IDENTITY property, unless one of the following conditions is true:
The SELECT statement contains a join, GROUP BY clause, or aggregate function.
Multiple SELECT statements are joined by using UNION.
The identity column is listed more than one time in the select list.
The identity column is part of an expression.
The identity column is from a remote data source.
If any one of these conditions is true, the column is created NOT NULL instead of inheriting the IDENTITY property. If an identity column is required in the new table but such a column is not available, or you want a seed or increment value that is different than the source identity column, define the column in the select list using the IDENTITY function. See "Creating an identity column using the IDENTITY function" in the Examples section below.
All credit goes to Eric Humphrey and bernd_k
Not with SELECT * - if you selected every column but the identity, it will be fine. The only way I can see is that you could do this by dynamically building the INSERT statement.
Just list the colums you want to re-insert, you should never use select * anyway. If you don't want to type them ,just drag them from the object browser (If you expand the table and drag the word, columns, you will get all of them, just delete the id column)
INSERT INTO #Table
SELECT MAX(Id) + ROW_NUMBER() OVER(ORDER BY Id)
set identity_insert on
Use this.
Might an "update where T.ID = #tmp.ID" work?
it gives me a chance to preview the data before I do the insert
I have joins between temp tables as part of my calculation; temp tables allows me to focus on the exact set data that I am working with. I think that was it. Any suggestions/comments?
For part 1, as mentioned by Kolten in one of the comments, encapsulating your statements in a transaction and adding a parameter to toggle between display and commit will meet your needs. For Part 2, I would needs to see what "calculations" you are attempting. Limiting your data to a temp table may be over complicating the situation.