I am working on SQL Server 2005 tables. Here I need to add a column named ‘ID’ as ‘IDENTITY’ column (With starting and incrementing values as 1, 1).
Now my problem is that these tables already have thousands of records. So could you please suggest the best and easy way to perform this job?
Many Thanks,
Regards.
Anusha.
If you add an identity column all the existing records will get calculated incremental values based on the seed value you establish on the new column.
First make sure you havea a good backup.
Here is an example:
alter table mydatabase.dbo.mytest
add id int identity (1,1)
Table will be locked up until it finishes adding the tidentity columns so don;t do this during high peak hours. As always test of dev first.
If you want to change an existing column to an identity that is harder.
Related
I am using SQL Server 2008, and I have a table that contains about 50 mill rows.
That table contains a primary identity column of type int.
I want to upgrade that column to be bigint.
I need to know how to do that in a quick way that will not make my DB server unavailable,
and will not delete or ruin any of my data
How should I best do it ? what are the consequences of doing that?
Well, it won't be a quick'n'easy way to do this, really....
My approach would be this:
create a new table with identical structure - except for the ID column being BIGINT IDENTITY instead of INT IDENTITY
----[ put your server into exclusive single-user mode here; user cannot use your server from this point on ]----
find and disable all foreign key constraints referencing your table
turn SET IDENTITY_INSERT (your new table) ON
insert the rows from your old table into the new table
turn SET IDENTITY_INSERT (your new table) OFF
delete your old table
rename your new table to the old table name
update all table that have a FK reference to your table to use BIGINT instead of INT (that should be doable with a simple ALTER TABLE ..... ALTER COLUMN FKID BIGINT)
re-create all foreign key relationships again
now you can return your server to normal multi-user usage again
What am I missing?
Why can't you just do this:
ALTER TABLE tableName ALTER COLUMN ID bigint
I guess try it in a test environment first but this always works for me
Probably the best way is to create a new table with a BIGINT IDENTITY column, move the existing data using SET IDENTITY_INSERT ON; and then rename the tables. You will need to do this during a maintenance window, just as you would if you changed the data type in Management Studio (which would similarly create a new table, move the data, and block everyone in the process).
You could use Alter script for your column as #MobileMon said, but couldn't do this before removing constraints. And besides FK constraints, you must also remove PK constraint before changing the column type!
Also there is another creative way, if the ID data is not important (No FK etc):
Take a Backup of table (if it's in a separate FileGroup) or DB
Rename table (Having no more inserts)
Remove PK/Constraints from the column
Drop ID column
Add new ID column, with Identity
Apply PK
Rename table back to original name (back to work :) )
& If the ID data is important:
Step 1,2 like above
Create a new column
Transfer the data from the existing IDENTITY column to the new column
Drop the existing IDENTITY column & PK.
Make new column, with Identity
Apply PK
Rename table back to original name (back to work :) )
Important Note: 1. If the old column ID value in not important & there are big gaps between your values(you have deletes besides inserts), you don't need BigInt. Just make the new ID column as Int again.
2. When table grows & is reaching Overflow value(2 billion) you could look at the actual row number in properties, storage of your table. Maybe your reaching overflow, but your row number is much less than that.
Why would someone want to use a BigInt instead of Int as an IDENTITY?
Consider this scenario:
Your database exists in several environments including 1 instance in a live Production environment and several other instances in (TestA, B, C, etc.), (QA A, B, C, etc.), (Demo A, B, C, etc), (UAT A, B, C, etc.), (Training A, B, C, etc.) on and on and on... You don't even want to know...
This database IDENTITY field is used to pass in an unique number to a 3rd party provider which is a shared environment in the Non Production environments. The vendor charges an arm and a leg in order to set up multiple environments so the company has one for the production DB and one for ALL the others.
So... when testing happens in the non production environments these numbers can never cross each other from whatever non production environment you happen to be testing in. And the testing includes stress testing... sending 100's of thousands of rows at a time.
To top it off... ALL these environments get refreshed with Production so the Identity field gets reset with whatever was in production. So one has to keep track of what spread was used in each environment and then reset the IDENTITY to a new spread that has never been used before. The 3rd party vendor will puke if an already number gets sent again in these environments. And the vendor is unwilling or unable to refresh or reset these numbers on their end.
This is a real world issue and the current field remains to be an int in ALL environments and the management of keeping track of these spreads is updated every quarter or whenever someone does a massive stress testing 100's of thousands of transactions.
So in about 10 years this IDENTITY will have to be updated to a BIGINT or someone will have to convince the 3rd party vendor to refresh on their end.
Oh yeah, management could give a rat's ass about it until everything comes crashing down all of a sudden.
Then the HACK "ALTER TABLE tableName ALTER COLUMN ID bigint" will do just fine.
Space and index processing is CHEAP!
I have one staging table and want to insert data to Main table, so i want to check while inserting data from staging to Main table, if exists then update the records else insert as new records. Here the issue is both the staging as well as Main table does not have any key column based on which i can compare values.
Is it possible to do without having key columns i.e. primary key on both the tables? if yes, please, suggest me how.
Thanks in advance.
If there is no unique key or set of data within a row to define uniqueness, then no.
The set of data can be a combination of the data in each column, creating a sum of parts which will provide uniqueness; however without exposure to your data you would need to make that decision.
You write the WHERE-clause to include all the fields that make your record unique (ie. the fields that decide whether the record is new or should be updated.)
Take a look at this article (http://blogs.msdn.com/b/miah/archive/2008/02/17/sql-if-exists-update-else-insert.aspx) for hints on how to construct it.
If you are using SQL Server 2008r2, you could also use the MERGE statement - I haven't tried it on tables without keys, so I don't know whether it would work for you.
SQL Server 2008+
I have a table with an auto-increment column which I would like to have increment not only on insert but also update. This column is not the primary key, but there is also a primary key which is a GUID created automatically via newid().
As far as I can tell, there are two ways to do this.
1.) Delete the existing row and insert a new row with indentical values (plus any updates).
or
2.) Update the existing row and use the following to get the "next" identity value:
IDENT_CURRENT('myTable') + IDENT_INCR('myTable')
In either case, I'm forced to allow identity inserts. (With option 1, because the primary key for the table needs to remain the same, and with option 2 because I'm updating the auto-increment column with a specific value.) I'm not sure what the locking/performance consequences of this are.
Any thoughts on this? Is there a better approach? The goal here is to maintain an always increasing set of integer values in the column whenever a row is inserted or updated.
I think a column of type rowversion (formerly known as "timestamp") might be your simplest choice, although at 8 bytes these can amount to fairly large integers. The "timestamp" syntax is deprecated in favor of rowversion (since ISO SQL has a timestamp datatype).
If you stay with the Identity column approach, you would probably want to put your logic into an UPDATE trigger, which would effectively replace the UPDATE with the INSERT and DELETE combination you've described.
Note that Identity column values are not guaranteed to be sequential, only increasing.
Does it need to be an integer column? A timestamp column will provide you the functionality you are looking for out of the box.
Columns with an identity property can't be updated. Once the column with an identity property on it has been assigned a value, either automatically, or with identity_insert on, it is an invariant value. Further the identity property may not be disabled or removed via alter column.
I believe what you want to look at is a SQL Server TIMESTAMP (now called rowversion in SQL Server 2008). It is fundamentally an auto-incrementing binary value. Each database has a unique rowversion counter. Each row insert/update in a table with a timestamp/rowversion column results in the counter being ticked up and the new value assigned to the inserted/modified row.
I am working on "cleaning up" a database and need to synchronize the IDENTITY columns. I am using stored procedures to handle the data and mirror it from one table to the next (after cleaning it and correcting the datatypes). At some point in the future I will want to cut off the old table and use only the new table, my question is how to have the IDENTITY field stay in sync while they are both in use... Once the old table is removed the new one will need to continue auto-incrementing and rebuilding/altering it to change the IDENTITY field is not an option. Is this possible or is there a better way to go about this?
My other thought was to create a lookup table to store the ID columns of both tables and anytime there is an insert in the new table take the old ID and new ID and insert them into the lookup table. This is kind of messy once the old table is out of the way tho.
Been there, done that. Put the old id in the new table as an FK. Drop that column just before you drop the old table.
Set the new table's identity to be a non-identity field.
Modify either your data population procedures to populate the non-identity field on your new table with the old table's identity value.
At cutover, switch your new field to auto-increment and set the seed number accordingly.
Why does Sql server doesn't allow more than one IDENTITY column in a table?? Any specific reasons.
Why would you need it? SQL Server keeps track of a single value (current identity value) for each table with IDENTITY column so it can have just one identity column per table.
An Identity column is a column ( also known as a field ) in a database table that :-
Uniquely identifies every row in the table
Is made up of values generated by the database
This is much like an AutoNumber field in Microsoft Access or a sequence in Oracle.
An identity column differs from a primary key in that its values are managed by the server and ( except in rare cases ) can't be modified. In many cases an identity column is used as a primary key, however this is not always the case.
SQL server uses the identity column as the key value to refer to a particular row. So only a single identity column can be created. Also if no identity columns are explicitly stated, Sql server internally stores a separate column which contains key value for each row. As stated if you want more than one column to be having unique value, you can make use of UNIQUE keyword.
The SQL Server stores the identity in an internal table, using the id of the table as it's key. So it's impossible for the SQL Server to have more than one Identity column per table.
Because MS realized that better than 80% of users would only want one auto-increment column per table and the work-around to have a second (or more) is simple enough i.e. create an IDENTITY with seed = 1, increment = 1 then a calculated column multiplying the auto-generated value by a factor to change the increment and adding an offset to change the seed.
Yes , Sequences allow more than one identity like columns in atable , but there are some issues here . In a typical development scenario i have seen developers manually inserting valid values in a column (which is suppose to be inserted through sequence) . Later on when a sequence try inserting value in to the table , it may fail due to unique key violation.
Also , in a multi developer / multi vendor scenario, developers might use the same sequence for more than one table (as sequences are not linked to tables) . This might lead to missing values in one of the table . ie tableA might get the value 1 while tableB might use value 2 and tableA will get 3. This means that tableA will have 1 and 3 (missing 2).
Apart from this , there is another scenario where you have a table which is truncated every day . Since Sequences are not having any link with table , the truncated table will continue to use the Seq.NextVal again (unless you manually reset the sequence) leading to missing values or even more dangerous arthmetic overflow error after sometime.
Owing to above reason , i feel that both Oracle sequences and SQL server identity column are good for their purposes. I would prefer oracle implementing the concept of Identity column and SQL Server implementing the sequence concept so that developers can implement either of the two as per their requirement.
The whole purpose of an identity column is that it will contain a unique value for each row in the table. So why would you need more than one of them in any given table?
Perhaps you need to clarify your question, if you have a real need for more than one.
An identity column is used to uniquely identify a single row of a table. If you want other columns to be unique, you can create a UNIQUE index for each "identity" column that you may need.
I've always seen this as an arbitrary and bad limitation for SQL Server. Yes, you only want one identity column to actually identify a row, but there are valid reasons why you would want the database to auto-generate a number for more than one field in the database.
That's the nice thing about sequences in Oracle. They're not tied to a table. You can use several different sequences to populate as many fields as you like in the same table. You could also have more than one table share the same sequence, although that's probably a really bad decision. But the point is you could. It's more granular and gives you more flexibility.
The bad thing about sequences is that you have to write code to actually increment them, whether it's in your insert statement or in an on-insert trigger on the table. The nice thing about SQL Server identity is that all you have to do is change a property or add a keyword to your table creation and you're done.