Replace identity column from int to bigint - sql

I am using SQL Server 2008, and I have a table that contains about 50 mill rows.
That table contains a primary identity column of type int.
I want to upgrade that column to be bigint.
I need to know how to do that in a quick way that will not make my DB server unavailable,
and will not delete or ruin any of my data
How should I best do it ? what are the consequences of doing that?

Well, it won't be a quick'n'easy way to do this, really....
My approach would be this:
create a new table with identical structure - except for the ID column being BIGINT IDENTITY instead of INT IDENTITY
----[ put your server into exclusive single-user mode here; user cannot use your server from this point on ]----
find and disable all foreign key constraints referencing your table
turn SET IDENTITY_INSERT (your new table) ON
insert the rows from your old table into the new table
turn SET IDENTITY_INSERT (your new table) OFF
delete your old table
rename your new table to the old table name
update all table that have a FK reference to your table to use BIGINT instead of INT (that should be doable with a simple ALTER TABLE ..... ALTER COLUMN FKID BIGINT)
re-create all foreign key relationships again
now you can return your server to normal multi-user usage again

What am I missing?
Why can't you just do this:
ALTER TABLE tableName ALTER COLUMN ID bigint
I guess try it in a test environment first but this always works for me

Probably the best way is to create a new table with a BIGINT IDENTITY column, move the existing data using SET IDENTITY_INSERT ON; and then rename the tables. You will need to do this during a maintenance window, just as you would if you changed the data type in Management Studio (which would similarly create a new table, move the data, and block everyone in the process).

You could use Alter script for your column as #MobileMon said, but couldn't do this before removing constraints. And besides FK constraints, you must also remove PK constraint before changing the column type!
Also there is another creative way, if the ID data is not important (No FK etc):
Take a Backup of table (if it's in a separate FileGroup) or DB
Rename table (Having no more inserts)
Remove PK/Constraints from the column
Drop ID column
Add new ID column, with Identity
Apply PK
Rename table back to original name (back to work :) )
& If the ID data is important:
Step 1,2 like above
Create a new column
Transfer the data from the existing IDENTITY column to the new column
Drop the existing IDENTITY column & PK.
Make new column, with Identity
Apply PK
Rename table back to original name (back to work :) )
Important Note: 1. If the old column ID value in not important & there are big gaps between your values(you have deletes besides inserts), you don't need BigInt. Just make the new ID column as Int again.
2. When table grows & is reaching Overflow value(2 billion) you could look at the actual row number in properties, storage of your table. Maybe your reaching overflow, but your row number is much less than that.

Why would someone want to use a BigInt instead of Int as an IDENTITY?
Consider this scenario:
Your database exists in several environments including 1 instance in a live Production environment and several other instances in (TestA, B, C, etc.), (QA A, B, C, etc.), (Demo A, B, C, etc), (UAT A, B, C, etc.), (Training A, B, C, etc.) on and on and on... You don't even want to know...
This database IDENTITY field is used to pass in an unique number to a 3rd party provider which is a shared environment in the Non Production environments. The vendor charges an arm and a leg in order to set up multiple environments so the company has one for the production DB and one for ALL the others.
So... when testing happens in the non production environments these numbers can never cross each other from whatever non production environment you happen to be testing in. And the testing includes stress testing... sending 100's of thousands of rows at a time.
To top it off... ALL these environments get refreshed with Production so the Identity field gets reset with whatever was in production. So one has to keep track of what spread was used in each environment and then reset the IDENTITY to a new spread that has never been used before. The 3rd party vendor will puke if an already number gets sent again in these environments. And the vendor is unwilling or unable to refresh or reset these numbers on their end.
This is a real world issue and the current field remains to be an int in ALL environments and the management of keeping track of these spreads is updated every quarter or whenever someone does a massive stress testing 100's of thousands of transactions.
So in about 10 years this IDENTITY will have to be updated to a BIGINT or someone will have to convince the 3rd party vendor to refresh on their end.
Oh yeah, management could give a rat's ass about it until everything comes crashing down all of a sudden.
Then the HACK "ALTER TABLE tableName ALTER COLUMN ID bigint" will do just fine.
Space and index processing is CHEAP!

Related

Postgres 9.4 Foreign Data wrapper "FDW" is unable to send serial data type between insertion from different ends

I have a simple 2 CentOS servers configurations using both postgres-9.4 to simulate the FDW scenario in Postgres-9.4.
I used fdw to link a simple table with a another database on another server, reading worked perfectly from both ends,the issue was with the serial primary key, it was not in sync; in other words, If I inserted from the original table, after I inserted from the foreign table, it doesn't sync the count. and vise versa.
Based on the comment I got from Nick Barnes, yes I do need to keep up the counter on both sides in sync, so I made a Function that every time Queries the Actual Database for the latest index, so that is always inserts to the correct record.
I am still not sure if this is going to survive, but I'll make it work production really soon.
I blogged about it here with a table example.
I had the same problem, and tried it like Negma suggested in his blog. This solution only works in case you insert only one row. In case you insert more rows in the same transaction, select max(id) will always return the same id and you will get not unique ids.
I have solved this by changing the type of the id from serial/integer to uuid. Then you can do the same as Negma suggested but with gen_random_uuid() from the pgcrypto EXTENSION.
So at the foreign server I did:
ALTER TABLE tablename ALTER COLUMN columnname SET DEFAULT gen_random_uuid();
And the same for the foreign table.

Insert & Delete from SQL best practice

I have a database with 2 tables: CurrentTickets & ClosedTickets. When a user creates a ticket via web application, a new row is created. When the user closes a ticket, the row from currenttickets is inserted into ClosedTickets and then deleted from CurrentTickets. If a user reopens a ticket, the same thing happens, only in reverse.
The catch is that one of the columns being copied back to CurrentTickets is the PK column (TicketID)that idendity is set to ON.
I know I can set the IDENTITY_INSERT to ON but as I understand it, this is generally frowned upon. I'm assuming that my database is a bit poorly designed. Is there a way for me to accomplish what I need without using IDENTITY_INSERT? How would I keep the TicketID column autoincremented without making it an identity column? I figure I could add another column RowID and make that the PK but I still want the TicketID column to autoincrement if possible but still not be considered an Idendity column.
This just seems like bad design with 2 tables. Why not just have a single tickets table that stores all tickets. Then add a column called IsClosed, which is false by default. Once a ticket is closed you simply update the value to true and you don't have to do any copying to and from other tables.
All of your code around this part of your application will be much simpler and easier to maintain with a single table for tickets.
Simple answer is DO NOT make an Identity column if you want your influence on the next Id generated in that column.
Also I think you have a really poor schema, Rather than having two tables just add another column in your CurrentTickets table, something like Open BIT and set its value to 1 by default and change the value to 0 when client closes the Ticket.
And you can Turn it On/Off as many time as client changes his mind, with having to go through all the trouble of Insert Identity and managing a whole separate table.
Update
Since now you have mentioned its SQL Server 2014, you have access to something called Sequence Object.
You define the object once and then every time you want a sequential number from it you just select next value from it, it is kind of hybrid of an Identity Column and having a simple INT column.
To achieve this in latest versions of SQL Server use OUTPUT clause (definition on MSDN).
OUTPUT clause used with a table variable:
declare #MyTableVar (...)
DELETE FROM dbo.CurrentTickets
OUTPUT DELETED.* INTO #MyTableVar
WHERE <...>;
INSERT INTO ClosedTicket
Select * from #MyTableVar
Second table should have ID column, but without IDENTITY property. It is enforced by the other table.

rebuild/refresh my table's PK list - gap in numbers

I have finished all my changes to a database table in sql server management studio 2012, but now I have a large gap between some values due to editing. Is there a way to keep my data, but re-assign all the ID's from 1 up to my last value?
I would like this cleaned up as I populate dropdownlists with these values and then I make interactions with my database with the assumption that my dropdownlist index and the table's ID match up, which is not the case right now.
My current DB has a large gap from 7 to 28, I would like to shift everything from 28 and up, back down to 8, 9, 10, 11, ect... so that my database has NO gaps from 1 and onward.
If the solution is tricky please give me some steps as I am new to SQL.
Thank you!
Yes, there are any number of ways to "close the gaps" in an auto generated sequence. You say you're new to SQL so I'll assume you're also new to relational concepts. Here is my advice to you: don't do it.
The ID field is a surrogate key. There are several aspects of surrogates one must be mindful of when using them, but the one I want to impress upon you is,
-- A surrogate key is used to make the row unique. Other than the guarantee that
-- the value is unique, no other assumptions may be made concerning the value.
-- In particular, no meaning may be derived from the value as to the contents of
-- the row or the row's relationship to any other row.
You have designed your app with a built-in assumption of the value of the key field (that they will be consecutive). Already it is causing you problems. Do you really want to go through this every time you make changes to the table? And suppose a future feature requires you to filter out some of the choices according to an option the user has selected? Or enable the user to specify the order of the items? Not going to be easy. So what is the solution?
You can create an additional (non-visible) field in the dropdown list that contains the key value. When the user makes a selection, use that index to get the key value of the selection and then go out to the database and get whatever additional data you need. This will work if you populate the list from the entire table or just select a few according to some as yet unknown filtering criteria or change the order in any way.
Viola. You never have this problem again, no matter how often you add and remove rows in the table.
However, on the off chance that you are as stubborn as me (not likely!) or just refuse to listen to the melodious voice of reason and experience, then try this:
Create a new table exactly like the old table, including auto incrementing PK.
Populate the new table using a Select from the old table. You can specify any order you want.
Drop the old table.
Rename the new table to the old table name.
You will have to drop and redefine any FKs from other tables. But this entire process
can be placed in a script because if you do this once, you'll probably do it again.
Now all the values are consecutive. Until you edit the table again...
You should refactor the code for your dropdown list and not the PK of the table.
If you do not agree, you can do one of the following:
Insert another column holding the dropdown's "order of appearance", make a unique index on it and fill this by hand (or programmatically).
Replace the SERIAL with an INT would work, make a unique index on the column and fill this by hand (or programmatically).
Remove the large ids and reseed your serial - the code depending on your DBMS
This happens to me all the time. If you don't have any foreign key constraints then it should be an easy fix.
Remember a DELETE statement will remove the record but keep the identity seed the same. (If I remove id # 5 and #5 was the last record inserted then SQL server still stores the identity seed value at "6").
TRUNCATING the table will reset the identity seed back to it's original value.
INSERT_IDENTITY [TABLE] ON can also be used to insert the correct data in the correct order if tuncating cannot happen.
SELECT *
INTO #tempTable
FROM [TableTryingToFix]
TRUNCATE TABLE [TableTryingToFix];
INSERT INTO [TableTryingToFix] (COL1, COL2, COL3, ETC)
SELECT COL1, COL2, COL2, ETC
FROM #tempTable
ORDER BY oldTableID

identity column in Sql server

Why does Sql server doesn't allow more than one IDENTITY column in a table?? Any specific reasons.
Why would you need it? SQL Server keeps track of a single value (current identity value) for each table with IDENTITY column so it can have just one identity column per table.
An Identity column is a column ( also known as a field ) in a database table that :-
Uniquely identifies every row in the table
Is made up of values generated by the database
This is much like an AutoNumber field in Microsoft Access or a sequence in Oracle.
An identity column differs from a primary key in that its values are managed by the server and ( except in rare cases ) can't be modified. In many cases an identity column is used as a primary key, however this is not always the case.
SQL server uses the identity column as the key value to refer to a particular row. So only a single identity column can be created. Also if no identity columns are explicitly stated, Sql server internally stores a separate column which contains key value for each row. As stated if you want more than one column to be having unique value, you can make use of UNIQUE keyword.
The SQL Server stores the identity in an internal table, using the id of the table as it's key. So it's impossible for the SQL Server to have more than one Identity column per table.
Because MS realized that better than 80% of users would only want one auto-increment column per table and the work-around to have a second (or more) is simple enough i.e. create an IDENTITY with seed = 1, increment = 1 then a calculated column multiplying the auto-generated value by a factor to change the increment and adding an offset to change the seed.
Yes , Sequences allow more than one identity like columns in atable , but there are some issues here . In a typical development scenario i have seen developers manually inserting valid values in a column (which is suppose to be inserted through sequence) . Later on when a sequence try inserting value in to the table , it may fail due to unique key violation.
Also , in a multi developer / multi vendor scenario, developers might use the same sequence for more than one table (as sequences are not linked to tables) . This might lead to missing values in one of the table . ie tableA might get the value 1 while tableB might use value 2 and tableA will get 3. This means that tableA will have 1 and 3 (missing 2).
Apart from this , there is another scenario where you have a table which is truncated every day . Since Sequences are not having any link with table , the truncated table will continue to use the Seq.NextVal again (unless you manually reset the sequence) leading to missing values or even more dangerous arthmetic overflow error after sometime.
Owing to above reason , i feel that both Oracle sequences and SQL server identity column are good for their purposes. I would prefer oracle implementing the concept of Identity column and SQL Server implementing the sequence concept so that developers can implement either of the two as per their requirement.
The whole purpose of an identity column is that it will contain a unique value for each row in the table. So why would you need more than one of them in any given table?
Perhaps you need to clarify your question, if you have a real need for more than one.
An identity column is used to uniquely identify a single row of a table. If you want other columns to be unique, you can create a UNIQUE index for each "identity" column that you may need.
I've always seen this as an arbitrary and bad limitation for SQL Server. Yes, you only want one identity column to actually identify a row, but there are valid reasons why you would want the database to auto-generate a number for more than one field in the database.
That's the nice thing about sequences in Oracle. They're not tied to a table. You can use several different sequences to populate as many fields as you like in the same table. You could also have more than one table share the same sequence, although that's probably a really bad decision. But the point is you could. It's more granular and gives you more flexibility.
The bad thing about sequences is that you have to write code to actually increment them, whether it's in your insert statement or in an on-insert trigger on the table. The nice thing about SQL Server identity is that all you have to do is change a property or add a keyword to your table creation and you're done.

Fixing DB Inconsistencies - ID Fields

I've inherited a (Microsoft?) SQL database that wasn't very pristine in its original state. There are still some very strange things in it that I'm trying to fix - one of them is inconsistent ID entries.
In the accounts table, each entry has a number called accountID, which is referenced in several other tables (notes, equipment, etc. ). The problem is that the numbers (for some random reason) - range from about -100000 to +2000000 when there are about only 7000 entries.
Is there any good way to re-number them while changing corresponding numbers in the other tables? At my disposal I also have ColdFusion, so any thing that works with SQL and/or that I'll accept.
For surrogate keys, they are meant to be meaningless, so unless you actually had a database integrity issue (like there were no foreign key contraints properly defined) or your identity was approaching the maximum for its datatype, I would leave them alone and go after some other low hanging fruit that would have more impact.
In this instance, it sounds like "why" is a better question than "how". The OP notes that there is a strange problem that needs to be fixed but doesn't say why it is a problem. Is it causing problems? What positive impact would changing these numbers have? Unless you originally programmed the system and understand precisely why the number is in its current state, you are taking quite a risky making changes like this.
I would talk to an accountant (or at least your financial people) before messing in anyway with the numbers in the accounts tables if this is a financial app. The Table of accounts is very critical to how finances are reported. These IDs may have meaning you don't understand. No one puts in a negative id unless they had a reason. I would under no circumstances change that unless I understood why it was negative to begin with. You could truly screw up your tax reporting or some other thing by making an uneeded change.
You could probably disable the foreign key relationships (if you're able to take it offline temporarily) and then update the primary keys using a script. I've used this update script before to change values, and you could pretty easily wrap this code in a cursor to go through the key values in question, one by one, and update the arbitrary value to an incrementing value you're keeping track of.
Check out the script here: http://vyaskn.tripod.com/sql_server_search_and_replace.htm
If you just have a list of tables that use the primary key, you could set up a series of UPDATE statements that run inside your cursor, and then you wouldn't need to use this script (which can be a little slow).
It's worth asking, though, why these values appear out of wack. Does this database have values added and deleted constantly? Are the primary key values really arbitrary, or do they just appear to be, but they really have meaning? Though I'm all for consolidating, you'd have to ensure that there's no purpose to those values.
With ColdFusion this shouldn't be a herculean task, but it will be messy and you'll have to be careful. One method you could use would be to script the database and then generate a brand new, blank table schema. Set the accountID as an identity field in the new database.
Then, using ColdFusion, write a query that will pull all of the old account data and insert them into the new database one by one. For each row, let the new database assign a new ID. After each insert, pull the new ID (using either ##IDENTITY or MAX(accountID)) and store the new ID and the old ID together in a temporary table so you know which old IDs belong to which new IDs.
Next, repeat the process with each of the child tables. For each old ID, pull its child entries and re-insert them into the new database using the new IDs. If the primary keys on the child tables are fine, you can insert them as-is or let the server assign new ones if they don't matter.
Assigning new IDs in place by disabling relationships temporarily may work, but you might also run into conflicts if one of the entries is assigned an ID that is already being used by the old data which could cause conflicts.
Create a new column in the accounts table for your new ID, and new column in each of your related tables to reference the new ID column.
ALTER TABLE accounts
ADD new_accountID int IDENTITY
ALTER TABLE notes
ADD new_accountID int
ALTER TABLE equipment
ADD new_accountID int
Then you can map the new_accountID column on each of your referencing tables to the accounts table.
UPDATE notes
SET new_accountID = accounts.new_accountID
FROM accounts
INNER JOIN notes ON (notes.accountID = accounts.accountID)
UPDATE equipment
SET new_accountID = accounts.new_accountID
FROM accounts
INNER JOIN equipment ON (equipment.accountID = accounts.accountID)
At this point, each table has both accountID with the old keys, and new_accountID with the new keys. From here it should be pretty straightforward.
Break all of the foreign keys on accountID.
On each table, UPDATE [table] SET accountID = new_accountID.
Re-add the foreign keys for accountID.
Drop new_accountID from all of the tables, as it's no longer needed.