Data gets changed when copying data in chunks between two identical tables

Data gets changed when copying data in chunks between two identical tables - sql

In short, I am trying to copy data from one table to another nearly identical table (minus constraints, indices, and a precision change to a decimal column) in batches using Insert [NewTable] Select Top X * from [Table] but some data is getting changed during the copy. Read on for more details.
Why we are copying in the first place
We are altering the precision of a couple of columns in our largest table and do not have the time in our deployment window to do a simple alter statement. As an alternative, we decided to create a table with the new schema and copy the data in batches in the days leading up to the deploy to allow us to simple drop the old table and rename this table during the deployment window.
Creation scripts for new and old tables
These are not the exact tables we have in our DB, but they've been trimmed down for this question. The actual table has ~100 columns.
CREATE TABLE [dbo].[Table]
(
[Id] BIGINT NOT NULL PRIMARY KEY NONCLUSTERED IDENTITY,
[ForeignKey1] INT NOT NULL,
[ForeignKey2] INT NOT NULL,
[ForeignKey3] INT NOT NULL,
[Name] VARCHAR(MAX) NOT NULL,
[SomeValue] DECIMAL(14, 5) NULL,
CONSTRAINT [FK_Table_ForeignKeyTable1] FOREIGN KEY ([ForeignKey1]) REFERENCES [ForeignKeyTable1]([ForeignKey1]),
CONSTRAINT [FK_Table_ForeignKeyTable2] FOREIGN KEY ([ForeignKey2]) REFERENCES [ForeignKeyTable2]([ForeignKey2]),
CONSTRAINT [FK_Table_ForeignKeyTable3] FOREIGN KEY ([ForeignKey3]) REFERENCES [ForeignKeyTable3]([ForeignKey3]),
)
GO
CREATE INDEX [IX_Table_ForeignKey2] ON [dbo].[Table] ([ForeignKey2])
GO
CREATE TABLE [dbo].[NewTable]
(
[Id] BIGINT NOT NULL PRIMARY KEY NONCLUSTERED IDENTITY,
[ForeignKey1] INT NOT NULL,
[ForeignKey2] INT NOT NULL,
[ForeignKey3] INT NOT NULL,
[Name] VARCHAR(MAX) NOT NULL,
[SomeValue] DECIMAL(16, 5) NULL
)
SQL I wrote to copy data
DECLARE #BatchSize INT
DECLARE #Count INT

-- Leave these the same --
SET #Count = 1

-- Update these to modify run behavior --
SET #BatchSize = 5000

WHILE #Count > 0
BEGIN
SET IDENTITY_INSERT [dbo].[NewTable] ON;
INSERT INTO [dbo].[NewTable]
([Id],
[ForeignKey1],
[ForeignKey2],
[ForeignKey3],
[Name],
[SomeValue])
SELECT TOP (#BatchSize)
[Id],
[ForeignKey1],
[ForeignKey2],
[ForeignKey3],
[Name],
[SomeValue]
FROM [dbo].[Table]
WHERE not exists(SELECT 1 FROM [dbo].[NewTable] WHERE [dbo].[NewTable].Id = [dbo].[Table].Id)
ORDER BY Id

SET #Count = ##ROWCOUNT

SET IDENTITY_INSERT [dbo].[NewTable] OFF;
END
The Problem
Somehow data is getting garbled or modified in a seemingly random pattern during the copy. Most (maybe all) of the modified data we've seen has been for the ForeignKey2 column. And the value we end up with in the new table is seemingly random as well as it didn't exist at all in the old table. There doesn't seem to be any rhyme or reason to which records it affects either.
For example, here is one row for the original table and the corresponding row in the new table:
Old Table
ID: 204663
FK1: 452
FK2: 522413
FK3: 11190
Name: Masked
Some Value: 0.0
New Table
ID: 204663
FK1: 452
FK2: 120848
FK3: 11190
Name: Masked but matches Old Table
Some Value: 0.0
Environment
SQL was run in SSMS. Database is an Azure SQL Database.

Related

Having troubles with Identity field of SQL-SERVER

I'm doing a school project about a school theme where I need to create some tables for Students, Classes, Programmes...
I want to add a Group to determined classes with an auto increment in group_id however I wanted the group_id variable to reset if I change any of those attributes(Classes_id,courses_acronym,year_Semesters) how can I reset it every time any of those change??
Here is my table:
CREATE TABLE Classes_Groups(
Classes_id varchar(2),
Group_id INT IDENTITY(1,1),
courses_acronym varchar(4),
year_Semesters varchar(5),
FOREIGN KEY (Classes_id, year_Semesters,courses_acronym) REFERENCES Classes(id,year_Semesters, courses_acronym),
PRIMARY KEY(Classes_id,courses_acronym,year_Semesters,Group_id)
);

Normally, you do not (need to) reset the identity column of a table. An identity column is used to create unique values for every single record in a table.
So you want to generate entries in your groups table based on new entries in your classes table. You might create a trigger on your classes table for that purpose.
Since Group_id is already unique by itself (because of its IDENTITY), you do not need other fields in the primary key at all. Instead, you may create a separate UNIQUE constraint for the combination (Classes_id, courses_acronym, year_Semesters) if you need it.
And if the id field of your classes table is an IDENTITY column too, you could define a primary key in your classes table solely on that id field. And then your foreign key constraint in your new groups table can only include that Classes_id field.)
So much for now. I guess that your database design needs some more additional tuning and tweaking. ;)

where are you setting the values from?, you can have a stored proc and in your query have the columns have an initial value set when stored proc is hit assuming there are values at the beginning
.Then use an IF statement.
declare #initial_Classes_id varchar(2) = --initial value inserted
declare #initial_courses_acronym varchar(4) = --initial value inserted
declare #initial_year_Semesters varchar(5) = --initial value inserted
declare #compare_Classes_id varchar(2) = (select top 1 Classes_id from Classes_Groups order by --PK column desc for last insert); l would add Dateadded and then order with last insert date
declare #compare_courses_acronym varchar(2) = (select top 1 Classes_id from Classes_Groups where Classes_id = #compare_Classes_id);
declare #compare_year_Semesters varchar(2) = (select top 1 Classes_id from Classes_Groups where Classes_id = #compare_Classes_id);
IF (#initial_Classes_id != #compare_Classes_id OR #initial_courses_acronym != #compare_courses_acronym OR #initial_year_Semesters != #compare_year_Semesters)
BEGIN
DBCC CHECKIDENT ('Group_id', RESEED, 1)
Insert into Classes_Groups (courses_acronym,year_Semesters)
values (
courses_acronym,
year_Semesters
)
END
ELSE
BEGIN
Insert into Classes_Groups (courses_acronym,year_Semesters)
values (
courses_acronym,
year_Semesters
)
END
NB: would advice to use int on the primary key. Unless you have a specific purpose of doing so.

Alter an existing Identity Column's Increment value

I am stumped,
I am trying to alter the increment value of Identity columns in a collection of existing MS SQL tables (which all have data) and have been trying to research if it is possible to do without writing custom scripts per table.
I can't find a solution that doesn't require dropping and recreating the tables which would require a different script for each table as they each have different column lists.
for example i want to change the existing table
CREATE TABLE [dbo].[ActionType](
[ID] [int] IDENTITY(1,1) NOT NULL,
[Action] [varchar](100) NOT NULL,
CONSTRAINT [PK_ActionType] PRIMARY KEY CLUSTERED
(
[ID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
To
CREATE TABLE [dbo].[ActionType](
[ID] [int] IDENTITY(1,5) NOT NULL,
[Action] [varchar](100) NOT NULL,
CONSTRAINT [PK_ActionType] PRIMARY KEY CLUSTERED
(
[ID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
Via something like
exec sp_AlterIncrement #TABLE_NAME = 'ActionType', #NEW_ICREMENT = 5
While keeping the data.
This would fix a big deployment issue i am facing right now so any help would be appreciated

You can not alter identity increment after you create it.It is possible just to change seed value with DBCC Chekident .
You should drop and recreate the column.

I had to do that before on a small table and it's fairly easy to do, trick is that you have to update it to something that currently doesn't exist as a key, and then back, since you can't increment it by 1 because that key already exists. It takes 2 updates, for a table with IDs smaller than 100 for example:
update my_table set id = id+100;
update my_table set id = id-99;

But anyways , I do not understand why you want to alter the identity value, Because anyhow you will keep the same as primary key or part of the clustered key.
Also, if any change in the column type is being required then i don't think that there is a possibility without altering the table structure.
Alter table ActionType
Alter column ID
You can also revert to the original structure when not required. This can be used for the specified case as well, As if you require this on demand basis.
Please suggest so that i can provide the further feedback.

Couple of things, maybe too much info but helpful when do stuff like this. The following will set the increment to whatever you want:
DBCC CHECKIDENT ([DB.Schema.Table], reseed, 0) --First record will have a 1. You can set it to any value
If you want to insert data into a table that has an identity but you need to force the value to something specific, do this:
SET IDENTITY_INSERT [DB].[schema].[Table] ON
...Add your data here
SET IDENTITY_INSERT [DB].[schema].[Table] OFF

Sometimes this is necessary.this might provide an answer. For example existing table is identity(1,1) [ex below would be A]
It contains value but you would like to change it to increment of to let's say so that it works well with another table [ex below would be B]
So a would have odd ids + whatever it use to contains.while be would now have even number
this script show you how to do it.
create table A(id int identity(1,1),v char)
insert into A
Select 'A'
union select 'B'
union select 'C'
go
create table B(id int identity(1,2),v char)
go
SET IDENTITY_INSERT B ON
GO
insert into B(Id,v)
Select Id,v from A
go
SET IDENTITY_INSERT B OFF
GO
insert into B
Select 'D'
union select 'E'
go
drop table A
go
EXEC sp_RENAME 'B' , 'A'
go
Select * from A
go
Select max(Id)+1 from A
go
create table B(id int identity(8,2),v char)
go
insert into B
Select 'A'
union select 'B'
union select 'C'
go
Select * from B

If you need to reenumerate or compress your Identity field, the easiest way is as follows:
Convert, temporarily, your identity filed into an integer
Replace the values using for example an Excel sheet in other to fill them up
Copy and Paste the column in your Excel file into the Int field.
Save the table
Open it again in design mode and change back the Int field into an Identity
If this Identity field is used in a child table, make sure you have a trigger to also export the new values into the dependant tables .
And that's all.
If you need to control Identity data in your applicaton, just change it to Int and manage the incremental values with code with the Dmax function.
Hope it helps

Inserting record from one column to another column in the same scope or statement

I have a Stored Procedure that populates a table: This table as indicated in the code below has an identity column which is also the primary key column.
I would like to append the primary key to contain leading letters: Example: ABC123.
Obviously this is not possible because the Primary key column is INT datatype.
So I created an additional column so that I can insert the appended primary key. This works except I have to make the new column Null and I am using an UPDATE statement.
Something tells me there is a better way.
Is there a way I can do this without using UPDATE after the initial Insert and have the new column CategoryID as Not Null?
Table Code:
CREATE TABLE [dbo].[Registration] (
[SystemID] INT IDENTITY (100035891, 1) NOT NULL,
[CategoryID] CHAR (13) NULL,
[FName] VARCHAR (30) NOT NULL,
[LName] VARCHAR (30) NOT NULL,
[MInit] CHAR (1) NULL,
PRIMARY KEY CLUSTERED ([SystemID] ASC)
);
Stored Procedure:
CREATE PROCEDURE [dbo].[uspInsertRegistration]
#FName VARCHAR(30),
#LName VARCHAR(30),
#MInit CHAR(1),
#CategoryID CHAR(13),
#SystemID int OUTPUT
AS
BEGIN
SET NOCOUNT ON
DECLARE #ErrCode int
INSERT INTO [dbo].[Registration] ([FName],[LName],[MInit])
VALUES (#FName, #LName, #MInit)
SELECT #ErrCode = ##ERROR, #SystemID = SCOPE_IDENTITY()
UPDATE [dbo].[Registration]
SET CategoryID = 'ABC'+ CAST(SystemID AS CHAR)
SET NOCOUNT OFF
RETURN #ErrCode
END
Finally this is what the table looks like with the data:
Thanks for being contagious with your knowledge. :)
Guy

My suggestion is to use a computed column, as what you're trying to do introduces redundancy. See below:
http://msdn.microsoft.com/en-us/library/ms191250%28v=sql.105%29.aspx
Alternately, make it big enough to contain a GUID, put a GUID into the column on the insert, then update it afterwards.

Trigger for insert on identity column

I have a table A with an Identity Column which is the primary key.
The primary key is at the same time a foreign key that points towards another table B.
I am trying to build an insert trigger that inserts into Table B the identity column that is about to be created in table A and another custom value for example '1'.
I tried using ##Identity but I keep getting a foreign key conflict. Thanks for your help.
create TRIGGER dbo.tr ON dbo.TableA FOR INSERT
AS
SET NOCOUNT ON
begin
insert into TableB
select ##identity, 1;
end

alexolb answered the question himself in the comments above. Another alternative is to use the IDENT_CURRENT function instead of selecting from the table. The drawback of this approach is that it always starts your number one higher than the seed, but that is easily remedied by setting the seed one unit lower. I think it feels better to use a function than a subquery.
For example:
CREATE TABLE [tbl_TiggeredTable](
[id] [int] identity(0,1) NOT NULL,
[other] [varchar](max)
)
CREATE TRIGGER [trgMyTrigger]
ON [tbl_TriggeredTable]
INSTEAD OF INSERT,UPDATE,DELETE
SET identity_insert tbl_TriggeredTable ON
INSERT INTO tbl_TriggeredTable (
[id],
[other]
)
SELECT
-- The identity column will have a zero in the insert table when
-- it has not been populated yet, so we need to figure it out manually
case i.[id]
when 0 then IDENT_CURRENT('tbl_TriggeredTable') + IDENT_INCR('tbl_TriggeredTable')
ELSE i.[id]
END,
i.[other],
FROM inserted i
SET identity_insert tbl_TriggeredTable OFF
END

Constraint for only one record marked as default

How could I set a constraint on a table so that only one of the records has its isDefault bit field set to 1?
The constraint is not table scope, but one default per set of rows, specified by a FormID.

Use a unique filtered index
On SQL Server 2008 or higher you can simply use a unique filtered index
CREATE UNIQUE INDEX IX_TableName_FormID_isDefault
ON TableName(FormID)
WHERE isDefault = 1
Where the table is
CREATE TABLE TableName(
FormID INT NOT NULL,
isDefault BIT NOT NULL
)
For example if you try to insert many rows with the same FormID and isDefault set to 1 you will have this error:
Cannot insert duplicate key row in object 'dbo.TableName' with unique
index 'IX_TableName_FormID_isDefault'. The duplicate key value is (1).
Source: http://technet.microsoft.com/en-us/library/cc280372.aspx

Here's a modification of Damien_The_Unbeliever's solution that allows one default per FormID.
CREATE VIEW form_defaults
AS
SELECT FormID
FROM whatever
WHERE isDefault = 1
GO
CREATE UNIQUE CLUSTERED INDEX ix_form_defaults on form_defaults (FormID)
GO
But the serious relational folks will tell you this information should just be in another table.
CREATE TABLE form
FormID int NOT NULL PRIMARY KEY
DefaultWhateverID int FOREIGN KEY REFERENCES Whatever(ID)

From a normalization perspective, this would be an inefficient way of storing a single fact.
I would opt to hold this information at a higher level, by storing (in a different table) a foreign key to the identifier of the row which is considered to be the default.
CREATE TABLE [dbo].[Foo](
[Id] [int] NOT NULL,
CONSTRAINT [PK_Foo] PRIMARY KEY CLUSTERED
(
[Id] ASC
) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[DefaultSettings](
[DefaultFoo] [int] NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DefaultSettings] WITH CHECK ADD CONSTRAINT [FK_DefaultSettings_Foo] FOREIGN KEY([DefaultFoo])
REFERENCES [dbo].[Foo] ([Id])
GO
ALTER TABLE [dbo].[DefaultSettings] CHECK CONSTRAINT [FK_DefaultSettings_Foo]
GO

You could use an insert/update trigger.
Within the trigger after an insert or update, if the count of rows with isDefault = 1 is more than 1, then rollback the transaction.

CREATE VIEW vOnlyOneDefault
AS
SELECT 1 as Lock
FROM <underlying table>
WHERE Default = 1
GO
CREATE UNIQUE CLUSTERED INDEX IX_vOnlyOneDefault on vOnlyOneDefault (Lock)
GO
You'll need to have the right ANSI settings turned on for this.

I don't know about SQLServer.But if it supports Function-Based Indexes like in Oracle, I hope this can be translated, if not, sorry.
You can do an index like this on suposed that default value is 1234, the column is DEFAULT_COLUMN and ID_COLUMN is the primary key:
CREATE
UNIQUE
INDEX only_one_default
ON my_table
( DECODE(DEFAULT_COLUMN, 1234, -1, ID_COLUMN) )
This DDL creates an unique index indexing -1 if the value of DEFAULT_COLUMN is 1234 and ID_COLUMN in any other case. Then, if two columns have DEFAULT_COLUMN value, it raises an exception.

The question implies to me that you have a primary table that has some child records and one of those child records will be the default record. Using address and a separate default table here is an example of how to make that happen using third normal form. Of course I don't know if it's valuable to answer something that is so old but it struck my fancy.
--drop table dev.defaultAddress;
--drop table dev.addresses;
--drop table dev.people;
CREATE TABLE [dev].[people](
[Id] [int] identity primary key,
name char(20)
)
GO
CREATE TABLE [dev].[Addresses](
id int identity primary key,
peopleId int foreign key references dev.people(id),
address varchar(100)
) ON [PRIMARY]
GO
CREATE TABLE [dev].[defaultAddress](
id int identity primary key,
peopleId int foreign key references dev.people(id),
addressesId int foreign key references dev.addresses(id))
go
create unique index defaultAddress on dev.defaultAddress (peopleId)
go
create unique index idx_addr_id_person on dev.addresses(peopleid,id);
go
ALTER TABLE dev.defaultAddress
ADD CONSTRAINT FK_Def_People_Address
FOREIGN KEY(peopleID, addressesID)
REFERENCES dev.Addresses(peopleId, id)
go
insert into dev.people (name)
select 'Bill' union
select 'John' union
select 'Harry'
insert into dev.Addresses (peopleid, address)
select 1, '123 someplace' union
select 1,'work place' union
select 2,'home address' union
select 3,'some address'
insert into dev.defaultaddress (peopleId, addressesid)
select 1,1 union
select 2,3
-- so two home addresses are default now
-- try adding another default address to Bill and you get an error
select * from dev.people
join dev.addresses on people.id = addresses.peopleid
left join dev.defaultAddress on defaultAddress.peopleid = people.id and defaultaddress.addressesid = addresses.id
insert into dev.defaultaddress (peopleId, addressesId)
select 1,2
GO

You could do it through an instead of trigger, or if you want it as a constraint create a constraint that references a function that checks for a row that has the default set to 1
EDIT oops, needs to be <=
Create table mytable(id1 int, defaultX bit not null default(0))
go
create Function dbo.fx_DefaultExists()
returns int as
Begin
Declare #Ret int
Set #ret = 0
Select #ret = count(1) from mytable
Where defaultX = 1
Return #ret
End
GO
Alter table mytable add
CONSTRAINT [CHK_DEFAULT_SET] CHECK
(([dbo].fx_DefaultExists()<=(1)))
GO
Insert into mytable (id1, defaultX) values (1,1)
Insert into mytable (id1, defaultX) values (2,1)

This is a fairly complex process that cannot be handled through a simple constraint.
We do this through a trigger. However before you write the trigger you need to be able to answer several things:
do we want to fail the insert if a default exists, change it to 0 instead of 1 or change the existing default to 0 and leave this one as 1?
what do we want to do if the default record is deleted and other non default records are still there? Do we make one the default, if so how do we determine which one?
You will also need to be very, very careful to make the trigger handle multiple row processing. For instance a client might decide that all of the records of a particular type should be the default. You wouldn't change a million records one at a time, so this trigger needs to be able to handle that. It also needs to handle that without looping or the use of a cursor (you really don't want the type of transaction discussed above to take hours locking up the table the whole time).
You also need a very extensive tesing scenario for this trigger before it goes live. You need to test:
adding a record with no default and it is the first record for that customer
adding a record with a default and it is the first record for that customer
adding a record with no default and it is the not the first record for that customer
adding a record with a default and it is the not the first record for that customer
Updating a record to have the default when no other record has it (assuming you don't require one record to always be set as the deafault)
Updating a record to remove the default
Deleting the record with the deafult
Deleting a record without the default
Performing a mass insert with multiple situations in the data including two records which both have isdefault set to 1 and all of the situations tested when running individual record inserts
Performing a mass update with multiple situations in the data including two records which both have isdefault set to 1 and all of the situations tested when running individual record updates
Performing a mass delete with multiple situations in the data including two records which both have isdefault set to 1 and all of the situations tested when running individual record deletes

#Andy Jones gave an answer above closest to mine, but bearing in mind the Rule of Three, I placed the logic directly in the stored proc that updates this table. This was my simple solution. If I need to update the table from elsewhere, I will move the logic to a trigger. The one default rule applies to each set of records specified by a FormID and a ConfigID:
ALTER proc [dbo].[cpForm_UpdateLinkedReport]
#reportLinkId int,
#defaultYN bit,
#linkName nvarchar(150)
as
if #defaultYN = 1
begin
declare #formId int, #configId int
select #formId = FormID, #configId = ConfigID from csReportLink where ReportLinkID = #reportLinkId
update csReportLink set DefaultYN = 0 where isnull(ConfigID, #configId) = #configId and FormID = #formId
end
update
csReportLink
set
DefaultYN = #defaultYN,
LinkName = #linkName
where
ReportLinkID = #reportLinkId

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Data gets changed when copying data in chunks between two identical tables - sql

Related

Having troubles with Identity field of SQL-SERVER

Alter an existing Identity Column's Increment value

Inserting record from one column to another column in the same scope or statement

Trigger for insert on identity column

Constraint for only one record marked as default

Categories

Resources