SQL Server create primary key constraint duplicate key error - sql

I have been experiencing some strange behaviour with one of my SQL commands taken from one of our stored procedures.
This command follows the below order of execution:
1) Drop table
2) Select * into table name from live server
3) Alter table to apply PK - this step fails once out of 4 daily executions
My SQL statement:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'
[inf].[tblBase_MyTable]') AND type in (N'U'))
DROP TABLE [inf].[tblBase_MyTable]
SELECT * INTO [inf].[tblBase_MyTable]
FROM LiveServer.KMS_ALLOCATION WITH (NOLOCK)
ALTER TABLE [inf].[tblBase_MyTable] ADD
CONSTRAINT [PK_KMS_ALLOCATION] PRIMARY KEY NONCLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GRANT SELECT ON [inf].[tblBase_MyTable] TO ourGroup
This is very strange considering the table is dropped, and I thought the indexes / keys would also be dropped. However I get this error at the same time every day. Any advice would be very much appreciated.
Error:
The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'inf.tblBase_MyTable' and the index name 'PK_KMS_ALLOCATION'.

Duplicate keys in [inf].[tblBase_MyTable] table are actually possible thanks to the WITH (NOLOCK) hint which allows "dirty reads". Have a look at blog which describes this in detail: SQL Server NOLOCK Hint & other poor ideas:
What many people think NOLOCK is doing
Most people think the NOLOCK hint just reads rows & doesn’t have to
wait till others have committed their updates or selects. If someone
is updating, that is OK. If they’ve changed a value then 99.999% of
the time they will commit, so it’s OK to read it before they commit.
If they haven’t changed the record yet then it saves me waiting, its
like my transaction happened before theirs did.
The Problem
The issue is that transactions do more than just update the row. Often
they require an index to be updated OR they run out of space on the
data page. This may require new pages to be allocated & existing rows
on that page to be moved, called a PageSplit. It is possible for your
select to completely miss a number of rows &/or count other rows
twice.

Well... you might have to repeat creating the new table and filling it until the check-query from #DarkoMartinovic does not return duplicates. Only then you can continue to add the PK. But this solution might cause heavy load on your live system. And you nave no guarantee that you have a 1:1 copy of the data as well.

Having reviewed various helpful comments here, I have decided against (for now) implementing SNAPSHOT isolation as this interface does not make use of a proper staging environment.
To move to this would mean either creating a staging area and setting that database to READ COMMITTED SNAPSHOT isolation, and a rebuild of the entire interface.
To that end and on the basis of saving development time, we have opted for ensuring that any ghost reads where dupes could be brought across from the source are handled before applying the PK.
This is by no means an ideal solution in terms of performance on the target server but will provide some headroom for now and certainly remove the previous error.
SQL approach below:
DECLARE #ALLOCTABLE TABLE
(SEQ INT, ID NVARCHAR(1000), CLASSID NVARCHAR(1000), [VERSION] NVARCHAR(25), [TYPE]
NVARCHAR(100), VERSIONSEQUENCE NVARCHAR(100), VERSIONSEQUENCE_TO NVARCHAR(100),
BRANCHID NVARCHAR(100), ISDELETED INT, RESOURCE_CLASS NVARCHAR(25), RESOURCE_ID
NVARCHAR(100), WARD_ID NVARCHAR(100), ISCOMPLETE INT, TASK_ID NVARCHAR(100));
------- ALLOCATION
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[inf].
[tblBase_MyTable]') AND type in (N'U'))
DROP TABLE [inf].[tblBase_MyTable]
SELECT * INTO [inf].[tblBase_MyTable]
FROM LiveServer.KMS_ALLOCATION WITH (NOLOCK)
INSERT INTO #ALLOCTABLE
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ISCOMPLETE DESC) SEQ, AL.*
FROM [inf].[tblBase_MyTable] AL
)DUPS
WHERE SEQ >1
DELETE FROM [inf].[tblBase_MyTable]
WHERE ID IN (SELECT ID FROM #ALLOCTABLE)
AND ISCOMPLETE = 0
ALTER TABLE [inf].[tblBase_MyTable] ADD CONSTRAINT
[PK_KMS_ALLOCATION] PRIMARY KEY NONCLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GRANT SELECT ON [inf].[tblBase_MyTable] TO OurGroup

Related

Why does an insert that groups by the primary key throw a primary key constraint violation error?

I have an insert statement that's throwing a primary key error but I don't see how I could possibly be inserting duplicate key values.
First I create a temp table with a primary key.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED //Note: I've tried committed and uncommited, neither materially affects the behavior. See screenshots below for proof.
IF (OBJECT_ID('TEMPDB..#P')) IS NOT NULL DROP TABLE #P;
CREATE TABLE #P(idIsbn INT NOT NULL PRIMARY KEY, price SMALLMONEY, priceChangedDate DATETIME);
Then I pull prices from the Prices table, grouping by idIsbn, which is the primary key in the temp table.
INSERT INTO #P(idIsbn, price, priceChangedDate)
SELECT idIsbn ,
MIN(lowestPrice) ,
MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
I understand that grouping by idIsbn by definition makes it unique. The idIsbn in the prices table is: [idIsbn] [int] NOT NULL.
But every once in a while when I run this query I get this error:
Violation of PRIMARY KEY constraint 'PK__#P________AED35F8119E85FC5'. Cannot insert duplicate key in object 'dbo.#P'. The duplicate key value is (1447858).
NOTE: I've got a lot of questions about timing. I will select this statement, press F5, and no error will occur. Then I'll do it again, and it will fail, then I'll run it again and again and it will succeed a couple times before it fails again. I guess what I'm saying is that I can find no pattern for when it will succeed and when it won't.
How can I be inserting duplicate rows if (A) I just created the table brand new before inserting into it and (B) I'm grouping by the column designed to be the primary key?
For now, I'm solving the problem with IGNORE_DUP_KEY = ON, but I'd really like to know the root cause of the problem.
Here is what I'm actually seeing in my SSMS window. There is nothing more and nothing less:
##Version is:
Microsoft SQL Server 2008 (SP3) - 10.0.5538.0 (X64)
Apr 3 2015 14:50:02
Copyright (c) 1988-2008 Microsoft Corporation
Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
Execution Plan:
Here is an example of what it looks like when it runs fine. Here I'm using READ COMMITTED, but it doesn't matter b/c I get the error no matter whether I read it committed or uncommited.
Here is another example of it failing, this time w/ READ COMMITTED.
Also:
I get the same error whether I'm populating a temp table or a
persistent table.
When I add option (maxdop 1) to the end of the insert it seems to fail every time, though I can't be exhaustively sure of that b/c I can't run it for infinity. But it seems to be the case.
Here is the definition of the price table. Table has 25M rows. 108,529 updates in the last hour.
CREATE TABLE [dbo].[Price](
[idPrice] [int] IDENTITY(1,1) NOT NULL,
[idIsbn] [int] NOT NULL,
[idMarketplace] [int] NOT NULL,
[lowestPrice] [smallmoney] NULL,
[offers] [smallint] NULL,
[priceDate] [smalldatetime] NOT NULL,
[priceChangedDate] [smalldatetime] NULL,
CONSTRAINT [pk_Price] PRIMARY KEY CLUSTERED
(
[idPrice] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [uc_idIsbn_idMarketplace] UNIQUE NONCLUSTERED
(
[idIsbn] ASC,
[idMarketplace] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
And the two non-clustered indexes:
CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_INC_idIsbn_lowestPrice_priceDate] ON [dbo].[Price]
(
[idMarketplace] ASC
)
INCLUDE ( [idIsbn],
[lowestPrice],
[priceDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice] ON [dbo].[Price]
(
[idMarketplace] ASC,
[priceChangedDate] ASC
)
INCLUDE ( [idIsbn],
[lowestPrice]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
You hadn't supplied your table structure.
This is a repro with some assumed details that causes the problem at read committed (NB: now you have supplied the definition I can see in your case updates to the priceChangedDate column will move rows around in the IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice index if that's the one being seeked)
Connection 1 (Set up tables)
USE tempdb;
CREATE TABLE Price
(
SomeKey INT PRIMARY KEY CLUSTERED,
idIsbn INT IDENTITY UNIQUE,
idMarketplace INT DEFAULT 3100,
lowestPrice SMALLMONEY DEFAULT $1.23,
priceChangedDate DATETIME DEFAULT GETDATE()
);
CREATE NONCLUSTERED INDEX ix
ON Price(idMarketplace)
INCLUDE (idIsbn, lowestPrice, priceChangedDate);
INSERT INTO Price
(SomeKey)
SELECT number
FROM master..spt_values
WHERE number BETWEEN 1 AND 2000
AND type = 'P';
Connection 2
Concurrent DataModifications that move a row from the beginning of the seeked range (3100,1) to the end (3100,2001) and back again repeatedly.
USE tempdb;
WHILE 1=1
BEGIN
UPDATE Price SET SomeKey = 2001 WHERE SomeKey = 1
UPDATE Price SET SomeKey = 1 WHERE SomeKey = 2001
END
Connection 3 (Do the insert into a temp table with a unique constraint)
USE tempdb;
CREATE TABLE #P
(
idIsbn INT NOT NULL PRIMARY KEY,
price SMALLMONEY,
priceChangedDate DATETIME
);
WHILE 1 = 1
BEGIN
TRUNCATE TABLE #P
INSERT INTO #P
(idIsbn,
price,
priceChangedDate)
SELECT idIsbn,
MIN(lowestPrice),
MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
END
The plan has no aggregate as there is a unique constraint on idIsbn (a unique constraint on idIsbn,idMarketplace would also work) therefore the group by can be optimised out as there are no duplicate values.
But at read committed isolation level shared row locks are released as soon as the row is read. So it is possible for a row to move places and be read a second time by the same seek or scan.
The index ix doesn't explicitly include SomeKey as a secondary key column but as it is not declared unique SQL Server silently includes the clustering key behind the scenes, hence updating that column value can move rows around in it.

Changing GUID Primary Keys to Integer Primary Keys

I took over an application a few months ago that used Guids for primary keys on the main tables. We've been having some index related database problems lately and I've just been reading up on the use of Guids as primary keys and I've learnt how much of a bad idea they can be, and I thought it might pay to look into changing them before the database gets too large.
I'm wondering if there's an easy way to change them to Ints? Is there some amazing software that will do this for me? Or is it all up to me?
I was thinking of just adding an extra Int column too all appropriate tables, write some code to poplate this column with 1 - n based on the CreationDate column, writing some more code to populate columns in all related tables, then switching the relationships to the new int columns. Does't sound TOO difficult... Would this be the best way to do it?
After combining pieces from all the above links, I came up with this script, simplified for the sake of the answer.
Tables Before Changes
JOB
Id Guid PK
Name nvarchar
CreationDate datetime
REPORT
Id Guid PK
JobId int
Name nvarchar
CreationDate datetime
SCRIPT
-- Create new Job table with new Id column
select JobId = IDENTITY(INT, 1, 1), Job.*
into Job2
from Job
order by CreationDate
-- Add new JobId column to Report
alter table Report add JobId2 int
-- Populate new JobId column
update Report
set Report.JobId2 = Job2.JobId
from Job2
where Report.JobId = Job2.Id
-- Delete Old Id
ALTER TABLE Job2 DROP COLUMN Id
-- Delete Relationships
ALTER TABLE Report DROP CONSTRAINT [FK_Report_Job]
ALTER TABLE Job DROP CONSTRAINT PK_Job
-- Create Relationships
ALTER TABLE [dbo].[Job2] ADD CONSTRAINT [PK_Job] PRIMARY KEY CLUSTERED
([JobId] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
ALTER TABLE [dbo].[Report] WITH CHECK ADD CONSTRAINT [FK_Report_Job] FOREIGN KEY([JobId2])
REFERENCES [dbo].[Job2] ([JobId])
ON DELETE CASCADE
ALTER TABLE [dbo].[Report] CHECK CONSTRAINT [FK_Report_Job]
-- Rename Columns
sp_RENAME 'Report.JobId', 'OldJobId' , 'COLUMN'
sp_RENAME 'Report.JobId2', 'JobId' , 'COLUMN'
-- Rename Tables
sp_rename Job, Job_Old
sp_rename Job2, Job
I created the Job2 table because it means I didn't have to touch the original Job table (apart from deleting the relationships), so that everything could easily be put back to it's original state in case something went bad.

Trigger to change autoincremented value

I have a simple table with tax rates
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TaxRates](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](50) NULL,
CONSTRAINT [PK_TaxRates] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
If user deleted record I want to not change autoincrementer while next insert.
To have more clearance.
Now I have 3 records with id 0,1 and 2. When I delete row with Id 2 and some time later I add next tax rate I want to have records in this table like before 0,1,2.
There shouldn't be chance to have a gap like 0,1,2,4,6. It must be trigger.
Could you help with that?
You need to accept gaps or don't use IDENTITY
id should have no external meaning
You can't update IDENTITY values
IDENTITY columns will always have gaps
In this case you'd update the clustered Pk which will be expensive
What about foreign keys? you'd need a CASCADE
Contiguous numbers can be generated with ROW_NUMBER() at read time
Without IDENTITY (whether you load this table or another) won't be concurrency-safe under load
Trying to INSERT into a gap (by an INSTEAD OF trigger) won't be concurrency-safe under load
(Edit) History tables may have the deleted values anyway
An option, if the identity column has become something passed around in your organization is to duplicate that column into a non-identity column on the same table and you can modify those new id values at will while retaining the actual identity field.
turning identity_insert on and off can allow you to insert identity values.

How can I change primary key on SQL Azure

I am going to change the primary key on SQL Azure. But it throws an error when using Microsoft SQL Server Management Studio to generate the scripts. Because every tables on SQL Azure must contains a primary key. And I can't drop it before create. What can I do if I must change it?
Script generated
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[mytable]') AND name = N'PK_mytable')
ALTER TABLE [dbo].[mytable] DROP CONSTRAINT [PK_mytable]
GO
ALTER TABLE [dbo].[mytable] ADD CONSTRAINT [PK_mytable] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF)
GO
Error message
Msg 40054, Level 16, State 2, Line 3
Tables without a clustered index are not supported in this version of SQL Server. Please create a clustered index and try again.
Msg 3727, Level 16, State 0, Line 3
Could not drop constraint. See previous errors.
The statement has been terminated.
Msg 1779, Level 16, State 0, Line 3
Table 't_event_admin' already has a primary key defined on it.
Msg 1750, Level 16, State 0, Line 3
Could not create constraint. See previous errors.
I ran into this exact problem and contacted the Azure team on the forums. Basically it isn't possible. You'll need to create a new table and transfer the data to it.
What I did was create a transaction and within it do the following:
Renamed the old table to OLD_MyTable.
Create the new table with the correct Primary Key and call it MyTable.
Select the contents from OLD_MyTable
into MyTable.
Drop OLD_MyTable.
You may also need to call sp_rename on any constraints so they don't conflict.
See also: http://social.msdn.microsoft.com/Forums/en/ssdsgetstarted/thread/5cc4b302-fa42-4c62-956a-bbf79dbbd040
upgrade SQL V12 and heaps are supported on it. So you can drop the primary key and recreate it.
I appreciate that this may be late in the day for yourself, but it may help others.
I recently came across this issue and found the least painful solution was to download the database from Azure, restore it locally, update the primary key locally (as the key constraint is a SQL Azure specific issue), and then restore the database back into Azure.
This saved any issues in regards to renaming databases or transferring data between them.
You can try the following scripts. Change it to suit for your table def.
EXECUTE sp_rename N'[PK_MyTable]', N'[PK_MyTable_old]', 'OBJECT'
CREATE TABLE [dbo].[Temp_MyTable](
[id] [int] NOT NULL,
[text] [text] NOT NULL CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED (
[id] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON))
INSERT INTO dbo.[Temp_MyTable] (Id, Text)
SELECT Id, Text FROM dbo.MyTable
drop table dbo.MyTable
EXECUTE sp_rename N'Temp_MyTable', N'MyTable', 'OBJECT'
This question is outdated because changing PK is already supported in latest version of SQL Azure. And you don't have to create temporary table.

Access DateTime of row written to table if no DateTime field in table - SQL Server 2008

I have a situation where I need to find out the most recent entry written to a table but the person who developed the database created the table without a DateTime field or an auto-incrementing ID field.
I am wondering is there any way of accessing some built in DateTime proprty recorded by SQL Server or any other way of determining the more recent entry out of say two results like this:
idp_fund_id IDPID
14 1653
18 1653
Below is the structure
CREATE TABLE [dbo].[web_IDP_Lifestyle](
[idp_fund_id] [int] NOT NULL,
[IDPID] [int] NOT NULL,
CONSTRAINT [PK_web_IDP_Lifestyle] PRIMARY KEY CLUSTERED
(
[idp_fund_id] ASC,
[IDPID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
In short: no.
You might be able to compare for existence of those values in backups... depending on how far the backups go.
I'm assuming that the idp_fund_id of 18 might have been written before the value of 14?
Nope.
SQL Server doesn't keep detailed records on this. If it DID, think about how large the metadata would have to be in order to track the update/insert time for each and every record in each and every table.
SQL doesn't order inserts by default, and I don't think you can check what page it was inserted on to determine this with any reliability, either.
If you want to track something like this, you need to create a field/table for it to track it yourself and implement some triggers.