EF Update Statement is Violating a Unique Index - sql

I've got a bit of an issue which has exposed my lack of understanding of Uniqueness Constraints. This is the table in question and there is a unique, non-clustered index on the DataManagerId, SurgeonId and HospitalId columns of a table, as shown in this image (apologies, I don't have Visio):
I am using Entity Framework ("EF") and it has generated an UPDATE statement akin to this:
exec sp_executesql N'UPDATE [dbo].[DataManagerSurgeon]
SET [DataManagerId] = #0, [SurgeonId] = #1, [HospitalId] = #2, [DateAdded] = #3, [DateRescinded] = #4
WHERE ([Id] = #5)',N'#0 int,#1 int,#2 int,#3 datetime2(7),#4 datetime2(7),#5 int',#0=24,#1=221,#2=10001,#3='2018-01-02 08:58:29.1061413',#4='2018-01-02 08:58:29.1061413',#5=122
I'll clean that up a bit for the purposes of this question:
UPDATE [dbo].[DataManagerSurgeon]
SET [DataManagerId] = 24, [SurgeonId] = 221, [HospitalId] = 10001, [DateAdded] = '2018-01-02 08:58:29.1061413', [DateRescinded] = '2018-01-02 08:58:29.1061413'
WHERE [Id] = 122
The problem is, this is seeking to alter data (hence the update statement). And there is already a row with the combination (24,221,10001) for (DataManagerId, HospitalId,SurgeonId). As a result, the database is having none of it:
Violation of UNIQUE KEY constraint 'UC_ManagerSurgeonHospital'. Cannot insert duplicate key in object 'dbo.DataManagerSurgeon'. The duplicate key value is (24, 221, 10001).
As further background, I am only trying to update the DateRescinded column and if I were to write the statement manually, it would look like:
UPDATE [dbo].[DataManagerSurgeon]
SET [DateRescinded] = '2018-01-02 08:58:29.1061413'
WHERE [Id] = 122
I guess I learnt that you can violate a uniqueness constraint with an UPDATE statement, even when identifying the row by primary key in such a way that, at the end of the transaction, the conditions of the uniqueness constraint would still be met.
But my question remains, if EF is generating an UPDATE statement which violates a uniqueness constraint (index), how do I get around that? Note: the least optimal solution would be writing manual SQL and sending that to the database. It can be done, but defeats the purpose of using an ORM like EF.
Also note that I added the following code in the Mapping file for that table to make EF "aware" of the index:
this.HasIndex(t => new { t.DataManagerId, t.SurgeonId, t.HospitalId })
.IsUnique(true)
.IsClustered(false)
.HasName("UC_ManagerSurgeonHospital");
Final addendum - I have been referring to this as an Index and I'm not sure how SQL Server distinguishes these from CONSTRAINTS, but the create statement for such an Index looks like:
ALTER TABLE [dbo].[DataManagerSurgeon] ADD CONSTRAINT [UC_ManagerSurgeonHospital] UNIQUE NONCLUSTERED
(
[DataManagerId] ASC,
[SurgeonId] ASC,
[HospitalId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
I only mention that because people might ask why I implemented it as an Index and not a CONSTRAINT. But apparently, this is how you do multi-column constraints in SQL Server.
Thanks

Related

Include columns in an existing non-clustered index or add a new non-clustered index

Table Props already has a non-clustered index for column 'CreatedOn' but this index doesn't include certain other columns that are required in order to significantly improve the query performance of a frequently run query.
To fix this is it best to;
1. create an additional non-clustered index with the included columns or
2. alter the existing index to add the other columns as included columns?
In addition:
- How will my decision affect the performance of other queries currently using the non-clustered index?
- If it is best to alter the existing index should it be dropped and re-created or altered in order to add the included columns?
A simplified version of the table is below along with the index in question:
CREATE TABLE dbo.Props(
PropID int NOT NULL,
Reference nchar(10) NULL,
CustomerID int NULL,
SecondCustomerID int NULL,
PropStatus smallint NOT NULL,
StatusLastChanged datetime NULL,
PropType smallint NULL,
CreatedOn datetime NULL,
CreatedBy int NULL
CONSTRAINT PK_Props PRIMARY KEY CLUSTERED
(
PropID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
GO
Current index:
CREATE NONCLUSTERED INDEX idx_CreatedOn ON dbo.Props
(
CreatedOn ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
GO
All 5 of the columns required in the new or altered index are; foreign key columns, a mixture of smallint and int, nullable and non-nullable.
In the example the columns to include are: CustomerID, SecondCustomerID, PropStatus, PropType and CreatedBy.
As always... It depends...
As a general rule having redundant indexes is not desirable. So, in the absence of other information, you'd be better off adding the included columns, making it a covering index.
That said, the original index was likely built for another "high frequency" query... So now you have to determine weather or not the the increased index page count is going adversely affect the existing queries that use the index in it's current state.
You'd also want to look at the expense of doing a key lookup in relation to the rest of the query. If the key lookup in only a minor part of the total cost, it's unlikely that the performance gains will offset the expense of maintaining the larger index.

Why does an insert that groups by the primary key throw a primary key constraint violation error?

I have an insert statement that's throwing a primary key error but I don't see how I could possibly be inserting duplicate key values.
First I create a temp table with a primary key.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED //Note: I've tried committed and uncommited, neither materially affects the behavior. See screenshots below for proof.
IF (OBJECT_ID('TEMPDB..#P')) IS NOT NULL DROP TABLE #P;
CREATE TABLE #P(idIsbn INT NOT NULL PRIMARY KEY, price SMALLMONEY, priceChangedDate DATETIME);
Then I pull prices from the Prices table, grouping by idIsbn, which is the primary key in the temp table.
INSERT INTO #P(idIsbn, price, priceChangedDate)
SELECT idIsbn ,
MIN(lowestPrice) ,
MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
I understand that grouping by idIsbn by definition makes it unique. The idIsbn in the prices table is: [idIsbn] [int] NOT NULL.
But every once in a while when I run this query I get this error:
Violation of PRIMARY KEY constraint 'PK__#P________AED35F8119E85FC5'. Cannot insert duplicate key in object 'dbo.#P'. The duplicate key value is (1447858).
NOTE: I've got a lot of questions about timing. I will select this statement, press F5, and no error will occur. Then I'll do it again, and it will fail, then I'll run it again and again and it will succeed a couple times before it fails again. I guess what I'm saying is that I can find no pattern for when it will succeed and when it won't.
How can I be inserting duplicate rows if (A) I just created the table brand new before inserting into it and (B) I'm grouping by the column designed to be the primary key?
For now, I'm solving the problem with IGNORE_DUP_KEY = ON, but I'd really like to know the root cause of the problem.
Here is what I'm actually seeing in my SSMS window. There is nothing more and nothing less:
##Version is:
Microsoft SQL Server 2008 (SP3) - 10.0.5538.0 (X64)
Apr 3 2015 14:50:02
Copyright (c) 1988-2008 Microsoft Corporation
Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
Execution Plan:
Here is an example of what it looks like when it runs fine. Here I'm using READ COMMITTED, but it doesn't matter b/c I get the error no matter whether I read it committed or uncommited.
Here is another example of it failing, this time w/ READ COMMITTED.
Also:
I get the same error whether I'm populating a temp table or a
persistent table.
When I add option (maxdop 1) to the end of the insert it seems to fail every time, though I can't be exhaustively sure of that b/c I can't run it for infinity. But it seems to be the case.
Here is the definition of the price table. Table has 25M rows. 108,529 updates in the last hour.
CREATE TABLE [dbo].[Price](
[idPrice] [int] IDENTITY(1,1) NOT NULL,
[idIsbn] [int] NOT NULL,
[idMarketplace] [int] NOT NULL,
[lowestPrice] [smallmoney] NULL,
[offers] [smallint] NULL,
[priceDate] [smalldatetime] NOT NULL,
[priceChangedDate] [smalldatetime] NULL,
CONSTRAINT [pk_Price] PRIMARY KEY CLUSTERED
(
[idPrice] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [uc_idIsbn_idMarketplace] UNIQUE NONCLUSTERED
(
[idIsbn] ASC,
[idMarketplace] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
And the two non-clustered indexes:
CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_INC_idIsbn_lowestPrice_priceDate] ON [dbo].[Price]
(
[idMarketplace] ASC
)
INCLUDE ( [idIsbn],
[lowestPrice],
[priceDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice] ON [dbo].[Price]
(
[idMarketplace] ASC,
[priceChangedDate] ASC
)
INCLUDE ( [idIsbn],
[lowestPrice]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
You hadn't supplied your table structure.
This is a repro with some assumed details that causes the problem at read committed (NB: now you have supplied the definition I can see in your case updates to the priceChangedDate column will move rows around in the IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice index if that's the one being seeked)
Connection 1 (Set up tables)
USE tempdb;
CREATE TABLE Price
(
SomeKey INT PRIMARY KEY CLUSTERED,
idIsbn INT IDENTITY UNIQUE,
idMarketplace INT DEFAULT 3100,
lowestPrice SMALLMONEY DEFAULT $1.23,
priceChangedDate DATETIME DEFAULT GETDATE()
);
CREATE NONCLUSTERED INDEX ix
ON Price(idMarketplace)
INCLUDE (idIsbn, lowestPrice, priceChangedDate);
INSERT INTO Price
(SomeKey)
SELECT number
FROM master..spt_values
WHERE number BETWEEN 1 AND 2000
AND type = 'P';
Connection 2
Concurrent DataModifications that move a row from the beginning of the seeked range (3100,1) to the end (3100,2001) and back again repeatedly.
USE tempdb;
WHILE 1=1
BEGIN
UPDATE Price SET SomeKey = 2001 WHERE SomeKey = 1
UPDATE Price SET SomeKey = 1 WHERE SomeKey = 2001
END
Connection 3 (Do the insert into a temp table with a unique constraint)
USE tempdb;
CREATE TABLE #P
(
idIsbn INT NOT NULL PRIMARY KEY,
price SMALLMONEY,
priceChangedDate DATETIME
);
WHILE 1 = 1
BEGIN
TRUNCATE TABLE #P
INSERT INTO #P
(idIsbn,
price,
priceChangedDate)
SELECT idIsbn,
MIN(lowestPrice),
MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
END
The plan has no aggregate as there is a unique constraint on idIsbn (a unique constraint on idIsbn,idMarketplace would also work) therefore the group by can be optimised out as there are no duplicate values.
But at read committed isolation level shared row locks are released as soon as the row is read. So it is possible for a row to move places and be read a second time by the same seek or scan.
The index ix doesn't explicitly include SomeKey as a secondary key column but as it is not declared unique SQL Server silently includes the clustering key behind the scenes, hence updating that column value can move rows around in it.

SQL Server ALTER COLUMN NOT NULL has no effect

The designers of table SOME_TABLE did not define a primary key, and worse, they set one of the columns that could define the primary key as NULLable (the others are OK).
The data for SOME_TABLE.PrinterPos does not contain any NULL values.
I am writing an upgrade script to apply to ~50 databases.
The following code is failing:
ALTER TABLE dbo.SOME_TABLE
ALTER COLUMN PrinterPos smallint NOT NULL;
ALTER TABLE dbo.SOME_TABLE
ADD CONSTRAINT PK_SOME_TABLE
PRIMARY KEY CLUSTERED (SOME_TABLE_ID ASC, Store_ID ASC, PrinterPos ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
I get the message
Cannot define PRIMARY KEY constraint on nullable column in table 'SOME_TABLE'.
It looks line the first command is being totally ignored. Although there is no message to indicate this.
To put it in it's own batch, I have tried executing the first command using sp_executesql to no effect.
If I execute the first command in SQL Server Management Studio followed by the second then it executes OK.
I need to get this change fully automated. How can I get this to work via script?
Try adding a GO keyword between the two ALTER TABLE commands

How can I change primary key on SQL Azure

I am going to change the primary key on SQL Azure. But it throws an error when using Microsoft SQL Server Management Studio to generate the scripts. Because every tables on SQL Azure must contains a primary key. And I can't drop it before create. What can I do if I must change it?
Script generated
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[mytable]') AND name = N'PK_mytable')
ALTER TABLE [dbo].[mytable] DROP CONSTRAINT [PK_mytable]
GO
ALTER TABLE [dbo].[mytable] ADD CONSTRAINT [PK_mytable] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF)
GO
Error message
Msg 40054, Level 16, State 2, Line 3
Tables without a clustered index are not supported in this version of SQL Server. Please create a clustered index and try again.
Msg 3727, Level 16, State 0, Line 3
Could not drop constraint. See previous errors.
The statement has been terminated.
Msg 1779, Level 16, State 0, Line 3
Table 't_event_admin' already has a primary key defined on it.
Msg 1750, Level 16, State 0, Line 3
Could not create constraint. See previous errors.
I ran into this exact problem and contacted the Azure team on the forums. Basically it isn't possible. You'll need to create a new table and transfer the data to it.
What I did was create a transaction and within it do the following:
Renamed the old table to OLD_MyTable.
Create the new table with the correct Primary Key and call it MyTable.
Select the contents from OLD_MyTable
into MyTable.
Drop OLD_MyTable.
You may also need to call sp_rename on any constraints so they don't conflict.
See also: http://social.msdn.microsoft.com/Forums/en/ssdsgetstarted/thread/5cc4b302-fa42-4c62-956a-bbf79dbbd040
upgrade SQL V12 and heaps are supported on it. So you can drop the primary key and recreate it.
I appreciate that this may be late in the day for yourself, but it may help others.
I recently came across this issue and found the least painful solution was to download the database from Azure, restore it locally, update the primary key locally (as the key constraint is a SQL Azure specific issue), and then restore the database back into Azure.
This saved any issues in regards to renaming databases or transferring data between them.
You can try the following scripts. Change it to suit for your table def.
EXECUTE sp_rename N'[PK_MyTable]', N'[PK_MyTable_old]', 'OBJECT'
CREATE TABLE [dbo].[Temp_MyTable](
[id] [int] NOT NULL,
[text] [text] NOT NULL CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED (
[id] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON))
INSERT INTO dbo.[Temp_MyTable] (Id, Text)
SELECT Id, Text FROM dbo.MyTable
drop table dbo.MyTable
EXECUTE sp_rename N'Temp_MyTable', N'MyTable', 'OBJECT'
This question is outdated because changing PK is already supported in latest version of SQL Azure. And you don't have to create temporary table.

Problem in using indexes with SQL Server 2005 and Hibernate

I've a problem with a query generated by Hibernate that do not uses an index. Access to database is made from Java using JTDS and server version is SQL Server 2005, latest service pack.
The field is nullable and is a foreign key that, in some specific scenarios, could be completely null, column is indexed via a not clustered index but the index is never used when the column is entirely null, creating a large number of full table scans and performance issues.
The situation could be verified also using the standard query analyzer with the following SQL code:
Create table and indexes
CREATE TABLE [dbo].[TestNulls](
[PK] [varchar](36) NOT NULL,
[DATA] [varchar](36) NULL,
[DATANULL] [varchar](36) NULL,
CONSTRAINT [PK_TestNulls] PRIMARY KEY NONCLUSTERED
(
[PK] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IDX_DATA] ON [dbo].[TestNulls]
(
[DATA] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IDX_DATANULL] ON [dbo].[TestNulls]
(
[DATANULL] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Fill it with some random data using the newid function
declare #i as int
set #i = 0
while (#i < 500000)
begin
set nocount on
insert into TestNulls values(NEWID(), NEWID(), null)
insert into TestNulls values(NEWID(), null, null)
insert into TestNulls values(NEWID(), null, null)
set #i = (#i + 1)
set nocount on
end;
This query perform a full table scan
declare #p varchar(36)
set #p = NEWID()
select PK, DATA, DATANULL from TestNulls
where DATANULL = #p
If I complete the query with a "and DATANULL IS NOT NULL" the query now uses the index.
Help needed:
How can I force the JTDS/Hibernate combination to use the index (the sendStringParametersAsUnicode is already set to false by default)?
Is there a way to append "and column is not null" for all hibernate queries that uses a nullable field?
Any explanation about this behavior?
Regards
Massimo
1) I think, we should avoid NULL values. Just use DEFAULT and place some {00000-0000-000...} as NULL value. Your data filling script generates too many nulls values, so selectivity of values of this field is very low. I think SQL Server will choose to scan then use index in this case (SQL Server automaticly chooses to use or does not use index itself). And it makes sence. You should analyse your REAL data. Any way you can force it to just use some index. You can create stored procedure to sql server and then query it from hibernate, for example,
or command hibernate to use custom query to request data (I think, it is possible) and add table hint to your query to force using some index:
INDEX ( index_val [ ,...n ] ):
select PK, DATA, DATANULL from TestNulls WITH INDEX(IDX_DATANULL)
The selectivity is the "number of rows" / "cardinality", so if you have 10K customers, and search for all "female", you have to consider that the search would return 10K/2 = 5K rows, so a very "bad" selectivity.
Luck.
You are using a table with no clustered index ("heap table", as it is called), that is generally not very efficient for SELECTs, because any meaningful query requires either a bookmark lookup or a full table scan.
So, to use the index the server will have to: 1) find the given values in the index and retrieve corresponding Row IDs, 2) retrive the rows by the IDs and return the data.
Given the nature of your data, the optimizer "thinks" full scan is mor efficient.
I'd suggest you to try:
Rebuilding the statistics on the table. Outdated stats could lead optimizer to wrong decisions.
Force using index via a hint. Do not forget to test whether it is really faster on your actual data (sometimes optimizer happens to know better than you).
Create a covering index for this query by adding some data (it will make inserts/updates somewhat slower, so you should consider the overall impact on the system):
CREATE INDEX IDX_DATANULL_FULL ON TestNulls (DATANULL) INCLUDE (PK, DATA)