Unable to create index because of duplicate that doesn't exist? - sql-server-2005

I'm getting an error running the following Transact-SQL command:
CREATE UNIQUE NONCLUSTERED INDEX IX_TopicShortName
ON DimMeasureTopic(TopicShortName)
The error is:
Msg 1505, Level 16, State 1, Line 1
The CREATE UNIQUE INDEX statement
terminated because a duplicate key was
found for the object name
'dbo.DimMeasureTopic' and the index
name 'IX_TopicShortName'. The
duplicate key value is ().
When I run SELECT * FROM sys.indexes WHERE name = 'IX_TopicShortName' or SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[DimMeasureTopic]') the IX_TopicShortName index does not display. So there doesn't appear to be a duplicate.
I have the same schema in another database and can create the index without issues there. Any ideas why it won't create here?

It's not that the index already exists, but that there are duplicate values of the TopicShortName field in the table itself. According to the error message the duplicate value is an empty string (it might just be a facet of posting I guess). Such duplicates prevent the creation of a UNIQUE index.
You could run a query to confirm that you have a duplicate:
SELECT
TopicShortName,
COUNT(*)
FROM
DimMeasureTopic
GROUP BY
TopicShortName
HAVING
COUNT(*) > 1
Presumably in the other database the data are different, and the duplicates are not present.

The duplicate is in your data, try running this query to find it.
SELECT TopicShortName, COUNT(*)
FROM DimMeasureTopic
GROUP BY TopicShortName
HAVING COUNT(*) > 1

It's because you have records in the table already that are not unique (by the sounds of it, 2 records with a blank value in the TopicShortName field).
So, it's to do with the data, not the index itself.

If you are using code-based migrations, and you rename a property of an entity and you are having an unique index for the property, entity framework will create a new column and trying to add an unique index for the new column but the new column has all null values, therefore it will fail. You need to manually modify the migration code to copy the data from old column before the line to create index.

It should have specified the duplicate key value in the error message. "The duplicate key value is (' ', ' ', ' ') The statement has been terminated. You have duplicate values that need to be addressed.

Related

PostgreSQL: Do I need to lock table to avoid concurrency errors during update that use a subquery to find the index o update in a JSONB array?

Assuming I want to update a JSONB column contacts (array of objects) in a customers table, and I want to update a value of an object inside an array based on its index thanks to a subquery, do I need to lock the table from update during the execution to avoid concurrency problems?
In other words, could the table be altered between my two query execution, and thus the index I selected thanks to the subquery would be obsolete?
with contact_email as (
select ('{' || index - 1 || ', value}')::text[] as path from customers
cross join jsonb_array_elements(contacts) with ordinality arr(contact, index)
where contact->>type = 'email' and name = 'john'
)
update customers
set contacts = jsonb_set(contacts,contact_email.path,'"john#example.com"', false)
from contact_email
where and name = 'john'
-- `customers` table has a `name` column and a `contacts` column (jsonb)
-- `contacts` column contains things like `[{"type":"email","value":"x#y.z", …}]`
In the example above, if the array in the contacts column is altered between the table reading (subquery) and the update (main query), then the index selected would become wrong: then I would update the wrong array entry.
If something is unclear I can edit my question and add more details.
Both parts of the query will see the same snapshot of the database, so the data are always consistent.
If some concurrent transaction changes the row between the time it is read and the time it is written, the outcome depends on your isolation level:
if you are running with the default READ COMMITTED isolation, the update will either overwrite that change or do nothing (the latter if name has changed)
if you are running with REPEATABLE READ or higher, you will get a serialization error and have to repeat the statement

How to copy, change, and insert records in Postgres

In a PostgreSQL DB table, I need to copy a block of records from a prior month, change values in some of the columns, and append the updated records to the table. Details include:
The key id is configured with nextval to automatically create
unique key values
The target records have '200814' in group_tag
The new records need '200911' in group_tag
Several other fields need to be updated as shown in the SELECT
My script so far:
INSERT INTO hist.group_control(
id,
group_tag,
process_sequence,
state,
cbsa_code,
window_starts_on,
preceding_group,
preceding_origin,
preceding_window_starts_on
)
SELECT id,
'200911',
1,
state,
cbsa_code,
'2020-09-11',
'200814',
preceding_origin,
'2020-08-14'
FROM hist.group_control WHERE group_tag='200814';
This generates an error:
SQL Error [23505]: ERROR: duplicate key value violates unique constraint "group_control_pkey"
Detail: Key (id)=(12250) already exists.
Records with key values up to 13008 exist. I would have expected nextval to determine this and start the id value at 13009. I attempted to simply not include id in the statement thinking the nextval function would operate automatically, but that errored as well. Variations on the following have not worked due to the respective errors:
alter sequence group_control_id_seq restart with 13009;
SQL Error [42501]: ERROR: must be owner of relation group_control_id_seq
SELECT setval('group_control_id_seq', 13009, true);
SQL Error [42501]: ERROR: permission denied for sequence group_control_id_seq
Anyone know how to code the main statement to not generate the duplicate key or alternatively, how to tell nextval to start at a value of 13009
It appears your serial, bigserial, or generated by default. Any of these only assign the id column when it is not specified in the insert statement. If you specify the id column Postgres will not assign a key PK. Since you selected the id, Postgres attempted to use what you specified. Solution drop id from the insert statement.
INSERT INTO hist.group_control(
group_tag,
process_sequence,
state,
cbsa_code,
window_starts_on,
preceding_group,
preceding_origin,
preceding_window_starts_on
)
SELECT '200911',
1,
state,
cbsa_code,
'2020-09-11',
'200814',
preceding_origin,
'2020-08-14'
FROM hist.group_control WHERE group_tag='200814';

SQL/DB2 SQLSTATE=23505 error when executing an UPDATE statement

I am getting a SQLSTATE=23505 error when I execute the following DB2 statement:
update SEOURLKEYWORD
set URLKEYWORD = REPLACE(URLKEYWORD, '/', '-')
where STOREENT_ID = 10701
and URLKEYWORD like '%/%';
After a quick search, a SQL state 23505 error is defined as follows:
AN INSERTED OR UPDATED VALUE IS INVALID BECAUSE THE INDEX IN INDEX SPACE CONSTRAINS COLUMNS OF THE TABLE SO NO TWO ROWS CAN CONTAIN DUPLICATE VALUES IN THOSE COLUMNS RID OF EXISTING ROW IS X
The full error I am seeing is:
The full error I am seeing is:
DB2 Database Error: ERROR [23505] [IBM][DB2/LINUXX8664] SQL0803N One or more values in the INSERT statement, UPDATE statement, or foreign key update caused by a DELETE statement are not valid because the primary key, unique constraint or unique index identified by "2" constrains table "WSCOMUSR.SEOURLKEYWORD" from having duplicate values for the index key. SQLSTATE=23505
1 0
I'm not sure what the "index identified by '2'" means, but it could be significant.
The properties of the columns for the SEOURLKEYWORD table are as follows:
Based on my understanding of this information, the only column that is forced to be unique is SEOURLKEYWORD_ID, the primary key column. This makes it sound like the update statement I'm trying to run is attempting to insert a row that has a SEOURLKEYWORD_ID that already exists in the table.
If I run a select * statement on the rows I'm trying to update, here's what I get:
select * from SEOURLKEYWORD
where storeent_id = 10701
and lower(URLKEYWORD) like '%/%';
I don't understand how executing the UPDATE statement is resulting in an error here. There are only 4 rows this statement should even be looking at, and I'm not manually updating the primary key at all. It kind of seems like it's reinserting a duplicate row with the updated column value before deleting the existing row.
Why am I seeing this error when I try to update the URLKEYWORD column of these four rows? How can I resolve this issue?
IMPORTANT: As I wrote this question, I have narrowed down the problem to the last of the four rows in the table above, SEOURLKEYWORD_ID = 3074457345616973668. I can update the other three rows just fine, but the 4th row is causing the error, I have no idea why. If I run a select * from SEOURLKEYWORD where SEOURLKEYWORD_ID = 3074457345616973668;, I see only the 1 row.
The error is pretty clear. You have a unique index/constraint in the table. Say you have two rows like this:
STOREENT_ID
URLKEYWORD
10701
A/B
10701
A-B
When the first version is replaced by 'A-B', the result would violate a unique constraint on (STOREENT_ID, URLKEYWORD) or (URLKEYWORD) (do note that other columns could possibly be included in the unique constraint/index as well).
You could avoid these situations by not updating them. I don't know what columns the unique constraint is on, but let's say only on URLKEYWORD. Then:
update SEOURLKEYWORD
set URLKEYWORD = REPLACE(URLKEYWORD, '/', '-')
where STOREENT_ID = 10701 and
URLKEYWORD like '%/%' and
not exists (select 1 from SEOURLKEYWORD s2 where replace(s2.urlkeyword, '/', '-') = REPLACE(SEOURLKEYWORD.URLKEYWORD, '/', '-')
);
Note the replace() is required for both columns because you might have:
A-B/C
A/B-C
These only conflict after the replacement in both values.
To complement the answer given by #GordonLinoff, here is a query that can be used to find a table's unique constraints, with their IDs, and the columns included in them:
SELECT c.tabschema, c.tabname, i.iid AS index_id, i.indname, ck.colname
FROM syscat.tabconst c
INNER JOIN syscat.indexes i
ON i.indname = c.constname -- unique index name matches constraint name
AND i.tabschema = c.tabschema AND i.tabname = c.tabname
INNER JOIN syscat.keycoluse ck
ON ck.constname = c.constname
AND ck.tabschema = c.tabschema c.tabname = ck.tabname AND
WHERE c.type = 'U' -- constraint type: unique
AND (c.tabschema, c.tabname) = ('YOURSCHEMA', 'YOURTABLE') -- replace schema/table
ORDER BY i.iid, ck.colseq

Insert statement with no joins results in duplicates where no duplicates existed previously?

I am having an issue with some SQL that is resulting in results that I wouldn't expect. I am storing information from a variety of tables in another table which is used as part of a search page on a website. All of the page data for each page, along with data from other elements on other pages (like calendars, etc) is referenced in a table called pageContentCache. This table has normally has an index against created with the following:
alter table pageContentCache add
constraint [IX_pageContentCache] PRIMARY KEY CLUSTERED (
[objectId]
)
For some reason that to me would appear to be a duplicate objectId, an issue has started occurring with one instance of this software, resulting in the following error:
Msg 1505, Level 16, State 1 Procedure sp_rebuildPageContentCache, Line 50
The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.pageContentCache' and the index name 'IX_pageContentCache'. The duplicate key value is (21912).
So, to debug the issue, I had got the procedure to load all of the data it was going to input into the pageContentCache table into a temporary table, #contentcache, first, so I could have a look through it.
This is where I'm starting to get a little confused...
Once the data has been inserted into #contentcache (which has two columns, objectId and content), I can run the following SQL statement and it will return nothing:
select objectId, count(objectId) from #contentcache
group by objectId having count(objectId) > 1
This returns no records. If I then run the following SQL:
insert into pageContentCache (objectId, contentData)
select objectId, content
from #contentcache
This inserts all of the data from #contentcache into pageContentCache as you'd expect. However, if I then run the following SQL, it returns duplicates:
select objectId, count(objectId) from pageContentCache
group by objectId having count(objectId) > 1
This then returns duplicates:
objectId (no column name)
21912 2
There are no triggers or anything like that associated with this table and the insert statement is merely copying the data from one table to another, so... where is this duplicate coming from?
Try the following:
insert into pageContentCache (objectId, contentData)
select distinct objectId, content
from #contentcache
Can't see why you would have duplicates since, as you mentioned, there are no joins in your select statement. Anyways, my guess is that the distinct keyword will ensure that the duplicates are eliminated.
This is a SQL Server database error I have seen before. You may want to patch the latest service pack and retry.
I am not so sure that this statement does what you think it does:
select objectId, count(objectId) from #contentcache
group by objectId having count(objectId) > 1
Can you try this instead:
WITH SUBQUERY AS
( select
COUNT(objectId) OVER (PARTITION BY objectId) AS CNT_OBJECT_IDS,
objectId
FROM #contentcache)
SELECT * FROM SUBQUERY WHERE CNT_OBJECT_IDS > 1
See if this gets you any rows back.
Also, I've never worked with clusters before and I am wondering if they do some additional things that we are not aware of. Can you try just saying
PRIMARY KEY
instead of
PRIMARY KEY CLUSTERED
in your constraint definition and see if that affects your problem at all?

Change column to unique

I have a table with data, and I want to change some column to unique, it must to be unique, but I'm worried about have duplicated data in that column and it brings some problems to my database.
I want to know what happen if I change a column to unique that doesn't have unique data, I'll lost some records, just got an error message, or something else?
PS.: I'm using SQL Server
Thanks in advance.
You just won't be able to add a UNIQUE constraint on a COLUMN with duplicates datas.
In SSMS, the error message is something like that
The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name 'dbo.<Table>' and the index name '<constraintname>'. The duplicate key value is (<NULL>).
Could not create constraint. See previous errors.
So you can be quiet, you won't lose any data.
alter table YourTable add constraint UX_YourTable_YourColumn unique(YourColumn)
If there is duplicate data, the alter will abort without making any changes.
You can query duplicates like:
select YourColumn
, count(*) as DuplicateCount
from YourTable
group by
YourColumn
having count(*) > 1