clustered index on a column which has duplicate values

clustered index on a column which has duplicate values - sql

I have a table with no index. I need to add a clustered index on one column but the table does not have any column having unique data.Will this allow me to add clustered index on a duplicate column?

A clustered index does not enforce uniqueness unless you specify the keyword UNIQUE.
CREATE CLUSTERED INDEX bob ON foo( bar )
is not the same as
CREATE UNIQUE CLUSTERED INDEX bob on foo( bar )
You may be thinking of a PRIMARY KEY constraint in a CREATE TABLE statement.
In this example:
CREATE TABLE foo ( bar PRIMARY KEY )
ASE will create a UNIQUE, CLUSTERED index on bar.

Related

Change clustered index without touching primary key

we have an existing database where we would like to change the clustered index to a unique, monotonically increasing field (as it should have been from the start), but we don't want to change the primary key because there is data referencing this primary key.
We have added a new column SequentialId and populated it with data, to serve as our new clustered index.
But how do we change the clustered index? If possible, we would like to either replace the existing clustered index OR add SequentialId to the current index as the first column.
How do we go about this? It seems we cannot change the clustered index without dropping the primary key (which we can't do).

Using the ALTER TABLE command drop the PRIMARY KEY constraint, which is not the same as dropping the CLUSTERED INDEX that is enforcing the PRIMARY KEY contraint, and recreate with the additional columns
ALTER TABLE <Table_Name>
DROP CONSTRAINT <constraint_name>
ALTER TABLE <Table_Name>
ADD CONSTRAINT <constraint_name> PRIMARY KEY (<Column1>,<Column2>)

Specifying existing non clustered unique index when defining a primary key constraint

I have a heap table - no clustered index defined - (call it table A), with a unique non clustered index on a non nullable column (call the column ID and the index IX).
I would like to use index IX when defining the primary key constraint on column ID for table A.
The documentation somewhere says this:
The Database Engine automatically creates a unique index to enforce the uniqueness requirement of the PRIMARY KEY constraint. If a clustered index does
not already exist on the table or a nonclustered index is not explicitly specified, a unique, clustered index is created to enforce the PRIMARY KEY constraint.
I've read through the entire ALTER TABLE documentation but there seems to be no syntax for "nonclustered index is ... explicitly specified, ".
Have tried defining the nonclustered index IX and specifying primary key, and have also tried various combinations of the alter table ... add constraint ... primary key statement to no avail.
It makes sense that my index IX is equivalent to the nonclustered index that SQL Server creates when I simply specify the ID column in the alter table .... add constraint .... primary key (ID) statement, but I would prefer not having this redundant index which SQL Server creates for me, and rather make it use the index IX which also consists of a include list of columns.
If I drop the index that SQL Server creates then the primary key constraint also vanishes.
If it were possible to alter the index that SQL Server creates my problem would be solved, but the alteration I would like to do to it requires a drop and recreate.

There is no way to create a constraint and associate it with an existing index that already guarantees the constraint.
This functionality does exist in other RDBMS. It would be particularly useful for the supertype/subtype pattern as this requires creating unique indexes on both Id and (Id, Type) even though the latter one (required for the FK) is logically ensured by the first.
It is possible to replace the Unique index with a Unique constraint as a metadata only change using ALTER TABLE ... SWITCH but attempting to do the same with a nonclustered PK constraint fails with
ALTER TABLE SWITCH statement failed. There is no identical index in
source table 'A' for the index 'IX' in target table 'B'.
The code that performs this for a unique constraint is
Initial Position
CREATE TABLE dbo.A(ID INT NOT NULL, OtherCols VARCHAR(200));
CREATE UNIQUE NONCLUSTERED INDEX IX ON dbo.A(ID);
INSERT INTO dbo.A VALUES (1,'A'),(2,'B');
Replace unique index with unique constraint
SET XACT_ABORT ON;
BEGIN TRAN;
CREATE TABLE dbo.B
(
ID INT NOT NULL CONSTRAINT IX UNIQUE NONCLUSTERED,
OtherCols VARCHAR(200)
);
ALTER TABLE dbo.A
SWITCH TO dbo.B;
DROP TABLE dbo.A;
EXECUTE sp_rename
N'dbo.B',
N'A',
'OBJECT';
COMMIT;

Does SQL Server creates Non clustered index by default

Ya, it is a duplicate of this. But I just needs a clarification on this article by Pinal Dave, which says the following:
Scenario 4: Primary Key defaults to Clustered Index with other index
defaults to Non-clustered index
In this case we will create two indexes on the both the tables but we
will not specify the type of the index on the columns. When we check
the results we will notice that Primary Key is automatically defaulted
to Clustered Index and another column as a Non-clustered index.
-- Case 4 Primary Key and Defaults
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL UNIQUE)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc]
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO

The only indexes that get created automatically:
the clustered index on your primary key (unless you specify otherwise - if you define your primary key to be nonclustered, then a nonclustered index will be created)
a unique nonclustered index when you apply a UNIQUE CONSTRAINT to a column (or set of columns)

Just to spell it out - the Result of Pinal Dave's example are indexes similar to the following:
TestTable PK__TestTabl__3214EC2703317E3D CLUSTERED
TestTable UQ__TestTabl__A259EE55060DEAE8 NONCLUSTERED
Which can be explained as follows:
PK Clustered
If a table is created with a primary key, then it is a Clustered Table, and the Clustered Index is defaulted to the Primary Key unless you specify otherwise.
(Tables without a Clustered Index are Heaps)
UQ Nonclustered
SQL does not usually create any non-clustered indexes on a table by default.
However, as Marc has pointed out, because the table has a column with a UNIQUE constraint, (Col1 INT NOT NULL UNIQUE), MS SQL implements the constraint as a unique, non-clustered index on that column.
See also: Is the Sql Server Unique Key also an Index?

Relationship of Primary Key and Clustered Index

Can a TABLE have a primary key without a clustered index?
And can a TABLE have a clustered index without having a primary key?
Can anybody briefly tell me the relationship between primary key and clustered index?

A primary key is a logical concept - it's the unique identifier for a row in a table. As such, it has a bunch of attributes - it may not be null, and it must be unique. Of course, as you're likely to be searching for records by their unique identifier a lot, it would be good to have an index on the primary key.
A clustered index is a physical concept - it's an index that affects the order in which records are stored on disk. This makes it a very fast index when accessing data, though it may slow down writes if your primary key is not a sequential number.
Yes, you can have a primary key without a clustered index - and sometimes, you may want to (for instance when your primary key is a combination of foreign keys on a joining table, and you don't want to incur the disk shuffle overhead when writing).
Yes, you can create a clustered index on columns that aren't a primary key.

A table can have a primary key that is not clustered, and a clustered table does not require a primary key. So the answer to both questions is yes.
A clustered index stores all columns at the leaf level. That means a clustered index contains all data in the table. A table without a clustered index is called a heap.
A primary key is a unique index that is clustered by default. By default means that when you create a primary key, if the table is not clustered yet, the primary key will be created as a clustered unique index. Unless you explicitly specify the nonclustered option.
An example, where t1 has a nonclustered primary key, and t2 is not clustered but has a primary key:
create table t1 (id int not null, col1 int);
alter table t1 add constraint PK_T1 primary key nonclustered (id);
create clustered index IX_T1_COL1 on t1 (col1);
create table t2 (id int not null, col1 int);
alter table t2 add constraint PK_T2 primary key nonclustered (id);
Example at SQL Fiddle.

First of all, take a look at Index-Organized Tables and Clustered Indexes. Actually, I recommend reading the whole Use the Index Luke! site from the beginning until you reach the clustering topic to really understand what's going on.
Now, to your questions...
Can a TABLE have primary key without Clustered Index?
Yes, use NONCLUSTERED keyword when declaring your primary key to make a heap-based table. For example:
CREATE TABLE YOUR_TABLE (
YOUR_PK int PRIMARY KEY NONCLUSTERED
-- Other fields...
);
This is unfortunate, since a lot of people seem to just accept the default (which is CLUSTERED), even though in many cases a heap-based table would actually be better (as discussed in the linked article).
and Can a TABLE have Clustered Index without primary key?
Unlike some other DBMSes, MS SQL Server will let you have a clustering index that is different from primary key, or even without having the primary key at all.
The following example creates a clustering index separate from the PK, that has a UNIQUE constraint on top of it, which is what you'd probably want in most cases:
CREATE TABLE YOUR_TABLE (
YOUR_PK int PRIMARY KEY,
YOUR_CLUSTERED_KEY int NOT NULL UNIQUE CLUSTERED
-- Other fields...
);
If you choose a non-unique clustering index (using CREATE CLUSTERED INDEX ...), MS SQL Server will automatically make it unique by adding a hidden field to it.
Please note that the benefits of clustering are most visible for range scans. If you use a clustering index that doesn't "align" with range scans done by your client application(s) (such as when over-relying on the hidden column mentioned above, or clustering on a surrogate key), you are pretty much defeating the purpose of clustering.
Can anybody briefly tell me the relationship of primary key and clustered index?
Under MS SQL Server, primary key is also clustered by default. You can change that default, as discussed above.

Answers taken from MSDN Using Clustered Indexes
Can a TABLE have primary key without Clustered Index? - Yes.
Can a TABLE have Clustered Index without primary key? - Yes.
A Primary Key is a constraint that ensures uniqueness of the values, such that a row can always be identified specifically by that key.
An index is automatically assigned to a primary key (as rows are often "looked up" by their primary key).
A non-clustered index is a logical ordering of rows, by one (or more) of its columns. Think of it as effectively another "copy" of the table, ordered by whatever columns the index is across.
A clustered index is when the actual table is physically ordered by a particular column. A table will not always have a clustered index (ie while it'll be physically ordered by something, that thing might be undefined). A table cannot have more than one clustered index, although it can have a single composite clustered index (ie the table is physically ordered by eg Surname, Firstname, DOB).
The PK is often (but not always) a clustered index.

For what it may be worth, in MS SQL Server all columns in the primary key must be defined as NOT Null, while creating unique clustered index does not require this. Not sure about other DB systems though.

It might not relate as answer to this question, but some important aspects on primary key and Clustered Indexes are ->
If there is a primary key (By Default Which is Clustered Index, however we can change that) with Clustered Index, then we can not create one more clustered index for that table.
But if there is not a primary key set yet, and there is a clustered index, then we can't create a primary key with Clustered Index.

How to drop clustered property but retain primary key in a table. SQL Server 2005

i have the following key:
ALTER TABLE dbo.Table ADD CONSTRAINT PK_ID PRIMARY KEY CLUSTERED
(
ID ASC
)
so i have clustered index and primary key on ID column.
Now i need to drop clustered index (i want to create new clustered index on another column), but retain primary key.
Is it possible?

It's not possible in one statement, but because DDL is transactional in MSSQL, you can simply do everything inside a transaction to prevent other sessions accessing the table while it has no primary key:
begin tran
alter table dbo.[Table] drop constraint pk_id
alter table dbo.[Table] add constraint pk_id primary key nonclustered (id)
commit tran

It is not possible, as the index is a physical implementation of the constraint.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas