Relationship of Primary Key and Clustered Index - sql

Can a TABLE have a primary key without a clustered index?
And can a TABLE have a clustered index without having a primary key?
Can anybody briefly tell me the relationship between primary key and clustered index?

A primary key is a logical concept - it's the unique identifier for a row in a table. As such, it has a bunch of attributes - it may not be null, and it must be unique. Of course, as you're likely to be searching for records by their unique identifier a lot, it would be good to have an index on the primary key.
A clustered index is a physical concept - it's an index that affects the order in which records are stored on disk. This makes it a very fast index when accessing data, though it may slow down writes if your primary key is not a sequential number.
Yes, you can have a primary key without a clustered index - and sometimes, you may want to (for instance when your primary key is a combination of foreign keys on a joining table, and you don't want to incur the disk shuffle overhead when writing).
Yes, you can create a clustered index on columns that aren't a primary key.

A table can have a primary key that is not clustered, and a clustered table does not require a primary key. So the answer to both questions is yes.
A clustered index stores all columns at the leaf level. That means a clustered index contains all data in the table. A table without a clustered index is called a heap.
A primary key is a unique index that is clustered by default. By default means that when you create a primary key, if the table is not clustered yet, the primary key will be created as a clustered unique index. Unless you explicitly specify the nonclustered option.
An example, where t1 has a nonclustered primary key, and t2 is not clustered but has a primary key:
create table t1 (id int not null, col1 int);
alter table t1 add constraint PK_T1 primary key nonclustered (id);
create clustered index IX_T1_COL1 on t1 (col1);
create table t2 (id int not null, col1 int);
alter table t2 add constraint PK_T2 primary key nonclustered (id);
Example at SQL Fiddle.

First of all, take a look at Index-Organized Tables and Clustered Indexes. Actually, I recommend reading the whole Use the Index Luke! site from the beginning until you reach the clustering topic to really understand what's going on.
Now, to your questions...
Can a TABLE have primary key without Clustered Index?
Yes, use NONCLUSTERED keyword when declaring your primary key to make a heap-based table. For example:
CREATE TABLE YOUR_TABLE (
YOUR_PK int PRIMARY KEY NONCLUSTERED
-- Other fields...
);
This is unfortunate, since a lot of people seem to just accept the default (which is CLUSTERED), even though in many cases a heap-based table would actually be better (as discussed in the linked article).
and Can a TABLE have Clustered Index without primary key?
Unlike some other DBMSes, MS SQL Server will let you have a clustering index that is different from primary key, or even without having the primary key at all.
The following example creates a clustering index separate from the PK, that has a UNIQUE constraint on top of it, which is what you'd probably want in most cases:
CREATE TABLE YOUR_TABLE (
YOUR_PK int PRIMARY KEY,
YOUR_CLUSTERED_KEY int NOT NULL UNIQUE CLUSTERED
-- Other fields...
);
If you choose a non-unique clustering index (using CREATE CLUSTERED INDEX ...), MS SQL Server will automatically make it unique by adding a hidden field to it.
Please note that the benefits of clustering are most visible for range scans. If you use a clustering index that doesn't "align" with range scans done by your client application(s) (such as when over-relying on the hidden column mentioned above, or clustering on a surrogate key), you are pretty much defeating the purpose of clustering.
Can anybody briefly tell me the relationship of primary key and clustered index?
Under MS SQL Server, primary key is also clustered by default. You can change that default, as discussed above.

Answers taken from MSDN Using Clustered Indexes
Can a TABLE have primary key without Clustered Index? - Yes.
Can a TABLE have Clustered Index without primary key? - Yes.
A Primary Key is a constraint that ensures uniqueness of the values, such that a row can always be identified specifically by that key.
An index is automatically assigned to a primary key (as rows are often "looked up" by their primary key).
A non-clustered index is a logical ordering of rows, by one (or more) of its columns. Think of it as effectively another "copy" of the table, ordered by whatever columns the index is across.
A clustered index is when the actual table is physically ordered by a particular column. A table will not always have a clustered index (ie while it'll be physically ordered by something, that thing might be undefined). A table cannot have more than one clustered index, although it can have a single composite clustered index (ie the table is physically ordered by eg Surname, Firstname, DOB).
The PK is often (but not always) a clustered index.

For what it may be worth, in MS SQL Server all columns in the primary key must be defined as NOT Null, while creating unique clustered index does not require this. Not sure about other DB systems though.

It might not relate as answer to this question, but some important aspects on primary key and Clustered Indexes are ->
If there is a primary key (By Default Which is Clustered Index, however we can change that) with Clustered Index, then we can not create one more clustered index for that table.
But if there is not a primary key set yet, and there is a clustered index, then we can't create a primary key with Clustered Index.

Related

Change clustered index without touching primary key

we have an existing database where we would like to change the clustered index to a unique, monotonically increasing field (as it should have been from the start), but we don't want to change the primary key because there is data referencing this primary key.
We have added a new column SequentialId and populated it with data, to serve as our new clustered index.
But how do we change the clustered index? If possible, we would like to either replace the existing clustered index OR add SequentialId to the current index as the first column.
How do we go about this? It seems we cannot change the clustered index without dropping the primary key (which we can't do).
Using the ALTER TABLE command drop the PRIMARY KEY constraint, which is not the same as dropping the CLUSTERED INDEX that is enforcing the PRIMARY KEY contraint, and recreate with the additional columns
ALTER TABLE <Table_Name>
DROP CONSTRAINT <constraint_name>
ALTER TABLE <Table_Name>
ADD CONSTRAINT <constraint_name> PRIMARY KEY (<Column1>,<Column2>)

Does a multicolumn constraint create indexes for each column or a unified index?

I have a unique constraint over two columns in an association table, e.g.:
CREATE TABLE PersonImage (
id serial PRIMARY KEY,
person_id integer REFERENCES Person (id),
image_id integer REFERENCES Image (id),
CONSTRAINT person_image_uc UNIQUE (person_id, image_id));
Does the constraint make indexes for person_id and image_id independently or does it create a unified index?
My aim is make it faster to search through person_id and image_id independently by creating indexes, but I don't want to create more overhead if this is automatically done in the constraint.
A UNIQUE constraint is almost, but not quite the same as a UNIQUE index on the same columns in the same order. But there are several subtle differences.
See:
How does PostgreSQL enforce the UNIQUE constraint / what type of index does it use?
Does a Postgres UNIQUE constraint imply an index?
Effectively, you get a single multicolumn unique index on (person_id, image_id) with your constraint. Since you also want to search on image_id independently, add another index on (image_id) or (image_id, person_id). Detailed explanation:
Is a composite index also good for queries on the first field?
This is functionally equivalent to:
create unique index unq_personimage_person_image on personimage(person_id, image_id);
The unique index is used to implement the constraint. Because the constraint in one two columns, both must be in the index.

Primary key constraint and unique constraint defined on the same column of a table

Today I came across a scenario where I observed that SQL Server actually allows us to create both a primary key constraint and a unique constraint on the same column. I expected that it wouldn't throw any error syntactically.
I tested it out and it seems to work fine.
Sample code:
CREATE TABLE testtable
(
id INT IDENTITY(1,1),
name VARCHAR(10),
CONSTRAINT PK_ID PRIMARY KEY (id),
CONSTRAINT uk_id UNIQUE (id)
)
I also saw that it created a PK constraint and unique constraint separately.
I wanted to get your thoughts on what would be the advantages of having this unique key separately created?
Would it be a good practice to always create a unique constraint along with the primary key? If the answer is "No", in what cases would it be advantageous.
I feel like its a very basic question, but I wanted to get some experts thoughts and advise.
Thank you.
A unique constraint is a unique index that's listed as a constraint object in the database.
This will exist separately from your clustered index (your primary key in this instance).
This can be of some benefit, e.g. for other table's foreign key lookups against this table, as the unique index created will be smaller the clustered index (if there are other columns in your table).
Adding nonclustered index on primary keys
Unique Constraints and Unique Indexes
I think you'd have a harder time finding an advantage if your primary key wasn't also the clustering key, in which case you would be adding a redundant unique nonclustered index (which is allowed even if one isn't a primary key).
The question has been edited to make the other two answers a bit misleading. The edit removed the explicit (and sort-of redundant) CLUSTERED declaration for the primary key.
In SQL Server, primary keys are not necessarily clustered. Primary keys have two characteristics:
They are NOT NULL
They are UNIQUE
In some databases, a PRIMARY KEY declaration necessarily creates a clustered index. In SQL Server, this is the default behavior, but it is not required:
CLUSTERED | NONCLUSTERED
Indicate that a clustered or a nonclustered index is created for the
PRIMARY KEY or UNIQUE constraint. PRIMARY KEY constraints default
to CLUSTERED, and UNIQUE constraints default to NONCLUSTERED.
Hence, the UNIQUE index is redundant to the PRIMARY KEY definition. I can only imagine that someone would create it first, and then decide to make the column a PRIMARY KEY, forgetting to remove the clustered index. One reason this might happen is if code used an INDEX hint with an explicit name for the index.
There is no advantage, the unique key is redundant with the clustered primary key.
Typically unique keys are used to enforce unique constraints that are not on the primary key, or foreign keys that require a different key than the primary key.

Does SQL Server creates Non clustered index by default

Ya, it is a duplicate of this. But I just needs a clarification on this article by Pinal Dave, which says the following:
Scenario 4: Primary Key defaults to Clustered Index with other index
defaults to Non-clustered index
In this case we will create two indexes on the both the tables but we
will not specify the type of the index on the columns. When we check
the results we will notice that Primary Key is automatically defaulted
to Clustered Index and another column as a Non-clustered index.
-- Case 4 Primary Key and Defaults
USE TempDB
GO
-- Create table
CREATE TABLE TestTable
(ID INT NOT NULL PRIMARY KEY,
Col1 INT NOT NULL UNIQUE)
GO
-- Check Indexes
SELECT OBJECT_NAME(OBJECT_ID) TableObject,
[name] IndexName,
[Type_Desc]
FROM sys.indexes
WHERE OBJECT_NAME(OBJECT_ID) = 'TestTable'
GO
-- Clean up
DROP TABLE TestTable
GO
The only indexes that get created automatically:
the clustered index on your primary key (unless you specify otherwise - if you define your primary key to be nonclustered, then a nonclustered index will be created)
a unique nonclustered index when you apply a UNIQUE CONSTRAINT to a column (or set of columns)
Just to spell it out - the Result of Pinal Dave's example are indexes similar to the following:
TestTable PK__TestTabl__3214EC2703317E3D CLUSTERED
TestTable UQ__TestTabl__A259EE55060DEAE8 NONCLUSTERED
Which can be explained as follows:
PK Clustered
If a table is created with a primary key, then it is a Clustered Table, and the Clustered Index is defaulted to the Primary Key unless you specify otherwise.
(Tables without a Clustered Index are Heaps)
UQ Nonclustered
SQL does not usually create any non-clustered indexes on a table by default.
However, as Marc has pointed out, because the table has a column with a UNIQUE constraint, (Col1 INT NOT NULL UNIQUE), MS SQL implements the constraint as a unique, non-clustered index on that column.
See also: Is the Sql Server Unique Key also an Index?

MySQL foreign key question

Does defining a foreign key also defines a index? I have mysql v5.1.46 & I am looking at the MySQL Administrator tool and its shows the foreign key as an index, so I wanted to confirm?
If there already is a usable index (an index where the foreign key columns are listed as the first columns in the same order) then a new index is not created.
If there is no usable index then creating a foreign key also creates an index.
This is covered in the documentation.
InnoDB requires indexes on foreign keys and referenced keys so that foreign key checks can be fast and not require a table scan. In the referencing table, there must be an index where the foreign key columns are listed as the first columns in the same order. Such an index is created on the referencing table automatically if it does not exist. (This is in contrast to some older versions, in which indexes had to be created explicitly or the creation of foreign key constraints would fail.) index_name, if given, is used as described previously.
Yes, MySQL 5.1 automatically creates an index on the referencing table when you define a foreign key constraint. MySQL requires an index on both the referencing table and referenced table for foreign keys.
Note however that the index is created automatically only on the referencing table, and not on the referenced table. MySQL won't allow you to create a foreign key that references a field in the referenced table that cannot use an index:
CREATE TABLE orders (
id int PRIMARY KEY,
code int,
name varchar(10)
) ENGINE=INNODB;
CREATE TABLE order_details (
detail_id int PRIMARY KEY,
order_code int,
value int,
FOREIGN KEY (order_code) REFERENCES orders(code)
) ENGINE=INNODB;
ERROR 1005 (HY000): Can't create table 'test.order_details'
This is not very common, since you'd often be creating foreign key constraints that reference the primary key of the referenced table, and primary keys are indexed automatically. However it is probably worth keeping in mind.
Creating an index on the code field of the orders table would solve the problem:
CREATE INDEX ix_orders_code ON orders(code);