SQL Server Creating Index over Index [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Just wanted to know what happens internally in SQL Server when we create,
Non Clustered over a Clustered Index.
Clustered Index over Non Clustered Index.
Non Clustered Index Over Non Clustered Index.
Please comment.

For all of the execution plans I have used the same base table filled with meaningless data:
CREATE TABLE dbo.T (ID INT NOT NULL, Filler CHAR(200) NULL);
INSERT dbo.T (ID, Filler)
SELECT ROW_NUMBER() OVER(ORDER BY Object_ID),
CAST(NULLIF(ABS(Object_ID) % 10, 0) AS CHAR(200))
FROM sys.all_objects
1. What happens internally in SQL Server when we create a Non Clustered over a Clustered Index.
This is just a normal create index process, so sorts the data in the manor specified and creates the relevant nodes up from the leaf nodes. The leaf node will store a "pointer" back to the clustered index to allow for key lookups.
So once the clustered index is in place on dbo.T (ID) the execute plan for creating the non-clustered index shows the sort:
And hovering over the sort shows that it is ordering by Filler, then adding it's own sort to ensure the sort is deterministic:
2. What happens internally in SQL Server when we create a Clustered Index over Non Clustered Index.
I think to explain this properly I need to explain how a clustered index on a table with no clustered index would work. A table with no clustered index is called a "Heap" table, this just means that the data is stored in no specific order and is usually just stored in the order it is inserted in. Essentailly SQL Server builds its own clustered index clustered on an internal column RowID, however without the constraints of an explicit clustered index it is free to move the data around if and when it sees fit (More reading on Forwarding records), the non-clustered index will then store a the rowID at the leaf level so it has a way of performing lookups.
When you then create a clustered index on a heap table the table has to be rebuilt, ordering by the columns you have specified, this means that all indexes are also dropped and rebuilt. to show this I first added the non clustered index to dbo.T:
CREATE NONCLUSTERED INDEX IX_T_Filler ON dbo.T (Filler);
Unlike above you can see that a table scan is done as there is no clustered index to use, and the sort done when creating the index does not include the ID column as it did above:
Then afterwards add the clustered index:
CREATE CLUSTERED INDEX IX_T_ID ON dbo.T (ID);
You can see in the execution plan that the nonclustered index is also rebuilt so the leaf will point to the new clustered index rather than the row ID as it did previously. (Note the second query is the same as in the first part when the nonclustered index was built on the clustered index)
3. What happens internally in SQL Server when we create a Non Clustered Index Over Non Clustered Index.
Nonclustered indexes are completely independent of each other, so this is the same as 1 (or the first part of 2 if there is no clustered key), i.e. It does not matter how many non clustered indexes already exist, the method of creating a new one remains the same.

Related

SQL: Add Primary Key to Non-Unique Index

Let's say a query is filtering on two fields and returning primary key values.
SELECT RowIdentifier
FROM Table
WHERE QualifierA = 'exampleA' AND QualifierB = 'exampleB'
Assuming the clustered index is not the PrimaryKey would a non-unique index that contains QualifierA and QualiferB be best served via the addition of the RowIdentifier(Scenario A & Scenario B). Or would it be more appropriate to simply include it(Scenario C)?
Scenario A: Non-Unique, Non-Clustered
CREATE NONCLUSTERED INDEX IX_Table_QualifierA
ON [dbo].[Table] ([QualifierA],[QualifierB],[RowIdentifier])
Scenario B: Unique, Non-Clustered
CREATE UNIQUE NONCLUSTERED INDEX IX_Table_QualifierA
ON [dbo].[Table] ([QualifierA],[QualifierB],[RowIdentifier])
Scenario C:
CREATE NONCLUSTERED INDEX IX_Table_QualifierA
ON [dbo].[Table] ([QualifierA],[QualifierB])
INCLUDE ([RowIdentifier])
Finally I'm assuming that if the PrimaryKey were the clustered index that neither is necessary, is this accurate?
If there is a CLUSTERED index, it is automatically included in all indexes on the table. You can explicitly include it but it is not required.
The UNIQUE index simply enforces uniqueness. The PK should already have this constraint. You do not need to re-enforce it in every index.
If you are including the PK in your where clause, it will almost certainly use the PK index to find that row because it is guaranteed to return the fewest results, so including in your index gains you nothing for lookups. It could also potentially skew the cardinality engine and make SQL think the index is more distinct than it really is.
For the above reasons, I would select Option C
CREATE NONCLUSTERED INDEX IX_Table_QualifierA
ON [dbo].[Table] ([QualifierA],[QualifierB])
INCLUDE ([RowIdentifier])
I would use this regardless of what column is clustered. This will give you the performance, insure the index will continue to perform regardless of the CLUSTERED INDEX, and make it explicit what the index is used for.
I'm wondering what's more appropriate? A non-clustered unique index incorporating all three fields, or a non-clustered non-unique index incorporating just the two fields(QualifierA & QualifierB) but including the PrimaryKey.
There's a third option. A non-clustered, non-unique index incorporating all three fields.
When you make an index, the fields in the index are duplicated to another place in memory so the server can go after those fields with ease. If you only have QualiferA and Qualifier B in the index it will find the rows in that index that meet your criteria and then go back to the main table to pick up the RowIdentifier. Instead, include all three in there to improve performance.
Remember, make sure you put QualifierA and QualifierB before RowIdentifier in your index. The order of the columns determine how the data is ordered.
Try it out with some test data if you like, and look at the query plan to see what it's doing.

Covering index adventureworks 2008 R2 - Why is column WorkorderID not included?

A covering non-clustered index is created to meet the requirements of a given query.
If one column is not present in an index then SQL server will require to execute a key lookup. To prevent a key lookup, a covering index is created, but what I don't understand is why the following is a covering index while one of the columns is not included.
Database: Adventureworks 2008 R2
Table: Production.WorkOrder
Index name: IX_WorkOrder_ProductID
Query:
SELECT WorkOrderID,StartDate
FROM Production.WorkOrder
WHERE ProductID = 757;
The index IX_WorkOrder_ProductID starts with only the column ProductID
It's dropped and re-created as follows:
CREATE INDEX IX_WorkOrder_ProductID
ON Production.WorkOrder (ProductID)
INCLUDE (StartDate);
After executing the actual execution plan uses index seek (non-clustered) with a cost of
100%.
My question is: Why is it not required to include the column WorkOrderID to the index
IX_WorkOrder_ProductID also? Why is it a covering index without WorkOrderID ?
Since WorkOrderID is the clustered index on that table Production.WorkOrder, it is already and automatically included in every single non-clustered index that you create on that table.
There's really no need to include that again - it's already there.
So your new index IX_WorkOrder_ProductID is in fact covering the query - the WorkOrderID is present due to the fact the clustering key is present in every non-clustered index anyway, and the ProductID column is part of the index definition.
Since the clustering index key(s) is (are) included in every single non-clustered index on that table is one more reason why the clustering key should be chosen very carefully, and should be as small as possible - ideally an INT or BIGINT.

SQL Server: Why Clustered Index Scan and not Table Scan?

I have following table schema -
CREATE TABLE [dbo].[TEST_TABLE]
(
[TEST_TABLE_ID] [int] IDENTITY(1,1) NOT NULL,
[NAME] [varchar](40) NULL,
CONSTRAINT [PK_TEST_TABLE] PRIMARY KEY CLUSTERED
(
[TEST_TABLE_ID] ASC
)
)
I have inserted huge data in TEST_TABLE.
As I have marked TEST_TABLE_ID column as primary key, clustered index will be created on TEST_TABLE_ID.
When I am running following query, execution plan is showing Clustered Index Scan which is expected.
SELECT * FROM TEST_TABLE WHERE TEST_TABLE_ID = 34
But, when I am running following query I was expecting Table Scan as NAME column does not have any index:
SELECT * FROM TEST_TABLE WHERE NAME LIKE 'a%'
But in execution plan it is showing Clustered Index Scan.
As NAME column does not have any index why it is accessing the clustered index?
I believe, this is happening as clustered index resides on data pages.
Can anyone tell me if my assumption is correct? Or is there any other reason?
A clustered index is the index that stores all the table data. So a table scan is the same as a clustered index scan.
In a table without a clustered index (a "heap"), a table scan requires crawling through all data pages. That is what the query optimizer calls a "table scan".
As others explained already, for a table that has a clustered index, a Clustered Index Scan means a Table Scan.
In other words, the table is the clustered index.
What you have wrong is your first query execution plan:
SELECT *
FROM TEST_TABLE
WHERE TEST_TABLE_ID = 34 ;
It does a Clustered Index Seek and not a Scan. It doesn't have to search (scan) the whole table (clustered index), it goes directly to the point (seeks) and checks if a row with id=34 exists.
You can see a simple test in SQL-Fiddle, and how the two execution plans differ.
The table is stored as a clustered index. The only way to scan the table is to scan the clustered index. Only tables with no clustered index can have a "table scan" per se.
It is because this table has a clustered index and it will scan the entire clustered index to return all the rows base on the where clause. How ever you should be seeing a missing index message.
When you build a Clustered Index on a table, then SQL Server logically orders the rows of that table based on the Clustered Index Key, which in your case is Test_Table_ID.
However, when you see the Clustered Index Scan operator, this COULD be a little misleading. If certain conditions are met, (which equate to SQL Server not caring about the order of the data) then SQL Server is still able to perform an unordered allocation scan, which is more similar to a table scan than an clustered index scan, as it actually reads the leaf level of the CI (the tables data pages) in allocation order, based on the IAM chain, as opposed to following the pointers in the index. This can potentially give you a performance improvement, as fragmentation (pages being out of physical order) does not decrease performance
To see if this is happening, look at the Ordered property in the execution plan. If this is set to False, then you have an unordered allocation scan.

What are the difference between clustered and a non-clustered index? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What are the differences between a clustered and a non-clustered index?
What are the difference between clustered and a non-clustered index?
What are the differences between a clustered and a non-clustered index?
A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.
A non clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.

What is advantages of non clustered index over primary key (clustered index)

i have got a table (stores data of forum, means normally no edit and update just insert) on which i have a primary key column which is as we know a clustered index.
please tell me, will i get any advantage if i creates a non-clustered index on that column (primary key column)?
EDIT: my table has got currently around 60000 records, what will be better to place non-clustered index on it or create a same new table and create index and then copy records from old to new table.
Thanks
Every table should have a clustered index
Non-clustered indexes allow INCLUDEs which is very useful
Non-clustered indexes allow filtering in SQL Server 2008+
Notes:
Primary key is a constraint which happens to be a clustered index by default
One clustered index only, many non-clustered indexes
One advantage: you can INCLUDE other columns in the index.
A clustered index specifies the physical storage order of the table data (this is why there can only be one clustered index per table).
If there is no clustered index, inserts will typically be faster since the data doesn't have to be stored in a specific order but can just be appended at the end of the table.
On the other hand, index searches on the key column will typically be slower, since the searches cannot use the advantages of the clustered index.
The only possible advantage that I can see could be from the fact that the entries on leaf pages of nonclustered index are not as wide. They only contain index columns while the clustered index' leaf pages are the actual rows of data. Therefore, if you need something like select count(your_column_name) from your_table then scanning the nonclustered index will involve considerably smaller number of data pages. Or if the number of index columns is greater than one and you run any query which does not need data from non-indexed columns then again, nonclustered index scan will be faster.