SQL Server: Why Clustered Index Scan and not Table Scan? - sql

I have following table schema -
CREATE TABLE [dbo].[TEST_TABLE]
(
[TEST_TABLE_ID] [int] IDENTITY(1,1) NOT NULL,
[NAME] [varchar](40) NULL,
CONSTRAINT [PK_TEST_TABLE] PRIMARY KEY CLUSTERED
(
[TEST_TABLE_ID] ASC
)
)
I have inserted huge data in TEST_TABLE.
As I have marked TEST_TABLE_ID column as primary key, clustered index will be created on TEST_TABLE_ID.
When I am running following query, execution plan is showing Clustered Index Scan which is expected.
SELECT * FROM TEST_TABLE WHERE TEST_TABLE_ID = 34
But, when I am running following query I was expecting Table Scan as NAME column does not have any index:
SELECT * FROM TEST_TABLE WHERE NAME LIKE 'a%'
But in execution plan it is showing Clustered Index Scan.
As NAME column does not have any index why it is accessing the clustered index?
I believe, this is happening as clustered index resides on data pages.
Can anyone tell me if my assumption is correct? Or is there any other reason?

A clustered index is the index that stores all the table data. So a table scan is the same as a clustered index scan.
In a table without a clustered index (a "heap"), a table scan requires crawling through all data pages. That is what the query optimizer calls a "table scan".

As others explained already, for a table that has a clustered index, a Clustered Index Scan means a Table Scan.
In other words, the table is the clustered index.
What you have wrong is your first query execution plan:
SELECT *
FROM TEST_TABLE
WHERE TEST_TABLE_ID = 34 ;
It does a Clustered Index Seek and not a Scan. It doesn't have to search (scan) the whole table (clustered index), it goes directly to the point (seeks) and checks if a row with id=34 exists.
You can see a simple test in SQL-Fiddle, and how the two execution plans differ.

The table is stored as a clustered index. The only way to scan the table is to scan the clustered index. Only tables with no clustered index can have a "table scan" per se.

It is because this table has a clustered index and it will scan the entire clustered index to return all the rows base on the where clause. How ever you should be seeing a missing index message.

When you build a Clustered Index on a table, then SQL Server logically orders the rows of that table based on the Clustered Index Key, which in your case is Test_Table_ID.
However, when you see the Clustered Index Scan operator, this COULD be a little misleading. If certain conditions are met, (which equate to SQL Server not caring about the order of the data) then SQL Server is still able to perform an unordered allocation scan, which is more similar to a table scan than an clustered index scan, as it actually reads the leaf level of the CI (the tables data pages) in allocation order, based on the IAM chain, as opposed to following the pointers in the index. This can potentially give you a performance improvement, as fragmentation (pages being out of physical order) does not decrease performance
To see if this is happening, look at the Ordered property in the execution plan. If this is set to False, then you have an unordered allocation scan.

Related

Why the differences in execution plan between nonclustered and clustered index in SQL Server?

Please explain why the below differences between non clustered and clustered index.
First I am running the below two select statements.
select *
from [dbo].[index_test2]
where id = 1 -- Nonclustered index on id column
select *
from [dbo].[index_test1]
where id = 1 -- Clustered index on id column
Execution plan shows "Table scan" for the first query and "Clustered index seek (clustered)" for the second query.
Then I am running below two statements.
select id
from [dbo].[index_test2]
where id = 1 -- Nonclustered index on id column
select id
from [dbo].[index_test1]
where id = 1 -- Clustered index on id column
Execution plan shows "Index seek (NonClustered)" for the first query and "Clustered index seek (Clustered)" for the second query.
You can see from the above two cases, when using clustered index it is going for "Index seek" but for in case of NonClustered index it shows "Table scan" (executed with *) and it shows "Index seek (NonClustered)" (executing with index applied column-id).
Can any one clarify why the NonClustered index reacting differently on both cases?
A clustered index defines the order in which data is physically stored in a table but A non-clustered index doesn’t sort the physical data inside the table.In fact, a non-clustered index is stored at one place and table data is stored in another place.
If you use an Non-Clustered Index it works in Index seek (NonClustered) mode when you call it property,but If you put where in Non-Clustered Index mode but call in select more expressions that are not Cover index change mode to Table scan
Indexes with included columns provide the greatest benefit when covering the query. This means that the index includes all columns referenced by your query, as you can add columns with data types, number or size not allowed as index key columns
But in Clustered Index, since the actual sorting is done by it, you do both in Clustered index seek (clustered) mode.

Could not Get ClusteredIndexSeek for a where with primary key

I have created a sample table with the clustered index as below and inserted 1500 records.
CREATE CLUSTERED INDEX IX_mytable_myid ON dbo.MyTable(myid)
When I execute the below query, I could see the execution plan having Clustered Index Scan instead of seek. I am not sure why the index table is scanned.
SELECT myid FROM dbo.MyTable WHERE myid=1666
Apologies. I identified that through the warning symbol in execution plan and found that myid field is actually a varchar and an implicit conversion happens, which forces to do a scan and not a seek.
Upon querying like this
SELECT myid FROM dbo.MyTable WHERE myid='1666'
it does the seek.

If available why does the processor not use a clustered index scan

So I have this structure on a user table:
But if I run the following code:
select count(*)
from WH.dbo.tb_DimUserAccount
It seems to go for the Non-Unique Non-Clustered index ix_DimUserAccount_UserType:
This is the Index Scan:
Why doesn't it go for a scan of ix_DimUserAccount_Unique ? Should I change my code to somehow use a different index?
Because the non clustered index is probably narrower than the clustered index and it is cheaper to scan than the clustered index (fewer pages to read).
The NCI leaf pages just contain values for the index keys and any included columns. The clustered index leaf pages need to contain values (or pointers to the values) for all columns in the table.
Thus the clustered index will (assuming equal fill factors) generally fit fewer rows per page than an NCI (except for the case where an NCI includes all columns in the table)

Query not doing index seek or scan

I am fairly new to Indexes. I have table following table [FORUM1]
[msg_id] [int] IDENTITY(1,1) NOT NULL,
[cat_id] [int] NULL,
[msg_title] [nvarchar](255) NULL
And have created a non clustered index
CREATE NONCLUSTERED INDEX catindex ON forum1(cat_id)
Now when i run this simple query, i can see index is not being used
SELECT msg_title FROM forum1 where cat_id=4
Index only gets called if i create CI and include the MSG_TITLE fld. But the issue is that i have to run many more similar queries on actually table like date=something, userid=20, status=1. So including columns in every index doesn't good to me .
The msg_title is not contained in the index -> any value found in the non-clustered index will need a key lookup into the actual data pages, which is an expensive operation - so therefore, most likely, a table scan is quicker. Plus: the "table scan" indicates you have a heap - a table without a clustered index - which is a bad thing (most of the time) to begin with. Why don't you have a clustered index?
You can fix this by e.g. including the msg_title in your index:
CREATE NONCLUSTERED INDEX catindex
ON forum1(cat_id) INCLUDE(msg_title)
and now, I'm pretty sure, SQL Server will use that index (since it can find all the data needed for the query in the index structure - the index is said to be a covering index). The benefit here is: the extra column is only included in the leaf level of the index, so it makes the index only minimally bigger. Yet, it can lead to the index being used just all that more often. Well worth it!

What is advantages of non clustered index over primary key (clustered index)

i have got a table (stores data of forum, means normally no edit and update just insert) on which i have a primary key column which is as we know a clustered index.
please tell me, will i get any advantage if i creates a non-clustered index on that column (primary key column)?
EDIT: my table has got currently around 60000 records, what will be better to place non-clustered index on it or create a same new table and create index and then copy records from old to new table.
Thanks
Every table should have a clustered index
Non-clustered indexes allow INCLUDEs which is very useful
Non-clustered indexes allow filtering in SQL Server 2008+
Notes:
Primary key is a constraint which happens to be a clustered index by default
One clustered index only, many non-clustered indexes
One advantage: you can INCLUDE other columns in the index.
A clustered index specifies the physical storage order of the table data (this is why there can only be one clustered index per table).
If there is no clustered index, inserts will typically be faster since the data doesn't have to be stored in a specific order but can just be appended at the end of the table.
On the other hand, index searches on the key column will typically be slower, since the searches cannot use the advantages of the clustered index.
The only possible advantage that I can see could be from the fact that the entries on leaf pages of nonclustered index are not as wide. They only contain index columns while the clustered index' leaf pages are the actual rows of data. Therefore, if you need something like select count(your_column_name) from your_table then scanning the nonclustered index will involve considerably smaller number of data pages. Or if the number of index columns is greater than one and you run any query which does not need data from non-indexed columns then again, nonclustered index scan will be faster.