Why does SQL ignore an index hint and opt for a different index?

Why does SQL ignore an index hint and opt for a different index? - sql

Given a table that has two indexes on it, one sorted in the reverse from the other and given these two queries.
Select value From SomeTable wITH (INDEX(IV_Sort_Asc))
Select value From SomeTable wITH (INDEX(IV_Sort_Desc))
I've come across a case in SQL Server 2008 where the hints are ignored and in both cases the IV_Sort_Desc index is used instead of the first one.
I realize many people will immediately suggest to not supply the hint, however given my specific case this is not an option.
What would cause this and what can I do to fix it? Surely you would expect SQL Server to honour an index hint and not use a different one?

I ran into the same problem when I wanted SQL to use an index on a view. It turned out I had to use the NOEXPAND option as well:
WITH (FORCESEEK, INDEX (IndexName),NOEXPAND)
https://technet.microsoft.com/en-us/library/bb510478%28v=sql.105%29.aspx

Related

Strange Clustered Key Lookup on a Filtered Index in SQL Server 2014 when using IS NULL [duplicate]

Assume I'm running a website that shows funny cat pictures. I have a table called CatPictures with the columns Filename, Awesomeness, and DeletionDate, and the following index:
create nonclustered index CatsByAwesomeness
on CatPictures (Awesomeness)
include (Filename)
where DeletionDate is null
My main query is this:
select Filename from CatPictures where DeletionDate is null and Awesomeness > 10
I, as a human being, know that the above index is all that SQL Server needs, because the index filter condition already ensures the DeletionDate is null part.
SQL Server however doesn't know this; the execution plan for my query will not use my index:
Even if adding an index hint, it will still explicitly check DeletionDate by looking at the actual table data:
(and in addition complain about a missing index that would include DeletionDate).
Of course I could
include (Filename, DeletionDate)
instead, and it will work:
But it seems a waste to include that column, since this just uses up space without adding any new information.
Is there a way to make SQL Server aware that the filter condition is already doing the job of checking DeletionDate?

No, not currently.
See this connect item. It is Closed as Won't Fix. (Or this one for the IS NULL case specifically)
The connect item does provide a workaround shown below.
Posted by RichardB CFCU on 29/09/2011 at 9:15 AM
A workaround is to INCLUDE the column that is being filtered on.
Example:
CREATE NONCLUSTERED INDEX [idx_FilteredKey1] ON [dbo].[TABLE]
(
[TABLE_ID] ASC,
[TABLE_ID2] ASC
)
INCLUDE ( [REMOVAL_TIMESTAMP]) --explicitly include the column here
WHERE ([REMOVAL_TIMESTAMP] IS NULL)

Is there a way to make SQL Server aware that the filter condition is
already doing the job of checking DeletionDate?
No.
Filtered indexes were designed to solve certain problems, not ALL. Things evolve and some day, you may see SQL Server supporting the feature you expect of filtered indexes, but it is also possible that you may never see it.
There are several good reasons I can see for how it works.
What it improves on:
Storage. The index contains only keys matching the filtering condition
Performance. A shoo-in from the above. Less to write and fewer pages = faster retrieval
What it does not do:
Change the query engine radically
Putting them together, considering that SQL Server is a heavily pipelined, multi-processor parallelism capable beast, we get the following behaviour when dealing with servicing a query:
Pre-condition to the query optimizer selecting indexes: check whether a Filtered Index is applicable against the WHERE clause.
Query optimizer continues it's normal work of determining selectivity from statistics, weighing up index->bookmark lookup vs clustered/heap scan depending on whether the index is covering etc
Threading the condition against the filtered index into the query optimizer "core" I suspect is going to be a much bigger job than leaving it at step 1.
Personally, I respect the SQL Server dev team and if it were easy enough, they might pull it into a not-too-distant sprint and get it done. However, what's there currently has achieved what it was intended to and makes me quite happy.

Just found that "gap in functionality", it's really sad that filtered indexes are ignored by optimizer.
I think I'll try to use indexed views for that, take a look at this article
http://www.sqlperformance.com/2013/04/t-sql-queries/optimizer-limitations-with-filtered-indexes

How to use index in SQL query

Well i am new to this stuff ..I have created an index in my SP at start like follows
Create Index index_fab
ON TblFab (Fab_name)
Now i have query under this
select fab_name from TblFab where artc = 'x' and atelr = 'y'.
now Is it necessary to use this index name in select clause or it will automatically used to speed up queries
Do i have to use something like
select fab_name from TblFab WITH(INDEX(index_fab)) where artc = 'x' and atelr = 'y'.
or any other method to use this index in query
and also how to use index if we are using join on this table?

Firstly, do you mean you're creating the index in a stored procedure? That's a bad idea - if you run the stored procedure twice, it will fail because the index already exists.
Secondly, your query doesn't use the column mentioned in the index, so it will have no impact.
Thirdly, as JodyT writes, the query analyzer (SQL Server itself) will decide which index to use; it's almost certainly better at it than you are.
Finally, to speed up the query you mention, create an index on columns artc and atelr.

The Query Optimizer of SQL Server will decide if it the index is suitable for the query. You can't force it to use a specific index. You can give hints on which you want it to use but it won't be a guarantee that it will use it.

As the other people answered your question to help you to understand better, my opinion is, you should first understand why you need to use indexes. As we know that indexes increase the performance , they could also cause performance issues as well. Its better to know when you need to use indexes, why you need to use indexes instead of how to use indexes.
You can read almost every little detail from here .
Regarding your example, your query's index has no impact. Because it doesn't have the mentioned column in your query's where clause.
You can also try:
CREATE INDEX yourIndexName
ON yourTableName (column_you_are_looking_for1,column_you_are_lookingfor2)
Also good to know: If no index exists on a table, a table scan must be performed for each table referenced in a database query. The larger the table, the longer a table scan takes because a table scan requires each table row to be accessed sequentially. Although a table scan might be more efficient for a complex query that requires most of the rows in a table, for a query that returns only some table rows an index scan can access table rows more efficiently. (source from here )
Hope this helps.

An index should be used by default if you run a query against the table using it.
But I think in the query you posted it will not be used, because you are not filtering your data by the column you created your index on.
I think you would have to create the index for the artc and atelr columns to profit from that.
To see wether your index is used take a look at the execution plan that was used in the SQL Management Studio.
more info on indices: use the index luke

You dont need to include index in your query. Its managed by sql server. Also you dont need to include index in select if you want to make join to this table. Hope its clear.

You're index use "Fab_name" column which you don't filter on in your select statement, so it's of no use.
Since you're new to this, you might benefit from an index like this :
Create Index index_fab
ON TblFab (artc, atelr)
or maybe like this
Create Index index_fab
ON TblFab (atelr, artc)
...yes there are a lot of subtleties to learn.

For better performance:
List out the columns /tables which are frequently used,
Create index on those tables/columns only.

If index is properly set up, optimizer will use it automatically. By properly set up, I mean that it's selective enough, can effectively help the query etc. Read about it. You can check by yourself if index is being used by using "include actual execution plan" option in ssms.
It's generally not advised to use with(index()) hints and let optimizer decided by itself, except from very special cases when you just know better ;).

Using temp table for sorting data in SQL Server

Recently, I came across a pattern (not sure, could be an anti-pattern) of sorting data in a SELECT query. The pattern is more of a verbose and non-declarative way for ordering data. The pattern is to dump relevant data from actual table into temporary table and then apply orderby on a field on the temporary table. I guess, the only reason why someone would do that is to improve the performance (which I doubt) and no other benefit.
For e.g. Let's say, there is a user table. The table might contain rows in millions. We want to retrieve all the users whose first name starts with 'G' and sorted by first name. The natural and more declarative way to implement a SQL query for this scenario is:
More natural and declarative way
SELECT * FROM Users
WHERE NAME LIKE 'G%'
ORDER BY Name
Verbose way
SELECT * INTO TempTable
FROM Users
WHERE NAME LIKE 'G%'
SELECT * FROM TempTable
ORDER BY Name
With that context, I have few questions:
Will there be any performance difference between two ways if there is no index on the first name field. If yes, which one would be better.
Will there be any performance difference between two ways if there is an index on the first name field. If yes, which one would be better.
Should not the SQL Server optimizer generate same execution plan for both the ways?
Is there any benefit in writing a verbose way from any other persective like locking/blocking?
Thanks in advance.

Reguzlarly: Anti pattern by people without an idea what they do.
SOMETIMES: ok, because SQL Server has a problem that is not resolvable otherwise - not seen that one in yeas, though.
It makes things slower because it forces the tmpddb table to be fully populated FIRST, while otherwise the query could POSSIBLY be resoled more efficiently.
last time I saw that was like 3 years ago. We got it 3 times as fast by not being smart and using a tempdb table ;)
Answers:
1: No, it still needs a table scan, obviously.
2: Possibly - depends on data amount, but an index seek by index would contain the data in order already (as the index is ordered by content).
3: no. Obviously. Query plan optimization is statement by statement. By cutting the execution in 2, the query optimizer CAN NOT merge the join into the first statement.
4: Only if you run into a query optimizer issue or a limitation of how many tables you can join - not in that degenerate case (degenerate in a technical meaning - i.e. very simplistic). BUt if you need to join MANY MANY tables it may be better to go with an interim step.

If the field you want to do an order by on is not indexed, you could put everything into a temp table and index it and then do the ordering and it might be faster. You would have to test to make sure.

There is never any benefit of the second approach that I can think of.
It means if the data is available pre-ordered SQL Server can't take advantage of this and adds an unnecessary blocking operator and additional sort to the plan.
In the case that the data is not available pre-ordered SQL Server will sort it in a work table either in memory or tempdb anyway and adding an explicit #temp table just adds an unnecessary additional step.
Edit
I suppose one case where the second approach could give an apparent benefit might be if the presence of the ORDER BY caused SQL Server to choose a different plan that turned out to be sub optimal. In which case I would resolve that in a different way by either improving statistics or by using hints/query rewrite to avoid the undesired plan.

How to use index in select statement?

Lets say in the employee table, I have created an index(idx_name) on the emp_name column of the table.
Do I need to explicitly specify the index name in select clause or it will automatically used to speed up queries.
If it is required to be specified in the select clause, What is the syntax for using index in select query ?

If you want to test the index to see if it works, here is the syntax:
SELECT *
FROM Table WITH(INDEX(Index_Name))
The WITH statement will force the index to be used.

Good question,
Usually the DB engine should automatically select the index to use based on query execution plans it builds. However, there are some pretty rare cases when you want to force the DB to use a specific index.
To be able to answer your specific question you have to specify the DB you are using.
For MySQL, you want to read the Index Hint Syntax documentation on how to do this

How to use index in select statement? this way:
SELECT * FROM table1 USE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
SELECT * FROM table1 IGNORE INDEX (col3_index)
WHERE col1=1 AND col2=2 AND col3=3;
SELECT * FROM t1 USE INDEX (i1) IGNORE INDEX (i2) USE INDEX (i2);
And many more ways check this
Do I need to explicitly specify?
No, no Need to specify explicitly.
DB engine should automatically select the index to use based on query execution plans it builds from #Tudor Constantin answer.
The optimiser will judge if the use of your index will make your query run faster, and if it is, it will use the index. from #niktrl answer

In general, the index will be used if the assumed cost of using the index, and then possibly having to perform further bookmark lookups is lower than the cost of just scanning the entire table.
If your query is of the form:
SELECT Name from Table where Name = 'Boris'
And 1 row out of 1000 has the name Boris, it will almost certainly be used. If everyone's name is Boris, it will probably resort to a table scan, since the index is unlikely to be a more efficient strategy to access the data.
If it's a wide table (lot's of columns) and you do:
SELECT * from Table where Name = 'Boris'
Then it may still choose to perform the table scan, if it's a reasonable assumption that it's going to take more time retrieving the other columns from the table than it will to just look up the name, or again, if it's likely to be retrieving a lot of rows anyway.

The optimiser will judge if the use of your index will make your query run faster, and if it is, it will use the index.
Depending on your RDBMS you can force the use of an index, although it is not recommended unless you know what you are doing.
In general you should index columns that you use in table join's and where statements

Generally, when you create an index on a table, database will automatically use that index while searching for data in that table. You don't need to do anything about that.
However, in MSSQL, you can specify an index hint which can specify that a particular index should be used to execute this query. More information about this can be found here.
Index hint is also seems to be available for MySQL. Thanks to Tudor Constantine.

By using the column that the index is applied to within your conditions, it will be included automatically. You do not have to use it, but it will speed up queries when it is used.
SELECT * FROM TABLE WHERE attribute = 'value'
Will use the appropriate index.

The index hint is only available for Microsoft Dynamics database servers.
For traditional SQL Server, the filters you define in your 'Where' clause should persuade the engine to use any relevant indices...
Provided the engine's execution plan can efficiently identify how to read the information (whether a full table scan or an indexed scan) - it must compare the two before executing the statement proper, as part of its built-in performance optimiser.
However, you can force the optimiser to scan by using something like
Select *
From [yourtable] With (Index(0))
Where ...
Or to seek a particular index by using something like
Select *
From [yourtable] With (Index(1))
Where ...
The choice is yours. Look at the table's index properties in the object panel to get an idea of which index you want to use. It ought to match your filter(s).
For best results, list the filters which would return the fewest results first.
I don't know if I'm right in saying, but it seems like the query filters are sequential; if you get your sequence right, the optimiser shouldn't have to do it for you by comparing all the combinations, or at least not begin the comparison with the more expensive queries.

SQL `LIKE` complexity

Does anyone know what the complexity is for the SQL LIKE operator for the most popular databases?

Let's consider the three core cases separately. This discussion is MySQL-specific, but might also apply to other DBMS due to the fact that indexes are typically implemented in a similar manner.
LIKE 'foo%' is quick if run on an indexed column. MySQL indexes are a variation of B-trees, so when performing this query it can simply descend the tree to the node corresponding to foo, or the first node with that prefix, and traverse the tree forward. All of this is very efficient.
LIKE '%foo' can't be accelerated by indexes and will result in a full table scan. If you have other criterias that can by executed using indices, it will only scan the the rows that remain after the initial filtering.
There's a trick though: If you need to do suffix matching - searching for file names with extension .foo, for instance - you can achieve the same performance by adding a column with the same contents as the original one but with the characters in reverse order.
ALTER TABLE my_table ADD COLUMN col_reverse VARCHAR (256) NOT NULL;
ALTER TABLE my_table ADD INDEX idx_col_reverse (col_reverse);
UPDATE my_table SET col_reverse = REVERSE(col);
Searching for rows with col ending in .foo then becomes:
SELECT * FROM my_table WHERE col_reverse LIKE 'oof.%'
Finally, there's LIKE '%foo%', for which there are no shortcuts. If there are no other limiting criterias which reduces the amount of rows to a feasible number, it'll cause a hard performance hit. You might want to consider a full text search solution instead, or some other specialized solution.

If you are asking about the performance impact:
The problem of like is that it keeps the database from using an index. On Oracle I think it doesn't use indexes anymore (but I'm still on Oracle 9). SqlServer uses indexes if the wildcard is only at the end. I don't know about other databases.

Depends on the RDBMS, the data (and possibly size of data), indexes and how the LIKE is used (with or without prefix wildcard)!
You are asking too general a question.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas