SQL Server non-clustered index and the query optimizer

SQL Server non-clustered index and the query optimizer - sql

In one of the projects I am working on, there is a table which has about one million records. For better performance I created a non-clustered index and defined sid field as index key column. When I execute this query
SELECT [id]
,[sid]
,[idm]
,[origin]
,[status]
,[pid]
FROM [EpollText_Db].[dbo].[PhoneNumbers] where sid = 9
The execution plan is like the above picture. My question is, why does SQL server ignore the sid index and scan the whole one million records instead, to find the query result. Your help is greatly appreciated

I believe that the problem is in the size of your result. You are selecting ten thousand records from your database which is quite a lot if you consider the necessary query plan that would include index seek operation. The plan includes index seek would be something like this
Therefore, ten thousand key lookups would be included and a significant number of random logical accesses. Due to this, if your table row is small, he could decide to use clustered index scan. If you are really concerned about the performance of this query create a covering index:
CREATE INDEX idx_PhoneNumbers_sid
ON [EpollText_Db].[dbo].[PhoneNumbers](sid)
INCLUDE ([id],[idm],[origin],[status],[pid])
However, this may slow down inserts, deletes, and updates, and it may also double the size of your table.

Related

SQL query optimization. Estimated vs actual number of rows

I'm dealing with a query that is taking very long to execute, after taking the query execution plan and live query statistics, I found that there is a huge difference between the estimated and actual number of rows. Why could this happend?
Need help to optimice that query.
Salutes.

RE: "Need help to optimize that query."
Query plan shows 2 clustered index scans. If those are large data tables, that could be a very big slow down.
Query plan also shows a recommended missing index to be created.
Start with creating the recommended index and see if DRIVER_ALLOCATIONS clustered index scan converts to a seek. My guess is that -- after the recommended index is added -- the next query plan will show another missing index for the other clustered index scan.

Creating a non clustered index on a table with existing 1mln records affects that data immediately?

I have a column with 1 mln records. If I create a non clustered index on Column 'A', and then perform filtering by that column, should I immediately feel that the request takes much less time? Or I should create the index on empty table first, and only then add data to table in order to feel the power of index?

I cannot explain why you would or would not feel that a query is taking too much time.
But, once you have added an index -- and the statement completes -- then the index is available for any query that is compiled after that point in time.
As a rule, we can think that creating an index will remove the plan from the query cache. This is effectively what happens, but the actual sequence of events is that the next execution of the query will replace the plan. You can think of this as "delayed removal".
Creating an index on table when it is created means that the index will be available for all queries on the table.

How to get list of values stored in index?

I'm having this issue in Oracle 11g R2. Table containing not null column which is indexed with non unique index. The index is not containing other columns.
Then I assumed that if I query distinct values of the column from the table, it would use index to get different values of the column (sounds logical to me). However at least explain plan is telling me it's doing full table scan. Also it took some time so probably the plan was not changed during run time. Optimizer index hint didn't helped.
I tried to search answer for this but no luck. Is there way to get values stored in index or somehow query the table without "touching" the table at all (like multi column index joins can)?
Thanks!
EDIT: This was about Oracle EBS gl_balances table and gl_balances_n2 index. I got answer and this changed the explain plan:
select /*+ index_ffs(gl gl_balances_n2) */
distinct gl.period_name
from gl_balances gl;

It may not be more efficient to scan the index than to scan the table -- don't forget that the index segment also contains branch nodes, and each index entry has to contain a ROWID of about 16 bytes (if memory serves).
So a "fast full index scan", which is the plan you're looking to get, may not be as fast as a full table scan. (You'd use an index_ffs() hint for that, by the way.)
edit: It be possible to use a more exotic method
Maintaining your own list by periodically querying the table using DBMS_Scheduler.
A materialized view. Complete refresh on demand might be adequate, though barely better than just periodically querying the data and maintaining your own unique list.
Making the index compressed, though that would only be of value for longish index keys.
A bitmap index -- not for a concurrently modified table though.

How do i optimize this query?

I have a very specific query. I tried lots of ways but i couldn't reach the performance i want.
SELECT *
FROM
items
WHERE
user_id=1
AND
(item_start < 20000 AND item_end > 30000)
i created and index on user_id, item_start, item_end
this didn't work and i dropped all indexes and create new indexes
user_id, (item_start, item_end)
also this didn't work.
(user_id, item_start and item_end are int)
edit: database is MySQL 5.1.44, engine is InnoDB

UPDATE: per your comment below, you need all the columns in the query (hence your SELECT *). If that's the case, you have a few options to maximize query performance:
create (or change) your clustered index to be on item_user_id, item_start, item_end. This will ensure that as few rows as possible are examined for each query. Per my original answer below, this approach may speed up this particular query but may slow down others, so you'll need to be careful.
if it's not practical to change your clustered index, you can create a non-clustered index on item_user_id, item_start, item_end and any other columns your query needs. This will slow down inserts somewhat, and will double the storage required for your table, but will speed up this particular query.
There are always other ways to increase performance (e.g. by reducing the size of each row) but the primary way is to decrease the number of rows which must be accessed and to increase the % of rows which are accessed sequentially rather than randomly. The indexing suggestions above do both.
ORIGINAL ANSWER BELOW:
Without knowing the exact schema or query plan, the main performance problem with this query is that SELECT * forces a lookup back to your clustered index for every row. If there are large numbers of matching rows for a particular user ID and if your clustered index's first column is not item_user_id, then this will likley be a very inefficient operation because your disk will be trying to fetch lots of randomly distributed rows from teh clustered inedx.
In other words, even thouggh filtering the rows you want is fast (because of your index), actually fetching the data is slower. .
If, however, your clustered index is ordered by item_user_id, item_start, item_end then that should speed things up. Note that this is not a panacea, since if you have other queries which depend on different ordering, or if you're inserting rows in a differnet order, you could end up slowing down other queries.
A less impactful solution would be to create a covering index which contains only the columns you want (also ordered by item_user_id, item_start, item_end, and then add the other cols you need). THen change your query to only pull back the cols you need, instead of using SELECT *.
If you could post more info about the DBMS brand and version, and the schema of your table, and we can help with more details.

Do you need to SELECT *?
If not, you can create a index on user_id, item_start, item_end with the fields you need in the SELECT-part as included columns. This all assuming you're using Microsoft SQL Server 2005+

Will I save any time on a INDEX that SELECTs only once?

On DBD::SQLite of SQLite3
If I am going to query a SELECT only once.
Should I CREATE a INDEX first and then query the SELECT
or
just query the SELECT without an INDEX,
which is faster ?
If need to be specified, the col. to be index on is a INTEGER of undef or 1, just these 2 possibilities.

Building an index takes longer than just doing a table scan. So, if your single query — which you're only running once — is just a table scan, adding an index will be slower.
However, if your single query is not just a table scan, adding the index may be faster. For example, without an index, the database may perform a join as many table scans, once for each joined row. Then the index would probably be faster.
I'd say to benchmark it, but that sounds silly for a one-off query that you're only ever going to run once.

If you consider setting and index on a column that only has two possible values it's not worth the effort as index will give very little improvement. Indexes are useful on a columns that has a high degree of uniqueness and are frequently queried for a certain value or range. On the other hard indexes make inserting and updating slower so in this case you should skip it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Server non-clustered index and the query optimizer - sql

Related

SQL query optimization. Estimated vs actual number of rows

Creating a non clustered index on a table with existing 1mln records affects that data immediately?

How to get list of values stored in index?

How do i optimize this query?

Will I save any time on a INDEX that SELECTs only once?

Categories

Resources