How to get list of values stored in index? - sql

I'm having this issue in Oracle 11g R2. Table containing not null column which is indexed with non unique index. The index is not containing other columns.
Then I assumed that if I query distinct values of the column from the table, it would use index to get different values of the column (sounds logical to me). However at least explain plan is telling me it's doing full table scan. Also it took some time so probably the plan was not changed during run time. Optimizer index hint didn't helped.
I tried to search answer for this but no luck. Is there way to get values stored in index or somehow query the table without "touching" the table at all (like multi column index joins can)?
Thanks!
EDIT: This was about Oracle EBS gl_balances table and gl_balances_n2 index. I got answer and this changed the explain plan:
select /*+ index_ffs(gl gl_balances_n2) */
distinct gl.period_name
from gl_balances gl;

It may not be more efficient to scan the index than to scan the table -- don't forget that the index segment also contains branch nodes, and each index entry has to contain a ROWID of about 16 bytes (if memory serves).
So a "fast full index scan", which is the plan you're looking to get, may not be as fast as a full table scan. (You'd use an index_ffs() hint for that, by the way.)
edit: It be possible to use a more exotic method
Maintaining your own list by periodically querying the table using DBMS_Scheduler.
A materialized view. Complete refresh on demand might be adequate, though barely better than just periodically querying the data and maintaining your own unique list.
Making the index compressed, though that would only be of value for longish index keys.
A bitmap index -- not for a concurrently modified table though.

Related

Creating a non clustered index on a table with existing 1mln records affects that data immediately?

I have a column with 1 mln records. If I create a non clustered index on Column 'A', and then perform filtering by that column, should I immediately feel that the request takes much less time? Or I should create the index on empty table first, and only then add data to table in order to feel the power of index?
I cannot explain why you would or would not feel that a query is taking too much time.
But, once you have added an index -- and the statement completes -- then the index is available for any query that is compiled after that point in time.
As a rule, we can think that creating an index will remove the plan from the query cache. This is effectively what happens, but the actual sequence of events is that the next execution of the query will replace the plan. You can think of this as "delayed removal".
Creating an index on table when it is created means that the index will be available for all queries on the table.

Avoid applying a function to an index column

I need to filter out data that exceeds a certain length but the column that contains the data is an indexed column. If I apply a function to the column I lose the benefit of the index.
I cannot create a new index or alter the column as I am not an admin to the database.
I would prefer not to drop the data after the fact.
I know of a few ways to filter the column but all would use some kind of function.
select
table.name
from
table
where
length(table.name)>12
;
The field table.name is not nullable.
If I apply a function to the column I lose the benefit of the index.
Ah, but what is the benefit of an index?
Consider these two values:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
Are they both longer than 12 characters? Yes. Are they likely to be adjacent in the index? Of course not. Therefore the only way for Oracle to use an index to find those values is to execute a Full Fast Scan over the index and evaluate the length of each entry. Now Oracle can do that, but is it worthwhile?
Your posted query is selecting just name. In a comment you say name is not nullable. In that case it would be efficient for Oracle to use the index, because there is no need to read the table records: the index has sufficient information to satisfy the query.
However.
In that comment you also say:
the query is not that simple
If your actual query includes other columns in the projection then the database does have to visit the table to get those values. At which point the rule of thumb for indexed reads kicks in: if the result set of the query is greater than 1-2% of all the rows in the table it's more efficient to do a Full Table Scan than use an index. So the number of records in the table becomes pertinent, and especially the proportion of records where length(name) > 12. If 99% of the records have short names then it is probably still more efficient to Full Fast Scan the index. But if it's only 90% using the index would probably be deadly to performance.
Likewise, if your actual query applies additional criteria in the WHERE clause it may be more efficient to do a Full Table Scan (because the database needs to read the records to evaluate those filters) to to use a different index, if there is an appropriate one.
So, while the index would be useful for the toy query you posted in your question it may not help with your actual query, and indeed could lead to a sub-optimal access path.
is it a case by case situation depending on query complexity?
Yes. The answer is always, it depends. That's why database tuning professionals can charge the fat consultancy fees they do. If you don't provide the whole query the best we can do is point you at this post which explains to ask performance tuning questions and wish you good luck.
If the column is NOT NULL, then Oracle can answer the query using a full index scan. It will need to read every row in the index in order to find only those rows with the length greater than 12. If the index is smaller than the table this is faster than a full scan.
You are only selecting the indexed column so Oracle would not need to visit the table but can get the result entirely from the index. If you were to select other columns there were not in that index Oracle would also need to read the table row having first located the row in the index.
There is no way around this without adding a more suitable index or otherwise changing the database schema.

Why does selecting a field change the Index Scan Type?

As you can see from the picture:
Query1 and Query2 are equal in tables and where clauses. But when I add a field from the Address table indexing goes from a Index Scan to a Table Scan. My question is why?
Note: I see recommended index, but I do not think I have the authority to change the database.
This is more than likely happening because StateProvCode is not a column in the PK of the Address table, nor is it an INCLUDE column. So SQL Server must be determining that it would be cheaper to simply scan the table, instead of scanning the PK and then doing additional lookups in the Address table to get the value of StateProvCode for each row. It's possible that your performance won't suffer all that much because scanning an index might only be a little faster than scanning the table (unless you have a filtered index in place). Of course, as you can see, you probably need to create an index to really improve performance.

Will I save any time on a INDEX that SELECTs only once?

On DBD::SQLite of SQLite3
If I am going to query a SELECT only once.
Should I CREATE a INDEX first and then query the SELECT
or
just query the SELECT without an INDEX,
which is faster ?
If need to be specified, the col. to be index on is a INTEGER of undef or 1, just these 2 possibilities.
Building an index takes longer than just doing a table scan. So, if your single query — which you're only running once — is just a table scan, adding an index will be slower.
However, if your single query is not just a table scan, adding the index may be faster. For example, without an index, the database may perform a join as many table scans, once for each joined row. Then the index would probably be faster.
I'd say to benchmark it, but that sounds silly for a one-off query that you're only ever going to run once.
If you consider setting and index on a column that only has two possible values it's not worth the effort as index will give very little improvement. Indexes are useful on a columns that has a high degree of uniqueness and are frequently queried for a certain value or range. On the other hard indexes make inserting and updating slower so in this case you should skip it.

Does an index on a unique field in a table allow a select count(*) to happen instantly? If not why not?

I know just enough about SQL tuning to get myself in trouble. Today I was doing EXPLAIN plan on a query and I noticed it was not using indexes when I thought it probably should. Well, I kept doing EXPLAIN on simpler and simpler (and more indexable in my mind) queries, until I did EXPLAIN on
select count(*) from table_name
I thought for sure this would return instantly and that the explain would show use of an index, as we have many indexes on this table, including an index on the row_id column, which is unique. Yet the explain plan showed a FULL table scan, and it took several seconds to complete. (We have 3 million rows in this table).
Why would oracle be doing a full table scan to count the rows in this table? I would like to think that since oracle is indexing unique fields already, and having to track every insert and update on that table, that it would be caching the row count somewhere. Even if it's not, wouldn't it be faster to scan the entire index than to scan the entire table?
I have two theories. Theory one is that I am imagining how indexes work incorrectly. Theory two is that some setting or parameter somewhere in our oracle setup is messing with Oracle's ability to optimize queries (we are on oracle 9i). Can anyone enlighten me?
Oracle does not cache COUNT(*).
MySQL with MyISAM does (can afford this), because MyISAM is transactionless and same COUNT(*) is visible by anyone.
Oracle is transactional, and a row deleted in other transaction is still visible by your transaction.
Oracle should scan it, see that it's deleted, visit the UNDO, make sure it's still in place from your transaction's point of view, and add it to the count.
Indexing a UNIQUE value differs from indexing a non-UNIQUE one only logically.
In fact, you can create a UNIQUE constraint over a column with a non-unique index defined, and the index will be used to enforce the constraint.
If a column is marked as non-NULL, the an INDEX FAST FULL SCAN over this column can be used for COUNT.
It's a special access method, used for cases when the index order is not important. It does not traverse the B-Tree, but instead just reads the pages sequentially.
Since an index has less pages than the table itself, the COUNT can be faster with an INDEX_FFS than with a FULL
It is certainly possible for Oracle to satisfy such a query with an index (specifically with an INDEX FAST FULL SCAN).
In order for the optimizer to choose that path, at least two things have to be true:
Oracle has to be certain that every row in the table is represented in the index -- basically, that there are no NULL entries that would be missing from the index. If you have a primary key this should be guaranteed.
Oracle has to calculate the cost of the index scan as lower than the cost of a table scan. I don't think it necessarily true to assume that an index scan is always cheaper.
Possibly, gathering statistics on the table would change the behavior.
Expanding a little on the "transactions" reason. When a database supports transactions, at any point in time there might be records in different states, even in a "deleted" state. If a transaction fails, the states are rolled back.
A full table scan is done so that the current "version" of each record can be accessed for that point in time.
MySQL MyISAM doesn't have this problem since it uses table locking, instead of record locking required for transactions, and caches the record count. So it's always instantlyy returned. InnoDB under MySQL works the same as Oracle, but returns and "estimate".
You may be able to get a quicker query by counting the distinct values on the primary key, then only the index would be accessed.