I have a table with several columns and a unique RAW column. I created an unique index on the RAW column.
My query selects all columns from the table (6 million rows).
when i see the cost of the query its too high (51K). and its still using INDEX FULL scan. The query do not have any filter conditions, its a plain select * from.
Please suggest how can i tune the query operation.
Thanks in advance.
Why are you hinting it to use the index if you're retrieving all columns from all rows? The index would only help if you were filtering on the indexed column. If you were only retrieving the indexed column then an INDEX_FFS hint might help. But if you have to go back to the data for any non-indexed columns then using the index at all becomes counterproductive beyond a certain proportion of returned data as you're having to access both the index data blocks and the table data blocks repeatedly.
So, your query is:
select /*+ index (rawdata idx_test) */
rawdata.*
from v_wis_cds_cp_rawdata_test rawdata
and you want to know why Oracle is choosing an INDEX FULL scan?
Well, as Alex said, the reason is the "index (raw data idx_text)" hint. This is a directive that tells the Oracle optimizer, "when you access rawdata, use an index access on the idx_text index", which means that's what Oracle will do if at all possible - even if that's not the best plan.
Hints don't make queries faster automatically. They are a way of telling the optimizer what not to do.
I've seen queries like this before - sometimes a hint like this is added in order to return the rows in sorted order, without actually doing a sort. However, if this was the requirement, I'd strongly recommend adding an ORDER BY clause in anyway, because if the hint becomes invalid for some reason (e.g. the index gets dropped or renamed), the sorting would no longer happen and no error would be reported.
If you don't need the rows returned in any particular order, I suggest you remove the hint and see if the performance improves.
Related
I need to filter out data that exceeds a certain length but the column that contains the data is an indexed column. If I apply a function to the column I lose the benefit of the index.
I cannot create a new index or alter the column as I am not an admin to the database.
I would prefer not to drop the data after the fact.
I know of a few ways to filter the column but all would use some kind of function.
select
table.name
from
table
where
length(table.name)>12
;
The field table.name is not nullable.
If I apply a function to the column I lose the benefit of the index.
Ah, but what is the benefit of an index?
Consider these two values:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
Are they both longer than 12 characters? Yes. Are they likely to be adjacent in the index? Of course not. Therefore the only way for Oracle to use an index to find those values is to execute a Full Fast Scan over the index and evaluate the length of each entry. Now Oracle can do that, but is it worthwhile?
Your posted query is selecting just name. In a comment you say name is not nullable. In that case it would be efficient for Oracle to use the index, because there is no need to read the table records: the index has sufficient information to satisfy the query.
However.
In that comment you also say:
the query is not that simple
If your actual query includes other columns in the projection then the database does have to visit the table to get those values. At which point the rule of thumb for indexed reads kicks in: if the result set of the query is greater than 1-2% of all the rows in the table it's more efficient to do a Full Table Scan than use an index. So the number of records in the table becomes pertinent, and especially the proportion of records where length(name) > 12. If 99% of the records have short names then it is probably still more efficient to Full Fast Scan the index. But if it's only 90% using the index would probably be deadly to performance.
Likewise, if your actual query applies additional criteria in the WHERE clause it may be more efficient to do a Full Table Scan (because the database needs to read the records to evaluate those filters) to to use a different index, if there is an appropriate one.
So, while the index would be useful for the toy query you posted in your question it may not help with your actual query, and indeed could lead to a sub-optimal access path.
is it a case by case situation depending on query complexity?
Yes. The answer is always, it depends. That's why database tuning professionals can charge the fat consultancy fees they do. If you don't provide the whole query the best we can do is point you at this post which explains to ask performance tuning questions and wish you good luck.
If the column is NOT NULL, then Oracle can answer the query using a full index scan. It will need to read every row in the index in order to find only those rows with the length greater than 12. If the index is smaller than the table this is faster than a full scan.
You are only selecting the indexed column so Oracle would not need to visit the table but can get the result entirely from the index. If you were to select other columns there were not in that index Oracle would also need to read the table row having first located the row in the index.
There is no way around this without adding a more suitable index or otherwise changing the database schema.
I have the following query:
SELECT * FROM messages GROUP BY peer
(really it's more complicated with joins, but I omitted them here for simplicity)
The problem is that SQLite doesn't use any indexes and always performs a full scan of the table. Expectedly, it works fast on small data sets but it's noticeably slow with a big table containing thousands of rows. Here's the output of the EXPLAIN QUERY PLAN command:
0|0|0|SCAN TABLE messages USING INDEX messages_peer_mid (~1000000 rows)
Despite it says "USING INDEX" it still performs a full scan. Is there any way to make SQLite use index for this query or it's better to give up with GROUP BY and look for some other approach?
The plan takes into account the amount of data and performs a scan because it's algorithm probably concludes it's faster to do so.
Other comments, your query has no WHERE condition and you are returning ALL columns so why wouldn't you expect a table scan?
Indexes assist in selecting records from a table (using a WHERE clause or as a result of a JOIN operation). GROUP BY is performed on a set of records after they've been selected and retrieved from the table. It cannot be assisted by indexes.
If you want to know more about what options are available for index use in your query, please post the entire query.
Also, you note that the SQL you gave is a symbolic representation of the code you're running, but if you're really using *, or any non-aggregated field names other than peer in your statement you may not be getting the results you want.
Finally, you ask "it's better to give up with GROUP BY and look for some other approach?" GROUP BY is used for a specific function in SQL (producing new aggregated result sets from non-aggregated data). If that's your goal, GROUP BY is likely to be the best solution (because it defers to the database engine, which is highly optimized and cognizant of database statistics the decision about how to retrieve and process the data). If that's not your goal and you're trying to do something else using GROUP BY as an "approach" to that other functionality, let us know what it is you're actually trying to achieve.
I am trying to write a simple query to count the results from a big table.
SELECT COUNT(*)
FROM DM.DM_CUSTOMER_SEG_BRIDGE_CORP_DW AL3
WHERE (AL3.REFERENCE_YEAR(+) =2012)
Above query is taking around 24 seconds to return me output. If I remove where clause and execute same query, it is giving me result in 2 seconds.
May i know what is the reason for that. I am relatively new to SQL queries.
Please help
Thanks,
Naveen
You might need an index on the table. Typically you will need an index on any columns used in the where clause
as for the (+) syntax I think it is redundant (i'm no Oracle expert) but see Difference between Oracle's plus (+) notation and ansi JOIN notation?
The reason may seem subtle. But there are multiple ways that Oracle could approach a query like this:
SELECT COUNT(*)
FROM DM.DM_CUSTOMER_SEG_BRIDGE_CORP_DW AL3
One way is to read all the rows in the table. Because this is a big table, that is not the most efficient approach. A second method would be to use statistics of some sort, where the number of rows are in the statistics. I don't think Oracle ever does this, but it is conceivable.
The final method is to read an index. Typically, an index would be much smaller than the table and it might already be in memory. The above query would be reading a much smaller amount of data. (Here is an interesting article on counting all the rows in a table.)
When you introduce the where clause,
WHERE (AL3.REFERENCE_YEAR(+) =2012)
Oracle can no longer scan just any index. It would have to scan the reference_year index. What is the problem? If it scanned an index, it would still need to fetch the data records to get the value of reference_year -- and that is equivalent (actually worse) than scanning the whole table.
Even with an index on reference_year, you are not guaranteed to use the index. The problem is something called selectivity. The number of rows that you are fetching may still be quite large, relative to the number of rows in the database (in this context, 10% is "quite large"). The Oracle optimize may choose to do a full table scan rather than read the index.
I have a large SAS dataset sorted by field 'A'. I'd like to do a query that references fields 'A' and 'B'. To speed up performance I created an index on 'B'. This results in an unhelpful message:
INFO: Index B not used. Sorting into index order may help.
Of course sorting on B would help. But that's not the point. Indexes are for the case when you are already sorted on some other field.
In a similar query, SAS gives this message:
INFO: Use of index C for WHERE clause optimization canceled.
Any tips on getting SAS to use my indexes? In one case the query is taking 2 hours to run because SAS doesn't use the index.
In case the query is not selective enough - taking most of source records to the result, the index use may not help performance, eventually can make things worse. That's probably why the optimizer desided not to use the index.
To force the use of index try using IDXNAME data set option (on both tables, probably).
Refer to http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000414058.htm.
Without seeing the query and knowing some characteristics of data (at least record counts of input tables and expected size of the query result) it's hard to tell the optimal approach.
Anyway, for optimal performance, when joining tables, both tables need to be index similarly and all the join keys need to be part of the index.
Can't answer a question like this without seeing the query you are trying to run. An index will only be useful if the SAS optimizer determines it will improve performance. Can you show a simple example of the code you want to run?
i found a in a table there are 50 thousands records and it takes one minute when we fetch data from sql server table just by issuing a sql. there are one primary key that means a already a cluster index is there. i just do not understand why it takes one minute. beside index what are the ways out there to optimize a table to get the data faster. in this situation what i need to do for faster response. also tell me how we can write always a optimize sql. please tell me all the steps in detail for optimization.
thanks.
The fastest way to optimize indexes in table is to use SQL Server Tuning Advisor. Take a look http://www.youtube.com/watch?v=gjT8wL92mqE <-- here
Select only the columns you need, rather than select *. If your table has some large columns e.g. OLE types or other binary data (maybe used for storing images etc) then you may be transferring vastly more data off disk and over the network than you need.
As others have said, an index is no help to you when you are selecting all rows (no where clause). Using an index would be slower in such cases because of the index read and table lookup for each row, vs full table scan.
If you are running select * from employee (as per question comment) then no amount of indexing will help you. It's an "Every column for every row" query: there is no magic for this.
Adding a WHERE won't help usually for select * query too.
What you can check is index and statistics maintenance. Do you do any? Here's a Google search
Or change how you use the data...
Edit:
Why a WHERE clause usually won't help...
If you add a WHERE that is not the PK..
you'll still need to scan the table unless you add an index on the searched column
then you'll need a key/bookmark lookup unless you make it covering
with SELECT * you need to add all columns to the index to make it covering
for a many hits, the index will probably be ignored to avoid key/bookmark lookups.
Unless there is a network issue or such, the issue is reading all columns not lack of WHERE
If you did SELECT col13 FROM MyTable and had an index on col13, the index will probably be used.
A SELECT * FROM MyTable WHERE DateCol < '20090101' with an index on DateCol but matched 40% of the table, it will probably be ignored or you'd have expensive key/bookmark lookups
Irrespective of the merits of returning the whole table to your application that does sound an unexpectedly long time to retrieve just 50000 rows of employee data.
Does your query have an ORDER BY or is it literally just select * from employee?
What is the definition of the employee table? Does it contain any particularly wide columns? Are you storing binary data such as their CVs or employee photo in it?
How are you issuing the SQL and retrieving the results?
What isolation level are your select statements running at (You can use SQL Profiler to check this)
Are you encountering blocking? Does adding NOLOCK to the query speed things up dramatically?