Does indexes work with group function in oracle? - sql

I am running following query.
SELECT Table_1.Field_1,
Table_1.Field_2,
SUM(Table_1.Field_5) BALANCE_AMOUNT
FROM Table_1, Table_2
WHERE Table_1.Field_3 NOT IN (1, 3)
AND Table_2.Field_2 <> 2
AND Table_2.Field_3 = 'Y'
AND Table_1.Field_1 = Table_2.Field_1
AND Table_1.Field_4 = '31-oct-2011'
GROUP BY Table_1.Field_1, Table_1.Field_2;
I have created index for columns (Field_1,Field_2,Field_3,Field_4) of Table_1 but the index is not getting used.
If I remove the SUM(Table_1.Field_5) from select clause then index is getting used.
I am confused if optimizer is not using this index or its because of SUM() function I have used in query.
Please share your explaination on the same.

When you remove the SUM you also remove field_5 from the query. All the data needed to answer the query can then be found in the index, which may be quicker than scanning the table. If you added field_5 to the index the query with SUM might use the index.

If your query is returning the large percentage of table's rows, Oracle may decide that doing a full table scan is cheaper than "hopping" between the index and the table's heap (to get the values in Table_1.Field_5).
Try adding Table_1.Field_5 to the index (thus covering the whole query with the index) and see if this helps.
See the Index-Only Scan: Avoiding Table Access at Use The Index Luke for conceptual explanation of what is going on.

As you mentioned, the presence of the summation function results in the the Index being overlooked.
There are function based indexes:
A function-based index includes columns that are either transformed by a function, such as the UPPER function, or included in an expression, such as col1 + col2.
Defining a function-based index on the transformed column or expression allows that data to be returned using the index when that function or expression is used in a WHERE clause or an ORDER BY clause. Therefore, a function-based index can be beneficial when frequently-executed SQL statements include transformed columns, or columns in expressions, in a WHERE or ORDER BY clause.
However, as with all, function based indexes have their restrictions:
Expressions in a function-based index cannot contain any aggregate functions. The expressions must reference only columns in a row in the table.

Though I see some good answers here couple of important points are being missed -
SELECT Table_1.Field_1,
Table_1.Field_2,
SUM(Table_1.Field_5) BALANCE_AMOUNT
FROM Table_1, Table_2
WHERE Table_1.Field_3 NOT IN (1, 3)
AND Table_2.Field_2 <> 2
AND Table_2.Field_3 = 'Y'
AND Table_1.Field_1 = Table_2.Field_1
AND Table_1.Field_4 = '31-oct-2011'
GROUP BY Table_1.Field_1, Table_1.Field_2;
Saying that having SUM(Table_1.Field_5) in select clause causes index not to be used in not correct. Your index on (Field_1,Field_2,Field_3,Field_4) can still be used. But there are problems with your index and sql query.
Since your index is only on (Field_1,Field_2,Field_3,Field_4) even if your index gets used DB will have to access the actual table row to fetch Field_5 for applying filter. Now it completely depends on the execution plan charted out of sql optimizer which one is cost effective. If SQL optimizer figures out that full table scan has less cost than using index it will ignore the index. Saying so I will now tell you probable problems with your index -
As others have states you could simply add Field_5 to the index so that there is no need for separate table access.
Your order of index matters very much for performance. For eg. in your case if you give order as (Field_4,Field_1,Field_2,Field_3) then it will be quicker since you have equality on Field_4 -Table_1.Field_4 = '31-oct-2011'. Think of it this was -
Table_1.Field_4 = '31-oct-2011' will give you less options to choose final result from then Table_1.Field_3 NOT IN (1, 3). Things might change since you are doing a join. It's always best to see the execution plan and design your index/sql accordingly.

Related

Performance Tuning SQL

I have the following sql. When I check the execution plan of this query we observe an index scan. How do I replace it with index seek. I have non clustered index on IdDeleted column.
SELECT Cast(Format(Sum(COALESCE(InstalledSubtotal, 0)), 'F') AS MONEY) AS TotalSoldNet,
BP.BoundProjectId AS ProjectId
FROM BoundProducts BP
WHERE BP.IsDeleted=0 or BP.IsDeleted is null
GROUP BY BP.BoundProjectId
I tried like this and got index seek, but the result was wrong.
SELECT Cast(Format(Sum(COALESCE(InstalledSubtotal, 0)), 'F') AS MONEY) AS TotalSoldNet,
BP.BoundProjectId AS ProjectId
FROM BoundProducts BP
WHERE BP.IsDeleted=0
GROUP BY BP.BoundProjectId
Can anyone kindly suggest me to get the right result set with index seek.
I mean how to I replace (BP.IsDeleted=0 or BP.IsDeleted is null) condition to make use of index seek.
Edit, added row counts from comments of one of the answers:
null: 254962 rows
0: 392002 rows
1: 50211 rows
You're not getting an index seek because you're fetching almost 93% of the rows in the table and in that kind of scenario, just scanning the whole index is faster and cheaper to do.
If you have performance issues, you should look into removing format() -function, especially if the query returns a lot of rows. Read more from this blog post
Other option might be to create an indexed view and pre-aggregate your data. This of course adds an overhead to update / insert operations, so consider that only if this is done really often vs. how often the table is updated.
have you tried a UNION ALL?
select ...
Where isdeleted = 0
UNION ALL
select ...
Where isdeleted is null
Or you can add an Query hint (index= [indexname])
also be aware that the cardinality will determine if SQL uses a seek or a scan - seeks are quicker but can require key lookups if the index doesnt cover all the columns required. A good rule of thumb is that if the query will return more than 5% of the table then SQL will prefer a scan.

In Oracle, if I make a composite index on 2 columns, then in which situation this index will be used to search the record?

In Oracle, if I make a composite index on 2 columns, then in which situation this index will be used to search the record ?
a) If my query has a WHERE clause which involves first column
e.g. WHERE first_column = 'John'
b) If my query has a WHERE clause which involves second column
e.g. WHERE second_column = 'Sharma'
c) Either a or b
d) Both a and b
e) Not specifically these 2 columns but it could be any column in the WHERE clause.
f) Only column a or both columns a and b
I happen to think that MySQL does a pretty good job of describing how composite indexes are used. The documentation is here.
The basic idea is that the index would normally be used in the following circumstances:
When the where condition is an equality on col1 (col1 = value).
When the where condition is an inequality or in on col1 (col1 in (list), col1 < value)
When the where condition is an equality on col1 and col2, connected by an and (col1 = val1 and col2 = val2)
When the where condition is an equality on col1 and an inequality or in on col2.
Any of the above four cases where additional columns are used with additional conditions on other columns, connected by an and.
In addition, the index would normally be used if col1 and col2 are the only columns referenced in the query. This is called a covering index, and -- assuming there are other columns in the table -- it is faster to read the index than the original table because the index is smaller.
Oracle has a pretty smart optimizer, so it might also use the index in some related circumstances, for instance when col1 uses an in condition along with a condition on col2.
In general, a condition will not qualify for an index if the column is an argument to a function. So, these clauses would not use a basic index:
where month(col1) = 3
where trunc(col1) = trunc(sysdate)
where abs(col1) < 1
Oracle supports functional indexes, so if these constructs are actually important, you can create an index on month(col1), trunc(col1), or abs(col1).
Also, or tends to make the use of indexes less likely.
d) Both a or b
If the leading column is used, Oracle will likely use a regular index range scan and just ignore the unused columns.
If a non-leading column is used, Oracle can use an index skip scan. In practice a skip scan is not used very often.
There are two completely different questions here: when can Oracle use an index and when will Oracle use an index. The above explains that Oracle can use an index in either case, and you can test that out with a hint: /*+ index(table_name index_name) */.
Determining when Oracle will use an index is much trickier. Oracle uses multi-block reads for full table scans and fast full index scans, and uses single-block reads for other index scans. This means a full table scan is more efficient when reading a larger percent of the data. But there are a lot of factors involved: the percentage of data, how big is the index, system statistics that tell Oracle how fast single- and multi-block IO are, the number of distinct values (especially important for choosing a skip scan), index clustering factor (how ordered is the table by the index columns), etc.
The optimizer will use indexes in several scenarios. Even if not "perfect".
Optimaly, if you are querying using the first columns in the index, then the index will be used. Even if you're referencing only the first column, then it will still use the index if the optimizer deems it filters out enough data.
If the indexed columns aren't answering the query requirement (for instance only referencing the second column in the where clause), the optimizer could still use the index for a full (table) index scan, if it holds all of the data required, because the index is smaller than the full table.
In your example, if you are only querying from that table, and you only have that one index, (a) will use the index, (b) will use it if you are only querying columns in the index, while the table itself has more.
If you have other indexes, or join other tables, then that could affect the explain plan compeltely.
Check out http://docs.oracle.com/cd/B19306_01/server.102/b14231/indexes.htm

sql server query performance where clause

I got a table like with 10 000 lines.
declare #a table
(
id bigint not null,
nm varchar(100) not null,
filter bigint
primary key (id)
)
A select, with 4-5 join, is taking x seconds. If a where clause is added, it's now taking 3x seconds. The where clause:
filter = #filder or
filter is null
I applied a nonclustered index on the column, but I'm getting only 10% on perfomance.
Any tips?
edit: the perfomance issue happens when the filter column is added. all joins are on primary keys.
I have a few thoughts on this:
Chances are that your joins are joining on table.id - which is a primary key and has an index - bingo - high selectivity (because the values are unique). With it being indexed the optimizer can really optimize access to this table when it is used in joins.
I'm not 100% sure but - either you do not have an index on filter or it is not selective enough. If you do not have an index - the optimizer will use a table scan. If you do have an index, but it is not selective enough, it will use a table scan anyways. Scans are expensive.
Even if you do have an index on filter The optimizer does not like OR predicates. Basically when using an OR the optimizer might end up using an index scan instead of an index seek. Try using this instead: #filder = ISNULL(Filter, #filder as #sut13 suggested.
So to improve the performance: add an index on filter if you do not have one and adjust your where clause to not use OR as I have suggested.
Also:
You shouldn't expect the query with the where filter to perform equal to or better than the query with 4-5 joins. If the query with the joins is more selective and makes better use of indexes it is going to perform better
It's likely that the lack of index (based on what you described on the structure) on the filter column is leading to a table scan. The only way to be sure is to take a look at the execution plan for the query. That will tell you what the optimizer is doing with the query and usually give you enough information so you can understand why it's doing that and what you need to do to fix it.
Probably, you need an index on the filter column. But, with the 'OR filter IS NULL' might lead to a scan anyway, depending on how many null values are in the data.
If you use the ISNULL as outlined, unfortunately, that's a function on the column and will probably (depending on the indexes used and other columns in the WHERE clause that can be used to initially filter the data, etc.) result in a scan and not a seek.

Decision when to create Index on table column in database?

I am not db guy. But I need to create tables and do CRUD operations on them. I get confused should I create the index on all columns by default
or not? Here is my understanding which I consider while creating index.
Index basically contains the memory location range ( starting memory location where first value is stored to end memory location where last value is
stored). So when we insert any value in table index for column needs to be updated as it has got one more value but update of column
value wont have any impact on index value. Right? So bottom line is when my column is used in join between two tables we should consider
creating index on column used in join but all other columns can be skipped because if we create index on them it will involve extra cost of
updating index value when new value is inserted in column.Right?
Consider this scenario where table mytable contains two three columns i.e col1,col2,col3. Now we fire this query
select col1,col2 from mytable
Now there are two cases here. In first case we create the index on col1 and col2. In second case we don't create any index.** As per my understanding
case 1 will be faster than case2 because in case 1 we oracle can quickly find column memory location. So here I have not used any join columns but
still index is helping here. So should I consider creating index here or not?**
What if in the same scenario above if we fire
select * from mytable
instead of
select col1,col2 from mytable
Will index help here?
Don't create Indexes in every column! It will slow things down on insert/delete/update operations.
As a simple reminder, you can create an index in columns that are common in WHERE, ORDER BY and GROUP BY clauses. You may consider adding an index in colums that are used to relate other tables (through a JOIN, for example)
Example:
SELECT col1,col2,col3 FROM my_table WHERE col2=1
Here, creating an index on col2 would help this query a lot.
Also, consider index selectivity. Simply put, create index on values that has a "big domain", i.e. Ids, names, etc. Don't create them on Male/Female columns.
but update of column value wont have any impact on index value. Right?
No. Updating an indexed column will have an impact. The Oracle 11g performance manual states that:
UPDATE statements that modify indexed columns and INSERT and DELETE
statements that modify indexed tables take longer than if there were
no index. Such SQL statements must modify data in indexes and data in
tables. They also create additional undo and redo.
So bottom line is when my column is used in join between two tables we should consider creating index on column used in join but all other columns can be skipped because if we create index on them it will involve extra cost of updating index value when new value is inserted in column. Right?
Not just Inserts but any other Data Manipulation Language statement.
Consider this scenario . . . Will index help here?
With regards to this last paragraph, why not build some test cases with representative data volumes so that you prove or disprove your assumptions about which columns you should index?
In the specific scenario you give, there is no WHERE clause, so a table scan is going to be used or the index scan will be used, but you're only dropping one column, so the performance might not be that different. In the second scenario, the index shouldn't be used, since it isn't covering and there is no WHERE clause. If there were a WHERE clause, the index could allow the filtering to reduce the number of rows which need to be looked up to get the missing column.
Oracle has a number of different tables, including heap or index organized tables.
If an index is covering, it is more likely to be used, especially when selective. But note that an index organized table is not better than a covering index on a heap when there are constraints in the WHERE clause and far fewer columns in the covering index than in the base table.
Creating indexes with more columns than are actually used only helps if they are more likely to make the index covering, but adding all the columns would be similar to an index organized table. Note that Oracle does not have the equivalent of SQL Server's INCLUDE (COLUMN) which can be used to make indexes more covering (it's effectively making an additional clustered index of only a subset of the columns - useful if you want an index to be unique but also add some data which you don't want to be considered in the uniqueness but helps to make it covering for more queries)
You need to look at your plans and then determine if indexes will help things. And then look at the plans afterwards to see if they made a difference.

Creating Indexes for Group By Fields?

Do you need to create an index for fields of group by fields in an Oracle database?
For example:
select *
from some_table
where field_one is not null and field_two = ?
group by field_three, field_four, field_five
I was testing the indexes I created for the above and the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?
It could be correct, but that would depend on how much data you have. Typically I would create an index for the columns I was using in a GROUP BY, but in your case the optimizer may have decided that after using the field_two index that there wouldn't be enough data returned to justify using the other index for the GROUP BY.
No, this can be incorrect.
If you have a large table, Oracle can prefer deriving the fields from the indexes rather than from the table, even there is no single index that covers all values.
In the latest article in my blog:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: Oracle
, there is a query in which Oracle does not use full table scan but rather joins two indexes to get the column values:
SELECT l.id, l.value
FROM t_left l
WHERE NOT EXISTS
(
SELECT value
FROM t_right r
WHERE r.value = l.value
)
The plan is:
SELECT STATEMENT
HASH JOIN ANTI
VIEW , 20090917_anti.index$_join$_001
HASH JOIN
INDEX FAST FULL SCAN, 20090917_anti.PK_LEFT_ID
INDEX FAST FULL SCAN, 20090917_anti.IX_LEFT_VALUE
INDEX FAST FULL SCAN, 20090917_anti.IX_RIGHT_VALUE
As you can see, there is no TABLE SCAN on t_left here.
Instead, Oracle takes the indexes on id and value, joins them on rowid and gets the (id, value) pairs from the join result.
Now, to your query:
SELECT *
FROM some_table
WHERE field_one is not null and field_two = ?
GROUP BY
field_three, field_four, field_five
First, it will not compile, since you are selecting * from a table with a GROUP BY clause.
You need to replace * with expressions based on the grouping columns and aggregates of the non-grouping columns.
You will most probably benefit from the following index:
CREATE INDEX ix_sometable_23451 ON some_table (field_two, field_three, field_four, field_five, field_one)
, since it will contain everything for both filtering on field_two, sorting on field_three, field_four, field_five (useful for GROUP BY) and making sure that field_one is NOT NULL.
Do you need to create an index for fields of group by fields in an Oracle database?
No. You don't need to, in the sense that a query will run irrespective of whether any indexes exist or not. Indexes are provided to improve query performance.
It can, however, help; but I'd hesitate to add an index just to help one query, without thinking about the possible impact of the new index on the database.
...the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?
Not always. Often a GROUP BY will require Oracle to perform a sort (but not always); and you can eliminate the sort operation by providing a suitable index on the column(s) to be sorted.
Whether you actually need to worry about the GROUP BY performance, however, is an important question for you to think about.