Query regarding Oracle index - sql

I have two tables created in an Oracle DB and I am using them in queries something like given below. One table has index and the other table doesn't
select * from (
select * from table_with_an_index
union all
select * from table_without_an_index
)first_table
join second_table
where first_table.index_col=second_table.col
My question is that, in the above query, the index of the first table will be used? Or will it store records from both the tables first in memory and then apply filter without using index of the first table?
I searched about this in the internet and I am not able to get a correct answer. Any clue would be appreciated

In this case CBO likely to do 2 full scans, then union then hash join.
If second table is small, few values, and access small percentage of table_with_an_index, then, probably, CBO will push predicate and do index access union with full scan and then nested loops.
Index access is not always fastest

Related

How to index the attributes of the associated table in Oracle?

In order to optimize the query of the following statement add an index:
SELECT SUPPLIER.COMPANY_NAME, SUPPLIER.CITY
FROM PRODUCT JOIN SUPPLIER
ON PRODUCT.SUPPLIER_NAME = SUPPLIER.COMPANY_NAME;
The statement I wrote is as follows:
EXPLAIN PLAN FOR SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT,SUPPLIER;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
CREATE INDEX PS_IDX_SC ON PRODUCT,SUPPLIER(PRODUCT.) ;
EXPLAIN PLAN FOR SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT JOIN SUPPLIER;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
DROP INDEX PS_IDX_SC;
How to write the statement on line 45,thanks.
You can not crete the index on two tables.
You need to create two separate index as follows:
CREATE INDEX PS_IDX_PS ON PRODUCT(SUPPLIER_NAME) ;
CREATE INDEX PS_IDX_SC ON SUPPLIER(COMPANY_NAME) ;
Let me try to answer your question in a different way, trying to give you a short overview of what indexes are for, and that sometimes they are not the answer. You are joining two tables based on a condition, but without filtering. When you need to analyse a performance issue, and you think an index is the answer, try to think a bit more.
In your specific case, the join has no filter, so you show the supplier name and company name. But your query shows two columns only: supplier_name from the product table, and company_name from the supplier table. However, what is the join condition here ? I guess that company_name and supplier_name are the same, however it does not make any sense to retrieve the same column from both tables, if you ask me.
Original query
SQL> SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT JOIN SUPPLIER;
Rewrite query
SQL> SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT JOIN SUPPLIER
on PRODUCT.SUPPLIER_NAME = SUPPLIER.COMPANY_NAME;
Try to write always the join condition, makes the query more readable. In your case you could create two indexes in both tables, as #Tejash has shown you before, but let me explain you a bit more something else.
If your SQL query only retrieves the columns present in the index, Oracle probably will use the indexes to access the data. In this case, accessing by index will be faster than by table because the indexes are smaller than the tables.
However, if your SQL query retrieves more columns than the ones contained in the indexes (for example, the product_name), then it would be very interesting see whether than indexes make the query faster when you have no filter on it. In this case Oracle probably would use a method called TABLE ACCESS BY INDEX ROWID. It means that Oracle access the index to retrieve the rowid, then it goes to the table to get the data using the rowid retrieved from the index. In this case, when more columns are involved, if the tables are big enough, I bet accessing by table full scan is faster than accessing by index.
My advice: Get statistics of both tables by using DBMS_STATS. And, if you have Oracle 11g or higher, that you most probably do, you might want to use Invisible Indexes to verify the performance of those queries when you add the indexes without affecting your environment, then when you are sure, you can make them visible.
SQL> CREATE INDEX IDX_PRO_SUP ON PRODUCT(SUPPLIER_NAME) INVISIBLE;
SQL> CREATE INDEX IDX_SUP_COM SUPPLIER(COMPANY_NAME) INVISIBLE;
To see how the indexes will work with your explain plan in your own session.
SQL> ALTER SESSION SET OPTIMIZER_USE_INVISIBLE_INDEXES=TRUE;
SQL> EXPLAIN PLAN FOR SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM
PRODUCT,SUPPLIER;
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Then when you are sure those indexes work as you expect:
SQL> ALTER INDEX IDX_PRO_SUP VISIBLE;
SQL> ALTER INDEX IDX_SUP_COM VISIBLE;
Hope it helps.
Best regards

UNION ALL from each row of a results set

I have a view that runs quickly if I feed it a single parameter, i.e.:
SELECT * FROM v_myView WHERE myVal = 'thisValue';
If I ask it for multiple values IN, it evaluates the entire view's data, and picks the results from the data, in an operation taking around 20 seconds. So this is slow:
SELECT * FROM v_myView WHERE myVal IN (SELECT theseValues FROM myTable);
I have it in mind that for a dataset I know is small, it would be quicker to take all the results from SELECT theseValues FROM myTable query their matches individually from v_myView and UNION ALL the results such that I'm effectively generating the query:
SELECT * FROM v_myView WHERE myVal = 'thisValue1'
UNION ALL
SELECT * FROM v_myView WHERE myVal = 'thisValue2'
UNION ALL
etc...;
Is there any way to force this to happen in a "simple" query without using a stored procedure or dynamic sql, or am I just going to have to do this long-hand?
Getting a look at the execution plan and the structure of the underlying table and the view would have allowed to answer the question better.
Usually use of EXISTS gives the best performance out of the 3 normal frequently used ways to achieve this:
EXISTS
INNER JOIN
IN
However the comparison depends on the structure and indexing of the underlying tables.
EXISTS in a way short circuits the search whenever it finds a match which can make it better than the other 2 approaches.
Query:
SELECT *
FROM v_myView V
WHERE EXISTS (
SELECT 1
FROM myTable T
WHERE T.theseValues = V.myVal
);
But the query should also be covered with required indexes to be able to get good performance:
myVal column of the underlying table of v_myView view should have a nonclustered rowstore index (unless myVal is already the clustered index key of that table).
theseValues column of myTable should have a nonclustered rowstore index (unless theseValues is already the clustered index key of the table)
Do you need to fetch all the columns of the view v_myView in your final result? I would suggest you to fetch only the required columns in result. The selected columns should be covered by use of either nonclustered rowstore index with INCLUDE clause, or create a single nonclustered columnstore index covering those columns per underlying table.

Avoid full table scan

I have an SQL select query to be tuned. In the query there is a View in from clause which has been formed through 4 tables. When this query is executed Full table scan takes place on all these 4 tables which causes CPU spikes. The four tables have valid indexes built on them.
The query looks similar to this:
SELECT DISTINCT ID, TITLE,......
FROM FINDSCHEDULEDTESTCASE
WHERE STEP_PASS_INDEX = 1 AND LOWER(COMPAREANAME) ='abc' ORDER BY ID;
The dots indicate that there are many more columns. Here FINDSCHEDULEDTESTCASE is a view on four tables.
Can someone guide me how to avoid full table scan on those four tables.
In any case using your condition
AND LOWER(COMPAREANAME) ='abc'
you'll have the full scan of COMPAREANAME values because for each value function LOWER must be calculated.
It depends on so many things!
SELECT DISTINCTG ID, TITLE, ......
Depending on how many columns you SELECT, it is possible that SQL Server decides to do a table scan instead of using your indexes.
Also, depending on your "WHERE" conditions, SQL Server can also decides to do a table scan instead of using your indexes.
Which version of SQL Server are you using?
There can be ways to improve the indexes on the tables, if, for an example, the conditions in the "WHERE" represents less than 50% of the rows, and if you are using SQL 2008. (With filtered indexes http://msdn.microsoft.com/en-us/library/ms188783.aspx )
Or you can create indexes on views (http://msdn.microsoft.com/en-us/library/ms191432.aspx )
There really is not enough details in your question to be able to really help you.

mysql: which queries can untilize which indexes?

I'm using Mysql 5.0 and am a bit new to indexes. Which of the following queries can be helped by indexing and which index should I create?
(Don't assume either table to have unique values. This isn't homework, its just some examples I made up to try and get my head around indexing.)
Query1:
Select a.*, b.*
From a
Left Join b on b.type=a.type;
Query2:
Select a.*, b.*
From a,b
Where a.type=b.type;
Query3:
Select a.*
From a
Where a.type in (Select b.type from b where b.brand=5);
Here is my guess for what indexes would be use for these different kinds of queries:
Query1:
Create Index Query1 Using Hash on b (type);
Query2:
Create Index Query2a Using Hash on a (type);
Create Index Query2b Using Hash on b (type);
Query3:
Create Index Query2a Using Hash on b (brand,type);
Am I correct that neither Query1 or Query3 would utilize any indexes on table a?
I believe these should all be hash because there is only = or !=, right?
Thanks
using the explain command in mysql will give a lot of great info on what mysql is doing and how a query can be optimized.
in q1 and q2: an index on (a.type, all other a cols) and one on (b.type, all other b cols)
in q3: an index on (a.b_type, all other a cols) and one on b (brand, type)
ideally, you'd want all the columns that were selected stored directly in the index so that mysql doesn't have to jump from the index back to the table data to fetch the selected columns. however, that is not always manageable (i.e.: sometimes you need to select * and indexing all columns is too costly), in which case indexing just the search columns is fine.
so everything you said works great.
query 3 is invalid, but i assume you meant
where a.type in ....
Query 1 is the same as query two, just better syntax, both probably have the same query plan and both will use both indexes.
Query 3 will use the index on b.brand, but not the type portion of it. It would also use an index on a.type if you had one.
You are right that they should be hash indexes.
Query 3 could utilize an index on a.type if the number of b's with brand=5 is close to zero
Query2 will utilize indices if they are B-trees (and thus are sorted). Using hash indices with index-join may slow down your query (because you'll have to read Size(a) values in non-sequential way)
Query optimization and indexing is a huge topic, so you'll definitely want to read about MySQL and the specific storage engines you're using. The "using hash" is supported by InnoDB and NDB; I don't think MyISAM supports it.
The joins you have will perform a full table or index scan even though the join condition is equality; Every row will have to be read because there's no where clause.
You'll probably be better off with a standard b-tree index, but measure it and investigate the query plan with "explain". MySQL InnoDB stores row data organized by primary key so you should also have a primary key on your tables, not just an index. It's best if you can use the primary key in your joins because otherwise MySQL retrieves the primary key from the index, then does another fetch to get the row. The nice exception to that rule is if your secondary index includes all the columns you need in the query. That's called a covering index and MySQL will not have to lookup the row at all.

Creating Indexes for Group By Fields?

Do you need to create an index for fields of group by fields in an Oracle database?
For example:
select *
from some_table
where field_one is not null and field_two = ?
group by field_three, field_four, field_five
I was testing the indexes I created for the above and the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?
It could be correct, but that would depend on how much data you have. Typically I would create an index for the columns I was using in a GROUP BY, but in your case the optimizer may have decided that after using the field_two index that there wouldn't be enough data returned to justify using the other index for the GROUP BY.
No, this can be incorrect.
If you have a large table, Oracle can prefer deriving the fields from the indexes rather than from the table, even there is no single index that covers all values.
In the latest article in my blog:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: Oracle
, there is a query in which Oracle does not use full table scan but rather joins two indexes to get the column values:
SELECT l.id, l.value
FROM t_left l
WHERE NOT EXISTS
(
SELECT value
FROM t_right r
WHERE r.value = l.value
)
The plan is:
SELECT STATEMENT
HASH JOIN ANTI
VIEW , 20090917_anti.index$_join$_001
HASH JOIN
INDEX FAST FULL SCAN, 20090917_anti.PK_LEFT_ID
INDEX FAST FULL SCAN, 20090917_anti.IX_LEFT_VALUE
INDEX FAST FULL SCAN, 20090917_anti.IX_RIGHT_VALUE
As you can see, there is no TABLE SCAN on t_left here.
Instead, Oracle takes the indexes on id and value, joins them on rowid and gets the (id, value) pairs from the join result.
Now, to your query:
SELECT *
FROM some_table
WHERE field_one is not null and field_two = ?
GROUP BY
field_three, field_four, field_five
First, it will not compile, since you are selecting * from a table with a GROUP BY clause.
You need to replace * with expressions based on the grouping columns and aggregates of the non-grouping columns.
You will most probably benefit from the following index:
CREATE INDEX ix_sometable_23451 ON some_table (field_two, field_three, field_four, field_five, field_one)
, since it will contain everything for both filtering on field_two, sorting on field_three, field_four, field_five (useful for GROUP BY) and making sure that field_one is NOT NULL.
Do you need to create an index for fields of group by fields in an Oracle database?
No. You don't need to, in the sense that a query will run irrespective of whether any indexes exist or not. Indexes are provided to improve query performance.
It can, however, help; but I'd hesitate to add an index just to help one query, without thinking about the possible impact of the new index on the database.
...the only relevant index for this query is an index created for field_two. Other single-field or composite indexes created on any of the other fields will not be used for the above query. Does this sound correct?
Not always. Often a GROUP BY will require Oracle to perform a sort (but not always); and you can eliminate the sort operation by providing a suitable index on the column(s) to be sorted.
Whether you actually need to worry about the GROUP BY performance, however, is an important question for you to think about.