I need to increase performance of this query:
select t.*
,(select max(1)
from schema1.table_a t1
where 1=1
to_date(t.misdate, 'YYYYMMDD') between t1.startdateref and t1.enddateref
and sysdate between t1.startdatevalue and t1.enddatevalue
and t1.idpma = t.idpm)
from schema2.table_b t
Any Ideas?
Thanks
Well you don't have any filtering condition on table_b. This means the best plan includes a full table scan on table_b. This would be optimal.
Having said that, now you need to focus on table_a. That one should be accessed using index range scans on either:
idpma, then by startdateref.
or idpma, then by startdateref.
Yes, it's one or the other. For Oracle's cost-based optimizer (CBO) to pick the best plan, you'll need to add the following indexes:
create index ix1 on schema1.table_a (idpma, startdateref);
create index ix2 on schema1.table_a (idpma, startdatevalue);
Try with this ones and see how it works.
Related
I have some questions about index.
First, if I use a index column in WITH clause, dose this column still works as index column in main query?
For example,
WITH TEST AS (
SELECT EMP_ID
FROM EMP_MASTER
)
SELECT *
FROM TEST
WHERE EMP_ID >= '2000'
'EMP_ID' in 'EMP_MASTER' table is PK and index for EMP_MASTER consists of EMP_ID.
In this situation, Does 'Index Scan' happen in main query?
Second, if I join two tables and then use two index columns from each table in WHERE, does 'Index Scan' happen?
For example,
SELECT *
FROM A, B
WHERE A.COL1 = B.COL1 AND
A.COL1 > 200 AND
B.COL1 > 100
Index for table A consists of 'COL1' and index for table B consists of 'COL1'.
In this situation, does 'Index Scan' happen in each table before table join?
If you give me some proper advice, I really appreciate that.
First, SQL is a declarative language, not a procedural language. That is, a SQL query describes the result set, not the specific processing. The SQL engine uses the optimizer to determine the best execution plan to generate the result set.
Second, Oracle has a reasonable optimizer.
Hence, Oracle will probably use the indexes in these situations. However, you should look at the execution plan to see what Oracle really does.
First, if I use a index column in WITH clause, dose this column still works as index column in main query?
Yes. A CTE (the WITH part) is a query just like any other - and if a query references a physical table column used by an index then the engine will use the index if it thinks it's a good idea.
In this situation, Does 'Index Scan' happen in main query?
We can't tell from the limitated information you've provided. An engine will scan or seek an index based on its heuristics about the distribution of data in the index (e.g. STATISTICS objects) and other information it has, such as cached query execution plans.
In this situation, does 'Index Scan' happen in each table before table join?
As it's a range query, it probably would make sense for an engine to use an index scan rather than an index seek - but it also could do a table-scan and ignore the index if the index isn't selective and specific enough. Also factor in query flags to force reading non-committed data (e.g. for performance and to avoid locking).
In order to optimize the query of the following statement add an index:
SELECT SUPPLIER.COMPANY_NAME, SUPPLIER.CITY
FROM PRODUCT JOIN SUPPLIER
ON PRODUCT.SUPPLIER_NAME = SUPPLIER.COMPANY_NAME;
The statement I wrote is as follows:
EXPLAIN PLAN FOR SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT,SUPPLIER;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
CREATE INDEX PS_IDX_SC ON PRODUCT,SUPPLIER(PRODUCT.) ;
EXPLAIN PLAN FOR SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT JOIN SUPPLIER;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
DROP INDEX PS_IDX_SC;
How to write the statement on line 45,thanks.
You can not crete the index on two tables.
You need to create two separate index as follows:
CREATE INDEX PS_IDX_PS ON PRODUCT(SUPPLIER_NAME) ;
CREATE INDEX PS_IDX_SC ON SUPPLIER(COMPANY_NAME) ;
Let me try to answer your question in a different way, trying to give you a short overview of what indexes are for, and that sometimes they are not the answer. You are joining two tables based on a condition, but without filtering. When you need to analyse a performance issue, and you think an index is the answer, try to think a bit more.
In your specific case, the join has no filter, so you show the supplier name and company name. But your query shows two columns only: supplier_name from the product table, and company_name from the supplier table. However, what is the join condition here ? I guess that company_name and supplier_name are the same, however it does not make any sense to retrieve the same column from both tables, if you ask me.
Original query
SQL> SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT JOIN SUPPLIER;
Rewrite query
SQL> SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM PRODUCT JOIN SUPPLIER
on PRODUCT.SUPPLIER_NAME = SUPPLIER.COMPANY_NAME;
Try to write always the join condition, makes the query more readable. In your case you could create two indexes in both tables, as #Tejash has shown you before, but let me explain you a bit more something else.
If your SQL query only retrieves the columns present in the index, Oracle probably will use the indexes to access the data. In this case, accessing by index will be faster than by table because the indexes are smaller than the tables.
However, if your SQL query retrieves more columns than the ones contained in the indexes (for example, the product_name), then it would be very interesting see whether than indexes make the query faster when you have no filter on it. In this case Oracle probably would use a method called TABLE ACCESS BY INDEX ROWID. It means that Oracle access the index to retrieve the rowid, then it goes to the table to get the data using the rowid retrieved from the index. In this case, when more columns are involved, if the tables are big enough, I bet accessing by table full scan is faster than accessing by index.
My advice: Get statistics of both tables by using DBMS_STATS. And, if you have Oracle 11g or higher, that you most probably do, you might want to use Invisible Indexes to verify the performance of those queries when you add the indexes without affecting your environment, then when you are sure, you can make them visible.
SQL> CREATE INDEX IDX_PRO_SUP ON PRODUCT(SUPPLIER_NAME) INVISIBLE;
SQL> CREATE INDEX IDX_SUP_COM SUPPLIER(COMPANY_NAME) INVISIBLE;
To see how the indexes will work with your explain plan in your own session.
SQL> ALTER SESSION SET OPTIMIZER_USE_INVISIBLE_INDEXES=TRUE;
SQL> EXPLAIN PLAN FOR SELECT PRODUCT.SUPPLIER_NAME, SUPPLIER.COMPANY_NAME FROM
PRODUCT,SUPPLIER;
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
Then when you are sure those indexes work as you expect:
SQL> ALTER INDEX IDX_PRO_SUP VISIBLE;
SQL> ALTER INDEX IDX_SUP_COM VISIBLE;
Hope it helps.
Best regards
I have a long query, but similar to the short version here:
select * from table_a a
left join table_b b on
b.id = a.id and
b.name = 'CONSTANT';
There are 2 indexes on table_b for both id and name, idx_id has fewer duplicates and idx_name has a lot of duplicates. This is quite a large table (20M+ records). And the join is taking 10min+.
A simple explain plan shows a lot of memory uses on the join part, and it shows it uses the index for name as opposed to id.
How to solve this issue? How to force using the idx_id index?
I was thinking of putting b.name='CONSTANT' to where clause, but this is a left join and where will remove all the record that exists in table_a.
Updated explain plan. Sorry cannot paste the whole plan.
Explain plan with b.name='CONSTANT':
Explain plan when commenting b.name clause:
Add an optimizer hint to your query.
Without knowing your 'long' query, it's difficult to know if Oracle is using the wrong one, or if your interpretation that indexb < indexa so therefore must be quicker for query z is correct.
To add a hint the syntax is
select /*+ index(table_name index_name) */ * from ....;
What is the size of TABLE_A relative to TABLE_B? It wouldn't make sense to use the ID index unless TABLE_A had significantly less rows than TABLE_B.
Index range scans are generally only useful when they access a small percentage of the rows in a table. Oracle reads the index one-block-at-a-time, and then still has to pull the relevant row from the table. If the index isn't very selective, that process can be slower than a multi-block full table scan.
Also, it might help if you can post the full explain plan using this text format:
explain plan for select ... ;
select * from table(dbms_xplan.display);
I've got the following query in Oracle10g:
select *
from DATA_TABLE DT,
LOOKUP_TABLE_A LTA,
LOOKUP_TABLE_B LTB
where DT.COL_A = LTA.COL_A (+)
and DT.COL_B = LTA.COL_B (+)
and LTA.COL_C = LTB.COL_C
and LTA.COL_B = LTB.COL_B
and ( DT.REF_TXT = :refTxt or DT.ALT_REF_TXT = :refTxt )
and DT.CREATED_DATE between :startDate and :endDate
And was wondering whether you've got any hints for optimising the query.
Currently I've got the following indices:
IDX1 on DATA_TABLE (REF_TXT, CREATED_DATE)
IDX2 on DATA_TABLE (ALT_REF_TXT, CREATED_DATE)
LOOKUP_A_PK on LOOKUP_TABLE_A (COL_A, COL_B)
LOOKUP_A_IDX1 on LOOKUP_TABLE_A (COL_C, COL_B)
LOOKUP_B_PK on LOOKUP_TABLE_B (COL_C, COL_B)
Note, the LOOKUP tables are very small (<200 rows).
EDIT:
Explain plan:
Query Plan
SELECT STATEMENT Cost = 8
FILTER
NESTED LOOPS
NESTED LOOPS
TABLE ACCESS BY INDEX ROWID DATA_TABLE
BITMAP CONVERSION TO ROWIDS
BITMAP OR
BITMAP CONVERSION FROM ROWIDS
SORT ORDER BY
INDEX RANGE SCAN IDX1
BITMAP CONVERSION FROM ROWIDS
SORT ORDER BY
INDEX RANGE SCAN IDX2
TABLE ACCESS BY INDEX ROWID LOOKUP_TABLE_A
INDEX UNIQUE SCAN LOOKUP_A_PK
TABLE ACCESS BY INDEX ROWID LOOKUP_TABLE_B
INDEX UNIQUE SCAN LOOKUP_B_PK
EDIT2:
The data looks like this:
There will be 10000s of distinct REF_TXT, which 10-100s of CREATED_DTs for each. ALT_REF_TXT will mostly NULL but there are going to be 100s-1000s which it will be different from REF_TXT.
EDIT3: Fixed what ALT_REF_TXT actually contains.
The execution plan you're currently getting looks pretty good. There's no obvious improvement to be made.
As other have noted, you have some outer join indicators, but then you essentially prevent the outer join by requiring equality on other columns in the two outer tables. As you can see from the execution plan, no outer join is happening. If you don't want an outer join, remove the (+) operators, they're just confusing the issue. If you do want an outer join, rewrite the query as shown by #Dems.
If you're unhappy with the current performance, I would suggest running the query with the gather_plan_statistics hint, then using DBMS_XPLAN.DISPLAY_CURSOR(?,?,'ALLSTATS LAST') to view the actual execution statistics. This will show the elapsed time attributed to each step in the execution plan.
You might get some benefit from converting one or both of the lookup tables into index-organized tables.
Your 2 index range scans on IDX1 and IDX2 will produce at most 100 rows, so your BITMAP CONVERSION TO ROWIDS will produce at most 200 rows. And from there on, it's only indexed access by rowids, leading to a likely sub-second execution. So are you really experiencing performance problems? If so, how long does it take exactly?
If you are experiencing performance problems, then please follow Dave Costa's advice and get the real plan, because in that case it's likely that you are using another plan runtime, possibly due to certain bind variable values or different optimizer environment settings.
Regards,
Rob.
This is one of those cases where it makes very little sense to try to optimize the DBMS performance without knowing what your data means.
Do you have many, many distinct CREATED_DATE values and a few rows in your DT for each date? If so you want an index on CREATED_DATE, as it will be the primary way for the DBMS to reject columns it doesn't want to process.
On the other hand, do you have only a handful of dates, and many distinct values of REF_TXT or ALT_REF_TXT? In that case you probably have the correct compound index choices.
The presence of OR in your query complicates things greatly, and throws most guesswork out the window. You must look at EXPLAIN PLAN to see what's going on.
If you have tens of millions of distinct REF_TXT and ALT_REF_TXT values, you may want to consider denormalizing this schema.
Edit.
Thanks for the additional info. Your explain plan contains no smoking guns that I can see. Some things to try next if you're not happy with performance yet.
Flip the order of the columns in your compound indexes on your data tables. Maybe that will get you simpler index range scans instead of all the bitmap monkey business.
Exchange your SELECT * for the names of the columns you actually need in the query resultset. That's good programming practice in any case, and it MAY allow the optimizer to avoid some work.
If things are still too slow, try recasting this as a UNION of two queries rather than using OR. That MAY allow the alt_ref_txt part of your query, which is made a little more complex by all the NULL values in that column, to be optimized separately.
This may be the query you want using a more upto date syntax.
(And without inner joins breaking outer joins)
select
*
from
DATA_TABLE DT
left outer join
(
LOOKUP_TABLE_A LTA
inner join
LOOKUP_TABLE_B LTB
on LTA.COL_C = LTB.COL_C
and LTA.COL_B = LTB.COL_B
)
on DT.COL_A = LTA.COL_A
and DT.COL_B = LTA.COL_B
where
( DT.REF_TXT = :refTxt or DT.ALT_REF_TXT = :refTxt )
and DT.CREATED_DATE between :startDate and :endDate
INDEXes that I'd have are...
LOOKUP_TABLE_A (COL_A, COL_B)
LOOKUP_TABLE_B (COL_B, COL_C)
DATA_TABLE (REF_TXT, CREATED_DATE)
DATA_TABLE (ALT_REF_TXT, CREATED_DATE)
Note: The first condition in the WHERE clause about contains an OR that will likely frag the use of INDEXes. In such case I have seen performance benefits in UNIONing two queries together...
<your query>
where
DT.REF_TXT = :refTxt
and DT.CREATED_DATE between :startDate and :endDate
UNION
<your query>
where
DT.ALT_REF_TXT = :refTxt
and DT.CREATED_DATE between :startDate and :endDate
Provide output of this query with "set autot trace". Let's see how many blocks it is pulling. Explain plan looks good, it should be very fast. If you need more, denormalize the lookup table info into DT. Violates 3rd normal form, but it will make your query faster by eliminating the joins. In a situation where milliseconds counts, everything is in buffers, and you need that query to run 1000 times/second, it can help by driving down the number of blocks looked at per row. It is the ultimate way to boost read performance, but complicates your app (and ruins your lovely ER diagram).
I'm using Mysql 5.0 and am a bit new to indexes. Which of the following queries can be helped by indexing and which index should I create?
(Don't assume either table to have unique values. This isn't homework, its just some examples I made up to try and get my head around indexing.)
Query1:
Select a.*, b.*
From a
Left Join b on b.type=a.type;
Query2:
Select a.*, b.*
From a,b
Where a.type=b.type;
Query3:
Select a.*
From a
Where a.type in (Select b.type from b where b.brand=5);
Here is my guess for what indexes would be use for these different kinds of queries:
Query1:
Create Index Query1 Using Hash on b (type);
Query2:
Create Index Query2a Using Hash on a (type);
Create Index Query2b Using Hash on b (type);
Query3:
Create Index Query2a Using Hash on b (brand,type);
Am I correct that neither Query1 or Query3 would utilize any indexes on table a?
I believe these should all be hash because there is only = or !=, right?
Thanks
using the explain command in mysql will give a lot of great info on what mysql is doing and how a query can be optimized.
in q1 and q2: an index on (a.type, all other a cols) and one on (b.type, all other b cols)
in q3: an index on (a.b_type, all other a cols) and one on b (brand, type)
ideally, you'd want all the columns that were selected stored directly in the index so that mysql doesn't have to jump from the index back to the table data to fetch the selected columns. however, that is not always manageable (i.e.: sometimes you need to select * and indexing all columns is too costly), in which case indexing just the search columns is fine.
so everything you said works great.
query 3 is invalid, but i assume you meant
where a.type in ....
Query 1 is the same as query two, just better syntax, both probably have the same query plan and both will use both indexes.
Query 3 will use the index on b.brand, but not the type portion of it. It would also use an index on a.type if you had one.
You are right that they should be hash indexes.
Query 3 could utilize an index on a.type if the number of b's with brand=5 is close to zero
Query2 will utilize indices if they are B-trees (and thus are sorted). Using hash indices with index-join may slow down your query (because you'll have to read Size(a) values in non-sequential way)
Query optimization and indexing is a huge topic, so you'll definitely want to read about MySQL and the specific storage engines you're using. The "using hash" is supported by InnoDB and NDB; I don't think MyISAM supports it.
The joins you have will perform a full table or index scan even though the join condition is equality; Every row will have to be read because there's no where clause.
You'll probably be better off with a standard b-tree index, but measure it and investigate the query plan with "explain". MySQL InnoDB stores row data organized by primary key so you should also have a primary key on your tables, not just an index. It's best if you can use the primary key in your joins because otherwise MySQL retrieves the primary key from the index, then does another fetch to get the row. The nice exception to that rule is if your secondary index includes all the columns you need in the query. That's called a covering index and MySQL will not have to lookup the row at all.