Oracle complex query with multiple joins on same table - sql

I am dealing with a monster query ( ~800 lines ) on oracle 11, and its taking expensive resources.
The main problem here is a table mouvement with about ~18 million lines, on which I have like 30 left joins on this table.
LEFT JOIN mouvement mracct_ad1
ON mracct_ad1.code_portefeuille = t.code_portefeuille
AND mracct_ad1.statut_ligne = 'PROPRE'
AND substr(mracct_ad1.code_valeur,1,4) = 'MRAC'
AND mracct_ad1.code_transaction = t.code_transaction
LEFT JOIN mouvement mracct_zias
ON mracct_zias.code_portefeuille = t.code_portefeuille
AND mracct_zias.statut_ligne = 'PROPRE'
AND substr(mracct_zias.code_valeur,1,4) = 'PRAC'
AND mracct_zias.code_transaction = t.code_transaction
LEFT JOIN mouvement mracct_zixs
ON mracct_zias.code_portefeuille = t.code_portefeuille
AND mracct_zias.statut_ligne = 'XROPRE'
AND substr(mracct_zias.code_valeur,1,4) = 'MRAT'
AND mracct_zias.code_transaction = t.code_transaction
is there some way so I can get rid of the left joins, (union join or example) to make the query faster and consumes less? execution plan or something?

Just a note on performance. Usually you want to "rephrase" conditions like:
AND substr(mracct_ad1.code_valeur,1,4) = 'MRAC'
In simple words, expressions on the left side of the equality will prevent the best usage of indexes and may push the SQL optimizer toward a less than optimal plan. The database engine will end up doing more work than is really needed, and the query will be [much] slower. In extreme cases they can even decide to use a Full Table Scan. In this case you can rephrase it as:
AND mracct_ad1.code_valeur like 'MRAC%'
or:
AND mracct_ad1.code_valeur >= 'MRAC' AND mracct_ad1.code_valeur < 'MRAD'

I am guessing so. Your code sample doesn't make much sense, but you can probably do conditional aggregation:
left join
(select m.code_portefeuille, m.code_transaction,
max(case when m.statut_ligne = 'PROPRE' and m.code_valeur like 'MRAC%' then ? end) as ad1,
max(case when m.statut_ligne = 'PROPRE' and m.code_valeur like 'MRAC%' then ? end) as zia,
. . . -- for all the rest of the joins as well
from mouvement m
group by m.code_portefeuille, m.code_transaction
) m
on m.code_portefeuille = t.code_portefeuille and m.code_transaction = t.code_transaction
You can probably replace all 30 joins with a single join to the aggregated table.

Related

Oracle Query Optimization for Multiple Joins

Following query is taking too long (30 seconds).
I think bottleneck is in where clause. Is it possible to optimize ?
SELECT
COUNT(*) AS IN_HELP
FROM
HCM.TableA
INNER JOIN
HCM.EMPLOYEE_POSITION_BOX EPB
ON
TableA.POSITION_BOX_CODE = EPB.POSITION_BOX_CODE AND NVL (V_DATE,SYSDATE) BETWEEN EPB.EFFECTIVE_DATE AND EPB.END_DATE
LEFT OUTER JOIN
HCM.NODE_LEVEL_MASTER NLM
ON
HCM.F_EMPLOYEE_ACTUAL_NODE(EPB.EMPLOYEE_CODE,TRUNC(SYSDATE)) = NLM.NODE_NO
LEFT JOIN
HCM.POSITION_STATUS_MASTER PSM
ON
TableA.STATUS = PSM.POSITION_STATUS_NO
WHERE
NLM.NODE_NO_LEVEL2 = V_LOOP.NODE_NO_LEVEL2
AND HCM.F_EMPLOYEE_ACTUAL_NODE(EPB.EMPLOYEE_CODE,TRUNC (SYSDATE)) <> TableA.NODE_NO
AND PSM.FLAG = 'Y'
AND TableA.STATUS <> '0108'
First of all, check the indexes for your tables from WHERE clause:
NLM.NODE_NO_LEVEL2
V_LOOP.NODE_NO_LEVEL2
TableA.NODE_NO
TableA.STATUS
PSM.FLAG shouldn't be added to the index as it will not provide some efficient filtering the data in tables.
Also, am I right that HCM.F_EMPLOYEE_ACTUAL_NODE is a function? If so, then you should consider to remove it from WHERE clause, as the execution plan won't examine the inside query, and it doesn't being optimized at all, which lead to performance issue.
The main problem is your function HCM.F_EMPLOYEE_ACTUAL_NODE. first move the function call to a subquery (epbx):
SELECT
COUNT(*) AS IN_HELP
FROM
HCM.TableA
INNER JOIN
(SELECT HCM.F_EMPLOYEE_ACTUAL_NODE(EPB.EMPLOYEE_CODE,TRUNC(SYSDATE)) node, epb.* from HCM.EMPLOYEE_POSITION_BOX EPB) epbx
ON
TableA.POSITION_BOX_CODE = EPBx.POSITION_BOX_CODE AND NVL (V_DATE,SYSDATE) BETWEEN EPBx.EFFECTIVE_DATE AND EPBx.END_DATE
LEFT OUTER JOIN
HCM.NODE_LEVEL_MASTER NLM
ON
epbx.node = NLM.NODE_NO
LEFT JOIN
HCM.POSITION_STATUS_MASTER PSM
ON
TableA.STATUS = PSM.POSITION_STATUS_NO
WHERE
NLM.NODE_NO_LEVEL2 = V_LOOP.NODE_NO_LEVEL2
AND epbx.node <> TableA.NODE_NO
AND PSM.FLAG = 'Y'
AND TableA.STATUS <> '0108'
Then use explain plan to find out whether your query is written efficiently.

Optimize SQL query with many left join

I have a SQL query with many left joins
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
LEFT JOIN T_PLAN_TYPE tp ON tp.plan_type_id = po.Plan_Type_Fk
LEFT JOIN T_PRODUCT_TYPE pt ON pt.PRODUCT_TYPE_ID = po.cust_product_type_fk
LEFT JOIN T_PROPOSAL_TYPE prt ON prt.PROPTYPE_ID = po.proposal_type_fk
LEFT JOIN T_BUSINESS_SOURCE bs ON bs.BUSINESS_SOURCE_ID = po.CONT_AGT_BRK_CHANNEL_FK
LEFT JOIN T_USER ur ON ur.Id = po.user_id_fk
LEFT JOIN T_ROLES ro ON ur.roleid_fk = ro.Role_Id
LEFT JOIN T_UNDERWRITING_DECISION und ON und.O_Id = po.decision_id_fk
LEFT JOIN T_STATUS st ON st.STATUS_ID = po.piv_uw_status_fk
LEFT OUTER JOIN T_MEMBER_INFO mi ON mi.proposal_info_fk = po.O_ID
WHERE 1 = 1
AND po.CUST_APP_NO LIKE '%100010233976%'
AND 1 = 1
AND po.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
The performance seems to be not good and I would like to optimize the query.
Any suggestions please?
Try this one -
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
WHERE PO.CUST_APP_NO LIKE '%100010233976%'
AND PO.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
First, check your indexes. Are they old? Did they get fragmented? Do they need rebuilding?
Then, check your "execution plan" (varies depending on the SQL Engine): are all joins properly understood? Are some of them 'out of order'? Do some of them transfer too many data?
Then, check your plan and indexes: are all important columns covered? Are there any outstandingly lengthy table scans or joins? Are the columns in indexes IN ORDER with the query?
Then, revise your query:
- can you extract some parts that normally would quickly generate small rowset?
- can you add new columns to indexes so join/filter expressions will get covered?
- or reorder them so they match the query better?
And, supporting the solution from #Devart:
Can you eliminate some tables on the way? does the where touch the other tables at all? does the data in the other tables modify the count significantly? If neither SELECT nor WHERE never touches the other joined columns, and if the COUNT exact value is not that important (i.e. does that T_PROPOSAL_INFO exist?) then you might remove all the joins completely, as Devart suggested. LEFTJOINs never reduce the number of rows. They only copy/expand/multiply the rows.

Oracle Sub-select taking a long time

I have a SQL Query that comprise of two level sub-select. This is taking too much time.
The Query goes like:
select * from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF
where calculation_context_key = 130205268077
and DERIV_POSITION_KEY in
(select ctry_risk_derivs_psn_key
from DALDBO.V_COUNTRY_DERIV_PSN
where calculation_context_key = 130111216755
--and ctry_risk_derivs_psn_key = 76296412
and CREDIT_PRODUCT_TYPE = 'SWP OP'
and CALC_OBLIGOR_COUNTRY_OF_ASSETS in
(select ctry_cd
from DALDBO.V_PSN_COUNTRY
where calculation_context_key = 130134216755
--and ctry_risk_derivs_psn_key = 76296412
)
)
These tables are huge! Is there any optimizations available?
Without knowing anything about your table or view definitions, indexing, etc. I would start by looking at the sub-selects and ensuring that they are performing optimally. I would also want to know how many values are being returned by each sub-select as this can impact performance.
How is calculation_context_key used to retrieve rows from V_COUNTRY_DERIV_PSN and V_PSN_COUNTRY? Is it an optimal execution plan?
How is DERIV_POSITION_KEY and CALC_OBLIGOR_COUNTRY_OF_ASSETS used in V_COUNTRY_DERIV_SUMMARY_XREF to retrieve rows? Again, look at the explain plan.
first of all, can you write this query using inner joins (and not subselect) ??
select A.*
from DALDBO.V_COUNTRY_DERIV_SUMMARY_XREF a,
DALDBO.V_COUNTRY_DERIV_PSN b,
DALDBO.V_PSN_COUNTRY c
where calculation_context_key = 130205268077
and a.DERIV_POSITION_KEY = b.ctry_risk_derivs_psn_key
and b.calculation_context_key = 130111216755
--and b.ctry_risk_derivs_psn_key = 76296412
and b.CREDIT_PRODUCT_TYPE = 'SWP OP'
and b.CALC_OBLIGOR_COUNTRY_OF_ASSETS = c.ctry_cd
and c.calculation_context_key = 130134216755
--and c.ctry_risk_derivs_psn_key = 76296412
second, best practice says that when you don't query any data from the tables in the subselect you better of using an EXISTS instead of IN. new versions of oracle does that automatically and actually rewrite the whole thing as an inner join.
last, without any knowledge on you data and of what you are trying to do i would suggest you to try and use views as less as you can - if you can query the underling tables it would be best and you will probably see immediate performance improvement.

SQL joins vs nested query

This two SQL statements return equal results but first one is much slower than the second:
SELECT leading.email, kstatus.name, contacts.status
FROM clients
JOIN clients_leading ON clients.id_client = clients_leading.id_client
JOIN leading ON clients_leading.id_leading = leading.id_leading
JOIN contacts ON contacts.id_k_p = clients_leading.id_group
JOIN kstatus on contacts.status = kstatus.id_kstatus
WHERE (clients.email = 'some_email' OR clients.email1 = 'some_email')
ORDER BY contacts.date DESC;
SELECT leading.email, kstatus.name, contacts.status
FROM (
SELECT *
FROM clients
WHERE (clients.email = 'some_email' OR clients.email1 = 'some_email')
)
AS clients
JOIN clients_leading ON clients.id_client = clients_leading.id_client
JOIN leading ON clients_leading.id_leading = leading.id_leading
JOIN contacts ON contacts.id_k_p = clients_leading.id_group
JOIN kstatus on contacts.status = kstatus.id_kstatus
ORDER BY contacts.date DESC;
But I'm wondering why is it so? It looks like in the firt statement joins are done first and then WHERE clause is applied and in second is just the opposite. But will it behave the same way on all DB engines (I tested it on MySQL)?
I was expecting DB engine can optimize queries like the fors one and firs apply WHERE clause and then make joins.
There are a lot of different reasons this could be (keying etc), but you can look at the explain mysql command to see how the statements are being executed. If you can run that and if it still is a mystery post it.
You always can replace join with nested query... It's always faster but lot messy...

Bad performance of SQL query due to ORDER BY clause

I have a query joining 4 tables with a lot of conditions in the WHERE clause. The query also includes ORDER BY clause on a numeric column. It takes 6 seconds to return which is too long and I need to speed it up. Surprisingly I found that if I remove the ORDER BY clause it takes 2 seconds. Why the order by makes so massive difference and how to optimize it? I am using SQL server 2005. Many thanks.
I cannot confirm that the ORDER BY makes big difference since I am clearing the execution plan cache. However can you shed light at how to speed this up a little bit? The query is as follows (for simplicity there is "SELECT *" but I am only selecting the ones I need).
SELECT *
FROM View_Product_Joined j
INNER JOIN [dbo].[OPR_PriceLookup] pl on pl.siteID = NodeSiteID and pl.skuid = j.skuid
LEFT JOIN [dbo].[OPR_InventoryRules] irp on irp.ID = pl.SkuID and irp.InventoryRulesType = 'Product'
LEFT JOIN [dbo].[OPR_InventoryRules] irs on irs.ID = pl.siteID and irs.InventoryRulesType = 'Store'
WHERE (((((SiteName = N'EcommerceSite') AND (Published = 1)) AND (DocumentCulture = N'en-GB')) AND (NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%')) AND ((NodeSKUID IS NOT NULL) AND (SKUEnabled = 1) AND pl.PriceLookupID in (select TOP 1 PriceLookupID from OPR_PriceLookup pl2 where pl.skuid = pl2.skuid and (pl2.RoleID = -1 or pl2.RoleId = 13) order by pl2.RoleID desc)))
ORDER BY NodeOrder ASC
Why the order by makes so massive difference and how to optimize it?
The ORDER BY needs to sort the resultset which may take long if it's big.
To optimize it, you may need to index the tables properly.
The index access path, however, has its drawbacks so it can even take longer.
If you have something other than equijoins in your query, or the ranged predicates (like <, > or BETWEEN, or GROUP BY clause), then the index used for ORDER BY may prevent the other indexes from being used.
If you post the query, I'll probably be able to tell you how to optimize it.
Update:
Rewrite the query:
SELECT *
FROM View_Product_Joined j
LEFT JOIN
[dbo].[OPR_InventoryRules] irp
ON irp.ID = j.skuid
AND irp.InventoryRulesType = 'Product'
LEFT JOIN
[dbo].[OPR_InventoryRules] irs
ON irs.ID = j.NodeSiteID
AND irs.InventoryRulesType = 'Store'
CROSS APPLY
(
SELECT TOP 1 *
FROM OPR_PriceLookup pl
WHERE pl.siteID = j.NodeSiteID
AND pl.skuid = j.skuid
AND pl.RoleID IN (-1, 13)
ORDER BY
pl.RoleID desc
) pl
WHERE SiteName = N'EcommerceSite'
AND Published = 1
AND DocumentCulture = N'en-GB'
AND NodeAliasPath LIKE N'/Products/Cats/Computers/Computer-servers/%'
AND NodeSKUID IS NOT NULL
AND SKUEnabled = 1
ORDER BY
NodeOrder ASC
The relation View_Product_Joined, as the name suggests, is probably a view.
Could you please post its definition?
If it is indexable, you may benefit from creating an index on View_Product_Joined (SiteName, Published, DocumentCulture, SKUEnabled, NodeOrder).