I am a newbie in SQL querying and I am spending 3hrs to get the whole result of joining 2 queries.
I have focused on using left joins and avoided using subqueries on the select statement after researching. However it is still extremely slow. I have no close friends who know sql enough to explain whats wrong or what I approach I should take.
I am also new here so if this question is not allowed please inform me and I will remove it immediately.
This is the structure of the query...
The first query will get the member details.
The second query will get the transaction details.
The relationship is,
one product has many sub-plans which has many members.
One product also has many transactions which is made on a per product basis.
I am required to show all transactions and duplicate each line for each member.
I joined the queries using the product primary key.
Prior to joining, I have tested both individual queries and they turned out fine. Only 1-2 secs and I get the result.
But joining the two, I end up with 3 hrs of waiting.
SELECT
MPPFF.N_DX,
MPPFF.PM_A_P,
MPPFF.FEE1,
MPPFF.FEE2,
MPPFF.FEE3,
MPPFF.FEE4,
MPPFF.FEE11,
MPPFF.FEE12,
MPPFF.FEE5,
MPPFF.N_NO,
MPPFF.SETN_DX,
MPPFF.PRIME_NO,
MPPFF.SECN_NO,
MPPFF.COMM_A,
MPPFF.TYX_NO,
MPPFF.P_NAME,
MPPFF.B_BFX,
MPPFF.B_FM,
MPPFF.B_TO,
MPPFF.BB_NAME_P,
MPPFF.BB_NAME_S,
MPPFF.REVERSE_BFX,
MPPFF.TYX_REF_NO,
MPPFF.BB_NO_AX,
MPPFF.BB_NAME_AX,
MPPFF.DXC,
MPPFF.ST,
MPPFF.DAY,
MPPFF.CE_D_PRODUCT,
MPPFF.CE_H,
MPPFF.AS_C_E,
MPPFF.BCH,
MPPFF.RCPY_NO,
MPPFF.RE_BFX,
MPPFF.A_END,
MPPFF.PLACE,
MPPFF.MEMB_DX,
MPPFF.MBR_NO,
MPPFF.MBR_TR_BFX,
MPPFF.CE_D_TERM_CE,
MPPFF.MEMBER_AS,
MPPFF.C_USER,
MPPFF.C_BFX,
MPPFF.U_USER,
MPPFF.U_BFX
FROM (
SELECT
FF.N_DX,
FF.PM_A_P,
FF.FEE1,
FF.FEE2,
FF.FEE3,
FF.FEE4,
FF.FEE11,
FF.FEE12,
FF.FEE5,
FF.N_NO,
FF.SETN_DX,
FF.PRIME_NO,
FF.SECN_NO,
FF.COMM_A,
FF.TYX_NO,
FF.P_NAME,
FF.B_BFX,
FF.B_FM,
FF.B_TO,
FF.BB_NAME_P,
FF.BB_NAME_S,
FF.REVERSE_BFX,
FF.TYX_REF_NO,
FF.BB_NO_AX,
FF.BB_NAME_AX,
FF.DXC,
FF.ST,
FF.DAY,
FF.CE_D_PRODUCT,
FF.CE_H,
FF.AS_C_E,
FF.RCPY_NO,
FF.RE_BFX,
FF.A_END,
FF.BCH,
MPP.MBR_NO,
MPP.MBR_TR_BFX,
MPP.CE_D_TERM_CE,
MPP.C_USER,
MPP.C_BFX,
MPP.U_USER,
MPP.U_BFX,
MPP.PLACE,
MPP.MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.PRODUCT,
MPP.POPL_DX,
MPP.MEMB_DX,
FF.TYX_DX
FROM (
SELECT
MBR.MEMB_DX,
MBR.MBR_NO,
MBR.MBR_TR_BFX,
MBR.CE_D_TERM_CE,
MBR.C_USER,
MBR.C_BFX,
MBR.U_USER,
MBR.U_BFX,
MPP.PLACE,
MPP.MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.PRODUCT,
MPP.POPL_DX
FROM (
SELECT
MPP.PLACE,
MPP.MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.PRODUCT,
MPP.POPL_DX,
MMP.MEMB_DX
FROM(
SELECT
MPP.PLACE,
MPP.TYX_AS_DXC MEMBER_AS,
MPP.TYX_DX,
MPP.AS_DX,
MPP.POPL_DX,
RPT.PRODUCT
FROM
TABLE1 MPP
LEFT JOIN (
SELECT
SUBSTR(CE_D_PRODUCT,9) PRODUCT,
AS_DX
FROM
TABLE6 RPT,
TABLE7 PP
WHERE
PP.PRTY_DX = RPT.PRTY_DX
) RPT
ON MPP.AS_DX = RPT.AS_DX
) MPP
LEFT JOIN (
SELECT
POPL_DX,
MEMB_DX
FROM
TABLE4
)MMP
ON MPP.POPL_DX=MMP.POPL_DX
) MPP,
(
SELECT
MBR.MEMB_DX,
MBR.MBR_NO,
MBR.TERM_BFX MBR_TR_BFX,
MBR.CE_D_TERM_CE,
MBR.C_USER,
MBR.C_BFX,
MBR.U_USER,
MBR.U_BFX
FROM
TABLE8 MBR
) MBR
WHERE
MPP.MEMB_DX = MBR.MEMB_DX
) MPP
INNER JOIN
(
SELECT
FF.N_DX,
ROUND(CB.FEE5 * FF.RATE,2) PM_A_P,
CB.FEE1,
CB.FEE2,
CB.FEE3,
CB.FEE4,
CB.FEE11,
CB.FEE12,
CB.FEE5,
FF.N_NO,
FF.SETN_DX,
FF.PRIME_NO,
FF.SECN_NO,
FF.COMM_A,
FF.TYX_NO,
FF.P_NAME_1||', '||FF.P_NAME_2||' '||FF.P_NAME_3 P_NAME,
FF.B_BFX,
FF.B_FM,
FF.B_TO,
FF.BB_NAME_1_P||', '||FF.BB_NAME_2_P BB_NAME_P,
FF.BB_NAME_1_S||', '||FF.BB_NAME_2_S BB_NAME_S,
CB.REVERSE_BFX,
FF.TYX_REF_NO,
FF.BB_NO_AX,
FF.BB_NAME_1_AX||' '|| FF.BB_NAME_2_AX BB_NAME_AX,
CASE
WHEN FF.CE_D_ST IN ('A', 'B', 'C') THEN 'AC'
WHEN FF.DAY >1 THEN 'NEW'
ELSE 'AB'
END DXC,
FF.CE_D_ST ST,
FF.DAY,
FF.CE_D_PRODUCT,
FF.CE_D_COMP CE_H,
FF.AS_C AS_C_E,
FF.RCPY_NO,
FF.RE_BFX,
ROUND(CB.A_S,2) A_END,
FF.TYX_DX,
MP.BCH
FROM
TABLE2 CB,
TABLE3 FF
LEFT JOIN (
SELECT
SUBSTR(CE_D_BCH_O,13) BCH,
TYX_DX
FROM
TABLE5 MP
)MP
ON MP.TYX_DX = FF.TYX_DX
WHERE
FF.SETN_DX = CB.SETN_DX AND
EXTRACT( YEAR FROM FF.EFF_BFX) >=2013
) FF
ON MPP.TYX_DX = FF.TYX_DX
)MPPFF
;
Use ROWNUM to prevent optimizer transformations from degrading the performance.
You are encountering a common problem - two queries run fast separately but run slow when put together. Oracle does not have to run the queries in the order they are written. It can merge views, push predicates around, and generally completely re-write the query to run in a different order. Normally this is a great thing because you don't want to have to worry about which physical order to join tables. But sometimes Oracle applies the wrong transformations and the results are disastrous.
There are two ways to solve these problems.
Look at table structures, the statements, the execution plans, SQL monitoring or traces, statistics, etc. Try to find out which operation is slow, and why (use cardinality as your guide), and then try to fix it. This process can easily take hours, maybe even days, but it's the best way to learn.
Stop the optimizer from combining the queries with a simple trick. There are a few ways to do this but in my experience the simplest way is to add the pseudo-column ROWNUM to any inline view that you do not want transformed. ROWNUM is a special column that tells Oracle "this query block must be returned in a specific way, don't do anything to it".
Change this:
--This is slow:
select ...
from
(
--This is fast:
select ...
) inline_view1
join
(
--This is fast:
select ...
) inline_view2
on ...
to this:
--Now this is fast.
select ...
from
(
--This is fast:
select rownum /*add rownum to prevent slow transformations*/, ...
) inline_view1
join
(
--This is fast:
select rownum /*add rownum to prevent slow transformations*/, ...
) inline_view2
on ...
In your code I believe the two inline views to modify would be the outer-most MPP and FF.
On a side note, I disagree with with some of the other comments and answers.
A CTE will not help here since none of the tables are used twice.
You don't always need to know a million details about the query to tune it. Unless you have the time and want to improve your skills.
I think your over-all query structure is good. You are on the right path to building great SQL statements. Inline views are the key to writing SQL - build small units of code, combine them in simple steps, repeat. Putting all the tables together in one massive join is a recipe for spaghetti code. Although I agree with others that you should avoid the old-fashioned join syntax. And the query would really benefit from some comments and more meaningful names. And don't feel afraid to put all the select list items on one line. Having a 500-column line isn't ideal, but you want to focus on the joins, not the simple list of columns.
Your query is almost unreadable, because of all the nesting. And you are mixing pre 1992 style joins with current join syntax. Don't use the outdated comma-separated join syntax. It is prone to errors. All your outer-joins are void, because at some point you will always have criteria that dismisses outer-joined records, such as when inner-joining table8 on the outer-joined table4's memb_dx.
Your query seems to translate to
select
<several fields from the tables>
from table1 mpp
join table6 rpt on rpt.as_dx = mpp.as_dx
join table7 pp on pp.prty_dx = rpt.prty_dx
join table4 mmp on mmp.popl_dx = mpp.popl_dx
join table8 mbr on mpp.memb_dx = mmp.memb_dx
join table3 ff on ff.tyx_dx = mpp.tyx_dx and extract(year from ff.eff_bfx) >= 2013
join table2 cb on ff.setn_dx = cb.setn_dx
left join table5 mp on mp.tyx_dx = ff.tyx_dx;
and maybe you want it to be
select
<several fields from the tables>
from table1 mpp
left join table6 rpt on rpt.as_dx = mpp.as_dx
left join table7 pp on pp.prty_dx = rpt.prty_dx
left join table4 mmp on mmp.popl_dx = mpp.popl_dx
left join table8 mbr on mpp.memb_dx = mmp.memb_dx
join table3 ff on ff.tyx_dx = mpp.tyx_dx and extract(year from ff.eff_bfx) >= 2013
join table2 cb on ff.setn_dx = cb.setn_dx
left join table5 mp on mp.tyx_dx = ff.tyx_dx;
instead or something along the lines. Get rid of all the nesting and stay with a clear and easy to read from clause.
One thing others haven't mentioned is the use of
EXTRACT( YEAR FROM FF.EFF_BFX) >=2013
This applies the EXTRACT function to every row selected from TABLE3 (I believe that's what FF refers to at this point in the query). I suggest replacing the above with
FF.EFF_BFX >= TO_DATE('01-JAN-2013', 'DD-MON-YYYY')
or something similar. This requires only a single call to TO_DATE to generate the date constant, which is then compared directly to FF.EFF_BFX, which appears to be a column of type DATE.
This query also uses the same table alias (e.g. FF, MPP, etc) multiple times for different entities in different contexts. In my opinion this is bad practice, and I suggest you rework your query to use a unique alias for each entity, which will make the query easier to understand.
As others have mentioned, getting rid of the pre-1992 joins in the WHERE clause would also help clarify what's going on, as would getting rid of the long column lists. A couple of the subqueries could be eliminated as well which would make the query cleaner and clearer.
After dealing with all the above I get the following:
SELECT *
FROM (SELECT *
FROM TABLE1 MPP
LEFT OUTER JOIN (SELECT SUBSTR(CE_D_PRODUCT, 9) PRODUCT,
AS_DX
FROM TABLE6 RPT
INNER JOIN TABLE7 PP
ON PP.PRTY_DX = RPT.PRTY_DX) RPT
ON MPP.AS_DX = RPT.AS_DX
LEFT OUTER JOIN TABLE4 MMP
ON MPP.POPL_DX = MMP.POPL_DX) MPP
INNER JOIN TABLE8 MBR
ON MPP.MEMB_DX = MBR.MEMB_DX
INNER JOIN (SELECT FF.*,
CB.*,
ROUND(CB.FEE5 * FF.RATE,2) PM_A_P,
FF.P_NAME_1 || ', ' || FF.P_NAME_2 || ' ' || FF.P_NAME_3 P_NAME,
FF.BB_NAME_1_P || ', ' || FF.BB_NAME_2_P BB_NAME_P,
FF.BB_NAME_1_S || ', ' || FF.BB_NAME_2_S BB_NAME_S,
FF.BB_NAME_1_AX || ' ' || FF.BB_NAME_2_AX BB_NAME_AX,
CASE
WHEN FF.CE_D_ST IN ('A', 'B', 'C') THEN 'AC'
WHEN FF.DAY > 1 THEN 'NEW'
ELSE 'AB'
END DXC,
ROUND(CB.A_S,2) A_END,
SUBSTR(MP.CE_D_BCH_O, 13) AS BCH
FROM TABLE2 CB
INNER JOIN TABLE3 FF
ON FF.SETN_DX = CB.SETN_DX
LEFT OUTER JOIN TABLE5 MP
ON MP.TYX_DX = FF.TYX_DX
WHERE FF.EFF_BFX >= TO_DATE('01-JAN-2013', 'DD-MON-YYYY')) FF
ON MPP.TYX_DX = FF.TYX_DX
Best of luck.
I tried to make your query more readable:
SELECT MPPFF.*
FROM
(SELECT FF.*, MPP.*
FROM
(SELECT MBR.*, MPP.*
FROM
(SELECT MPP.*, MMP.*
FROM
(SELECT MPP.*, RPT.*
FROM TABLE1 MPP
LEFT JOIN (SELECT * FROM TABLE6 RPT, TABLE7 PP WHERE PP.PRTY_DX = RPT.PRTY_DX) RPT ON MPP.AS_DX = RPT.AS_DX) MPP
LEFT JOIN (SELECT * FROM TABLE4) MMP ON MPP.POPL_DX=MMP.POPL_DX) MPP,
(SELECT MBR.* FROM TABLE8 MBR) MBR
WHERE MPP.MEMB_DX = MBR.MEMB_DX) MPP
INNER JOIN (SELECT FF.*, CB.* FROM TABLE2 CB, TABLE3 FF
LEFT JOIN (SELECT * FROM TABLE5 MP ) MP ON MP.TYX_DX = FF.TYX_DX
WHERE FF.SETN_DX = CB.SETN_DX
AND EXTRACT( YEAR FROM FF.EFF_BFX) >=2013) FF ON MPP.TYX_DX = FF.TYX_DX) MPPFF
;
You select 8 different tables and the only WHERE condition is EXTRACT( YEAR FROM FF.EFF_BFX) >= 2013
Unless the tables are tiny it will always take some time to query them all together.
Why do you mix ANSI join syntax and old-style Oracle join syntax?
Related
I need help in optimizing this SQL query.
In the main SELECT statement there are three columns which is dependent on the outer query result. This is why my query is taking a long time to return data. I have tried making left joins but this is not working properly.
Can anyone help me to resolve this issue?
SELECT
DISTINCT ou.OrganizationUserID AS StudentID,
ou.FirstName,
ou.LastName,
(
SELECT
STRING_AGG(
(ug.UG_Name),
','
)
FROM
Groups ug
INNER JOIN ApplicantUserGroup augm ON augm.AUGM_UserGroupID = ug.UG_ID
WHERE
augm.AUGM_OrganizationUserID = ou.OrganizationUserID
AND ug.UG_IsDeleted = 0
AND augm.AUGM_IsDeleted = 0
) AS UserGroups,
order1.OrderNumber AS OrderId -- UAT-2455
,
(
SELECT
STRING_AGG(
(CActe.CustomAttribute),
','
)
FROM
CustomAttributeCte CActe
WHERE
CActe.HierarchyNodeID = dpm.DPM_ID
AND CActe.OrganizationUserID = ps.OrganizationUserID
) AS CustomAttributes -- UAT-2455
,
(
SELECT
STRING_AGG(
(CActe.CustomAttributeID),
','
)
FROM
CustomAttributeCte CActe
WHERE
CActe.HierarchyNodeID = dpm.DPM_ID
AND CActe.OrganizationUserID = ps.OrganizationUserID
) AS CustomAttributeID
FROM
ApplicantData acd WITH (NOLOCK)
INNER JOIN ClientPackage ps WITH (NOLOCK) ON acd.ClientSubscriptionID = ps.ClientSubscriptionID
INNER JOIN [ClientOrder] order1 WITH (NOLOCK) ON order1.OrderID = ps.OrderID
AND order1.IsDeleted = 0
INNER JOIN OUser ou WITH (NOLOCK) ON ou.OrganizationUserID = ps.OrganizationUserID
It looks like this query can be simplified, and the dependent subqueries in your SELECT clause removed, Consider your second and third dependent subqueries. You can refactor them into one nondependent subquery with a LEFT JOIN. Using nondependent subqueries is more efficient because the query planner can run them just once, rather than once for each row.
You want two STRING_AGG() results from the same table. This subquery gives those two outputs for every possible combination of HierarchyNodeID and OrganizationUserID values. STRING_AGG() is an aggregate function like SUM() and so works nicely with GROUP BY.
SELECT HierarchyNodeID, OrganizationUserID,
STRING_AGG((CActe.CustomAttribute), ',') CustomAttributes -- UAT-2455,
STRING_AGG((CActe.CustomAttributeID), ',') CustomAttributeIDs -- UAT-2455
FROM CustomAttributeCte CActe
GROUP BY HierarchyNodeID, OrganizationUserID
You can run this subquery itself to convince yourself it works.
Now, we can LEFT JOIN that into your query. Like this. (For readability I took out the NOLOCKs and used JOIN: it means the same thing as INNER JOIN.)
SELECT DISTINCT
ou.OrganizationUserID AS StudentID,
ou.FirstName,
ou.LastName,
'tempvalue' AS UserGroups, -- shortened for testing
order1.OrderNumber AS OrderId, -- UAT-2455
uat2455.CustomAttributes, -- UAT-2455
uat2455.CustomAttributeIDs -- UAT-2455
FROM ApplicantData acd
JOIN ClientPackage ps
ON acd.ClientSubscriptionID = ps.ClientSubscriptionID
JOIN ClientOrder order1
ON order1.OrderID = ps.OrderID
AND order1.IsDeleted = 0
JOIN OUser ou
ON ou.OrganizationUserID = ps.OrganizationUserID
LEFT JOIN (
SELECT HierarchyNodeID, OrganizationUserID,
STRING_AGG((CActe.CustomAttribute), ',') CustomAttributes -- UAT-2455,
STRING_AGG((CActe.CustomAttributeID), ',') CustomAttributeIDs -- UAT-2455
FROM CustomAttributeCte CActe
GROUP BY HierarchyNodeID, OrganizationUserID
) uat2455
ON uat2455.HierarchyNodeID = dpm.DPM_ID
AND uat2455.OrganizationUserId = ps.OrganizationUserID
See how we collapsed your second and third dependent subqueries to just one, then used it as a virtual table with LEFT JOIN? We transformed the WHERE clauses from the dependent subqueries into an ON clause.
You can test this: run it with TOP(50) and eyeball the results.
When you're happy, the next step is to transform your first dependent subquery the same way.
Pro tip Don't use WITH (NOLOCK), ever, unless a database administration expert tells you to after looking at your specific query. If your query's purpose is a historical report and you don't care whether the most recent transactions in your database are represented exactly right, you can precede your query with this statement. It also allows the query to run while avoiding locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Pro tip Be obsessive about formatting your queries for readability. You, your colleagues, and yourself a year from now must be able to read and reason about queries like this.
I'have a specific problem. I need get some some data from database. I have a mechanism to retrieve data from a program. I need to use it, no modifications possible. The original query is:
SELECT it_Symbol AS Symbol, tt_Name AS Nazwa, tt_Price AS Cena,
tt_Quantity AS Ilosc, tt_Id
FROM tr__Transaction INNER JOIN tr_Item
ON tt_TransId=tr_Id LEFT OUTER JOIN it__Item
ON tt_ItemId = it_Id RIGHT JOIN reg_Site
ON tr_SiteId = rs_Id LEFT OUTER JOIN it_ItemSite
ON it_Id = is_ItemId
WHERE tt_TransId=#transId
GROUP BY tt_Id, tt_Quantity, tr_Id, it_Name, tt_Price,it_Symbol,
is_Name, tt_Name, tt_ItemId, tt_Id
The problem is that I need to get some additional data from tr__Transaction table.
It has a field tr_Source. I need this fields value, but for tr__transaction records which have tr_Id listed in returned tt_Id field.
Any way to do a subquery returning values dependant on tt_Id column values?
Or maybe any other joins combination? I've spend whole week with this, and have no more ideas or skills to do this:/
Any help would be very appreciated.
Ok still don't know exactly what you need but it is an atempt to clean up the question. So this is a working version of the answer since you can't format code in comments.
Please explain if some relations are wrong.
you can always join tables multiple times under different conditions as long as they have different aliases.
For instance:
SELECT c.it_Symbol AS Symbol, a.tt_Name AS Nazwa, a.tt_Price AS Cena,
a.tt_Quantity AS Ilosc, a.tt_Id, f.tr_Source
FROM tr__Transaction a
INNER JOIN tr_Item b
ON a.tt_TransId=b.tr_Id
LEFT OUTER JOIN it__Item c
ON a.tt_ItemId = c.it_Id
RIGHT JOIN reg_Site d
ON a.tr_SiteId = d.rs_Id
LEFT OUTER JOIN it_ItemSite e
ON c.it_Id = e.is_ItemId
LEFT OUTER JOIN tr__Transaction f
ON c.tt_id = f.tr_id
WHERE a.tt_TransId=#transId
GROUP BY a.tt_Id, a.tt_Quantity, a.tr_Id, c.it_Name, a.tt_Price,c.it_Symbol,
e.is_Name, a.tt_Name, a.tt_ItemId, a.tt_Id
I'm not sure if I understand you question correctly, but assuming you are saying that the Original SQL statement cannot be changed (ie, it's in a read-only View). Then you can wrap it around another SELECT statement.
SELECT tblOriginal.*, tblExtend.tt_Source
FROM (
SELECT it_Symbol AS Symbol, tt_Name AS Nazwa, tt_Price AS Cena,
tt_Quantity AS Ilosc, tt_Id
FROM tr__Transaction INNER JOIN tr_Item
ON tt_TransId=tr_Id LEFT OUTER JOIN it__Item
ON tt_ItemId = it_Id RIGHT JOIN reg_Site
ON tr_SiteId = rs_Id LEFT OUTER JOIN it_ItemSite
ON it_Id = is_ItemId
WHERE tt_TransId=#transId
GROUP BY tt_Id, tt_Quantity, tr_Id, it_Name, tt_Price,it_Symbol,
is_Name, tt_Name, tt_ItemId, tt_Id
) AS tblOriginal
INNER JOIN tr__Transaction AS tblExtend
ON tblOriginal.tt_Id = tblExtend.tt_Id
But I suspect your problem is more complicated that that as you've spent over a week on it. In that case, can you elaborate?
I am new to SQL and thus facing some problems in enhancing a SQL query.
There are two tables basically: one is ef_dat_app_ef_link and the other is ef_dat_lspd_machine_type_model
Here is the query I am talking about. I have added an inner join in the 10th line which is the base of all problems.
select country_isocode as country,
language_isocode as language, model_number,
machine_type_model_name model_name,machine_type_group_id,
machine_type_model_id, product_type_name product_type,
machine_type_model_id_to,filenet_link
from ef_dat_lspd_machine_type_model,
cross join ef_dim_lspd_country_language
left join ef_dat_lspd_model_country
using (machine_type_model_id, country_isocode)
inner join ef_dat_app_ef_link on (ef_dat_lspd_machine_type_model.lpmd_revenue_pid = ef_dat_app_ef_link.ef_product_revenue_pid)
where (ef_dat_app_ef_link.country_isocode='US' or ef_dat_app_ef_link.country_isocode='CA') and ef_dat_app_ef_link.language_isocode='en'
join ef_dat_lspd_product_type using (product_type_id)
left join ef_dat_lspd_model_relationship on (machine_type_model_id_from = machine_type_model_id)
where (discontinue_date is null or discontinue_date > sysdate) and
(announce_date is null or announce_date <= sysdate)
and (machine_type_model_id in (select machine_type_model_id from ef_dat_lspd_model_parts)
or not machine_type_model_id_to is null) order by machine_type_model_id
Below is the unchanged query which I am supposed to work on.
select country_isocode as country,
language_isocode as language,
machine_type_model_id,
product_type_name product_type,machine_type_model_id_to,
image_url
from ef_dat_lspd_machine_type_model cross join ef_dim_lspd_country_language
left join ef_dat_lspd_model_country using (machine_type_model_id, country_isocode)
join ef_dat_lspd_product_type using (product_type_id)
left join ef_dat_lspd_model_relationship on (machine_type_model_id_from = machine_type_model_id)
where (discontinue_date is null or discontinue_date > sysdate) and
(announce_date is null or announce_date <= sysdate) and
(machine_type_model_id in (select machine_type_model_id from ef_dat_lspd_model_parts) or >not machine_type_model_id_to is null)
order by machine_type_model_id
Now in the ef_dat_app_ef_link table there is a link with images,countrycode and languagecode and a revenueid and the other is ef_dat_lspd_machine_type_model which contains the image links and a few more columns. What I am trying to do is, create a query which shall replace pull up images from the ef_dat_app_ef_link table where the revenue id in this table is equal to the revenue id in the other table. A lot of columns come up when a table is searched by the revenue id, thats why i want to pull up a row which has language as 'en' and country as 'US' or 'CA'.
I have added an inner join statement to the same effect but it is consistently throwing a ORA00905 error for a missing keyword.
I have italicized the changes I have made. Sorry for such a bad representation of the code. I couldn't make head or tail of how to make it look better.
You added a WHERE clause smack dab in the middle of the existing query. You can't have more than one WHERE clause in a query, and you also can't place it in the middle of your table joins.
Merge the WHERE clause you added to the pre-existing one (indicated by the green arrow in the snapshot).
You can't mix the structure like that you have to do it like
select *
from t
join t1
join t2
where
and
or
order by
you can't do this
select *
from t
join t2
where
order
join
where
order
try
select country_isocode as country,
language_isocode as language, model_number,
machine_type_model_name model_name,machine_type_group_id,
machine_type_model_id, product_type_name product_type,
machine_type_model_id_to,filenet_link
from ef_dat_lspd_machine_type_model
cross join ef_dim_lspd_country_language
left join ef_dat_lspd_model_country
using (machine_type_model_id, country_isocode)
inner join ef_dat_app_ef_link on (ef_dat_lspd_machine_type_model.lpmd_revenue_pid = ef_dat_app_ef_link.ef_product_revenue_pid)
join ef_dat_lspd_product_type using (product_type_id)
left join ef_dat_lspd_model_relationship on (machine_type_model_id_from = machine_type_model_id)
where (ef_dat_app_ef_link.country_isocode='US' or ef_dat_app_ef_link.country_isocode='CA') and ef_dat_app_ef_link.language_isocode='en'
and (discontinue_date is null or discontinue_date > sysdate) and
(announce_date is null or announce_date <= sysdate)
and (machine_type_model_id in (select machine_type_model_id from ef_dat_lspd_model_parts)
or not machine_type_model_id_to is null) order by machine_type_model_id
I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.
I have a big SELECT statement which has many nested selects in it. When I run it, it gives me an ORA-22813 error:
Ora-22813:- The Collection value from one of the inner sub queries has exceeded the system limits and hence this error.
I have given below some of the nested selects which return huge data.
---The 1st select returns the most data.
Can I handle and process the huge data returned by the INNER SELECTs into the tables in any alternate way so that there is no error of memory less, sort size less.
get, any other way so that the QUERY successfully processes without error.
/*****************************************BEGIN
LEFT OUTER JOIN
( SELECT *
FROM STUDENT_COURSE stu_c
LEFT OUTER JOIN STUDENT_history ch on stu_c.course_id = ch.ch_course_id
LEFT OUTER JOIN STUDENT_master stu_mca on ch.course_history_id = stu_mca.item_id
) stu_c ON stu_c.HISTORY_ID = toa.ACTIVITY_ID ----->This table is joined earlier
LEFT OUTER JOIN
(SELECT c_e.EV_ID, c_e.EV_NAME, ma.item_id, ma.cata_id
FROM EVENTS c_e LEFT OUTER
JOIN COURSE_master ma on c_e.event_Id = ma.item_id ) c_e ON c_e.EVENT_ID = toa.ACTIVITY_ID
After these selects---we have GROUP_BYs to further sort.
---I have checked that if I put a extra limit qualification
like where rownum <30,<20 in each of these SELECTs it works fine.
Full query
SELECT * FROM (SELECT
mcat.CATALOG_ITEM_ID,
mcat.CATALOG_ITEM_NAME ,
mcat.DESCRIPTION,
mcat.CATALOG_ITEM_TYPE,
mcat.DELIVERY_METHOD,
XMLElement("TRAINING_PLAN",XMLAttributes( TP.TPLAN_ID as "id" ),
XMLELEMENT("COMPLETE_QUANTITY", TP.COMPLETE_QUANTITY),
XMLELEMENT("COMPLETE_UNIT", TP.COMPLETE_UNIT),
XMLElement("TOTAL_CREDITS", TP.numberOfCredits ),
XMLELEMENT("IS_CREDIT_BASED", TP.IS_CREDIT_BASED),
XMLELEMENT("IS_FOR_CERT", TP.IS_FOR_CERT),
XMLELEMENT("ACCREDIT_ORG_NAME", TP.ACCRED_ORG_NAME),
XMLELEMENT("ACCREDIT_ORG_ID", TP.accredit_org_id ),
XMLElement("OBJECTIVE_LIST", TP.OBJECTIVE_LIST )
).extract('/').getClobVal() AS PLAN_LIST
FROM
student_master_catalog mcat
INNER JOIN
(SELECT stu_tp.TPLAN_ID,
stu_tp.COMPLETE_QUANTITY,
stu_tp.COMPLETE_UNIT,
stu_tp.TPLAN_XML_DATA.extract('//numberOfCredits/text()').getStringVal() as numberOfCredits,
stu_tp.IS_CREDIT_BASED,
stu_tp.IS_FOR_CERT,
stu_oa.ACCRED_ORG_NAME,
stu_tp.TPLAN_XML_DATA.extract('//accreditingOrg/text()').getStringVal() as accredit_org_id,
objective_list.OBJECTIVE_LIST
FROM
student_training_catalog stu_tp
LEFT OUTER JOIN
stu_accrediting_org stu_oa on stu_tp.TPLAN_XML_DATA.extract('//accreditingOrg/text()').getStringVal() = stu_oa.ACCRED_ORG_ID
INNER JOIN
(SELECT *
FROM
(SELECT
stu_tpo.TPLAN_ID AS OBJECTIVE_TPLAN_ID,
XMLAgg(
XMLElement("OBJECTIVE",
XMLElement("OBJECTIVE_ID",stu_tpo.T_OBJECTIVE_ID ),
XMLElement("OBJECTIVE_NAME",stu_to.T_OBJECTIVE_NAME ),
XMLElement("OBJECTIVE_REQUIRED_CREDITS_OR_ACTIVITIES",stu_tpo.REQUIRED_CREDITS ),
XMLElement("ITEM_ORDER", stu_tpo.ITEM_ORDER ),
XMLElement("ACTIVITY_LIST", activity_list.ACTIVITY_LIST )
)
) as OBJECTIVE_LIST
FROM
stu_TP_OBJECTIVE stu_tpo
INNER JOIN
stu_TRAINING_OBJECTIVE stu_to ON stu_tpo.T_OBJECTIVE_ID = stu_to.T_OBJECTIVE_ID
INNER JOIN
(SELECT *
FROM
(SELECT stu_toa.T_OBJECTIVE_ID AS ACTIVITY_TOBJ_ID, XMLAgg(
XMLElement("ACTIVITY",
XMLElement("ACTIVITY_ID",stu_toa.ACTIVITY_ID ),
XMLElement("CATALOG_ID",COALESCE(stu_c.CATALOG_ID, COALESCE( stu_e.CATALOG_ID, stu_t.CATALOG_ID ) ) ),
XMLElement("CATALOG_ITEM_ID",COALESCE(stu_c.CATALOG_ITEM_ID, COALESCE( stu_e.CATALOG_ITEM_ID, stu_t.CATALOG_ITEM_ID ) ) ),
XMLElement("DELIVERY_METHOD",COALESCE(stu_c.DELIVERY_METHOD, COALESCE( stu_e.DELIVERY_METHOD, stu_t.DELIVERY_METHOD ) ) ),
XMLElement("ACTIVITY_NAME",COALESCE(stu_c.COURSE_NAME, COALESCE( stu_e.EVENT_NAME, stu_t.TEST_NAME ) ) ),
XMLElement("ACTIVITY_TYPE",initcap( stu_toa.ACTIVITY_TYPE ) ),
XMLElement("IS_REQUIRED",stu_toa.IS_REQUIRED ),
XMLElement("IS_PREFERRED",stu_toa.IS_PREFERRED ),
XMLElement("NUMBER_OF_CREDITS",stu_lac.CREDIT_HOURS),
XMLElement("ITEM_ORDER", stu_toa.ITEM_ORDER )
)) as ACTIVITY_LIST
FROM stu_TRAIN_OBJ_ACTIVITY stu_toa
LEFT OUTER JOIN
(
SELECT distinct lac.LEARNING_ACTIVITY_ID, lac.CREDIT_HOURS
FROM student_training_catalog tp
INNER JOIN stu_TP_OBJECTIVE tpo on tp.TPLAN_ID = tpo.TPLAN_ID
INNER JOIN stu_TRAIN_OBJ_ACTIVITY toa on tpo.T_OBJECTIVE_ID = toa.T_OBJECTIVE_ID
INNER JOIN stu_LEARNINGACTIVITY_CREDITS lac on lac.LEARNING_ACTIVITY_ID = toa.ACTIVITY_ID and tp.TPLAN_XML_DATA.extract ('//accreditingOrg/text()').getStringVal() = lac.ACC_ORG_ID
where tp.tplan_id ='*************'
) stu_lac ON stu_lac.LEARNING_ACTIVITY_ID = stu_toa.ACTIVITY_ID ------>This Select returns correct no. of rows
I want to join the below nested SELECTs with stu_toa.ACTIVITY_ID. This would solve my issues.
This below SELECT inside the LEFT OUTER JOIN is the Problem. it returns too much because 3 tables are joined directly without any value qualification.
LEFT OUTER JOIN
( SELECT ch.COURSE_HISTORY_ID, stu_c.COURSE_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method
FROM stu_COURSE stu_c
LEFT OUTER JOIN stu_course_history ch on stu_c.course_id = ch.ch_course_id -
--If I can qualify here with ch.ch_course_id = stu_toa.ACTIVITY_ID (stu_toa.ACTIVITY_ID from the above select with correct no. of rows )
--Here, I get errors because I can't access outside values inside a left outer join
LEFT OUTER JOIN student_master_catalog mca on ch.course_history_id = mca.catalog_item_id
) stu_c ON stu_c.COURSE_HISTORY_ID = stu_toa.ACTIVITY_ID
LEFT OUTER JOIN
(SELECT stu_e.EVENT_ID, stu_e.EVENT_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method FROM stu_EVENTS stu_e LEFT OUTER JOIN student_master_catalog mca on stu_e.event_Id = mca.catalog_item_id ) stu_e ON stu_e.EVENT_ID = stu_toa.ACTIVITY_ID
LEFT OUTER JOIN
(SELECT stu_t.TEST_HISTORY_ID, stu_t.TEST_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method FROM stu_TEST_HISTORY stu_t LEFT OUTER JOIN student_master_catalog mca on stu_t.test_history_id = mca.catalog_item_id) stu_t ON stu_t.test_history_id = stu_toa.ACTIVITY_ID
GROUP BY stu_toa.T_OBJECTIVE_ID) ) activity_list ON activity_list.ACTIVITY_TOBJ_ID = stu_tpo.T_OBJECTIVE_ID
GROUP BY stu_tpo.TPLAN_ID) ) objective_list ON objective_list.OBJECTIVE_TPLAN_ID = stu_tp.TPLAN_ID
)TP ON TP.TPLAN_ID = mcat.CATALOG_ITEM_ID
WHERE
mcat.CATALOG_ITEM_ID = '*****************' and mcat.CATALOG_ORG_ID = '********')
Please post the DDLs, approximate sizes (relative to each other), and the complete query, rather than just an excerpt.
Some quick hits that may or may not solve your problem (for better help, I need better information) --
Are you sure you mean OUTER join? Outer joining students to courses means students who are not taking any courses will still be around. Is that the desired behaviour?
Don't select * if you only want a limited subset of the columns. Enumerate the exact columns you need. The rest might not seem like much on a row-by-row basis, but when you multiply by the total number of rows you have, this sort of thing can mean the difference between in-memory sorts and spilling to disk.
How many rows of data are you looking at? there are times when separate queries with programmatic aggregation can work better. Someone with more knowledge of Oracle query optimization may be able to help, also, tweaking the settings could help here too...
I've had instances where a sproc was being called that aggregated data from more than one source took exponentially longer than two calls in the app, and putting it together in memory.
Post DDL of your tables and exact plan of the query.
Meanwhile, try increasing pga_aggregate_target, sort_area_size and hash_area_size