Sort & Parallelism costing my query too much time - sql

My SQL query is taking a large amount of time to run. I wrote a similar query and pit them against each other and this one runs FASTER when a small dataset (10K lines) is used, but about 20-30x slower than the other one when a LARGE dataset (500K+ lines) is used. My first query however does not have ONE column that I need, and I cannot add it without going about it with this different approach.
SELECT a.[RFIDTAGID], a.[JOB_NUMBER], d.[PROJECT_NUMBER], a.[PART_NUMBER], a.[QUANTITY], b.[DESIGNATION] as LOCATION,
c.[DESIGNATION] as CONTAINER, a.[LAST_SEEN_TIME], b.[TYPE], b.[BLDG], d.[PBG], d.[PLANNED_MFG_DELIVERY_DATE], d.[EXTENSION_DATE], a.[ORG_ID]
FROM [LTS].[dbo].[LTS_PACKAGE] as a
LEFT OUTER JOIN (
SELECT [DESIGNATION], [CONTAINER_ID], [LOCATION_ID]
FROM [LTS].[dbo].[LTS_CONTAINER]
) c ON a.[CONTAINER_ID] = c.[CONTAINER_ID]
LEFT OUTER JOIN (
SELECT [DESIGNATION], [TYPE], [BLDG], [LOCATION_ID]
FROM [LTS].[dbo].[LTS_LOCATION]
) b ON a.[LAST_SEEN_LOC_ID] = b.[LOCATION_ID] OR b.[LOCATION_ID] = c.[LOCATION_ID]
INNER JOIN (
SELECT [PBG], [PLANNED_MFG_DELIVERY_DATE], [EXTENSION_DATE], [DISCRETE_JOB_NUMBER], [PROJECT_NUMBER]
FROM [LTS].[dbo].[LTS_DISCRETE_JOB_SUMMARY]
)d ON a.[JOB_NUMBER] = d.[DISCRETE_JOB_NUMBER]
WHERE
d.[PLANNED_MFG_DELIVERY_DATE] <= GETDATE()
AND b.[TYPE] NOT IN('MFG', 'Manufacturing')
AND (b.[DESIGNATION] IS NOT NULL OR c.[DESIGNATION] IS NOT NULL)
ORDER BY [JOB_NUMBER], d.[PLANNED_MFG_DELIVERY_DATE] desc, [RFIDTAGID];
You can see below the usage, 100% is roughly 20,000, whereas my other query is about 900:
Is there something I can do to speed up my query, or where did I bog it down?

Remove inner selects and join directly to the tables:
SELECT a.[RFIDTAGID], a.[JOB_NUMBER], d.[PROJECT_NUMBER], a.[PART_NUMBER], a.[QUANTITY], b.[DESIGNATION] as LOCATION,
c.[DESIGNATION] as CONTAINER, a.[LAST_SEEN_TIME], b.[TYPE], b.[BLDG], d.[PBG], d.[PLANNED_MFG_DELIVERY_DATE], d.[EXTENSION_DATE], a.[ORG_ID]
FROM [LTS].[dbo].[LTS_PACKAGE] a
LEFT OUTER JOIN [LTS].[dbo].[LTS_CONTAINER]
c ON a.[CONTAINER_ID] = c.[CONTAINER_ID]
LEFT OUTER JOIN [dbo].[LTS_LOCATION]
b ON a.[LAST_SEEN_LOC_ID] = b.[LOCATION_ID] OR b.[LOCATION_ID] = c.[LOCATION_ID]
INNER JOIN
[LTS].[dbo].[LTS_DISCRETE_JOB_SUMMARY]
d ON a.[JOB_NUMBER] = d.[DISCRETE_JOB_NUMBER]
WHERE
d.[PLANNED_MFG_DELIVERY_DATE] <= GETDATE()
AND b.[TYPE] NOT IN('MFG', 'Manufacturing')
AND (b.[DESIGNATION] IS NOT NULL OR c.[DESIGNATION] IS NOT NULL)
ORDER BY [JOB_NUMBER], d.[PLANNED_MFG_DELIVERY_DATE] desc, [RFIDTAGID];

Related

Clustered and non-clustered index seeking increase execution time in stored procedure

I have a stored procedure which takes over 3 minutes to execute, when I show the execution plan I find
Clustered index seeking and non-clustered index seeking
index seeking
clustered index seek
My query:
SELECT distinct
[tbl_worflowprocess].[currenttid]
,USR2.[firstname] AS [prev_action_user_name]
,USR3.[firstname] AS [current_action_user_name]
,COD2.[Code] AS [reasontext]
,[tbl_application_details].[application_id] AS [ApplicationId]
,[tbl_application_details].[application_number] AS [ApplicationNumber]
,[dbo].[fn_app_GetApplicationId]([tbl_application_details].[link_application_id]) AS [LinkApplicationId]
,[tbl_application_details].[link_type] AS [LinkType]
,[dbo].[fn_app_CountProductsInApplication]([tbl_application_details].[application_id]) AS [ProductsCount]
,[tbl_application_details].[submission_date] AS [SubmissionDate]
,[tbl_jurisdiction].[jurisdictionname]
,[tbl_devicetype].[devicetype]
,COD1.[Code] AS [ClassificationName]
,EST1.[name] AS [ApplicantName]
,EST2.[name] AS [ManufacturerName]
,[dbo].[fnGetApplicationStatusFromTaskId]([tbl_worflowprocess].[currenttid]) AS [AppStatus]
,[dbo].[fnGetApplicationStatusText](#pLoggedInUserRoleId,[tbl_worflowprocess].[currenttid]) AS [StatusText]
,[Paid] = (CASE [tbl_application_details].[paid] WHEN 1 THEN 'Yes' ELSE 'No' END)
,[CreationDate] = [tbl_worflowprocess].[creationdate]
,[CommentText] =
(select CommentText
from dbo.tbl_application_comments
where Id = (select max(Id) from dbo.tbl_application_comments
where ApplicationId= [tbl_application_details].[application_id] and UserId = #pLoggedInUserID ))
,[LastCab] = (select isnull(dbo.fnGetLastCabForApplication([tbl_application_details].[application_id]),'-'))
,[tbl_application_details].[ArExpired]
FROM
[tbl_worflowprocess]
INNER JOIN
(SELECT
[application_id], [actionbyuser_id],
[actionbyrole_id], [reason_id], createddate
FROM
[tbl_applicationworkflowhistory]
INNER JOIN
(SELECT
[application_id] AS C1, MAX([version]) AS C2
FROM
[tbl_applicationworkflowhistory]
WHERE
(#pCurrentRoleId IS NULL
OR [application_id] IN (SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id] = [tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory = 0 OR [tbl_applicationworkflowhistory].[actionbyrole_id] = #pCurrentRoleId OR [tbl_worflowprocess].[currentroleid] = #pCurrentRoleId)
AND (#pSearchInHistory = 1 OR [tbl_worflowprocess].[currentroleid] = #pCurrentRoleId)
)
) AND
(#pCurrentUserId IS NULL OR [application_id] IN (
SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id]=[tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory=0 OR [tbl_applicationworkflowhistory].[actionbyuser_id] =#pCurrentUserId OR [tbl_worflowprocess].[currentuserid]=#pCurrentUserId)
AND (#pSearchInHistory=1 OR [tbl_worflowprocess].[currentuserid]=#pCurrentUserId)
)
) AND
(#pCurrentEstablishmentId IS NULL OR [application_id] IN (
SELECT [application_id]
FROM [tbl_applicationworkflowhistory]
INNER JOIN [tbl_worflowprocess] ON [tbl_applicationworkflowhistory].[application_id]=[tbl_worflowprocess].[applicationid]
WHERE
(#pSearchInHistory=0 OR [tbl_applicationworkflowhistory].[actionbyuser_id] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId) OR [tbl_worflowprocess].[currentuserid] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId))
AND (#pSearchInHistory=1 OR [tbl_worflowprocess].[currentuserid] IN
(SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId))
)
)
GROUP BY [application_id]
)AS T1 ON ([tbl_applicationworkflowhistory].[application_id]=T1.C1 AND [tbl_applicationworkflowhistory].[version]=T1.C2)
) AS T2 ON([tbl_worflowprocess].[applicationid]=T2.[application_id])
INNER JOIN [tbl_application_details] ON [tbl_application_details].[application_id]=[tbl_worflowprocess].[applicationid]
INNER JOIN [tbl_user] USR1 ON USR1.[user_id]=[tbl_application_details].[responsible_user_id]
INNER JOIN [tbl_establishments] EST1 on EST1.[establishment_id] = USR1.[establishment_id]
LEFT OUTER JOIN [tbl_user] USR2 ON USR2.[user_id]=T2.[actionbyuser_id]
LEFT OUTER JOIN [tbl_user] USR3 ON USR3.[user_id]=[tbl_worflowprocess].[currentuserid]
LEFT OUTER JOIN [tbl_establishments] EST2 on EST2.[establishment_id] = [tbl_application_details].[manufacturer_id]
LEFT OUTER JOIN [tbl_jurisdiction] ON [tbl_jurisdiction].[jurisdiction_id]=[tbl_application_details].[jurisdiction_id]
LEFT OUTER JOIN [tbl_devicetype] ON [tbl_devicetype].[devicetype_id]=[tbl_application_details].[device_type_id]
LEFT OUTER JOIN [tbl_codes] COD1 ON COD1.[code_id]=[tbl_application_details].[device_classification_id]
LEFT OUTER JOIN [tbl_codes] COD2 ON COD2.[code_id]=T2.[reason_id]
LEFT OUTER JOIN [tbl_certificates] CERTF ON CERTF.[application_id]=[tbl_application_details].[application_id]
WHERE
(#pWFTasks IS NULL OR
[tbl_worflowprocess].[currenttid] IN (SELECT item
FROM [dbo].[fnSplit](#pWFTasks,',')))
Any way to improve my query?
try to create indexes on tables based on the query - use suggested performance improvements if exists and do not interfere with the rest of your DB.
if you have table scan in query execution plan, while index already exists on a table on that field - try to change the index to include the columns you select.
if you can - avoid using UDF's in case of many results returned by the query
if you can pre-calculate - do that using table variables or CTEs even: for example (if it returns more then 1 value) this can be stored in a table variable: SELECT [user_id] FROM [tbl_user] WHERE [establishment_id]=#pCurrentEstablishmentId)
queries, such as - "select max(Id) from dbo.tbl_application_comments" - can be improved by using simple variable before the query
use SNAPSHOT or READ UNCOMMITED or at least (nolock)
make sure statistics on tables are updated!
check you are using left join correctly (vs inner join which is faster)
limit the number of rows for each join as much as possible - use where statements to cut the data
more can be advise, query optimization is an interesting field

How do I optimize my query in MySQL?

I need to improve my query, specially the execution time.This is my query:
SELECT SQL_CALC_FOUND_ROWS p.*,v.type,v.idName,v.name as etapaName,m.name AS manager,
c.name AS CLIENT,
(SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(duration)))
FROM activities a
WHERE a.projectid = p.projectid) AS worked,
(SELECT SUM(TIME_TO_SEC(duration))
FROM activities a
WHERE a.projectid = p.projectid) AS worked_seconds,
(SELECT SUM(TIME_TO_SEC(remain_time))
FROM tasks t
WHERE t.projectid = p.projectid) AS remain_time
FROM projects p
INNER JOIN users m
ON p.managerid = m.userid
INNER JOIN clients c
ON p.clientid = c.clientid
INNER JOIN `values` v
ON p.etapa = v.id
WHERE 1 = 1
ORDER BY idName
ASC
The execution time of this is aprox. 5 sec. If i remove this part: (SELECT SUM(TIME_TO_SEC(remain_time)) FROM tasks t WHERE t.projectid = p.projectid) AS remain_time
the execution time is reduced to 0.3 sec. Is there a way to get the values of the remain_time in order to reduce the exec.time ?
The SQL is invoked from PHP (if this is relevant to any proposed solution).
It sounds like you need an index on tasks.
Try adding this one:
create index idx_tasks_projectid_remaintime on tasks(projectid, remain_time);
The correlated subquery should just use the index and go much faster.
Optimizing the query as it is written would give significant performance benefits (see below). But the FIRST QUESTION TO ASK when approaching any optimization is whether you really need to see all the data - there is no filtering of the resultset implemented here. This is a HUGE impact on how you optimize a query.
Adding an index on the query above will only help if the optimizer is opening a new cursor on the tasks table for every row returned by the main query. In the absence of any filtering, it will be much faster to do a full table scan of the tasks table.
SELECT ilv.*, remaining.rtime
FROM (
SELECT p.*,v.type, v.idName, v.name as etapaName,
m.name AS manager, c.name AS CLIENT,
SEC_TO_TIME(asbq.worked) AS worked, asbq.worked AS seconds_worked,
FROM projects p
INNER JOIN users m
ON p.managerid = m.userid
INNER JOIN clients c
ON p.clientid = c.clientid
INNER JOIN `values` v
ON p.etapa = v.id
LEFT JOIN (
SELECT a.projectid, SUM(TIME_TO_SEC(duration)) AS worked
FROM activities a
GROUP BY a.projectid
) asbq
ON asbq.projectid=p.projectid
) ilv
LEFT JOIN (
(SELECT t.project_id, SUM(TIME_TO_SEC(remain_time)) as rtime
FROM tasks t
GROUP BY t.projectid) remaining
ON ilv.projectid=remaining.projectid

Strange performance issue with SELECT (SUBQUERY)

I have a stored procedure that has been having some issues lately and I finally narrowed it down to 1 SELECT. The problem is I cannot figure out exactly what is happening to kill the performance of this one query. I re-wrote it, but I am not sure the re-write is the exact same data.
Original Query:
SELECT
#userId, p.job, p.charge_code, p.code
, (SELECT SUM(b.total) FROM dbo.[backorder w/total] b WHERE b.ponumber = p.ponumber AND b.code = p.code)
, ISNULL(jm.markup, 0)
, (SELECT SUM(b.TOTAL_TAX) FROM dbo.[backorder w/total] b WHERE b.ponumber = p.ponumber AND b.code = p.code)
, p.ponumber
, p.billable
, p.[date]
FROM dbo.PO p
INNER JOIN dbo.JobCostFilter jcf
ON p.job = jcf.jobno AND p.charge_code = jcf.chargecode AND jcf.userno = #userId
LEFT JOIN dbo.JobMarkup jm
ON jm.jobno = p.job
AND jm.code = p.code
LEFT JOIN dbo.[Working Codes] wc
ON p.code = wc.code
INNER JOIN dbo.JOBFILE j
ON j.JOB_NO = p.job
WHERE (wc.brcode <> 4 OR #BmtDb = 0)
GROUP BY p.job, p.charge_code, p.code, p.ponumber, p.billable, p.[date], jm.markup, wc.brcode
This query will practically never finish running. It actually times out for some larger jobs we have.
And if I change the 2 subqueries in the select to read like joins instead:
SELECT
#userid, p.job, p.charge_code, p.code
, (SELECT SUM(b.TOTAL))
, ISNULL(jm.markup, 0)
, (SELECT SUM(b.TOTAL_TAX))
, p.ponumber, p.billable, p.[date]
FROM dbo.PO p
INNER JOIN dbo.JobCostFilter jcf
ON p.job = jcf.jobno AND p.charge_code = jcf.chargecode AND jcf.userno = 11190030
INNER JOIN [BACKORDER W/TOTAL] b
ON P.PONUMBER = b.ponumber AND P.code = b.code
LEFT JOIN dbo.JobMarkup jm
ON jm.jobno = p.job
AND jm.code = p.code
LEFT JOIN dbo.[Working Codes] wc
ON p.code = wc.code
INNER JOIN dbo.JOBFILE j
ON j.JOB_NO = p.job
WHERE (wc.brcode <> 4 OR #BmtDb = 0)
GROUP BY p.job, p.charge_code, p.code, p.ponumber, p.billable, p.[date], jm.markup, wc.brcode
The data comes out looking very nearly identical to me (though there are thousands of lines overall so I could be wrong), and it runs very quickly.
Any ideas appreciated..
Performace
In the second query you have less logical reads because the table [BACKORDER W/TOTAL] has been scanned only once. In the first query two separate subqueries are processed indenpendent and the table is scanned twice although both subqueries have the same predicates.
Correctness
If you want to check if two queries return the same resultset you can use the EXCEPT operator:
If both statements:
First SELECT Query...
EXCEPT
Second SELECT Query...
and
Second SELECT Query..
EXCEPT
First SELECT Query...
return an empty set the resultsets are identical.
In terms of correctness, you are inner joining [BACKORDER W/TOTAL] in the second query, so if the first query has Null values in the subqueries, these rows would be missing in the second query.
For performance, the optimizer is a heuristic - it will sometimes use spectacularly bad query plans, and even minimal changes can sometimes lead to a completely different query plan. Your best chance is to compare the query plans and see what causes the difference.

How order of joins affect performance of a query

I'm experiencing big differences in timeperformance in my query, and it seems the order of which the joins (inner and left outer) occur in the query makes all the difference.
Are there some "ground rules" in what order joins should be in?
Both of them are part of a bigger query.
The difference between them is that the left join is placed last in the faster query.
Slow query: (> 10 minutes)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),#p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),#p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
Faster query: (~ 30 seconds)
SELECT [t0].[Ref], [t1].[Key], [t1].[Name],
(CASE
WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),#p0)
ELSE CONVERT(NVarChar(250),[t3].[Key])
END) AS [value],
(CASE
WHEN 0 = 1 THEN CONVERT(NVarChar(250),#p1)
ELSE CONVERT(NVarChar(250),[t4].[Key])
END) AS [value2]
FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]
INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])
LEFT OUTER JOIN (
SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
FROM [dbo].[tblC] AS [t2]
) AS [t3] ON [t0].[RefC] = ([t3].[Ref])
Generally INNER JOIN order won't matter because inner joins are commutative and associative. In both cases, you still have t0 inner join t4 so should make no difference.
Re-phrasing that, SQL is declarative: you say "what you want", not "how". The optimiser works the "how" and will re-order JOINs as needed, looking as WHEREs etc too in practice.
In complex queries, a cost based query optimiser won't exhaust all permutation so it could matter occasionally.
So, I'd check for these:
You said these are part of a bigger query, so this section matters less because the whole query matters.
Complexity can be hidden using views too if any of the tables are actually views
Is this repeatable, no matter what order code runs in?
What are the query plan differences?
See some other SO questions:
how to best organize the Inner Joins in (select) statement
SQL Server 2005 - Order of Inner Joins
If u have more than 2 tables it is important to order table joins. It can make big differences. First table should get a leading hint. First table is that object with most selective rows. For example: If u have a member table with 1.000.000 people and you only want to select female gender and it is first table, so you only join 500.000 records to next table. If this table is at the end of join order (maybe table 4,5 or 6) then each record (worst case 1.000.000) will be joined. This includes inner and outer joins.
The Rule: Start with most selective table, then join next logical most selective table.
Converting functions and beautifying should do last. Sometimes it is better to
bundle the shole SQL in brackets and use expressions and functions in outer select statements.
In the case of left join it impact a lot the performance. i was having a problem in a select query that was like that :
select distinct count(p0_.id) over () as col_0_0_,
p0_.id as col_1_0_,
p0_.lp as col_2_0_,
0
as col_3_0_,
max(coalesce(i6_.cft, i7_.rfo,
'')) as col_4_0_,
p0_.pdv as col_5_0_,
(s8_.qer)
as col_6_0_,
cf1_.ests as col_7_0_
from Produit p0_
left outer join CF cf1_ on p0_.fk_cf = cf1_.id
left outer join CA c2_ on cf1_.fk_ca = c2_.id
left outer join ml mt on c2_.fk_m = mt.id
left outer join sk s8_ on p0_.id = s8_.fk_p
left outer join rf r5_ on
rp4_.fk_r = r5_.id
left outer join
in i6_ on r5_.fk_ireftc = i6_.id
left outer join r_p_r rp4_ on p0_.id = rp4_.fk_p
left outer join
ir i7_ on r5_.fk_if = i7_.id
left outer join re_p_g gc9_ on p0_.id = gc9_.fk_p
left outer join gc g10_ on gc9_.fk_g = g10_.id
where
and (p0_.lC is null or p0_.lS = 'E')
and g10_.id is null or g10_.id
and r5_.fk_i is null
group by col_1_0_, col_2_0_, col_3_0_, col_5_0_, col_6_0_, col_7_0_
order by col_2_0_ asc, p0_.id
limit 10;
the query takes 13 to 15 seconde to execute, when i change the order its takes 1 to 2 seconde.
select distinct count(p0_.id) over () as col_0_0_,
p0_.id as col_1_0_,
p0_.lp as col_2_0_,
0
as col_3_0_,
max(coalesce(i6_.cft, i7_.rfo,
'')) as col_4_0_,
p0_.pdv as col_5_0_,
(s8_.qer)
as col_6_0_,
cf1_.ests as col_7_0_
from Produit p0_
left outer join CF cf1_ on p0_.fk_cf = cf1_.id
left outer join sk s8_ on p0_.id = s8_.fk_p
left outer join r_p_r rp4_ on p0_.id = rp4_.fk_p
left outer join re_p_g gc9_ on p0_.id = gc9_.fk_p
left outer join CA c2_ on cf1_.fk_ca = c2_.id
left outer join ml mt on c2_.fk_m = mt.id
left outer join rf r5_ on
rp4_.fk_r = r5_.id
left outer join
in i6_ on r5_.fk_ireftc = i6_.id
left outer join
ir i7_ on r5_.fk_if = i7_.id
left outer join gc g10_ on gc9_.fk_g = g10_.id
where
and (p0_.lC is null or p0_.lS = 'E')
and(g10_.id is null
and r5_.fk_i is null
group by col_1_0_, col_2_0_, col_3_0_, col_5_0_, col_6_0_, col_7_0_
order by col_2_0_ asc, p0_.id
limit 10;
in my case i change the order in case when i load a table i use all the join that use this table in the join that follow and not to load it in another block. like in my p0_ table i made all the left join in the first 4 lines not like in the first code.
PS: to test my perf in postgre i use this website: http://tatiyants.com/pev/#/plans/new
At least in SQLite, I found out that it makes a huge difference. Actually it didn't need to be a very complex query for the difference to show itself. My JOIN statements were inside an embedded clause however.
Basically, you should start with the most specific limitations first, as Christian has pointed out.

ORA-22813 error with SQL complex query

I have a big SELECT statement which has many nested selects in it. When I run it, it gives me an ORA-22813 error:
Ora-22813:- The Collection value from one of the inner sub queries has exceeded the system limits and hence this error.
I have given below some of the nested selects which return huge data.
---The 1st select returns the most data.
Can I handle and process the huge data returned by the INNER SELECTs into the tables in any alternate way so that there is no error of memory less, sort size less.
get, any other way so that the QUERY successfully processes without error.
/*****************************************BEGIN
LEFT OUTER JOIN
( SELECT *
FROM STUDENT_COURSE stu_c
LEFT OUTER JOIN STUDENT_history ch on stu_c.course_id = ch.ch_course_id
LEFT OUTER JOIN STUDENT_master stu_mca on ch.course_history_id = stu_mca.item_id
) stu_c ON stu_c.HISTORY_ID = toa.ACTIVITY_ID ----->This table is joined earlier
LEFT OUTER JOIN
(SELECT c_e.EV_ID, c_e.EV_NAME, ma.item_id, ma.cata_id
FROM EVENTS c_e LEFT OUTER
JOIN COURSE_master ma on c_e.event_Id = ma.item_id ) c_e ON c_e.EVENT_ID = toa.ACTIVITY_ID
After these selects---we have GROUP_BYs to further sort.
---I have checked that if I put a extra limit qualification
like where rownum <30,<20 in each of these SELECTs it works fine.
Full query
SELECT * FROM (SELECT
mcat.CATALOG_ITEM_ID,
mcat.CATALOG_ITEM_NAME ,
mcat.DESCRIPTION,
mcat.CATALOG_ITEM_TYPE,
mcat.DELIVERY_METHOD,
XMLElement("TRAINING_PLAN",XMLAttributes( TP.TPLAN_ID as "id" ),
XMLELEMENT("COMPLETE_QUANTITY", TP.COMPLETE_QUANTITY),
XMLELEMENT("COMPLETE_UNIT", TP.COMPLETE_UNIT),
XMLElement("TOTAL_CREDITS", TP.numberOfCredits ),
XMLELEMENT("IS_CREDIT_BASED", TP.IS_CREDIT_BASED),
XMLELEMENT("IS_FOR_CERT", TP.IS_FOR_CERT),
XMLELEMENT("ACCREDIT_ORG_NAME", TP.ACCRED_ORG_NAME),
XMLELEMENT("ACCREDIT_ORG_ID", TP.accredit_org_id ),
XMLElement("OBJECTIVE_LIST", TP.OBJECTIVE_LIST )
).extract('/').getClobVal() AS PLAN_LIST
FROM
student_master_catalog mcat
INNER JOIN
(SELECT stu_tp.TPLAN_ID,
stu_tp.COMPLETE_QUANTITY,
stu_tp.COMPLETE_UNIT,
stu_tp.TPLAN_XML_DATA.extract('//numberOfCredits/text()').getStringVal() as numberOfCredits,
stu_tp.IS_CREDIT_BASED,
stu_tp.IS_FOR_CERT,
stu_oa.ACCRED_ORG_NAME,
stu_tp.TPLAN_XML_DATA.extract('//accreditingOrg/text()').getStringVal() as accredit_org_id,
objective_list.OBJECTIVE_LIST
FROM
student_training_catalog stu_tp
LEFT OUTER JOIN
stu_accrediting_org stu_oa on stu_tp.TPLAN_XML_DATA.extract('//accreditingOrg/text()').getStringVal() = stu_oa.ACCRED_ORG_ID
INNER JOIN
(SELECT *
FROM
(SELECT
stu_tpo.TPLAN_ID AS OBJECTIVE_TPLAN_ID,
XMLAgg(
XMLElement("OBJECTIVE",
XMLElement("OBJECTIVE_ID",stu_tpo.T_OBJECTIVE_ID ),
XMLElement("OBJECTIVE_NAME",stu_to.T_OBJECTIVE_NAME ),
XMLElement("OBJECTIVE_REQUIRED_CREDITS_OR_ACTIVITIES",stu_tpo.REQUIRED_CREDITS ),
XMLElement("ITEM_ORDER", stu_tpo.ITEM_ORDER ),
XMLElement("ACTIVITY_LIST", activity_list.ACTIVITY_LIST )
)
) as OBJECTIVE_LIST
FROM
stu_TP_OBJECTIVE stu_tpo
INNER JOIN
stu_TRAINING_OBJECTIVE stu_to ON stu_tpo.T_OBJECTIVE_ID = stu_to.T_OBJECTIVE_ID
INNER JOIN
(SELECT *
FROM
(SELECT stu_toa.T_OBJECTIVE_ID AS ACTIVITY_TOBJ_ID, XMLAgg(
XMLElement("ACTIVITY",
XMLElement("ACTIVITY_ID",stu_toa.ACTIVITY_ID ),
XMLElement("CATALOG_ID",COALESCE(stu_c.CATALOG_ID, COALESCE( stu_e.CATALOG_ID, stu_t.CATALOG_ID ) ) ),
XMLElement("CATALOG_ITEM_ID",COALESCE(stu_c.CATALOG_ITEM_ID, COALESCE( stu_e.CATALOG_ITEM_ID, stu_t.CATALOG_ITEM_ID ) ) ),
XMLElement("DELIVERY_METHOD",COALESCE(stu_c.DELIVERY_METHOD, COALESCE( stu_e.DELIVERY_METHOD, stu_t.DELIVERY_METHOD ) ) ),
XMLElement("ACTIVITY_NAME",COALESCE(stu_c.COURSE_NAME, COALESCE( stu_e.EVENT_NAME, stu_t.TEST_NAME ) ) ),
XMLElement("ACTIVITY_TYPE",initcap( stu_toa.ACTIVITY_TYPE ) ),
XMLElement("IS_REQUIRED",stu_toa.IS_REQUIRED ),
XMLElement("IS_PREFERRED",stu_toa.IS_PREFERRED ),
XMLElement("NUMBER_OF_CREDITS",stu_lac.CREDIT_HOURS),
XMLElement("ITEM_ORDER", stu_toa.ITEM_ORDER )
)) as ACTIVITY_LIST
FROM stu_TRAIN_OBJ_ACTIVITY stu_toa
LEFT OUTER JOIN
(
SELECT distinct lac.LEARNING_ACTIVITY_ID, lac.CREDIT_HOURS
FROM student_training_catalog tp
INNER JOIN stu_TP_OBJECTIVE tpo on tp.TPLAN_ID = tpo.TPLAN_ID
INNER JOIN stu_TRAIN_OBJ_ACTIVITY toa on tpo.T_OBJECTIVE_ID = toa.T_OBJECTIVE_ID
INNER JOIN stu_LEARNINGACTIVITY_CREDITS lac on lac.LEARNING_ACTIVITY_ID = toa.ACTIVITY_ID and tp.TPLAN_XML_DATA.extract ('//accreditingOrg/text()').getStringVal() = lac.ACC_ORG_ID
where tp.tplan_id ='*************'
) stu_lac ON stu_lac.LEARNING_ACTIVITY_ID = stu_toa.ACTIVITY_ID ------>This Select returns correct no. of rows
I want to join the below nested SELECTs with stu_toa.ACTIVITY_ID. This would solve my issues.
This below SELECT inside the LEFT OUTER JOIN is the Problem. it returns too much because 3 tables are joined directly without any value qualification.
LEFT OUTER JOIN
( SELECT ch.COURSE_HISTORY_ID, stu_c.COURSE_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method
FROM stu_COURSE stu_c
LEFT OUTER JOIN stu_course_history ch on stu_c.course_id = ch.ch_course_id -
--If I can qualify here with ch.ch_course_id = stu_toa.ACTIVITY_ID (stu_toa.ACTIVITY_ID from the above select with correct no. of rows )
--Here, I get errors because I can't access outside values inside a left outer join
LEFT OUTER JOIN student_master_catalog mca on ch.course_history_id = mca.catalog_item_id
) stu_c ON stu_c.COURSE_HISTORY_ID = stu_toa.ACTIVITY_ID
LEFT OUTER JOIN
(SELECT stu_e.EVENT_ID, stu_e.EVENT_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method FROM stu_EVENTS stu_e LEFT OUTER JOIN student_master_catalog mca on stu_e.event_Id = mca.catalog_item_id ) stu_e ON stu_e.EVENT_ID = stu_toa.ACTIVITY_ID
LEFT OUTER JOIN
(SELECT stu_t.TEST_HISTORY_ID, stu_t.TEST_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method FROM stu_TEST_HISTORY stu_t LEFT OUTER JOIN student_master_catalog mca on stu_t.test_history_id = mca.catalog_item_id) stu_t ON stu_t.test_history_id = stu_toa.ACTIVITY_ID
GROUP BY stu_toa.T_OBJECTIVE_ID) ) activity_list ON activity_list.ACTIVITY_TOBJ_ID = stu_tpo.T_OBJECTIVE_ID
GROUP BY stu_tpo.TPLAN_ID) ) objective_list ON objective_list.OBJECTIVE_TPLAN_ID = stu_tp.TPLAN_ID
)TP ON TP.TPLAN_ID = mcat.CATALOG_ITEM_ID
WHERE
mcat.CATALOG_ITEM_ID = '*****************' and mcat.CATALOG_ORG_ID = '********')
Please post the DDLs, approximate sizes (relative to each other), and the complete query, rather than just an excerpt.
Some quick hits that may or may not solve your problem (for better help, I need better information) --
Are you sure you mean OUTER join? Outer joining students to courses means students who are not taking any courses will still be around. Is that the desired behaviour?
Don't select * if you only want a limited subset of the columns. Enumerate the exact columns you need. The rest might not seem like much on a row-by-row basis, but when you multiply by the total number of rows you have, this sort of thing can mean the difference between in-memory sorts and spilling to disk.
How many rows of data are you looking at? there are times when separate queries with programmatic aggregation can work better. Someone with more knowledge of Oracle query optimization may be able to help, also, tweaking the settings could help here too...
I've had instances where a sproc was being called that aggregated data from more than one source took exponentially longer than two calls in the app, and putting it together in memory.
Post DDL of your tables and exact plan of the query.
Meanwhile, try increasing pga_aggregate_target, sort_area_size and hash_area_size