Top 1 on Left Join SubQuery - sql

I am trying to take a person and display their current insurance along with their former insurance. I guess one could say that I'm trying to flaten my view of customers or people. I'm running into an issue where I'm getting multiple records back due to multiple records existing within my left join subqueries. I had hoped I could solve this by adding "TOP 1" to the subquery, but that actually returns nothing...
Any ideas?
SELECT
p.person_id AS 'MIRID'
, p.firstname AS 'FIRST'
, p.lastname AS 'LAST'
, pg.name AS 'GROUP'
, e.name AS 'AOR'
, p.leaddate AS 'CONTACT DATE'
, [dbo].[GetPICampaignDisp](p.person_id, '2009') AS 'PI - 2009'
, [dbo].[GetPICampaignDisp](p.person_id, '2008') AS 'PI - 2008'
, [dbo].[GetPICampaignDisp](p.person_id, '2007') AS 'PI - 2007'
, a_disp.name AS 'CURR DISP'
, a_ins.name AS 'CURR INS'
, a_prodtype.name AS 'CURR INS TYPE'
, a_t.date AS 'CURR INS APP DATE'
, a_t.effdate AS 'CURR INS EFF DATE'
, b_disp.name AS 'PREV DISP'
, b_ins.name AS 'PREV INS'
, b_prodtype.name AS 'PREV INS TYPE'
, b_t.date AS 'PREV INS APP DATE'
, b_t.effdate AS 'PREV INS EFF DATE'
, b_t.termdate AS 'PREV INS TERM DATE'
FROM
[person] p
LEFT OUTER JOIN
[employee] e
ON
e.employee_id = p.agentofrecord_id
INNER JOIN
[dbo].[person_physician] pp
ON
p.person_id = pp.person_id
INNER JOIN
[dbo].[physician] ph
ON
ph.physician_id = pp.physician_id
INNER JOIN
[dbo].[clinic] c
ON
c.clinic_id = ph.clinic_id
INNER JOIN
[dbo].[d_Physgroup] pg
ON
pg.d_physgroup_id = c.physgroup_id
LEFT OUTER JOIN
(
SELECT
tr1.*
FROM
[transaction] tr1
LEFT OUTER JOIN
[d_vendor] ins1
ON
ins1.d_vendor_id = tr1.d_vendor_id
LEFT OUTER JOIN
[d_product_type] prodtype1
ON
prodtype1.d_product_type_id = tr1.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] ctype1
ON
ctype1.d_commission_type_id = tr1.d_commission_type_id
WHERE
prodtype1.name <> 'Medicare Part D'
AND tr1.termdate IS NULL
) AS a_t
ON
a_t.person_id = p.person_id
LEFT OUTER JOIN
[d_vendor] a_ins
ON
a_ins.d_vendor_id = a_t.d_vendor_id
LEFT OUTER JOIN
[d_product_type] a_prodtype
ON
a_prodtype.d_product_type_id = a_t.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] a_ctype
ON
a_ctype.d_commission_type_id = a_t.d_commission_type_id
LEFT OUTER JOIN
[d_disposition] a_disp
ON
a_disp.d_disposition_id = a_t.d_disposition_id
LEFT OUTER JOIN
(
SELECT
tr2.*
FROM
[transaction] tr2
LEFT OUTER JOIN
[d_vendor] ins2
ON
ins2.d_vendor_id = tr2.d_vendor_id
LEFT OUTER JOIN
[d_product_type] prodtype2
ON
prodtype2.d_product_type_id = tr2.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] ctype2
ON
ctype2.d_commission_type_id = tr2.d_commission_type_id
WHERE
prodtype2.name <> 'Medicare Part D'
AND tr2.termdate IS NOT NULL
) AS b_t
ON
b_t.person_id = p.person_id
LEFT OUTER JOIN
[d_vendor] b_ins
ON
b_ins.d_vendor_id = b_t.d_vendor_id
LEFT OUTER JOIN
[d_product_type] b_prodtype
ON
b_prodtype.d_product_type_id = b_t.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] b_ctype
ON
b_ctype.d_commission_type_id = b_t.d_commission_type_id
LEFT OUTER JOIN
[d_disposition] b_disp
ON
b_disp.d_disposition_id = b_t.d_disposition_id
WHERE
pg.d_physgroup_id = #PhysGroupID

In Sql server 2005 you can use OUTER APPLY
SELECT p.person_id, s.e.employee_id
FROM person p
OUTER APPLY (SELECT TOP 1 *
FROM Employee
WHERE /*JOINCONDITION*/
ORDER BY /*Something*/ DESC) s
http://technet.microsoft.com/en-us/library/ms175156.aspx

The pattern I normally use for this is:
SELECT whatever
FROM person
LEFT JOIN subtable AS s1
ON s1.personid = person.personid
...
WHERE NOT EXISTS
( SELECT 1 FROM subtable
WHERE personid = person.personid
AND orderbydate > s1.orderbydate
)
Which avoids the TOP 1 clause and maybe makes it a little clearer.
BTW, I like the way you've put this query together in general, except I'd leave out the brackets, assuming you have rationally named tables and columns; and you might even gain some performance (but at least elegance) by listing columns for tr1 and tr2, rather than "tr1.*" and "tr2.*".

Thanks for all of the feedback and ideas...
In the simplest of terms, I have a person table that stores contact information like name, email, etc. I have another table that stores transactions. Each transaction is really an insurance policy that would contain information on the provider, product type, product name, etc.
I want to avoid giving the user duplicate person records since this causes them to look for the duplicates prior to running mail merges, etc. I'm getting duplicates when there is more than 1 transaction that has not been terminated, and when there is more than 1 transaction that has been terminated.
Someone else suggested that I consider a cursor to grab my distinct contact records and then perform the sub selects to get the current and previous insurance information. I don't know if I want to head down that path though.

It's difficult to understand your question so first I'll throw this out there: does changing your SELECT to SELECT DISTINCT do what you want?
Otherwise, let me get this straight, you're trying to get your customers' current insurance and previous insurance, but they may actually have many insurances before that, recorded in the [transactions] table? I looked at your SQL for quite a few minutes but can't figure out what it all means, so could you please reduce it down to only the parts that are necessary? Then I'll think about it some more. It sounds to me like you need a GROUP BY somehow, but I can't work it out exactly yet.

Couldn't take the time to dig through all your SQL (what a beast!), here's an idea that might make things easier to handle:
select
p.person_id, p.name <and other person columns>,
(select <current policy columns>
from pol <and other tables for policy>
where pol.<columns for join> = p.person_id
and <restrictions for current policy>),
(select <previous policy columns>
from pol <and other tables for policy>
where pol.<columns for join> = p.person_id
and <restrictions for previouspolicy>),
<other columns>
from person p <and "directly related" tables>
This makes the statement easier to read by separating the different parts into their own subselects, and it also makes it easier to add a "Top 1" in without affecting the rest of the statement. Hope that helps.

Related

Trying to get the right SQL Query (Joins)

I've been trying to join a couple tables to get the right result.
The problem is its returning some NULL values where this shouldn't be the case.
EP.ProductId, CreateDate, IsproductActivated and Subject are returning a Null Value.I've noticed it That DossierID and dossier will return a value when I make a right join as stated below.
Could anyone help me with my case?
SELECT
EP.ProductId AS [ProductId]
,MAX(EP.CreateDate) AS [CreateDate]
,EP.IsProductActivated AS [IsProductActivated]
,PC.EntityId AS [DossierID]
,PC.EntityDiscriminator AS [EntityDiscriminator]
,PDCP.Name AS [Subject]
,D.Name AS [Dossier]
FROM
[tblEntityProduct] AS EP
LEFT JOIN .tblPDCProduct AS PDCP
ON PDCP.id = EP.Productid
LEFT JOIN
[tblProductContainerContent] AS PCC
ON PCC.EntityProductId = EP.ProductId
RIGHT JOIN
[tblProductContainer] AS PC
ON PC.Id = PCC.Productcontainerid
LEFT JOIN
[tblDossier] AS D
ON D.Id = PC.Entityid
WHERE PC.EntityId = 2803
GROUP BY EP.Productid
,IsProductActivated
,EntityId
,EntityDiscriminator
,PDCP.name
,D.name
The picture below shows the result of the above query. I want the values it suppose to have instead of the NULL
use inner join to avoid getting null values, because The LEFT JOIN keyword returns all records from the left table (table1), and the matched records from the right table (table2). The result is NULL from the right side, if there is no match. for right join opposite is true of left join
SELECT
EP.ProductId AS [ProductId]
,MAX(EP.CreateDate) AS [CreateDate]
,EP.IsProductActivated AS [IsProductActivated]
,PC.EntityId AS [DossierID]
,PC.EntityDiscriminator AS [EntityDiscriminator]
,PDCP.Name AS [Subject]
,D.Name AS [Dossier]
FROM
[tblEntityProduct] AS EP
JOIN .tblPDCProduct AS PDCP
ON PDCP.id = EP.Productid
JOIN
[tblProductContainerContent] AS PCC
ON PCC.EntityProductId = EP.ProductId
JOIN
[tblProductContainer] AS PC
ON PC.Id = PCC.Productcontainerid
JOIN
[tblDossier] AS D
ON D.Id = PC.Entityid
WHERE PC.EntityId = 2803
GROUP BY EP.Productid
,IsProductActivated
,EntityId
,EntityDiscriminator
,PDCP.name
,D.name
I hate right-joins. They are basically a REQUIRE THIS TABLE regardless of the others leading up to it. Instead, I reversed the query to put the REQUIRED Table in the first position, then LEFT-JOIN to the underlying. Now, based on the structure / join, I would THINK that you start with a product container -- will always exist. Each product container has stuff in it (ProductContainerContent). Then, each content item IS an entity product to get its description. Each container has its own Dossier which may (or not) be assigned at the time the container is being prepared. If I am accurate, most of the LEFT-JOINs should just be standard (inner) JOINs, but lets see what this does against your data.
SELECT
EP.ProductId,
MAX(EP.CreateDate) CreateDate,
EP.IsProductActivated,
PC.EntityId DossierID,
PC.EntityDiscriminator,
PDCP.Name Subject,
D.Name Dossier
FROM
tblProductContainer PC
LEFT JOIN tblProductContainerContent PCC
ON PC.Id = PCC.Productcontainerid
LEFT JOIN tblEntityProduct EP
ON PCC.EntityProductId = EP.ProductId
LEFT JOIN tblPDCProduct PDCP
ON EP.Productid = PDCP.id
LEFT JOIN tblDossier D
ON PC.Entityid = D.Id
WHERE
PC.EntityId = 2803
GROUP BY
EP.Productid,
IsProductActivated,
EntityId,
EntityDiscriminator,
PDCP.name,
D.name

SQL query. Very restricted due to environment. Is it even possible?

I'have a specific problem. I need get some some data from database. I have a mechanism to retrieve data from a program. I need to use it, no modifications possible. The original query is:
SELECT it_Symbol AS Symbol, tt_Name AS Nazwa, tt_Price AS Cena,
tt_Quantity AS Ilosc, tt_Id
FROM tr__Transaction INNER JOIN tr_Item
ON tt_TransId=tr_Id LEFT OUTER JOIN it__Item
ON tt_ItemId = it_Id RIGHT JOIN reg_Site
ON tr_SiteId = rs_Id LEFT OUTER JOIN it_ItemSite
ON it_Id = is_ItemId
WHERE tt_TransId=#transId
GROUP BY tt_Id, tt_Quantity, tr_Id, it_Name, tt_Price,it_Symbol,
is_Name, tt_Name, tt_ItemId, tt_Id
The problem is that I need to get some additional data from tr__Transaction table.
It has a field tr_Source. I need this fields value, but for tr__transaction records which have tr_Id listed in returned tt_Id field.
Any way to do a subquery returning values dependant on tt_Id column values?
Or maybe any other joins combination? I've spend whole week with this, and have no more ideas or skills to do this:/
Any help would be very appreciated.
Ok still don't know exactly what you need but it is an atempt to clean up the question. So this is a working version of the answer since you can't format code in comments.
Please explain if some relations are wrong.
you can always join tables multiple times under different conditions as long as they have different aliases.
For instance:
SELECT c.it_Symbol AS Symbol, a.tt_Name AS Nazwa, a.tt_Price AS Cena,
a.tt_Quantity AS Ilosc, a.tt_Id, f.tr_Source
FROM tr__Transaction a
INNER JOIN tr_Item b
ON a.tt_TransId=b.tr_Id
LEFT OUTER JOIN it__Item c
ON a.tt_ItemId = c.it_Id
RIGHT JOIN reg_Site d
ON a.tr_SiteId = d.rs_Id
LEFT OUTER JOIN it_ItemSite e
ON c.it_Id = e.is_ItemId
LEFT OUTER JOIN tr__Transaction f
ON c.tt_id = f.tr_id
WHERE a.tt_TransId=#transId
GROUP BY a.tt_Id, a.tt_Quantity, a.tr_Id, c.it_Name, a.tt_Price,c.it_Symbol,
e.is_Name, a.tt_Name, a.tt_ItemId, a.tt_Id
I'm not sure if I understand you question correctly, but assuming you are saying that the Original SQL statement cannot be changed (ie, it's in a read-only View). Then you can wrap it around another SELECT statement.
SELECT tblOriginal.*, tblExtend.tt_Source
FROM (
SELECT it_Symbol AS Symbol, tt_Name AS Nazwa, tt_Price AS Cena,
tt_Quantity AS Ilosc, tt_Id
FROM tr__Transaction INNER JOIN tr_Item
ON tt_TransId=tr_Id LEFT OUTER JOIN it__Item
ON tt_ItemId = it_Id RIGHT JOIN reg_Site
ON tr_SiteId = rs_Id LEFT OUTER JOIN it_ItemSite
ON it_Id = is_ItemId
WHERE tt_TransId=#transId
GROUP BY tt_Id, tt_Quantity, tr_Id, it_Name, tt_Price,it_Symbol,
is_Name, tt_Name, tt_ItemId, tt_Id
) AS tblOriginal
INNER JOIN tr__Transaction AS tblExtend
ON tblOriginal.tt_Id = tblExtend.tt_Id
But I suspect your problem is more complicated that that as you've spent over a week on it. In that case, can you elaborate?

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.

SQL joins "going up" two tables

I'm trying to create a moderately complex query with joins:
SELECT `history`.`id`,
`parts`.`type_id`,
`serialized_parts`.`serial`,
`history_actions`.`action`,
`history`.`date_added`
FROM `history_actions`, `history`
LEFT OUTER JOIN `parts` ON `parts`.`id` = `history`.`part_id`
LEFT OUTER JOIN `serialized_parts` ON `serialized_parts`.`parts_id` = `history`.`part_id`
WHERE `history_actions`.`id` = `history`.`action_id`
AND `history`.`unit_id` = '1'
ORDER BY `history`.`id` DESC
I'd like to replace `parts`.`type_id` in the SELECT statement with `part_list`.`name` where the relationship I need to enforce between the two tables is `part_list`.`id` = `parts`.`type_id`. Also I have to use joins because in some cases `history`.`part_id` may be NULL which obviously isn't a valid part id. How would I modify the query to do this?
Here is some sample date as requested:
history table:
(source: ianburris.com)
serialized_parts table:
(source: ianburris.com)
parts table:
(source: ianburris.com)
part_list table:
(source: ianburris.com)
And what I want to see is:
id name serial action date_added
4 Battery 567 added 2010-05-19 10:42:51
3 Antenna Board 345 added 2010-05-19 10:42:51
2 Main Board 123 added 2010-05-19 10:42:51
1 NULL NULL created 2010-05-19 10:42:51
This would at least be on the right track...
If you're looking to NOT show any parts with an invalid ID, simply change the LEFT JOINs to INNER JOINs (they will restrict NULL values)
SELECT `history`.`id`
, `parts`.`type_id`
, `part_list`.`name`
, `serialized_parts`.`serial`
, `history_actions`.`action`
, `history`.`date_added`
FROM `history_actions`
INNER JOIN `history` ON `history`.`action_id` = `history_actions`.`id`
LEFT JOIN `parts` ON `parts`.`id` = `history`.`part_id`
LEFT JOIN `serialized_parts` ON `serialized_parts`.`parts_id` = `history`.`part_id`
LEFT JOIN `part_list` ON `part_list`.`id` = `parts`.`type_id`
WHERE `history`.`unit_id` = '1'
ORDER BY `history`.`id` DESC
Boy, these backticks make my eyes hurt.
SELECT
h.id,
p.type_id,
pl.name,
sp.serial,
ha.action,
h.date_added
FROM
history h
INNER JOIN history_actions ha ON ha.id = h.action_id
LEFT JOIN parts p ON p.id = h.part_id
LEFT JOIN serialized_parts sp ON sp.parts_id = h.part_id
LEFT JOIN part_list pl ON pl.id = p.type_id
WHERE
h.unit_id = '1'
ORDER BY
history.id DESC

ORA-22813 error with SQL complex query

I have a big SELECT statement which has many nested selects in it. When I run it, it gives me an ORA-22813 error:
Ora-22813:- The Collection value from one of the inner sub queries has exceeded the system limits and hence this error.
I have given below some of the nested selects which return huge data.
---The 1st select returns the most data.
Can I handle and process the huge data returned by the INNER SELECTs into the tables in any alternate way so that there is no error of memory less, sort size less.
get, any other way so that the QUERY successfully processes without error.
/*****************************************BEGIN
LEFT OUTER JOIN
( SELECT *
FROM STUDENT_COURSE stu_c
LEFT OUTER JOIN STUDENT_history ch on stu_c.course_id = ch.ch_course_id
LEFT OUTER JOIN STUDENT_master stu_mca on ch.course_history_id = stu_mca.item_id
) stu_c ON stu_c.HISTORY_ID = toa.ACTIVITY_ID ----->This table is joined earlier
LEFT OUTER JOIN
(SELECT c_e.EV_ID, c_e.EV_NAME, ma.item_id, ma.cata_id
FROM EVENTS c_e LEFT OUTER
JOIN COURSE_master ma on c_e.event_Id = ma.item_id ) c_e ON c_e.EVENT_ID = toa.ACTIVITY_ID
After these selects---we have GROUP_BYs to further sort.
---I have checked that if I put a extra limit qualification
like where rownum <30,<20 in each of these SELECTs it works fine.
Full query
SELECT * FROM (SELECT
mcat.CATALOG_ITEM_ID,
mcat.CATALOG_ITEM_NAME ,
mcat.DESCRIPTION,
mcat.CATALOG_ITEM_TYPE,
mcat.DELIVERY_METHOD,
XMLElement("TRAINING_PLAN",XMLAttributes( TP.TPLAN_ID as "id" ),
XMLELEMENT("COMPLETE_QUANTITY", TP.COMPLETE_QUANTITY),
XMLELEMENT("COMPLETE_UNIT", TP.COMPLETE_UNIT),
XMLElement("TOTAL_CREDITS", TP.numberOfCredits ),
XMLELEMENT("IS_CREDIT_BASED", TP.IS_CREDIT_BASED),
XMLELEMENT("IS_FOR_CERT", TP.IS_FOR_CERT),
XMLELEMENT("ACCREDIT_ORG_NAME", TP.ACCRED_ORG_NAME),
XMLELEMENT("ACCREDIT_ORG_ID", TP.accredit_org_id ),
XMLElement("OBJECTIVE_LIST", TP.OBJECTIVE_LIST )
).extract('/').getClobVal() AS PLAN_LIST
FROM
student_master_catalog mcat
INNER JOIN
(SELECT stu_tp.TPLAN_ID,
stu_tp.COMPLETE_QUANTITY,
stu_tp.COMPLETE_UNIT,
stu_tp.TPLAN_XML_DATA.extract('//numberOfCredits/text()').getStringVal() as numberOfCredits,
stu_tp.IS_CREDIT_BASED,
stu_tp.IS_FOR_CERT,
stu_oa.ACCRED_ORG_NAME,
stu_tp.TPLAN_XML_DATA.extract('//accreditingOrg/text()').getStringVal() as accredit_org_id,
objective_list.OBJECTIVE_LIST
FROM
student_training_catalog stu_tp
LEFT OUTER JOIN
stu_accrediting_org stu_oa on stu_tp.TPLAN_XML_DATA.extract('//accreditingOrg/text()').getStringVal() = stu_oa.ACCRED_ORG_ID
INNER JOIN
(SELECT *
FROM
(SELECT
stu_tpo.TPLAN_ID AS OBJECTIVE_TPLAN_ID,
XMLAgg(
XMLElement("OBJECTIVE",
XMLElement("OBJECTIVE_ID",stu_tpo.T_OBJECTIVE_ID ),
XMLElement("OBJECTIVE_NAME",stu_to.T_OBJECTIVE_NAME ),
XMLElement("OBJECTIVE_REQUIRED_CREDITS_OR_ACTIVITIES",stu_tpo.REQUIRED_CREDITS ),
XMLElement("ITEM_ORDER", stu_tpo.ITEM_ORDER ),
XMLElement("ACTIVITY_LIST", activity_list.ACTIVITY_LIST )
)
) as OBJECTIVE_LIST
FROM
stu_TP_OBJECTIVE stu_tpo
INNER JOIN
stu_TRAINING_OBJECTIVE stu_to ON stu_tpo.T_OBJECTIVE_ID = stu_to.T_OBJECTIVE_ID
INNER JOIN
(SELECT *
FROM
(SELECT stu_toa.T_OBJECTIVE_ID AS ACTIVITY_TOBJ_ID, XMLAgg(
XMLElement("ACTIVITY",
XMLElement("ACTIVITY_ID",stu_toa.ACTIVITY_ID ),
XMLElement("CATALOG_ID",COALESCE(stu_c.CATALOG_ID, COALESCE( stu_e.CATALOG_ID, stu_t.CATALOG_ID ) ) ),
XMLElement("CATALOG_ITEM_ID",COALESCE(stu_c.CATALOG_ITEM_ID, COALESCE( stu_e.CATALOG_ITEM_ID, stu_t.CATALOG_ITEM_ID ) ) ),
XMLElement("DELIVERY_METHOD",COALESCE(stu_c.DELIVERY_METHOD, COALESCE( stu_e.DELIVERY_METHOD, stu_t.DELIVERY_METHOD ) ) ),
XMLElement("ACTIVITY_NAME",COALESCE(stu_c.COURSE_NAME, COALESCE( stu_e.EVENT_NAME, stu_t.TEST_NAME ) ) ),
XMLElement("ACTIVITY_TYPE",initcap( stu_toa.ACTIVITY_TYPE ) ),
XMLElement("IS_REQUIRED",stu_toa.IS_REQUIRED ),
XMLElement("IS_PREFERRED",stu_toa.IS_PREFERRED ),
XMLElement("NUMBER_OF_CREDITS",stu_lac.CREDIT_HOURS),
XMLElement("ITEM_ORDER", stu_toa.ITEM_ORDER )
)) as ACTIVITY_LIST
FROM stu_TRAIN_OBJ_ACTIVITY stu_toa
LEFT OUTER JOIN
(
SELECT distinct lac.LEARNING_ACTIVITY_ID, lac.CREDIT_HOURS
FROM student_training_catalog tp
INNER JOIN stu_TP_OBJECTIVE tpo on tp.TPLAN_ID = tpo.TPLAN_ID
INNER JOIN stu_TRAIN_OBJ_ACTIVITY toa on tpo.T_OBJECTIVE_ID = toa.T_OBJECTIVE_ID
INNER JOIN stu_LEARNINGACTIVITY_CREDITS lac on lac.LEARNING_ACTIVITY_ID = toa.ACTIVITY_ID and tp.TPLAN_XML_DATA.extract ('//accreditingOrg/text()').getStringVal() = lac.ACC_ORG_ID
where tp.tplan_id ='*************'
) stu_lac ON stu_lac.LEARNING_ACTIVITY_ID = stu_toa.ACTIVITY_ID ------>This Select returns correct no. of rows
I want to join the below nested SELECTs with stu_toa.ACTIVITY_ID. This would solve my issues.
This below SELECT inside the LEFT OUTER JOIN is the Problem. it returns too much because 3 tables are joined directly without any value qualification.
LEFT OUTER JOIN
( SELECT ch.COURSE_HISTORY_ID, stu_c.COURSE_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method
FROM stu_COURSE stu_c
LEFT OUTER JOIN stu_course_history ch on stu_c.course_id = ch.ch_course_id -
--If I can qualify here with ch.ch_course_id = stu_toa.ACTIVITY_ID (stu_toa.ACTIVITY_ID from the above select with correct no. of rows )
--Here, I get errors because I can't access outside values inside a left outer join
LEFT OUTER JOIN student_master_catalog mca on ch.course_history_id = mca.catalog_item_id
) stu_c ON stu_c.COURSE_HISTORY_ID = stu_toa.ACTIVITY_ID
LEFT OUTER JOIN
(SELECT stu_e.EVENT_ID, stu_e.EVENT_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method FROM stu_EVENTS stu_e LEFT OUTER JOIN student_master_catalog mca on stu_e.event_Id = mca.catalog_item_id ) stu_e ON stu_e.EVENT_ID = stu_toa.ACTIVITY_ID
LEFT OUTER JOIN
(SELECT stu_t.TEST_HISTORY_ID, stu_t.TEST_NAME, mca.catalog_item_id, mca.catalog_id, mca.delivery_method FROM stu_TEST_HISTORY stu_t LEFT OUTER JOIN student_master_catalog mca on stu_t.test_history_id = mca.catalog_item_id) stu_t ON stu_t.test_history_id = stu_toa.ACTIVITY_ID
GROUP BY stu_toa.T_OBJECTIVE_ID) ) activity_list ON activity_list.ACTIVITY_TOBJ_ID = stu_tpo.T_OBJECTIVE_ID
GROUP BY stu_tpo.TPLAN_ID) ) objective_list ON objective_list.OBJECTIVE_TPLAN_ID = stu_tp.TPLAN_ID
)TP ON TP.TPLAN_ID = mcat.CATALOG_ITEM_ID
WHERE
mcat.CATALOG_ITEM_ID = '*****************' and mcat.CATALOG_ORG_ID = '********')
Please post the DDLs, approximate sizes (relative to each other), and the complete query, rather than just an excerpt.
Some quick hits that may or may not solve your problem (for better help, I need better information) --
Are you sure you mean OUTER join? Outer joining students to courses means students who are not taking any courses will still be around. Is that the desired behaviour?
Don't select * if you only want a limited subset of the columns. Enumerate the exact columns you need. The rest might not seem like much on a row-by-row basis, but when you multiply by the total number of rows you have, this sort of thing can mean the difference between in-memory sorts and spilling to disk.
How many rows of data are you looking at? there are times when separate queries with programmatic aggregation can work better. Someone with more knowledge of Oracle query optimization may be able to help, also, tweaking the settings could help here too...
I've had instances where a sproc was being called that aggregated data from more than one source took exponentially longer than two calls in the app, and putting it together in memory.
Post DDL of your tables and exact plan of the query.
Meanwhile, try increasing pga_aggregate_target, sort_area_size and hash_area_size