Oracle SQL Query (LEFT OUTER JOIN) - sql

I have five tables and i want to get result out of them. Here is what i am doing:
select
person.SERVICE_NO as Service_No, person.CNIC_NO as CNIC, person.NAME as NAME , card.CPLC_SERIAL_NO as Card_Number,
child_dc.NAME as Child_DC, root_dc.NAME as Root_DC, person.OU as OU, person.EMAIL as Email
from
person,card,person_card,child_dc,root_dc
where
person_card.PERSON_ID = person.ID
and
person_card.CARD_ID = card.ID
and
person.CHILD_DC_ID = child_dc.ID
and
root_dc.ID = child_dc.ID;
This query give redundant values, (not if i place a distinct with it). I was thinking of doing it with left out join; which means that i would be LEFT OUTER JOINING with 5 tables. How would i do this. If anyone has more optimized query or any other idea, that would be great.

Query Updated:
select distinct
person.SERVICE_NO as Service_No,
person.CNIC_NO as CNIC, person.NAME as NAME ,
card.CPLC_SERIAL_NO as Card_Number,
child_dc.NAME as Child_DC,
root_dc.NAME as Root_DC, person.OU as OU,
person.EMAIL as Email
from
person_card inner join person
on person_card.PERSON_ID = person.ID
inner join card
on person_card.CARD_ID = card.ID
left outer join child_dc
on person.CHILD_DC_ID = child_dc.ID
left outer join root_dc
on child_dc.ID = root_dc.ID;

Related

Looking to add in a Count query with Group by INTO an existing working query

Goal:
I wish to get the Count of how many times a WorkItem was re-assigned
From what I understand the proper query is the following:
SELECT
WorkItemDimvw.Id,
COUNT(WorkItemAssignedToUserFactvw.WorkItemAssignedToUser_UserDimKey) AS Assignments
FROM WorkItemDimvw INNER JOIN WorkItemAssignedToUserFactvw
ON WorkItemDimvw.WorkItemDimKey = WorkItemAssignedToUserFactvw.WorkItemDimKey
GROUP BY WorkItemDimvw.Id
The EXISTING query is below and I'm wondering / forgeting if I should:
Just add in COUNT(WorkItemAssignedToUserFactvw.WorkItemAssignedToUser_UserDimKey) AS Assignments since joins are existing, except it is group by WorkItemDimvw.Id
Should it instead be a subquery in the Select below?
Query:
SELECT
SRD.ID,
SRD.Title,
SRD.Description,
SRD.EntityDimKey,
WI.WorkItemDimKey,
IATUFact.DateKey
FROM
SLAConfigurationDimvw
INNER JOIN SLAInstanceInformationFactvw
ON SLAConfigurationDimvw.SLAConfigurationDimKey = SLAInstanceInformationFactvw.SLAConfigurationDimKey
RIGHT OUTER JOIN ServiceRequestDimvw AS SRD
INNER JOIN WorkItemDimvw AS WI
ON SRD.EntityDimKey = WI.EntityDimKey
LEFT OUTER JOIN WorkItemAssignedToUserFactvw AS IATUFact
ON WI.WorkItemDimKey = IATUFact.WorkItemDimKey
AND IATUFact.DeletedDate IS NULL
The trick is to aggregate the data on a sub query, before you join it.
SELECT
SRD.ID,
SRD.Title,
SRD.Description,
SRD.EntityDimKey,
WI.WorkItemDimKey,
IATUFact.DateKey,
IATUFact.Assignments
FROM
SLAConfigurationDimvw
INNER JOIN
SLAInstanceInformationFactvw
ON SLAConfigurationDimvw.SLAConfigurationDimKey = SLAInstanceInformationFactvw.SLAConfigurationDimKey
RIGHT OUTER JOIN
ServiceRequestDimvw AS SRD
ON <you're missing something here>
INNER JOIN
WorkItemDimvw AS WI
ON SRD.EntityDimKey = WI.EntityDimKey
LEFT OUTER JOIN
(
SELECT
WorkItemDimKey,
DateKey,
COUNT(WorkItemAssignedToUser_UserDimKey) AS Assignments
FROM
WorkItemAssignedToUserFactvw
WHERE
DeletedDate IS NULL
GROUP BY
WorkItemDimKey,
DateKey
)
IATUFact
ON WI.WorkItemDimKey = IATUFact.WorkItemDimKey

SQL join on two tables

How can I join two tables on either value?
select cases.*
from client_cases
inner join cases on client_cases.id = cases.timeline
left join customers on client_cases.customer_id = customers.id or customers.email = 'test#gmail.com'
where cases.timeline in (
select timeline from cases where cases.payload -> 'person' ->> 'phone' ~ '4625152'
)
I'm trying to get a result if either customers.email = 'test#gmail.com' or if the cases.payload JSONB phone field has the value '4625152'
The above query will only return a result if the phone number has a match. Not if the email has a match and the phone number doesn't.
I think it makes more sense to put email condition in WHERE clause.
select cases.*
from client_cases
inner join cases on client_cases.id = cases.timeline
left join customers on client_cases.customer_id = customers.id
where cases.timeline in (
select timeline from cases where cases.payload -> 'person' ->> 'phone' ~ '4625152'
) or customers.email = 'test#gmail.com'
The on clause is used when the join is looking for matching rows.
The where clause is used to filter rows after all the joining is done.
you have been doing the join with customers.email = 'test#gmail.com' but you filter that with WHERE case.timeline IN.... That's explain the result you had
SELECT cases.*
FROM client_cases
INNER JOIN cases ON (client_cases.id = cases.timeline)
LEFT JOIN customers ON (client_cases.customer_id = customers.id)
WHERE cases.timeline IN (
SELECT timeline
FROM cases
WHERE cases.payload -> 'person' ->> 'phone' ~ '4625152'
) OR customers.email = 'test#gmail.com'

Why is this SQL query returning this result?

Can anyone tell me why this sql query is returning this result.
SELECT
M.MatterNumber_Id, M.MatterNumber, M.MatterName,
ISNULL(MP.Role_Cd, 'No Primary') AS PrimaryRole,
ISNULL(E.Name, 'No Primary') AS PrimaryName,
ISNULL(C.CommNumber, 'l.w#abc.ca;m.j#abc.com') AS Email
FROM
Matter M
LEFT OUTER JOIN
MatterPlayer MP ON M.MatterNumber_Id = MP.MatterNumber_Id AND
MP.Role_Cd IN ('Primary Lawyer', 'Primary Staff Member') AND
MP.EndDate IS NULL
LEFT OUTER JOIN
Entity E on MP.Entity_EID = E.Entity_EID
LEFT OUTER JOIN Communication C on MP.Entity_EID = C.Entity_EID AND
C.CommunicationType_Cd = 'Email'
LEFT OUTER JOIN
MatterExposure ME on M.MatterNumber_Id = ME.MatterNumber_Id AND
ME.AssessedDate > '7/10/2014' AND
ME.Currency_CD IS NOT NULL
WHERE
M.MatterStatus_Cd = 'Active' AND
ME.AssessedDate IS NULL AND
M.Matter_Cd in
(SELECT
rpl.Type_CD
FROM
RuleProfile_Tabs rpt
INNER JOIN
RuleProfile_LookupCode rpl ON rpt.RuleProfile_ID = rpl.RuleProfile_ID
WHERE
tab_id = 1034 AND Caption LIKE 'Reportable Matter%')
ORDER BY
Email, M.MatterName
But When i see the results I check one of the records returned and it should not have been returned.
One of the results returned:
Bailey, Richard
In the db i check the values in the tables it should not have been returned because Currency_CD is null and in the sql it states currency_cd is not null. Also the assessed date is AFTER 7/10/2014
Table Values:
MatterStatus_CD:
Active
AssessedDate:
7/24/2014
Currency_CD:
NULL
EndDate:
NULL
Where clauses are used for filtering, not the join clause. Every JOIN should have an ON. Here's a head start.
Select M.MatterNumber_Id, M.MatterNumber, M.MatterName,
IsNull(MP.Role_Cd, 'No Primary') as PrimaryRole,
IsNull(E.Name, 'No Primary') as PrimaryName,
IsNull(C.CommNumber, 'l.w#abc.ca;m.j#abc.com') as Email
From Matter M
Left Outer Join MatterPlayer MP on M.MatterNumber_Id = MP.MatterNumber_Id
Left Outer Join Entity E on MP.Entity_EID = E.Entity_EID
Left Outer Join Communication C on MP.Entity_EID = C.Entity_EID
Left Outer Join MatterExposure ME on M.MatterNumber_Id = ME.MatterNumber_Id
Where M.MatterStatus_Cd = 'Active'
AND ME.AssessedDate is Null
and M.Matter_Cd in (select rpl.Type_CD
from RuleProfile_Tabs rpt
inner join RuleProfile_LookupCode rpl on rpt.RuleProfile_ID = rpl.RuleProfile_ID
where tab_id = 1034 AND Caption like 'Reportable Matter%')
and MP.Role_Cd In ('Primary Lawyer', 'Primary Staff Member')
and MP.EndDate is Null
and ME.AssessedDate > '7/10/2014'
and ME.Currency_CD IS NOT NULL
and C.CommunicationType_Cd = 'Email'
Order by Email, M.MatterName
You are LEFT JOINing MatterExposure onto Matter. Your Me.Currency_CD IS NOT NULL criteria in the LEFT JOIN's ON statement is only saying, "don't join to records where Currency_CD is null..." but because this is a left join, none of your Matter table's records are lost in the join. So when you ask for Currency_CD you will get NULL.
To remove records from your query where Currency_CD is null in the MatterExposure table you will either need to specify Currency_CD IS NOT NULL in your where statement (instead of your join) or will need to use an INNER JOIN to your MatterExposure table.
It is because this join is a LEFT OUTER JOIN. If you want to enforce this join make it an inner join. Or move that check to the WHERE clause.

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.

Top 1 on Left Join SubQuery

I am trying to take a person and display their current insurance along with their former insurance. I guess one could say that I'm trying to flaten my view of customers or people. I'm running into an issue where I'm getting multiple records back due to multiple records existing within my left join subqueries. I had hoped I could solve this by adding "TOP 1" to the subquery, but that actually returns nothing...
Any ideas?
SELECT
p.person_id AS 'MIRID'
, p.firstname AS 'FIRST'
, p.lastname AS 'LAST'
, pg.name AS 'GROUP'
, e.name AS 'AOR'
, p.leaddate AS 'CONTACT DATE'
, [dbo].[GetPICampaignDisp](p.person_id, '2009') AS 'PI - 2009'
, [dbo].[GetPICampaignDisp](p.person_id, '2008') AS 'PI - 2008'
, [dbo].[GetPICampaignDisp](p.person_id, '2007') AS 'PI - 2007'
, a_disp.name AS 'CURR DISP'
, a_ins.name AS 'CURR INS'
, a_prodtype.name AS 'CURR INS TYPE'
, a_t.date AS 'CURR INS APP DATE'
, a_t.effdate AS 'CURR INS EFF DATE'
, b_disp.name AS 'PREV DISP'
, b_ins.name AS 'PREV INS'
, b_prodtype.name AS 'PREV INS TYPE'
, b_t.date AS 'PREV INS APP DATE'
, b_t.effdate AS 'PREV INS EFF DATE'
, b_t.termdate AS 'PREV INS TERM DATE'
FROM
[person] p
LEFT OUTER JOIN
[employee] e
ON
e.employee_id = p.agentofrecord_id
INNER JOIN
[dbo].[person_physician] pp
ON
p.person_id = pp.person_id
INNER JOIN
[dbo].[physician] ph
ON
ph.physician_id = pp.physician_id
INNER JOIN
[dbo].[clinic] c
ON
c.clinic_id = ph.clinic_id
INNER JOIN
[dbo].[d_Physgroup] pg
ON
pg.d_physgroup_id = c.physgroup_id
LEFT OUTER JOIN
(
SELECT
tr1.*
FROM
[transaction] tr1
LEFT OUTER JOIN
[d_vendor] ins1
ON
ins1.d_vendor_id = tr1.d_vendor_id
LEFT OUTER JOIN
[d_product_type] prodtype1
ON
prodtype1.d_product_type_id = tr1.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] ctype1
ON
ctype1.d_commission_type_id = tr1.d_commission_type_id
WHERE
prodtype1.name <> 'Medicare Part D'
AND tr1.termdate IS NULL
) AS a_t
ON
a_t.person_id = p.person_id
LEFT OUTER JOIN
[d_vendor] a_ins
ON
a_ins.d_vendor_id = a_t.d_vendor_id
LEFT OUTER JOIN
[d_product_type] a_prodtype
ON
a_prodtype.d_product_type_id = a_t.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] a_ctype
ON
a_ctype.d_commission_type_id = a_t.d_commission_type_id
LEFT OUTER JOIN
[d_disposition] a_disp
ON
a_disp.d_disposition_id = a_t.d_disposition_id
LEFT OUTER JOIN
(
SELECT
tr2.*
FROM
[transaction] tr2
LEFT OUTER JOIN
[d_vendor] ins2
ON
ins2.d_vendor_id = tr2.d_vendor_id
LEFT OUTER JOIN
[d_product_type] prodtype2
ON
prodtype2.d_product_type_id = tr2.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] ctype2
ON
ctype2.d_commission_type_id = tr2.d_commission_type_id
WHERE
prodtype2.name <> 'Medicare Part D'
AND tr2.termdate IS NOT NULL
) AS b_t
ON
b_t.person_id = p.person_id
LEFT OUTER JOIN
[d_vendor] b_ins
ON
b_ins.d_vendor_id = b_t.d_vendor_id
LEFT OUTER JOIN
[d_product_type] b_prodtype
ON
b_prodtype.d_product_type_id = b_t.d_product_type_id
LEFT OUTER JOIN
[d_commission_type] b_ctype
ON
b_ctype.d_commission_type_id = b_t.d_commission_type_id
LEFT OUTER JOIN
[d_disposition] b_disp
ON
b_disp.d_disposition_id = b_t.d_disposition_id
WHERE
pg.d_physgroup_id = #PhysGroupID
In Sql server 2005 you can use OUTER APPLY
SELECT p.person_id, s.e.employee_id
FROM person p
OUTER APPLY (SELECT TOP 1 *
FROM Employee
WHERE /*JOINCONDITION*/
ORDER BY /*Something*/ DESC) s
http://technet.microsoft.com/en-us/library/ms175156.aspx
The pattern I normally use for this is:
SELECT whatever
FROM person
LEFT JOIN subtable AS s1
ON s1.personid = person.personid
...
WHERE NOT EXISTS
( SELECT 1 FROM subtable
WHERE personid = person.personid
AND orderbydate > s1.orderbydate
)
Which avoids the TOP 1 clause and maybe makes it a little clearer.
BTW, I like the way you've put this query together in general, except I'd leave out the brackets, assuming you have rationally named tables and columns; and you might even gain some performance (but at least elegance) by listing columns for tr1 and tr2, rather than "tr1.*" and "tr2.*".
Thanks for all of the feedback and ideas...
In the simplest of terms, I have a person table that stores contact information like name, email, etc. I have another table that stores transactions. Each transaction is really an insurance policy that would contain information on the provider, product type, product name, etc.
I want to avoid giving the user duplicate person records since this causes them to look for the duplicates prior to running mail merges, etc. I'm getting duplicates when there is more than 1 transaction that has not been terminated, and when there is more than 1 transaction that has been terminated.
Someone else suggested that I consider a cursor to grab my distinct contact records and then perform the sub selects to get the current and previous insurance information. I don't know if I want to head down that path though.
It's difficult to understand your question so first I'll throw this out there: does changing your SELECT to SELECT DISTINCT do what you want?
Otherwise, let me get this straight, you're trying to get your customers' current insurance and previous insurance, but they may actually have many insurances before that, recorded in the [transactions] table? I looked at your SQL for quite a few minutes but can't figure out what it all means, so could you please reduce it down to only the parts that are necessary? Then I'll think about it some more. It sounds to me like you need a GROUP BY somehow, but I can't work it out exactly yet.
Couldn't take the time to dig through all your SQL (what a beast!), here's an idea that might make things easier to handle:
select
p.person_id, p.name <and other person columns>,
(select <current policy columns>
from pol <and other tables for policy>
where pol.<columns for join> = p.person_id
and <restrictions for current policy>),
(select <previous policy columns>
from pol <and other tables for policy>
where pol.<columns for join> = p.person_id
and <restrictions for previouspolicy>),
<other columns>
from person p <and "directly related" tables>
This makes the statement easier to read by separating the different parts into their own subselects, and it also makes it easier to add a "Top 1" in without affecting the rest of the statement. Hope that helps.